CN117332067A

CN117332067A - Question-answer interaction method and device, electronic equipment and storage medium

Info

Publication number: CN117332067A
Application number: CN202311427205.3A
Authority: CN
Inventors: 陈科鑫; 张晓帆
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2023-10-30
Filing date: 2023-10-30
Publication date: 2024-01-02

Abstract

The invention discloses a question and answer interaction method, a question and answer interaction device, electronic equipment and a storage medium. By adopting the embodiment of the invention, the first knowledge similar to the question information in the first storage pool is used as the second prompt word, and the second prompt word and the first prompt word corresponding to the question information are used as the total prompt word to send the question-answer request to the large language model, so that the data length of the large language model to be predicted can be reduced, and the aim of saving the cost is fulfilled.

Description

Question-answer interaction method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a question-answer interaction method, a question-answer interaction device, electronic equipment and a storage medium.

Background

Smart phones are an important tool indispensable to modern life, and can provide help for people no matter work, study or entertainment. The voice assistant is an important function in the smart phone, and can help a user to quickly complete various tasks through a voice recognition technology; and can also talk to the user to provide entertainment and leisure for the user. But traditional terminal voice assistants are based on regular multi-round conversations and user-generated answers are also based on a speech template, meaning that their answers often lack individualization and creativity and do not meet the needs of the diversity of users.

To meet the needs of user diversity, a solution using a large language model as a voice assistant is proposed in the related art, for example, using ChatGPT (chat-generated pre-training model, chat Generative Pre-trained Transformer) as a voice assistant. ChatGPT is quite different from traditional terminal voice assistants: chatGPT can provide more personalized and creative answers to users through natural language understanding and generation technology; the method can generate the answer which better meets the requirements of the user according to the requirements and the context information of the user. This means that ChatGPT can meet the needs of diversification of users, and provide more abundant and diversified services for users.

However, this solution, which uses a large language model as a voice assistant, would naturally bring benefit to the terminal device, and the user experience would be directly better. However, in this solution, the terminal device acts as a forwarding relay for only one user session, and gives all session logics to the large language model to answer. The solution has some cost problems, because the large model requires a large amount of computing resources to infer, which means that the cost of invoking the large model is relatively high, and because the large model needs to communicate with the cloud server, this generates additional network transmission cost. In addition, the cost of calling a large model becomes unacceptable after the number of users increases and the number of accesses increases dramatically, especially for terminal equipment manufacturers.

Therefore, how to reduce the cost of large model speech is a technical problem that needs to be solved.

Disclosure of Invention

The embodiment of the invention aims to provide a question-answer interaction method, a question-answer interaction device, electronic equipment and a storage medium, so as to solve the technical problem of high use cost of large-model voice.

In a first aspect, an embodiment of the present invention provides a question-answer interaction method, including:

Acquiring question information input by an object;

analyzing the questioning information to generate a first prompting word of the questioning information;

determining a second prompt word according to the similarity of the question information and first knowledge stored in a first storage pool, wherein the first knowledge comprises historical question information input by the object history and common information of the object;

and sending a question-answer request to a large language model deployed at a cloud end according to the first prompt word, the second prompt word and the question information so as to obtain a first question-answer result returned by the large language model.

In a second aspect, an embodiment of the present invention provides a question-answer interaction device, including:

the acquisition module is used for acquiring question information input by the object;

the generating module is used for analyzing the questioning information and generating a first prompting word of the questioning information;

the first determining module is used for determining a second prompt word according to the similarity of the question information and first knowledge stored in a first storage pool, wherein the first knowledge comprises history question information input by the object history and common information of the object;

and the question-answering module is used for sending a question-answering request to the large language model deployed at the cloud end according to the first prompt word, the second prompt word and the question information so as to obtain a first question-answering result returned by the large language model.

In a third aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes a processor, a memory, and a computer program stored in the memory and executable on the processor, and the processor implements the steps in the question-answer interaction method described in any one of the foregoing when executing the computer program.

In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the question-answer interaction method of any one of the above.

The embodiment of the invention provides a question-answer interaction method, a question-answer interaction device, electronic equipment and a storage medium.

Drawings

FIG. 1 is a schematic diagram of a related art architecture for accessing a large language model on a terminal device;

FIG. 2 is a schematic flow chart of a question-answer interaction method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a voice assistant according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for using a voice assistant according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of another question-answering interaction device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

fig. 7 is a schematic diagram of another structure of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

In the related art, the principle of ChatGPT is based on a large language model (LLM, large Language Model) and a GPT (generated Pre-training model, generated Pre-trained Transformer) model. LLM is a language model that predicts the likelihood of the next word or next word segment by learning a large amount of text data. GPT is a technique for realizing LLM large-scale language model, which trains the model through pre-training and fine-tuning. In the pre-training stage, GPT uses a large amount of unlabeled text data to learn itself, thereby learning the potential patterns and rules of language; in the fine tuning phase, the GPT is then allowed to learn and form the answer by giving the correct dialog template. The technical principle ensures that the ChatGPT has stronger language understanding and generating capability and can better meet the requirements of users. Therefore, the large language model is accessed to the terminal equipment to replace the traditional voice assistant based on the template speaking operation, so that the user experience can be remarkably improved.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a related art method for accessing a large language model to a terminal device, and as shown in fig. 1, a user sends a query instruction to the terminal device through voice, where the query instruction may be text input or voice input. Taking voice input as an example, after receiving the voice input of a user, the terminal equipment converts the voice input of the user into characters, then splices prompt words and sends a request to the large language model, after receiving the request, the large language model generates an answer for the terminal equipment, and after outputting and packaging the answer of the large language model, the terminal equipment broadcasts the answer to the user through voice.

In order to solve the technical problems in the related art, an embodiment of the present invention provides a question-answer interaction method, please refer to fig. 2, fig. 2 is a schematic flow chart of the question-answer interaction method provided in the embodiment of the present invention, the method includes steps 201 to 204;

step 201, acquiring question information input by an object.

In this embodiment, the object provided in this embodiment may be a user, that is, a user of the terminal. The question information may be voice question information or text question information. The questioning information can be selected and determined from a plurality of questioning information preset and displayed by the terminal equipment by the user, and can also be the questioning information input by the user according to the user demand. The voice questioning information can be voice data collected through a microphone on the terminal equipment, and the text questioning information can be text data manually input by a user.

Step 202, analyzing the questioning information to generate a first prompting word of the questioning information.

In this embodiment, in order to reduce the data length of the large language model to be predicted, and achieve the purpose of saving cost, the present embodiment analyzes the question information to generate the first prompt word corresponding to the question information, so that after a question-answering request is sent to the large language model according to the first prompt word, the computing resource required by the large language model can be reduced. Specifically, the step of analyzing the question information and generating the first prompting word of the question information provided in this embodiment may be: extracting key information from the questioning information to obtain key information; according to the questioning information, determining the requirement information of the questioning information; and determining a first prompt word of the questioning information according to the key information and the requirement information.

If the question information provided in the embodiment is voice data, the question information may be converted into text data when the key information extraction processing is performed on the question information, and then the key word extraction processing is performed on the text data, so as to obtain the key word of the text data. Then, based on the requirements of the user and the obtained keywords, determining a first prompt word corresponding to the text data. For example, when text data corresponding to the questioning information is "small cloth" and you can help me find out the tourist attractions in the city a ", keyword extraction processing is performed on the text data to obtain keywords corresponding to the city a", "tourist attractions". And then, according to the questioning information, the requirement of the user can be determined as 'acquiring the position of the tourist attraction in the city A'. Thus, according to the keywords and the user requirements, a first prompt word corresponding to ______ at the position of the tourist attraction in the city A can be generated. Therefore, according to the first keyword, the computing resources required by the large language model can be effectively reduced, and the purpose of saving the cost of the terminal equipment is achieved.

In step 203, a second hint word is determined based on the similarity of the question information and the first knowledge stored in the first storage pool.

The first storage pool provided in this embodiment may be any storage area in the terminal device. The first knowledge may include historical questioning information input by a user's history, and common information of the user. The common information of the user may be information that is commonly used in the used terminal device by the user, such as positioning information and weather information.

In some embodiments, the step of determining the second prompting word according to the similarity between the question information and the first knowledge stored in the first storage pool provided in this embodiment may be: calculating the similarity of the questioning information and the first knowledge stored in the first storage pool to obtain a first similarity value; and under the condition that the first similarity value is larger than a first threshold value, determining the first knowledge and the context data of the first knowledge as second prompt words.

In this way, the first knowledge stored in the first storage pool is searched to determine the similarity of the questioning information and the first knowledge, so that the first knowledge related to the questioning information can be obtained and used as the second prompting word, the length of the prompting word subsequently input into the large language model can be expanded, the calculation resources required by the large language model are further reduced, and the purpose of saving the cost of the terminal equipment is achieved.

Specifically, the first threshold value provided in this embodiment may be 0.5, or may be a value such as 0.6 or 0.7, and the value of the first threshold value may be set according to the actual application requirement, which is not specifically limited herein.

If the questioning information is "small cloth", you can help me to check the tourist attractions in the city a ", and the first knowledge in the first storage pool comprises the positioning information near the user, and the user is in the city a, the first knowledge with high similarity to the questioning information can be the positioning information near the user such as" tourist attractions a "," tourist attractions B ", and the like, and then the" tourist attractions a "," tourist attractions B "are used as second prompting words to expand the length of the prompting words subsequently input into the large language model.

In order to increase the retrieval rate of retrieving the first knowledge stored in the first storage pool, the first storage pool provided in this embodiment may be a cache area in the terminal device, so as to increase the rate of retrieving and judging the similarity, effectively increase the efficiency of question-answer interaction, and increase the use experience of the user.

As another optional embodiment, in order to further increase the search rate of searching the first knowledge stored in the first storage pool, the step of calculating the similarity between the question information and the first knowledge stored in the first storage pool to obtain the first similarity value may be: converting the question information into text data; vectorizing the text data to obtain vectorized data; and calculating the similarity of the vectorized data and the first knowledge stored in the first storage pool to obtain a first similarity value, wherein the data form of the first knowledge is in a vector form.

By carrying out vectorization processing on the question information and then calculating the vectorization data corresponding to the question information, the similarity with the first knowledge which is vectorization data can effectively improve the speed of similarity calculation between the vectorization data and the first knowledge, so that the efficiency of question-answer interaction can be further improved, and the use experience of a user is further improved.

Step 204, according to the first prompt word, the second prompt word and the question information, a question-answering request is sent to the large language model deployed in the cloud to obtain a first question-answering result returned by the large language model.

In the related art, the question information input by the user is directly sent to the large language model deployed in the cloud end, so that the large language model is used as a voice assistant to realize the question-answer interaction function of the terminal device. Then, the call to the large language model requires a call cost, that is, a certain cost is required to be spent to realize the function of the voice assistant each time the terminal device in the related art performs the question-answer interaction.

The charging mode of the large language model is generally defined based on the calculation resources required by the large language model or the data length of the prediction data, that is, the more the terminal device calls the large language model to calculate the calculation resources or the longer the prediction data length of the large language model to predict, the higher the cost generated by the voice assistant.

In this case, in order to reduce the cost of using the voice assistant function by the terminal device, the first prompt word and the second prompt word determined in the above embodiment are sent to the large language model deployed in the cloud as a question-answer request in combination with the question information, so that the prompt can be provided for the large language model, thereby reducing the computing resources required by the large language model and the data length that needs to be predicted by the large language model, and achieving the purpose of reducing the cost of using the voice assistant function by the terminal device.

In some embodiments, there may be a case where the first knowledge related to the questioning information is not retrieved, that is, a case where a first similarity value between the questioning information and the first knowledge in the first storage pool is not greater than a first threshold. Therefore, after the step of calculating the similarity between the question information and the first knowledge stored in the first storage pool and obtaining the first similarity value, the question-answer interaction method provided in this embodiment may further include: and under the condition that the first similarity value is not greater than a first threshold value, determining that the second prompting word is empty.

After first knowledge related to the questioning information is not retrieved in the first storage pool, that is, it is determined that the second prompting word is empty, the step of sending a questioning and answering request to the large language model deployed in the cloud according to the first prompting word and the second prompting word provided in this embodiment may be: and sending a question-answer request to the large language model deployed at the cloud end according to the first prompt word.

In this embodiment, under the condition that the first knowledge related to the question information is not retrieved in the first storage pool, the first prompt word corresponding to the question information may still be provided for the large language model, so that compared with a scheme of directly sending the question-answer request to the large language model in the related art, the calculation resource required by the large language model can be effectively reduced, and thus the use cost of using the voice assistant function by the terminal device is reduced.

In some embodiments, the charging manner of the large language model may be defined based on the number of calls, that is, the more times the terminal device calls the large language model, the higher the cost of the voice assistant. In this case, in order to reduce the use cost of the terminal device for using the voice assistant function, before the step of analyzing the question information and generating the first prompt word of the question information provided in this embodiment, the question-answer interaction method provided in this embodiment may further include: calculating the similarity between the questioning information and the second knowledge stored in the second storage pool to obtain a second similarity value; and when the second similarity value is larger than a second threshold value, taking the second knowledge as a second question and answer result corresponding to the question information, and returning the second question and answer result to the object.

The second storage pool provided in this embodiment may be any storage area in the terminal device except the storage area in which the first knowledge is stored. The second knowledge may include historical question and answer results returned by the large language model during the historical question and answer interaction. The second threshold may be a value of 0.7, 0.8, 0.85, etc., and specifically, the value of the second threshold provided in this embodiment may be set according to the actual application requirement, which is not limited herein specifically.

According to the method, the second storage pool in the terminal equipment is searched before the question-answer request is sent to the large language model, and after the second knowledge with high similarity with the question information is searched, the second knowledge can be returned to the user as the question-answer result, so that the question-answer request does not need to be sent to the large language model continuously, the calling cost required by calling the large language model is avoided, the times of calling the large language model are reduced, and the use cost of using a voice assistant function by the terminal equipment is reduced.

In order to increase the retrieval rate of retrieving the second knowledge stored in the second storage pool, the second storage pool provided in this embodiment may be a cache area in the terminal device, so as to increase the rate of retrieving and judging the similarity, effectively increase the efficiency of question-answer interaction, and increase the use experience of the user.

Furthermore, in order to further improve the retrieval rate of retrieving the second knowledge stored in the second storage pool, the embodiment can also perform vectorization processing on the second knowledge in the second storage pool, and use the vectorized question information to retrieve the second knowledge in the second storage pool, so that the similarity calculation rate between the second knowledge and the question information can be effectively improved, the question-answer interaction efficiency can be further improved, and the use experience of a user is further improved.

In some embodiments, if the second similarity value is not greater than the second threshold, that is, if the second knowledge related to the question information is not retrieved in the second storage pool, the step of analyzing the question information and generating the first prompt word of the question information provided in this embodiment is performed, that is, step 202 in the foregoing embodiment.

As an optional embodiment, in order to further reduce the number of subsequent requests for the large language model and reduce the network transmission cost and the call cost of the large language model, after the step of sending the question-answer request to the large language model deployed in the cloud according to the first prompt word, the second prompt word and the question information, so as to obtain the first question-answer result returned by the large language model, the question-answer interaction method provided in this embodiment may further include: the first question and answer result is stored in a second storage pool.

The question and answer results returned by the large language model each time are stored in the second storage pool of the terminal equipment, so that the user can conveniently provide more question and answer results for the user at the local end when using the voice assistant function later, the subsequent request times for the large language model are reduced as much as possible, the network transmission cost and the calling cost of the large language model are reduced, and the use cost of the voice assistant function used by the subsequent terminal equipment is reduced.

In this way, by adopting the question-answer interaction method provided by the embodiment, the first storage pool storing the first knowledge and the second storage pool storing the second knowledge are respectively set in the terminal device, so that before the question-answer request is sent to the large language model, the second knowledge in the second storage pool can be searched first to determine whether the question-answer result related to the question information exists, if the question-answer result exists, the question-answer request does not need to be sent to the large language model, the request times of the large language model are reduced, and the network transmission cost and the calling cost of the large language model are reduced. If the second knowledge related to the question information is not retrieved in the second storage pool, the question information may be analyzed, and the first knowledge in the first storage pool may be retrieved, so as to obtain at least one hint (the first hint or the first hint and the second hint), and then, according to the at least one hint and the question information, a question-answering request may be performed to the large language model, so that calculation resources required for prediction of the large language model and a data length required for prediction of the large language model may be reduced, thereby reducing cost required for calling the large language model.

In order to better illustrate the functions of the voice assistant provided by the embodiment of the present invention, please refer to fig. 3, fig. 3 is a schematic structural diagram of the voice assistant provided by the embodiment of the present invention. As shown in fig. 3, the voice assistant provided in this embodiment is composed of a terminal control center, a network layer, a prompt engineering layer, and a storage layer.

The terminal control center is responsible for master control and corresponds to a processor chip in the terminal equipment, and has the function of controlling the logic flow of the whole voice assistant, sending control instructions to the network layer, the prompt engineering layer and the storage layer and controlling the state of the voice assistant. The terminal control center is also responsible for processing the question voice input by the user and converting the question voice of the user into text characters. The terminal control center can feed back according to the results of each flow state and instruct the prompting engineering layer to splice prompting words; and the command network layer sends a request to the large language model of the cloud. Meanwhile, the terminal control center is also responsible for determining whether to send a request to the cloud according to the knowledge stored in the storage layer, outputting and packaging the answer sent by the cloud, and finally broadcasting the answer to the user through voice.

The network layer refers to a logical structure part that is not deployed on the terminal device, including a large language model and a large language model proxy. Specifically, the large language model is responsible for processing a question-answer request from the terminal equipment, and generating an answer which better meets the requirements of the user according to the prompt words provided by the terminal equipment. The large language model part provided in this embodiment may include a plurality of large language models, so as to adapt to different requirements of users or terminal equipment manufacturers. The large language model agent is responsible for communication between the terminal equipment and the large language model, forwarding the question-answer request of the terminal equipment to the large language model, and returning the question-answer result of the large language model to the terminal equipment. The large language model agent is also responsible for carrying out interface adaptation on different large language models, and carrying out unified encapsulation on the interfaces of the large language models, so that unified simple abstract interfaces are exposed to the outside. Therefore, communication between the terminal equipment and the cloud server can be realized, and richer and diversified services are provided for the user.

The prompt engineering layer consists of a knowledge integration module and an interactive agent module. The knowledge integration module is firstly responsible for analyzing the questioning information of the user so as to extract key information and provide more personalized and creative prompt word splicing for the user according to the requirement of the user. The knowledge integration module is also responsible for processing the cache knowledge from the storage layer, splicing the prompt words of the effective relevant parts in the cache knowledge, and synthesizing the prompt words of the large model.

Specifically, the alert word templates used by the knowledge integration module provided in this embodiment may be as follows:

1) Task statement section: describing the role that the large language model needs to play in the task, namely, the task target of the large model in the task;

2) User demand part: part of the content is used for describing the requirements and the expectations of users on task results, and the rest is used for filling the questioning information of the users by a large language model;

3) Knowledge reference part: describing background knowledge and reference information related to tasks, specifically knowledge of a storage layer in the terminal equipment, and performing text description, namely prompt words, after being preprocessed by a knowledge integration module.

4) Task planning section: describing how to plan tasks based on user needs and background knowledge. Wherein the part is completely generated by using a large language model, the part of prompt words are optional, when relevant knowledge is hit in a storage layer, the part does not need to be filled in, and the part can be omitted under the condition of local knowledge, so that the cost can be saved;

5) Task planning result part: and filling the result of task planning. Wherein, this part is also completely generated by using a large language model, and in the case of the local knowledge of the storage layer hit, this part can be omitted to reduce the cost;

6) Returning to the evaluation section: describing how to let the large language model evaluate whether the current answer is satisfactory. Such as requirements for compliance with common sense, requirements for answer compliance, etc.;

7) Return format requirement part: textual format requirements describing the returned results, such as original answers to the answer, alternative answers, answer reasoning, etc.

The interaction agent module is responsible for the communication between the terminal device and the large model agent of the network layer, sending the prompt words and the question information to the large model agent through the set interface, obtaining the returned result of the large model agent, and packaging the returned result, for example, adding fixed prefix text or suffix text information to the returned result so as to improve the use experience of the user.

The storage layer corresponds to the physical storage device on the terminal device and is divided into four parts, namely a local knowledge agent, a local knowledge storage (namely a first storage pool provided by the embodiment), a cache agent and a cache storage (namely a second storage pool provided by the embodiment). Wherein the local knowledge agent is responsible for managing data in the local knowledge store, providing the ability to quickly access the local knowledge. The local knowledge storage is responsible for storing local knowledge on the terminal device, including historical input of the user, common information and the like. The caching agent is responsible for managing data in the cache store, providing the ability to quickly access the cache knowledge. The cache storage is responsible for storing cache knowledge on the terminal device, including historical answers of a large language model, common information and the like. The persistent storage is responsible for persisting the data in the cache to disk so that the cached data can still be accessed quickly after the terminal device is restarted. Thus, the capability of quickly accessing local and cached knowledge on the terminal equipment can be realized, and richer and diversified services can be provided for users.

The above has mainly described in detail the respective structures constituting the functions of the voice assistant, and the use of the voice assistant will be described in detail below.

Referring to fig. 4, fig. 4 is a flow chart of a voice assistant using method provided by the embodiment of the present invention, as shown in fig. 4, in the voice assistant using method provided by the embodiment, compared with a technical scheme of using a voice assistant in a related art, user input provided by the embodiment is not directly forwarded to a large language model as a relay through a terminal device, but is subjected to a series of buffer processing, so that the use cost of the large language model is reduced.

Specifically, when the user inputs the question information, the terminal device first converts the question information into a text format, and then performs vectorization processing. The vectorization process herein refers to converting text into a vector representation using a vectorized large language model, also called an vectorization model, whose principle is to implement compression and dimension reduction of data by mapping high-dimensional discrete data into a low-dimensional continuous space. The Embedding model can be trained by means of unsupervised learning or supervised learning to obtain better data representation capabilities. For example, in the field of natural language processing, the commonly used Word2Vec and BERT models are implemented based on the embedded technology. They are capable of converting text data into a vector representation, thereby facilitating processing and analysis by a computer. The vectorized text has a characteristic that the larger the vector similarity is, the larger the correlation between the original texts is described, and the similarity judgment of the texts can be conveniently carried out.

With continued reference to fig. 4, after vectorizing the text, first, a similarity process is performed by the history buffer management, that is, a second knowledge in the second storage pool is searched, so as to calculate a second similarity value between the vectorized text and the second knowledge in the second storage pool. The second knowledge in the second storage pool is also stored in a vectorized database, that is, the second storage pool is a vector database, and the principle of the vector database is to convert text into vectors, and then store the vectors in the database. When a user inputs questioning information, the questioning information is converted into vectors, then the most similar vectors and contexts are searched in a vector database, and finally the most similar texts are returned to the user. Thus, quick retrieval and similarity judgment of text data can be realized. If the vector corresponding to the question information is hit in the history cache management, the hit refers to that the similarity obtained by the vector inner product of the question information and the second knowledge is larger than a set second threshold, for example, 0.8. The cached results in the second storage pool are returned directly to the user. If there is no hit, then go to the next step.

The questioning information vector which does not hit the second knowledge in the second storage pool continues to be downward, and the local knowledge cache management performs similarity calculation, namely, searches the first knowledge in the first storage pool to calculate a first similarity value between the vectorized text and the first knowledge in the first storage pool. The first storage pool contains historical input and common information of some users, and although the information cannot be directly used as answers of the users, the information can be filled into the first prompt words, so that the text quantity accessing to a large model can be reduced, and a part of cost can be reduced, wherein the first prompt words are obtained by analyzing the questioning information according to local knowledge buffer management. If the word does not hit, the next step is to send a request to the large language model of the cloud only by using the first prompt word obtained in advance. If so, the knowledge is integrated after the first hint word and a request is made to the large model.

After the large language model receives the request, the question and answer result is returned to the user, and the current question and answer result is updated to the second storage pool so that quick retrieval and similarity judgment can be realized in the future. Therefore, the use cost of the large language model can be reduced, richer and diversified services can be provided for users, and meanwhile, the real economic benefit can be brought to terminal equipment manufacturers.

In summary, the present embodiment provides a question-answer interaction method, including obtaining question information input by an object, analyzing the question information, generating a first prompt word of the question information, determining a second prompt word according to similarity between the question information and first knowledge stored in a first storage pool, where the first knowledge includes historical question information input by an object history and common information of the object, and sending a question-answer request to a large language model deployed in a cloud according to the first prompt word, the second prompt word and the question information, so as to obtain a first question-answer result returned by the large language model. By adopting the embodiment of the invention, the first knowledge similar to the question information in the first storage pool is used as the second prompt word, and the second prompt word and the first prompt word corresponding to the question information are used as the total prompt word to send the question-answer request to the large language model, so that the data length of the large language model to be predicted can be reduced, and the aim of saving the cost is fulfilled.

According to the method described in the above embodiments, the present embodiment will be further described from the perspective of a question-answer interaction device, which may be implemented as a separate entity, or may be implemented as an integrated electronic device, such as a terminal, where the terminal may include a mobile phone, a tablet computer, and so on.

In some embodiments, the present embodiment provides a question-answer interaction device. Specifically, referring to fig. 5, fig. 5 is a schematic structural diagram of a question-answer interaction device provided by an embodiment of the present invention, and as shown in fig. 4, a question-answer interaction device 500 provided by an embodiment of the present invention includes: an acquisition module 501, a generation module 502, a first determination module 503, and a question-answering module 504;

the obtaining module 501 is configured to obtain question information input by an object.

The generating module 502 is configured to analyze the question information and generate a first prompt word of the question information.

The first determining module 503 is configured to determine the second hint word according to the similarity between the question information and first knowledge stored in the first storage pool, where the first knowledge includes historical question information input by the object history, and common information of the object.

And the question and answer module 504 is configured to send a question and answer request to the large language model deployed in the cloud according to the first prompt word, the second prompt word and the question information, so as to obtain a first question and answer result returned by the large language model.

In some embodiments, the step of analyzing the question information to generate a first prompt for the question information includes:

Extracting key information from the questioning information to obtain key information;

according to the questioning information, determining the requirement information of the questioning information;

and determining a first prompt word of the questioning information according to the key information and the requirement information.

In some embodiments, the first determining module 503 provided in this embodiment is specifically configured to: calculating the similarity of the questioning information and the first knowledge stored in the first storage pool to obtain a first similarity value; and under the condition that the first similarity value is larger than a first threshold value, determining the first knowledge and the context data of the first knowledge as second prompt words.

In some embodiments, the first determining module 503 provided in this embodiment is specifically further configured to: converting the question information into text data; vectorizing the text data to obtain vectorized data; and calculating the similarity of the vectorized data and the first knowledge stored in the first storage pool to obtain a first similarity value, wherein the data form of the first knowledge is in a vector form.

In some embodiments, the question-answer interaction device provided in this embodiment further includes: a second determination module;

the second determining module is configured to determine that the second prompt word is null when the first similarity value is not greater than a first threshold value.

In this embodiment, the question and answer module 504 provided in this embodiment is specifically configured to: and sending a question-answer request to the large language model deployed at the cloud end according to the first prompt word.

In some embodiments, the question-answer interaction device provided in this embodiment further includes: a calculation module and a return module;

the computing module is used for computing the similarity between the question information and second knowledge stored in the second storage pool to obtain a second similarity value, wherein the second knowledge comprises a historical question-answer result returned by the large language model;

and the return module is used for taking the second knowledge as a second question and answer result corresponding to the question information and returning the second question and answer result to the object under the condition that the second similarity value is larger than a second threshold value.

In the case that the second similarity value is not greater than the second threshold value, the generating module 502 provided in this embodiment is executed.

In some embodiments, the question-answer interaction device provided in this embodiment further includes: a storage module;

and the storage module is used for storing the first question and answer result in the second storage pool.

In the implementation, each module and/or unit may be implemented as an independent entity, or may be combined arbitrarily and implemented as the same entity or a plurality of entities, where the implementation of each module and/or unit may refer to the foregoing method embodiment, and the specific beneficial effects that may be achieved may refer to the beneficial effects in the foregoing method embodiment, which are not described herein again.

In addition, referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device may be a mobile terminal, such as a smart phone, a tablet computer, or the like. As shown in fig. 6, the electronic device 600 includes a processor 601, a memory 602. The processor 601 is electrically connected to the memory 602.

The processor 601 is a control center of the electronic device 600, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device 600 and processes data by running or loading application programs stored in the memory 602, and calling data stored in the memory 602, thereby performing overall monitoring of the electronic device 600.

In this embodiment, the processor 601 in the electronic device 600 loads instructions corresponding to the processes of one or more application programs into the memory 602 according to the following steps, and the processor 601 executes the application programs stored in the memory 602, so as to implement any step of the question-answer interaction method provided in the foregoing embodiment.

The electronic device 600 may implement the steps in any embodiment of the question-answer interaction method provided by the embodiment of the present invention, so that the beneficial effects that any one of the question-answer interaction method provided by the embodiment of the present invention can implement are described in detail in the previous embodiments, and are not repeated herein.

Referring to fig. 7, fig. 7 is another schematic structural diagram of an electronic device provided in the embodiment of the present invention, and fig. 7 is a specific structural block diagram of the electronic device provided in the embodiment of the present invention, where the electronic device may be used to implement the question-answer interaction method provided in the embodiment. The electronic device 700 may be a mobile terminal such as a smart phone or a notebook computer.

The RF circuit 710 is configured to receive and transmit electromagnetic waves, and to perform mutual conversion between the electromagnetic waves and the electrical signals, thereby communicating with a communication network or other devices. RF circuitry 710 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. The RF circuitry 710 may communicate with various networks such as the internet, intranets, wireless networks, or other devices via wireless networks. The wireless network may include a cellular telephone network, a wireless local area network, or a metropolitan area network. The wireless network may use various communication standards, protocols, and technologies including, but not limited to, global system for mobile communications (Global System for Mobile Communication, GSM), enhanced mobile communications technology (Enhanced Data GSM Environment, EDGE), wideband code division multiple access technology (Wideband Code Division Multiple Access, WCDMA), code division multiple access technology (Code Division Access, CDMA), time division multiple access technology (Time Division Multiple Access, TDMA), wireless fidelity technology (Wireless Fidelity, wi-Fi) (e.g., institute of electrical and electronics engineers standards IEEE 802.11a,IEEE 802.11b,IEEE802.11g and/or IEEE802.11 n), internet telephony (Voice over Internet Protocol, voIP), worldwide interoperability for microwave access (Worldwide Interoperability for Microwave Access, wi-Max), other protocols for mail, instant messaging, and short messaging, as well as any other suitable communication protocols, even including those not currently developed.

The memory 720 may be used to store software programs and modules, such as program instructions/modules corresponding to the question-answer interaction method in the above embodiments, and the processor 780 executes the software programs and modules stored in the memory 720 to perform various functional applications and the question-answer interaction method.

Memory 720 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 720 may further include memory located remotely from processor 780, which may be connected to electronic device 700 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input unit 730 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, the input unit 730 may include a touch-sensitive surface 731 and other input devices 732. The touch-sensitive surface 731, also referred to as a touch display screen or touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on or thereabout the touch-sensitive surface 731 using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection device according to a pre-set program. Alternatively, touch-sensitive surface 731 may comprise two parts, a touch-detecting device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 780, and can receive commands from the processor 780 and execute them. In addition, the touch sensitive surface 731 may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch-sensitive surface 731, the input unit 730 may also include other input devices 732. In particular, the other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 740 may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device 700, which may be composed of graphics, text, icons, video, and any combination thereof. The display unit 740 may include a display panel 741, and alternatively, the display panel 741 may be configured in the form of an LCD (Liquid Crystal Display ), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface 731 may overlay the display panel 741, and upon detection of a touch operation thereon or thereabout by the touch-sensitive surface 731, the touch-sensitive surface 731 is passed to the processor 780 for determining the type of touch event, and the processor 780 then provides a corresponding visual output on the display panel 741 based on the type of touch event. Although in the figures the touch-sensitive surface 731 and the display panel 741 are implemented as two separate components, in some embodiments the touch-sensitive surface 731 and the display panel 741 may be integrated to implement the input and output functions.

The electronic device 700 may also include at least one sensor 750, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 741 according to the brightness of ambient light, and a proximity sensor that may generate an interrupt when the folder is closed or closed. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile phone is stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the electronic device 700 are not described in detail herein.

Audio circuitry 760, speaker 761, and microphone 762 may provide an audio interface between a user and electronic device 700. The audio circuit 760 may transmit the received electrical signal converted from audio data to the speaker 761, and the electrical signal is converted into a sound signal by the speaker 761 to be output; on the other hand, microphone 762 converts the collected sound signals into electrical signals, which are received by audio circuit 760 and converted into audio data, which are processed by audio data output processor 780 for transmission to, for example, another terminal via RF circuit 710, or which are output to memory 720 for further processing. Audio circuitry 760 may also include an ear bud jack to provide communication between a peripheral ear bud and electronic device 700.

The electronic device 700 may facilitate user reception of requests, transmission of information, etc. via a transmission module 770 (e.g., wi-Fi module), which provides wireless broadband internet access to the user. Although the transmission module 770 is shown in the drawings, it is understood that it does not belong to the essential constitution of the electronic device 700, and can be omitted entirely as required within the scope not changing the essence of the invention.

The processor 780 is a control center of the electronic device 700, connects various parts of the entire handset using various interfaces and lines, and performs various functions of the electronic device 700 and processes data by running or executing software programs and/or modules stored in the memory 720 and invoking data stored in the memory 720, thereby performing overall monitoring of the electronic device. Optionally, the processor 780 may include one or more processing cores; in some embodiments, the processor 780 may integrate an application processor that primarily processes operating systems, user interfaces, applications, and the like, with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 780.

The electronic device 700 also includes a power supply 790 (e.g., a battery) that provides power to the various components, and in some embodiments, may be logically coupled to the processor 780 through a power management system to perform functions such as managing charging, discharging, and power consumption by the power management system. Power supply 790 may also include one or more of any components, such as a dc or ac power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

Although not shown, the electronic device 700 further includes a camera (e.g., front camera, rear camera), a bluetooth module, etc., which will not be described in detail herein. In particular, in this embodiment, the display unit of the electronic device is a touch screen display, and the mobile terminal further includes a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by the one or more processors to implement any step of the question-answer interaction method provided in the foregoing embodiment.

In the implementation, each module may be implemented as an independent entity, or may be combined arbitrarily, and implemented as the same entity or several entities, and the implementation of each module may be referred to the foregoing method embodiment, which is not described herein again.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor. To this end, an embodiment of the present invention provides a storage medium in which a plurality of instructions are stored, which when executed by a processor, implement any of the steps of the question-answer interaction method provided in the above embodiment.

Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The instructions stored in the storage medium can execute the steps in any embodiment of the question-answer interaction method provided by the embodiment of the present invention, so that the beneficial effects of any one of the question-answer interaction methods provided by the embodiment of the present invention can be achieved, and detailed descriptions of the previous embodiments are omitted.

The foregoing describes in detail a question-answer interaction method, apparatus, electronic device and storage medium provided in the embodiments of the present application, and specific examples are applied to illustrate principles and implementations of the present application, where the foregoing description of the embodiments is only used to help understand the method and core idea of the present application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above. Moreover, it will be apparent to those skilled in the art that various modifications and variations can be made without departing from the principles of the present invention, and such modifications and variations are also considered to be within the scope of the invention.

Claims

1. A question-answer interaction method is characterized by comprising the following steps:

acquiring question information input by an object;

2. The method of claim 1, wherein the step of analyzing the question information to generate a first prompt for the question information comprises:

3. The method of claim 1, wherein the step of determining the second hint word based on the similarity of the questioning information and the first knowledge stored in the first storage pool comprises:

Calculating the similarity of the questioning information and the first knowledge stored in the first storage pool to obtain a first similarity value;

and under the condition that the first similarity value is larger than a first threshold value, determining the first knowledge and the context data of the first knowledge as second prompt words.

4. The method of claim 3, wherein the step of calculating a similarity of the challenge information and the first knowledge stored in the first storage pool to obtain a first similarity value comprises:

converting the question information into text data;

vectorizing the text data to obtain vectorized data;

and calculating the similarity between the vectorized data and the first knowledge stored in the first storage pool to obtain a first similarity value, wherein the data form of the first knowledge is a vector form.

5. The method of claim 3, wherein after the step of calculating a similarity of the challenge information and the first knowledge stored in the first storage pool to obtain a first similarity value, the method further comprises:

under the condition that the first similarity value is not larger than a first threshold value, determining that a second prompt word is empty;

The step of sending a question-answer request to a large language model deployed in the cloud according to the first prompt word and the second prompt word comprises the following steps:

and sending a question-answer request to the large language model deployed at the cloud according to the first prompt word.

6. The method of claim 1, wherein prior to the step of analyzing the question information to generate a first prompt for the question information, the method further comprises:

calculating the similarity between the questioning information and second knowledge stored in a second storage pool to obtain a second similarity value, wherein the second knowledge comprises a historical questioning and answering result returned by a large language model;

when the second similarity value is larger than a second threshold value, taking the second knowledge as a second question and answer result corresponding to the question information, and returning the second question and answer result to the object;

and executing the step of analyzing the question information and generating a first prompt word of the question information under the condition that the second similarity value is not greater than a second threshold value.

7. The method according to claim 6, wherein after the step of sending a question-answer request to a large language model deployed in the cloud to obtain a first question-answer result returned by the large language model according to the first prompt word, the second prompt word, and the question information, the method further comprises:

And storing the first question and answer result in the second storage pool.

8. A question-answering interaction device, comprising:

9. An electronic device comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any one of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the steps in the method according to any one of claims 1 to 7.