CN116992005B

CN116992005B - Intelligent dialogue method, system and equipment based on large model and local knowledge base

Info

Publication number: CN116992005B
Application number: CN202311236134.9A
Authority: CN
Inventors: 夏博; 李世奇; 李国东
Original assignee: Yucang Technology Beijing Co ltd
Current assignee: Yucang Technology Beijing Co ltd
Priority date: 2023-09-25
Filing date: 2023-09-25
Publication date: 2023-12-01
Anticipated expiration: 2043-09-25
Also published as: CN116992005A

Abstract

The invention provides an intelligent dialogue method, a system and equipment based on a large model and a local knowledge base, belonging to the field of artificial intelligence, wherein the method comprises the following steps: acquiring a local knowledge base and user problems; coding the user problem and the problem in the local knowledge base by adopting a coder in a pre-trained large model to obtain corresponding sentence vectors; calculating the semantic similarity between the problem sentence vector of the user and the problem sentence vector in the local knowledge base; if the semantic similarity between the question sentence vector of the user and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value, taking the answer of the question-answer data as the answer of the user question; otherwise, calculating the similarity between the user problem and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model, and determining a plurality of candidate article paragraphs; and determining answers to the user questions by adopting a pre-trained large model according to the plurality of candidate article paragraphs and the user questions. The invention improves the reliability of intelligent dialogue.

Description

Intelligent dialogue method, system and equipment based on large model and local knowledge base

Technical Field

The invention relates to the field of artificial intelligence, in particular to an intelligent dialogue method, system and equipment based on a large model and a local knowledge base.

Background

Recently, the intelligent dialogue system ChatGPT based on the pre-training large model introduced by OpenAI of the artificial intelligence company in the united states has been hot on the whole network, and is one of the applications with the fastest growing speed for users in history. In the use process, the ChatGPT and similar pre-training large model technology expose some problems, including difficult knowledge updating, uncontrolled answer content, incapability of privating deployment, data security and the like, and cannot meet the requirements of large and medium enterprises and users in some knowledge-intensive industries.

In summary, there are many limitations to the existing intelligent dialogue technology based on pre-trained big models, mainly in the following aspects.

(1) Timeliness problems. Large models are older in content and therefore answer poorly to questions that are more time-efficient or that change frequently. The long training time and high cost of large models are the main reasons for this problem.

(2) Professional problems. The existing large model has better answer to the general questions, but cannot guarantee the correctness in the field with stronger professionals, and has poor performance in some industries with higher requirements on reliability and accuracy.

(3) Uncontrollable problems. The large model can give out answers forcefully to unfamiliar questions, and obvious errors in the answers or contradiction of contents in a knowledge base cannot be avoided. It is also due in essence to the lack of ability of large models to integrate with industry or domain knowledge.

In summary, the reliability of answers to questions provided by existing intelligent dialog techniques is low.

Disclosure of Invention

The invention aims to provide an intelligent dialogue method, an intelligent dialogue system and intelligent dialogue equipment based on a large model and a local knowledge base, which can improve the reliability of intelligent dialogue.

In order to achieve the above purpose, the present invention provides an intelligent dialogue method based on a large model and a local knowledge base, comprising: acquiring a local knowledge base and user problems; the local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs; each question-answer data includes a question and an answer.

Coding the user questions and the questions of each question-answer data in the local knowledge base by adopting a coder in a pre-trained large model to obtain user question sentence vectors and question sentence vectors of each question-answer data in the local knowledge base; the large model includes an encoder and a decoder, each of which includes a plurality of stacked transform blocks.

And calculating the semantic similarity between the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base.

And if the semantic similarity of the question sentence vector of the user and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value, taking the answer of the question-answer data as the answer of the user question.

If the semantic similarity between the user question sentence vector and the question sentence vectors of all question and answer data in the local knowledge base is smaller than a set threshold value, calculating the similarity between the user question and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model.

And determining a plurality of candidate article paragraphs according to the similarity between the user problem and each article paragraph in the local knowledge base.

Determining answers to the user questions by using a pre-trained large model according to the plurality of candidate article paragraphs and the user questions; the pre-trained large model is used for outputting answers according to the input knowledge information and questions.

In order to achieve the above object, the present invention further provides an intelligent dialogue system based on a large model and a local knowledge base, including: the data acquisition module is used for acquiring a local knowledge base and user problems; the local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs; each question-answer data includes a question and an answer.

The coding module is connected with the data acquisition module and is used for respectively coding the user questions and the questions of each question-answer data in the local knowledge base by adopting a coder in a pre-trained large model to obtain user question sentence vectors and question sentence vectors of each question-answer data in the local knowledge base; the large model includes an encoder and a decoder, each of which includes a plurality of stacked transform blocks.

The semantic similarity calculation module is connected with the encoding module and used for calculating the semantic similarity of the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base.

The first answer determining module is connected with the semantic similarity calculating module and is used for taking the answer of the question-answer data as the answer of the user question if the semantic similarity of the question sentence vector of the user question and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value.

And the paragraph similarity calculation module is connected with the semantic similarity calculation module and is used for calculating the similarity of the user question and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model if the semantic similarity of the user question sentence vector and the question sentence vectors of all question-answer data in the local knowledge base is smaller than a set threshold value.

And the candidate paragraph determining module is connected with the paragraph similarity calculating module and is used for determining a plurality of candidate article paragraphs according to the similarity between the user problem and each article paragraph in the local knowledge base.

The second answer determining module is connected with the candidate paragraph determining module and is used for determining answers of the user questions by adopting a pre-trained large model according to a plurality of candidate article paragraphs and the user questions; the pre-trained large model is used for outputting answers according to the input knowledge information and questions.

In order to achieve the above object, the present invention further provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute the above intelligent dialogue method based on the large model and the local knowledge base.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention adopts the encoder in the pre-trained large model to encode the user problem and the question of each question-answer data in the local knowledge base respectively, and based on sentence vector, the learning of the user local structured question-answer knowledge is realized, the encoder structure is introduced into the large model based on the decoder result, the natural language understanding capability is enhanced while the better natural language generating capability is possessed, and the intention of the user is better understood. If the semantic similarity between the question sentence vector of the user and the question sentence vectors of all question and answer data in the local knowledge base is smaller than a set threshold value, a pre-trained dense paragraph retrieval model is adopted to calculate the similarity between the user question and each article paragraph in the local knowledge base, a plurality of candidate article paragraphs are determined, and the local knowledge base retrieval is carried out based on the dense paragraph retrieval model, so that the learning of the user local unstructured document knowledge is realized. And finally, determining answers to the user questions by adopting a pre-trained large model according to the plurality of candidate article paragraphs and the user questions, so as to realize controllable intelligent dialogue based on local knowledge of the user, and further improve the reliability of the intelligent dialogue.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of the intelligent dialogue method based on the large model and the local knowledge base.

Fig. 2 is a schematic diagram of a large model.

Fig. 3 is a schematic diagram of an intelligent dialogue system based on a large model and a local knowledge base according to the present invention.

Symbol description: 201-data acquisition module, 202-encoding module, 203-semantic similarity calculation module, 204-first answer determination module, 205-paragraph similarity calculation module, 206-candidate paragraph determination module, 207-second answer determination module.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide an intelligent dialogue method, an intelligent dialogue system and intelligent dialogue equipment based on a large model and a local knowledge base, which perfectly integrate a large model technology with user knowledge and solve the problems that the conventional intelligent question-answering technology is insufficient in interaction capability, cannot combine user documents and knowledge, and is difficult to accurately understand user intention and the like.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

Embodiment one: as shown in fig. 1, the present embodiment provides an intelligent dialogue method based on a large model and a local knowledge base, which includes the following steps.

Step 1: and acquiring a local knowledge base and user problems. The local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs. Each question-answer data includes a question and an answer.

Step 2: and respectively encoding the user questions and the questions of each question-answer data in the local knowledge base by adopting an encoder in a pre-trained large model to obtain user question sentence vectors and question sentence vectors of each question-answer data in the local knowledge base.

In this embodiment, as shown in fig. 2, the large model includes an encoder and a decoder, each of which includes a plurality of stacked transducers blocks, each of which includes a self-attention module and a feedforward neural network module inside. Wherein the encoder comprises K transducer blocks and the decoder comprises M transducer blocks. The autoregressive learning is realized through the network structure, and the relation between words at different positions in the input character sequence is obtained. The encoder is for mapping the input character sequence into an embedded semantic representation and the decoder is for converting the embedded semantic representation into a target output sequence.

The large model bottom layer adopts an encoding-decoding (encoding-decoding) architecture, and compared with a decoding-only architecture selected by a large model such as ChatGPT, the architecture has better natural language generation capability, strengthens the task capability of natural language understanding, and is better in understanding user demands.

The core idea of step 2 is to map the whole sentence to a vector space, where semantically similar sentences would be mapped to positions closer in space. The sentence text is converted into a vector with fixed dimension through the trained large model by adopting the sentence vector coding technology based on the large model.

Specifically, the sentence of the question is converted into a word element, and the word element is input into a trained large model, and the embedded semantic representation output by the encoder of the large model is used as a sentence vector of the corresponding question. And storing sentence vectors of all the problems into a vector database, and constructing an index for subsequent retrieval. Thereby realizing understanding and representation of sentence semantics. For example, the sentence "today is a good day" may be expressed as a sentence vector that outputs the same dimension as the large model encoder: (-0.1329041,0.16528024, … …, -0.03694873,0.42119873).

Compared with the traditional word bag model, TF-IDF and other methods, the sentence vector technology based on the large model has higher accuracy and robustness, and semantic information in sentences can be captured better. In addition, since the large model is trained in advance, the generation time and the calculation cost of sentence vectors can be greatly reduced.

Further, the training process of the large model comprises the following steps.

(1) A training dataset is acquired. The training dataset includes a plurality of character sequences.

Specifically, the training data set comprises three parts of Chinese, english and code, and mainly comprises text chapters, and the minimum unit of the training data is a token (token). The training dataset contains about 4000 hundred million lements. Wherein, the Chinese data is about 1900 hundred million words, mainly from WeChat public number, zhijian, baijiu, microblog and the like; english data about 1900 hundred million word elements mainly come from Internet webpage data, books, papers, wikipedia and the like; about 200 hundred million word elements of code data mainly originate from websites such as Github. The method is obtained by adopting the modes of public channel collection, pay purchase and the like. The acquired data needs to be subjected to fine de-duplication and cleaning pretreatment before being added into the training data set.

(2) And carrying out iterative training on the large model by adopting the training data set so as to obtain a preliminary large model.

In the pre-training stage, large-scale and high-quality unlabeled text corpus is adopted to train parameters of a large model. Firstly, a standard language model training method is adopted, a character sequence is sequentially input from left to right to pretrain a large model, namely, k Token in front are given, the ith Token is predicted, and specific objective functions are as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>For the objective function value, u _i For the ith Token, k in the entered character sequence is the contextual window size,/->As a model parameter, P () is a probability function.

Then, based on the objective function, character sequences in the training data set are input in batches, and updating iteration is carried out on parameters of the large model, so that the large model obtains probability distribution of language information in the training data set. The large model obtains the capabilities of basic knowledge, language expression, logical reasoning and the like which are contained in the training data set through a self-supervision learning mode.

(3) An instruction fine data set is obtained. The instruction fine data set includes a plurality of real dialogue data. Each real dialogue data includes knowledge information, questions, and answers.

Specifically, source data is first acquired: a large amount of preliminary real dialogue data is collected and arranged, wherein the preliminary real dialogue data comprises dialogue content, participant roles, sending time and the like.

Then constructing an instruction template: according to the task goals, instruction templates for training large models for answering user questions based on relevant content in knowledge are designed, which contain necessary questions, knowledge information and answers. The instruction templates are in the form as follows.

{ answer questions according to the knowledge provided below.

Knowledge: * ***.

Problems: * ***.

Answer: * ***}.

And generating an instruction fine tuning data set according to the instruction template and the source data. Screening the dialogue content of the source data, extracting question-answer pairs and related knowledge information. And placing the data into an instruction template to construct complete instruction fine tuning data. Examples are as follows.

{ answer questions according to the knowledge provided below.

Knowledge information: the construction enterprises main responsible person, project responsible person and full-time safety production manager safety production management rule implementation opinion (construction quality [ 2015 ] 206) rule that the enterprises main responsible person comprises legal representative persons, general manager (president), sub-general manager (sub president) of sub-management safety production, sub-general manager (sub president) of sub-management production operation, technical responsible person, safety president and the like. Current inspection requires that legal representatives of the enterprise, general managers and technical authorities have security production check-up certification, if any, to take care of the latest notice.

Problems: who is the building company required to apply for security to act as the primary responsible party?

Answer: legal representatives, general manager (president), secondary general manager (secondary president) for safe production of branch pipes, secondary general manager (secondary president) for production and management of branch pipes, technical responsible person, security director, etc.

(4) And carrying out iterative training on the preliminary large model by adopting the instruction fine adjustment data set so as to obtain a trained large model.

Specifically, the parameters of the preliminary large model are finely adjusted in a supervision mode, the instructions, the knowledge information and the question parts of the data are used as the input of the preliminary large model, and the answer parts are used as the output of the preliminary large model. The instruction fine data set is divided into a training set and a testing set. The training set is used for fine tuning training of the preliminary large model, and the testing set is used for evaluating the performance of the preliminary large model.

The preliminary large model constructed in the step (1) and the step (2) obtains basic language understanding and expression capability, and the purpose of the step (3) and the step (4) is to perform further supervised fine tuning on the preliminary large model by constructing specific task instruction data so that the preliminary large model obtains the capability of solving actual tasks. Such as classification tasks, writing tasks, question and answer tasks, etc. The task in the invention is a user question answering based on a local knowledge base, and is mainly characterized in that a dialogue generating task based on knowledge. And performing supervised fine tuning on parameters of the preliminary large model to enable the preliminary large model to better fit features of task data so as to be better aligned with human behaviors.

Step 3: and calculating the semantic similarity between the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base.

Specifically, the semantic similarity is cosine similarity or euclidean distance. And comparing the semantic similarity between the problem sentence vector of the user and all the problem sentence vectors in the local knowledge base by adopting a vector similarity calculation algorithm of cosine similarity or Euclidean distance.

Step 4: and if the semantic similarity of the question sentence vector of the user and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value, taking the answer of the question-answer data as the answer of the user question.

Step 5: if the semantic similarity between the user question sentence vector and the question sentence vectors of all question and answer data in the local knowledge base is smaller than a set threshold value, calculating the similarity between the user question and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model.

The dense paragraph retrieval (Dense Passage Retrieval, DPR) model is a deep neural network retrieval model for open domain questions and answers. Traditional retrieval methods rely on sparse features such as word bags or TF-IDF, and it is often difficult to capture semantic information of text content. The DPR model adopts a dense vector method, so that semantic relations between texts can be better understood and captured, and the retrieval accuracy is improved.

Further, in step 5, a pre-trained DPR model is adopted to calculate the similarity between the user problem and each article paragraph in the local knowledge base, which specifically includes the following steps.

(51) And respectively coding the user problem and each article paragraph in the local knowledge base by adopting a pre-trained DPR model to obtain user problem coding data and coding data of each article paragraph in the local knowledge base.

(52) And calculating the inner product of the user question coding data and the coding data of the article paragraph aiming at any article paragraph so as to obtain the similarity of the user question and the article paragraph. Both step (51) and step (52) are processes of the DPR model.

Specifically, the DPR model employs a double-tower structure, a first tower for encoding user questions and a second tower for encoding article paragraphs. During the training process, one question at a time, a positive answer to the question (the correct paragraph), and some negative answers (the irrelevant paragraphs) are required, after which optimization is done by contrast loss.

Step 6: and determining a plurality of candidate article paragraphs according to the similarity between the user problem and each article paragraph in the local knowledge base.

Specifically, the article paragraphs in the local knowledge base are ranked according to the similarity with the user problem from large to small, and the first N article paragraphs are selected as candidate article paragraphs, wherein N is greater than 1.

Step 7: and determining answers to the user questions by adopting a pre-trained large model according to the plurality of candidate article paragraphs and the user questions. The pre-trained large model is used for outputting answers according to the input knowledge information and questions.

Specifically, step 7 includes: (71) And determining candidate knowledge information according to the multiple candidate article paragraphs and preset prompt words. (72) And determining a target article paragraph by adopting a pre-trained large model according to the candidate knowledge information and the user problem. (73) And determining answers to the user questions by adopting a pre-trained large model according to the target article paragraphs and the user questions.

Through designing proper prompt words, the user problems and a plurality of article paragraphs are input into a trained large model, so that the trained large model can judge whether information in the article paragraphs can give correct solutions to the user problems, and then the article paragraph which is most suitable for solving the user problems is extracted from the information as a target article paragraph. If neither can answer, the output informs the user that the answer to the question cannot be found.

For a further understanding of the aspects of the present invention, step (72) is described below in connection with the detailed description.

The following candidate knowledge information and user questions are entered into the trained large model.

{ judging which one of the following materials is most suitable for solving the following problems, and outputting the corresponding material number; if none of the following materials solves the problem, a 0 is output.

Material 1: the building construction enterprises refer to independent production and operation units for housing, building and equipment installation and production activities. There are two forms of building installation enterprises and self-operating construction units. The former is an enterprise with an administrative independent organization, which performs independent accounting economically.

Material 2: the construction enterprises main responsible person, project responsible person and full-time safety production manager safety production management rule implementation opinion (construction quality [ 2015 ] 206) rule that the enterprises main responsible person comprises legal representative persons, general manager (president), sub-general manager (sub president) of sub-management safety production, sub-general manager (sub president) of sub-management production operation, technical responsible person, safety president and the like. At present, the examination of the lobby requires that legal representatives, general managers and technical responsible persons of the enterprise have safe production and examination qualification certificates, and if the certificate changes, the latest notice of the lobby is attended to.

Material 3: the responsible person is one of registration matters of the business entity and the branch company. The responsible person refers to a person who exercises authority on behalf of an operation unit and a subsidiary, hosts production, operation or service activities of the operation unit and the subsidiary, and is generally appointed by a corporate legal person of a higher-level legal unit, a corporate legal person, a social legal person, or the like, which is affiliated to the operation unit or the subsidiary.

User problem: who is the building company required to apply for security to act as the primary responsible party? }.

The result of the trained large model is: { answer: 2}.

Wherein material 1, material 2, and material 3 correspond to 3 candidate article paragraphs.

Because the trained large model is trained on massive text data, rich semantic information can be learned, and complex questions can be understood and correct answers can be given. According to the invention, through designing proper prompt words, the user questions and the target article paragraphs screened in the step (72) are used as knowledge information to be input into a trained large model, and the output result of the large model is the answer of the user questions. Examples are as follows.

The following information is entered into the trained large model.

{ answer questions according to the knowledge provided below.

Knowledge information: the construction enterprises main responsible person, project responsible person and full-time safety production manager safety production management rule implementation opinion (construction quality [ 2015 ] 206) rule that the enterprises main responsible person comprises legal representative persons, general manager (president), sub-general manager (sub president) of sub-management safety production, sub-general manager (sub president) of sub-management production operation, technical responsible person, safety president and the like. At present, the examination of the lobby requires that legal representatives, general managers and technical responsible persons of the enterprise have safe production and examination qualification certificates, and if the certificate changes, the latest notice of the lobby is attended to.

Problems: who is the building company required to apply for security to act as the primary responsible party? }.

The output result of the trained large model is as follows: { answer: legal representatives, general manager (president), secondary general manager (secondary president) for safe production of branch pipes, secondary general manager (secondary president) for production and management of branch pipes, technical responsible person, security director, etc.

In order to improve the reliability of the output answers, the intelligent dialogue method based on the large model and the local knowledge base further comprises the following steps.

Step 8: and judging whether the answer to the user question comprises sensitive information or not by adopting a pre-established rule engine to obtain a first checking result. And the first checking result is yes or no. The rules engine includes a series of sensitive words and text templates with specific patterns (such as specific phrases or specific grammatical structures, etc.), matches the answers to the user questions with the rules, and marks the answers to the user questions as sensitive information if the matching is successful.

Step 9: and classifying the answers of the user questions by adopting an answer classification model to obtain a second checking result. The second audit result is sensitive or non-sensitive. The answer classification model is obtained by training the deep learning model by adopting a training sample set in advance. The training sample set comprises a plurality of character sequence samples and categories of the character sequence samples. The class is sensitive or non-sensitive.

Specifically, first, a training sample set containing sensitive information and non-sensitive information is collected. May be collected from different sources, such as social media, news stories, etc. And then, labeling the character sequence samples in the training sample set, and classifying the character sequence samples into sensitive and non-sensitive categories.

Next, a suitable deep learning model architecture is selected, and common techniques include recurrent neural networks (Recurrent Neural Network, RNN), convolutional neural networks (Convolutional Neural Networks, CNN), bert classification models, and the like. And selecting a proper deep learning model according to the task requirements and the characteristics of the training sample set.

The training sample set is then divided into a training set, a validation set, and a test set. Training the deep learning model by using a training set, training parameters of the deep learning model by adopting a cross entropy loss function and gradient descent optimization algorithm, adjusting and optimizing the deep learning model by using a verification set, and selecting proper super parameters. And evaluating the trained deep learning model by using the test set, and calculating indexes such as accuracy, recall, F1 value and the like of the deep learning model. And optimizing the deep learning model according to the evaluation result, such as adjusting super parameters, increasing regularization and the like.

Finally, the trained deep learning model is deployed into practical application, and can be deployed through an application programming interface (Application Programming Interface, API) service form so as to audit the sensitive information of the new text data.

Step 10: and judging whether the answer of the user question is standard according to the first checking result and the second checking result. If the first checking result is yes or the second checking result is sensitive, the answer of the user question is not standard, otherwise, the answer of the user question is standard. That is, the answer specification is considered only when both the first and second audit results are insensitive.

Step 11: if the answer of the user question is not standard, generating alarm information and outputting an alarm informing the user that the answer of the user question cannot be found.

Step 12: and if the answer of the user question is standard, outputting the answer of the user question.

The invention adopts a text classification method combining a rule engine and a deep learning model to realize content security audit. Combines the advantages of the two: the deep learning model can understand the context and meaning of the text, and has high recognition capability for some complex and varied texts. At the same time, the rules engine can ensure that some explicitly sensitive information can be captured accurately.

The trained large model may generate sensitive information when generating high quality natural language text. The content security audit is a key link for ensuring the security and controllability of the large model application, and the function of the content security audit is to check whether the user questions and the answers generated in the step 7 contain sensitive information, including illegal contents, improper values and the like. If the user detects that the user has sensitive information, the user directly reports the information to exit by mistake; if the answers are checked to contain sensitive information, the output informs the user that the answers to the questions cannot be found.

The bottom layer of the invention is based on an independently developed large-scale pre-training language model (called a large model for short), and has the incomparable advantage of the intelligent dialogue system of the previous generation based on the natural language processing (Natural Language Processing, NLP) technology. Firstly, recognizing user intention by utilizing language understanding capability of a large model, accurately understanding problems raised by a user, and grasping real intention of the user; secondly, the capacity of personifying communication and communication is utilized, and the conversation generating capacity of a large model is utilized to communicate with the user in a natural, smooth and humanly-flavored mode, so that the user obtains better service experience; in addition, the intelligent dialogue robot can answer strictly according to the user knowledge, realizes perfect combination of a large model and the user knowledge by utilizing various artificial intelligence (Artificial Intelligence, AI) technologies such as knowledge enhancement and the like, generates more objective, safe and controllable answers, avoids the problem of messy answer of the large model, and builds a safe and reliable intelligent dialogue robot based on the user knowledge.

In summary, the invention has the following beneficial effects.

1. The intention understanding ability of the large model is improved. In the training process of the large model, the structural design part of the large model adopts an encoding-decoding architecture, and the architecture introduces an encoder structure on the basis of the existing decoder structure, has better natural language generating capability and strengthens the natural language understanding capability, so that the architecture is better in understanding user intention, and has advantages in the application field of the vertical field.

2. Controllable intelligent questions and answers can be conducted based on local knowledge of the user. The invention supports learning of the user local document and knowledge, and automatically realizes intelligent question and answer based on the user local knowledge by combining the understanding and generating capability of the large model. The user intention is understood based on sentence vectors in the step 2 and the step 3, learning of local structured question-answer knowledge of the user is achieved, local knowledge base retrieval is conducted based on the DPR model in the step 5 and the step 6, learning of local unstructured document knowledge of the user is achieved, answers are generated based on the trained large model in the step 7, relevant answers of the user questions are summarized and extracted from the local knowledge base through the large model, answer basis can be given, and controllable intelligent dialogue based on the local knowledge of the user is achieved.

3. The output answer is safer and more reliable. And 8, performing content auditing based on the rule engine and the deep learning model in the step 12, so as to realize automatic auditing of the output content of the large model and avoid the problems of illegal generation, improper value viewing and the like. The transparency of the generated artificial intelligence technology is improved, and the safety and reliability of output content are improved.

4. The cost is reduced, and the question-answering efficiency and quality are improved. Compared with the traditional intelligent question-answering method, the method can communicate with the user in a more friendly way, can help the user to quickly and accurately find the question answer, and improves the service efficiency and quality of the client.

Embodiment two: in order to execute the corresponding method of the above embodiment to achieve the corresponding functions and technical effects, an intelligent dialogue system based on a large model and a local knowledge base is provided below.

As shown in fig. 3, the intelligent dialogue system based on the large model and the local knowledge base provided in this embodiment includes: the system comprises a data acquisition module 201, an encoding module 202, a semantic similarity calculation module 203, a first answer determination module 204, a paragraph similarity calculation module 205, a candidate paragraph determination module 206 and a second answer determination module 207.

The data obtaining module 201 is configured to obtain a local knowledge base and a user problem. The local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs. Each question-answer data includes a question and an answer.

The encoding module 202 is connected to the data obtaining module 201, and the encoding module 202 is configured to encode the user question and the question of each question-answer data in the local knowledge base by using an encoder in a pre-trained large model, so as to obtain a user question sentence vector and a question sentence vector of each question-answer data in the local knowledge base. The large model includes an encoder and a decoder, each of which includes a plurality of stacked transform blocks.

The semantic similarity calculating module 203 is connected to the encoding module 202, and the semantic similarity calculating module 203 is configured to calculate the semantic similarity between the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base.

The first answer determining module 204 is connected to the semantic similarity calculating module 203, where the first answer determining module 204 is configured to take an answer of the question-answer data as an answer of the user question if the semantic similarity between the user question sentence vector and a question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold.

The paragraph similarity calculation module 205 is connected to the semantic similarity calculation module 203, where the paragraph similarity calculation module 205 is configured to calculate the similarity between the user question and each article paragraph in the local knowledge base by using a pre-trained DPR model if the semantic similarity between the user question sentence vector and the question sentence vectors of all question-answer data in the local knowledge base is smaller than a set threshold.

The candidate paragraph determining module 206 is connected to the paragraph similarity calculating module 205, and the candidate paragraph determining module 206 is configured to determine a plurality of candidate article paragraphs according to the similarity between the user question and each article paragraph in the local knowledge base.

The second answer determining module 207 is connected to the candidate paragraph determining module 206, and the second answer determining module 207 is configured to determine an answer to the user question according to a plurality of candidate article paragraphs and the user question by using a pre-trained large model. The pre-trained large model is used for outputting answers according to the input knowledge information and questions.

Compared with the prior art, the intelligent dialogue system based on the large model and the local knowledge base provided by the embodiment has the same beneficial effects as the intelligent dialogue method based on the large model and the local knowledge base provided by the first embodiment, and is not described herein.

Embodiment III: the present embodiment provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to run the computer program to cause the electronic device to execute the intelligent dialogue method according to the first embodiment based on the large model and the local knowledge base.

Alternatively, the electronic device may be a server.

In addition, the embodiment of the invention also provides a computer readable storage medium, which stores a computer program, and the computer program realizes the intelligent dialogue method based on the large model and the local knowledge base in the first embodiment when being executed by a processor.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The intelligent dialogue method based on the large model and the local knowledge base is characterized by comprising the following steps of:

acquiring a local knowledge base and user problems; the local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs; each question and answer data comprises a question and an answer;

coding the user questions and the questions of each question-answer data in the local knowledge base by adopting a coder in a pre-trained large model to obtain user question sentence vectors and question sentence vectors of each question-answer data in the local knowledge base; the large model comprises an encoder and a decoder, each of which comprises a plurality of stacked Transformer blocks;

calculating the semantic similarity between the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base;

if the semantic similarity of the question sentence vector of the user and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value, taking an answer of the question-answer data as an answer of the user question;

if the semantic similarity between the user question sentence vector and the question sentence vectors of all question-answering data in the local knowledge base is smaller than a set threshold value, calculating the similarity between the user question and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model;

determining a plurality of candidate article paragraphs according to the similarity between the user problem and each article paragraph in the local knowledge base;

according to a plurality of candidate article paragraphs and the user questions, a pre-trained large model is adopted to determine answers to the user questions, and the method specifically comprises the following steps: determining candidate knowledge information according to a plurality of candidate article paragraphs and preset prompt words; according to the candidate knowledge information and the user problem, a pre-trained large model is adopted to determine a target article paragraph; determining answers to the user questions by adopting a pre-trained large model according to the target article paragraphs and the user questions; the pre-trained large model is used for outputting answers according to the input knowledge information and questions.

2. The intelligent dialogue method based on a large model and a local knowledge base according to claim 1, wherein the training process of the large model comprises:

acquiring a training data set; the training dataset comprises a plurality of character sequences;

performing iterative training on the large model by adopting the training data set to obtain a preliminary large model;

acquiring an instruction fine adjustment data set; the instruction fine tuning dataset comprises a plurality of real dialogue data; each real dialogue data comprises knowledge information, questions and answers;

and carrying out iterative training on the preliminary large model by adopting the instruction fine adjustment data set so as to obtain a trained large model.

3. The intelligent dialogue method based on large models and local knowledge base according to claim 1, wherein the semantic similarity is determined according to a vector similarity calculation algorithm of cosine similarity or euclidean distance.

4. The intelligent dialogue method based on a big model and a local knowledge base according to claim 1, wherein the calculating the similarity between the user problem and each article paragraph in the local knowledge base by using a pre-trained dense paragraph retrieval model specifically comprises:

coding the user questions and the article paragraphs in the local knowledge base respectively by adopting a pre-trained dense paragraph retrieval model to obtain user question coding data and coding data of the article paragraphs in the local knowledge base;

and calculating the inner product of the user question coding data and the coding data of the article paragraph aiming at any article paragraph so as to obtain the similarity of the user question and the article paragraph.

5. The method of claim 1, wherein determining a plurality of candidate article paragraphs based on similarity of the user questions to each article paragraph in the local knowledge base comprises:

ranking the article paragraphs in the local knowledge base according to the similarity with the user problem from big to small, and selecting the previous article paragraphsNThe individual article paragraphs serve as candidate article paragraphs,N>1。

6. the large model and local knowledge base based intelligent conversation method of claim 1, further comprising:

judging whether the answer to the user question comprises sensitive information or not by adopting a pre-established rule engine to obtain a first checking result; the first checking result is yes or no;

classifying answers of the user questions by adopting an answer classification model to obtain second checking results; the second checking result is sensitive or non-sensitive; the answer classification model is obtained by training a deep learning model by adopting a training sample set in advance; the training sample set comprises a plurality of character sequence samples and categories of the character sequence samples; the class is sensitive or non-sensitive;

judging whether the answer of the user question is standard or not according to the first checking result and the second checking result; if the first checking result is yes or the second checking result is sensitive, the answer of the user question is not standard, otherwise, the answer of the user question is standard;

if the answer of the user question is not standard, generating alarm information and outputting an alarm informing the user that the answer of the user question cannot be found;

and if the answer of the user question is standard, outputting the answer of the user question.

7. An intelligent dialogue system based on a large model and a local knowledge base, which is characterized by comprising:

the data acquisition module is used for acquiring a local knowledge base and user problems; the local knowledge base comprises a plurality of question and answer data and a plurality of article paragraphs; each question and answer data comprises a question and an answer;

the coding module is connected with the data acquisition module and is used for respectively coding the user questions and the questions of each question-answer data in the local knowledge base by adopting a coder in a pre-trained large model to obtain user question sentence vectors and question sentence vectors of each question-answer data in the local knowledge base; the large model comprises an encoder and a decoder, each of which comprises a plurality of stacked Transformer blocks;

the semantic similarity calculation module is connected with the encoding module and is used for calculating the semantic similarity of the user question sentence vector and the question sentence vector of each question-answer data in the local knowledge base;

the first answer determining module is connected with the semantic similarity calculating module and is used for taking the answer of the question-answer data as the answer of the user question if the semantic similarity of the question sentence vector of the user and the question sentence vector of any question-answer data in the local knowledge base is greater than or equal to a set threshold value;

the paragraph similarity calculation module is connected with the semantic similarity calculation module and is used for calculating the similarity between the user problem and each article paragraph in the local knowledge base by adopting a pre-trained dense paragraph retrieval model if the semantic similarity between the user problem sentence vector and the problem sentence vectors of all question-answer data in the local knowledge base is smaller than a set threshold value;

the candidate paragraph determining module is connected with the paragraph similarity calculating module and is used for determining a plurality of candidate article paragraphs according to the similarity between the user problem and each article paragraph in the local knowledge base;

the second answer determining module is connected with the candidate paragraph determining module and is used for determining the answer of the user question by adopting a pre-trained large model according to a plurality of candidate article paragraphs and the user question, and specifically comprises the following steps: determining candidate knowledge information according to a plurality of candidate article paragraphs and preset prompt words; according to the candidate knowledge information and the user problem, a pre-trained large model is adopted to determine a target article paragraph; determining answers to the user questions by adopting a pre-trained large model according to the target article paragraphs and the user questions; the pre-trained large model is used for outputting answers according to the input knowledge information and questions.

8. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the large model and local knowledge base based intelligent dialog method of any of claims 1 to 6.