CN117271735A

CN117271735A - Method and system for optimizing man-machine conversation based on LLM model

Info

Publication number: CN117271735A
Application number: CN202311254274.9A
Authority: CN
Inventors: 朱亮
Original assignee: Jinmao Cloud Technology Service Beijing Co ltd
Current assignee: Jinmao Cloud Technology Service Beijing Co ltd
Priority date: 2023-09-26
Filing date: 2023-09-26
Publication date: 2023-12-22

Abstract

The application discloses a method and a system for optimizing man-machine conversation based on an LLM model, comprising the following steps: when a problem of a user is received, vectorizing the input problem of the user, and searching a problem vector and a text knowledge vector which have the highest similarity with the input problem of the user from a pre-constructed problem vector library and a text knowledge vector library respectively; the problem vector library is used for generating a plurality of problems based on business data text information by controlling the LLM big model, and then carrying out vector conversion on the problems; the text knowledge vector base is used for carrying out vector conversion on text information of the business materials; performing text splicing on the question vector and the text knowledge vector to form an answer content background section; and combining and splicing the input questions of the user and the background segments of the answer content, inputting the combined and spliced input questions into the LLM big model for semantic rearrangement, and outputting the answer by using the LLM big model. By the method, the technical effects of accurate processing of complex query, high query accuracy and low cost can be achieved.

Description

Method and system for optimizing man-machine conversation based on LLM model

Technical Field

The present application relates to the field of human-machine conversations, and in particular, to a method and system for optimizing human-machine conversations.

Background

In the customer service and marketing fields, an effective customer question-answering system plays a vital role. The system can provide accurate and quick answers for clients, and improve the satisfaction degree and service efficiency of the clients. However, constructing a high quality question-answering system requires significant labor costs.

The question-answering system in the prior art mainly relies on a QA manual, the QA manual is constructed to be structured knowledge, common questions, answers and related information are sorted and classified, so that the questions are more efficiently searched, and then the user questions are answered by extracting text knowledge corresponding to the client questions from a structured knowledge base. Extracting textual knowledge of customer problem correspondences from a structured knowledge base is typically done in several ways:

1. keyword-based search is employed: by analyzing the content and the user input keywords, relevance is calculated and search results are returned. This approach works well when dealing with simple queries, but has limitations in dealing with complex queries and understanding semantics.

2. Vector search based method: the method has the advantages that the vector space model (such as a word bag model, a TF-IDF and the like) is adopted to represent the documents and the keywords, the relevance of the query and the documents is measured by calculating the similarity between the vectors, but the documents are generally segmented into short texts during searching, so that the short texts are actually searched, the long texts are processed into the short texts by the word bag model and the TF-IDF or based on the keywords or a word segmentation mode, the information loss is relatively large, and the matching effect is relatively large.

3. Searching based on a knowledge graph: the knowledge graph is used for representing the entity, the attribute and the relationship, so that the accurate query and reasoning of the entity and the relationship can be realized. However, knowledge graph construction and maintenance costs are high, and are not suitable for real-time searching and dynamic updating of scenes.

4. Deep learning search: by utilizing the deep neural network to learn documents and keyword representations, tasks such as semantic understanding, entity identification, relation extraction and the like can be well processed. The training model requires a large amount of data and computing resources, and has a large optimization space for real-time searching and dynamic updating of scenes.

Disclosure of Invention

Based on the above, aiming at the technical problems, the method for optimizing man-machine conversation based on the LLM model, which can simultaneously achieve the technical effects of accurately processing complex query, high query accuracy and low cost, is provided.

In a first aspect, a method for optimizing a man-machine conversation based on LLM model, the method comprising:

when receiving a problem input by a user, vectorizing the input problem, and searching a problem vector and a text knowledge vector which have the highest similarity with the input problem from a pre-constructed problem vector library and a text knowledge vector library respectively; the construction of the problem vector library is to generate a plurality of problems based on business material text information by controlling a LLM large model, and then perform vector conversion on the problems based on a pre-training word vector model; the text knowledge vector library is constructed by carrying out vector conversion on text information of business materials through a pre-training word vector model;

performing text splicing on the question vector and the text knowledge vector to form an answer content background section;

and combining and splicing the input questions of the user and the background segments of the answer content, inputting the combined and spliced input questions into the LLM big model for semantic rearrangement, and outputting the answer by using the LLM big model.

In the above solution, optionally, vector conversion is performed on the plurality of questions through a pre-training word vector model, which specifically includes:

word segmentation is carried out on the problem to obtain a plurality of words;

vectorizing words through a pre-training word vector model to obtain word vectors;

and carrying out average value processing on a plurality of word vectors of the problem to obtain sentence vectors of the problem.

In the above solution, optionally, the constructing the problem vector library and the text knowledge vector library further includes:

vector clustering is carried out on all vectors;

and establishing an index for each class vector according to the class.

In the above solution, optionally, the preprocessing the business material text information before the generating a plurality of questions based on the business material text information by controlling the LLM big model includes:

and slicing the text information of the business data according to the same word number to obtain a plurality of text segments.

In the above scheme, optionally, text splicing is performed on the question vector and the text knowledge vector to form an answer content background section, and then, the nonstandard words in the answer content background section are filtered.

In a second aspect, a system for optimizing human-machine conversations based on LLM models, the system comprising:

the system comprises a problem vector library and a text knowledge vector library construction module, wherein the problem vector library and text knowledge vector library construction module is used for generating a plurality of problems based on business data text information by controlling an LLM (logical level management) model in advance and then carrying out vector conversion on the problems based on a pre-training word vector model to construct a problem vector library; the method comprises the steps of carrying out vector conversion on business material text information through a pre-training word vector model in advance to construct a text knowledge vector library;

the user problem processing module is used for vectorizing the input problem of the user when receiving the problem input by the user;

the retrieval module is used for respectively searching a question vector and a text knowledge vector which have the highest similarity with the input question of the user from a pre-constructed question vector library and a text knowledge vector library;

the information segment splicing module is used for carrying out text splicing on the question vector and the text knowledge vector output by the retrieval module to form an answer content background segment; combining and splicing the input questions of the user and the background section of the answer content;

and the LLM model module is used for inputting the combined and spliced input questions of the user and the background segment of the answer content into the LLM model for semantic rearrangement, and outputting the answer by using the LLM model.

In the above solution, optionally, the problem vector library and text knowledge vector library construction module is specifically configured to:

word segmentation is carried out on the text information of the problem or the business data to obtain a plurality of words;

and carrying out average value processing on a plurality of word vectors of the text information of the problem or the business data to obtain sentence vectors.

In the above solution, optionally, the problem vector library and text knowledge vector library construction module is further configured to:

vector clustering processing is carried out on all vectors in a vector library;

and establishing an index for each class vector according to the class.

In a third aspect, a computer device includes a memory storing a computer program and a processor implementing the steps of the method of optimizing a man-machine conversation based on LLM model of the first aspect when the computer program is executed:

in a fourth aspect, a computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a method of optimizing a man-machine conversation based on LLM model of the first aspect.

The application has at least the following beneficial effects:

firstly, refining a problem through an LLM model, and obviously improving the QA content construction threshold and efficiency; secondly, the matched vectors are subjected to semantic rearrangement and duplication removal through vector matching and LLM, so that the problem of optimizing the content matching of search word/vector model search is solved, the short text input is promoted, the long text matching effect is searched, and the matching precision and semantic understanding effect can be improved. In addition, the method does not need to establish a complex knowledge graph, so that the construction and maintenance cost is low.

Drawings

FIG. 1 is a flow chart of a method for optimizing human-machine conversation based on LLM model according to one embodiment of the present application;

FIG. 2 is a flow chart of a method for training LLM small models according to one embodiment of the present application;

FIG. 3 is a block diagram of a module architecture of a system for optimizing human-machine conversations based on the LLM model in one embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, a method for optimizing man-machine conversation based on LLM model is provided, comprising the steps of:

step S101, when a problem input by a user is received, vectorizing the input problem of the user, and searching a problem vector and a text knowledge vector which have the highest similarity with the input problem of the user from a pre-constructed problem vector library and a text knowledge vector library respectively; the construction of the problem vector library is to generate a plurality of problems based on business material text information by controlling a LLM large model, and then perform vector conversion on the problems based on a pre-training word vector model; the text knowledge vector library is constructed by carrying out vector conversion on text information of business materials through a pre-training word vector model;

specifically, the construction of the problem vector library firstly needs to perform data preprocessing on business data text information, wherein the business data text information comprises a product specification, a policy file and a business flow file; the data preprocessing steps are as follows:

a. and processing special character strings for the problems of the specifications, the policy files and the business processes. For example: the html\css label of the webpage file is removed, the picture information is removed, the characters of characters are removed, and the like;

b. the text is segmented, and the property introduction file, the policy file and the like in the property marketing field are generally segmented into 600 words to be one section aiming at knowledge density.

Secondly, after data preprocessing is completed on the text information of the business data, QA content construction is carried out based on the LLM model; unlike manually constructing QA pairs based on the form of question 1-question 1, achieving exact matching, LLM constructs QA in N-1 mode (question N-shard 1), i.e., which questions the current shard may answer. At present, a LLM trillion-level large model or a billion-level small model can be adopted for problem extraction. The main modes are as follows:

a: the LLM trillion-level large model is adopted: the early stage can adopt a promtt command mode, directly instruct a large model to generate problem output based on text segments, and call out better promtt;

b: if the data confidentiality is involved, a public API large model cannot be called, billion-level small models can be deployed for extraction, but as the small models are generalized and the comprehension/adaptation of the promt is not good for the large models, the effect of the large models can be approached by performing a certain fine tune. The more acceptable effect can be achieved according to about 3-5000 high-quality corpus of experience. The step of training the LLM manikin is shown in fig. 2.

Then, vectorizing the problem by using a pre-training word vector model; common pre-training models such as word2vec, bert, roberta model are available. The actual steps are as follows:

a: preprocessing the word of the problem, such as: i love Beijing, cut into I, love and Beijing;

b: vectorization is performed with a pre-trained model, such as: convert "me" to vector a [0.233,0.142, … ];

c: carrying out mean value processing on the word vector sum to obtain a question sentence vector; (a+b+c)/3.

Finally, clustering all the problem vectors, clustering similar vectors, and establishing a problem vector library A; establishing an index for the problem vector library A according to the category; the clustering quantity belongs to super parameters, and under the condition of large data quantity, the optimization is required according to the problem number. Larger data sets may require more clusters, while smaller data sets may require fewer clusters.

The text knowledge vector library is constructed by adopting the business data text information subjected to data preprocessing, vectorizing the processed business data text information through a pre-training word model, constructing a search index after constructing a text knowledge vector library B, and vectorizing and constructing the search index by the same method as the method for vectorizing and constructing the search index in the problem vector library A.

After a problem vector library A and a text knowledge vector library B are built in advance, when a problem input by a user is received, vectorizing the input problem of the user to obtain an input problem vector q; the vectorization is performed in the same manner as the vectorization of the generated problem when the problem vector library a is constructed. Then, inquiring a problem vector with highest similarity and a knowledge block vector from a problem vector library A and a text knowledge vector library B based on an input problem vector q, and respectively taking top1 to obtain a1 and B1; generally, the problem vector and the knowledge block vector with highest query similarity are the first problem vector and the first knowledge block vector, but if the content of the text information of the service data used in constructing the problem vector library and the text knowledge vector library is less, the problem vector and the knowledge block vector ranked in the first few may be selected, and the specific value is adaptively changed based on the actual size of the text information of the service data.

Specifically, when the vectors a1 and B1 are queried, the similar category can be searched according to the problem vector library a and the text knowledge vector library B, and then the similarity of each vector in the similar category is calculated, so that the highest problem vector a1 and the knowledge block vector B1 are obtained.

Step S102: performing text splicing on the question vector and the text knowledge vector to form an answer content background section;

specifically, text stitching is directly carried out on the a1 and the b1, so that an answer content background section is formed.

In particular, according to experiments, when searching the knowledge block vector with the highest similarity in the knowledge text vector library B, a plurality of knowledge block vectors with the highest similarity may be obtained, so that a plurality of knowledge block vectors B1 may be obtained; therefore, at the final splicing, text splicing is carried out on the searched multiple question vectors a1 and the knowledge fast vectors b1 to form an answer content background section.

Step S103: and combining and splicing the input questions of the user and the background segments of the answer content, inputting the combined and spliced input questions into the LLM big model for semantic rearrangement, and outputting the answer by using the LLM big model.

Specifically, before semantic rearrangement is performed in the LLM large model, keyword filtering is performed on the spliced content background section, so that the safety of the content injected into the large model is ensured. Then, the user problem and the content background section are spliced into a prompt, a large model is input, and the output of the large model is further constrained to be based on the content background through a prompt or fine-tune method. Finally, the content output by the LLM can be rearranged and output to the whole knowledge by utilizing the existing neural network of the LLM, so that the output of the personified content can be satisfied.

Thus, the personified answer flow for the user input question can be constructed. The method can be independently used as a question-answering flow; the method can be used in parallel with the traditional method, and the answers searched by the conventional searching mode and the input questions of the user and the background sections of the answer content searched before the step S102 of the method are spliced and input into the LLM big model for semantic rearrangement, and the answers are output by the LLM big model. The scenarios that are generally used in parallel with conventional methods are the following: if a client wants to add some advertisements or advertisement words or other content fusion in the answer, the accuracy requirement of the specific content of the answer is not so high, the acceleration of asking a car is good, the advertisement is actually a recommendation of the relevant car type, and the answer question adopts the scheme of the application, but the recommended car type can extract the content from the conventional mode and then splice the content to the end; the LLM answers will cause a question to be answered first and then a recommendation of a section of vehicle model to be given.

In the method for optimizing the man-machine conversation based on the LLM model, the LLM model is used for refining the problems, so that the QA content construction threshold and efficiency are remarkably improved; secondly, the matched vectors are subjected to semantic rearrangement and duplication removal through vector matching and LLM, so that the problem of optimizing the content matching of search word/vector model search is solved, the short text input is promoted, the long text matching effect is searched, and the matching precision and semantic understanding effect can be improved. In addition, the method does not need to establish a complex knowledge graph, so that the construction and maintenance cost is low.

The technical scheme of the application has the following beneficial technical effects:

1. the method for directly extracting the problems from the long file knowledge base by utilizing the LLM model is provided, and a multi-to-one QA question-answering basis is constructed;

2. aiming at mechanical word or content comparison, and meanwhile, comparing and simplifying problems and content blocks, refining content, and improving the correlation searching effect of actual scenes, real problems and answer content;

3. aiming at the traditional search engine interaction mode of searching/presenting the content, the anthropomorphic question-answering experience is realized based on the LLM+ knowledge injection mode.

In one embodiment, as shown in fig. 3, a system for optimizing man-machine conversations based on LLM model is provided, comprising the following modules: the system comprises a problem vector library and text knowledge vector library construction module, a user problem processing module, a retrieval module, an information fragment splicing module and a LLM model module, wherein:

the system comprises a problem vector library and a text knowledge vector library construction module, wherein the problem vector library and text knowledge vector library construction module is used for generating a plurality of problems based on business data text information by controlling a LLM large model in advance and then carrying out vector conversion on the plurality of problems based on a pre-training word vector model to construct a problem vector library; the method comprises the steps of carrying out vector conversion on business material text information through a pre-training word vector model in advance to construct a text knowledge vector library;

the user problem processing module is used for vectorizing the input problem of the user when receiving the problem of the user;

the information segment splicing module is used for carrying out text splicing on the question vector and the text knowledge vector output by the retrieval module to form an answer content background segment;

and the LLM big model module is used for combining and splicing the input questions of the user and the background segments of the answer content, inputting the input questions into the LLM big model for semantic rearrangement, and outputting the answer by using the LLM big model.

In one embodiment, the problem vector library and text knowledge vector library construction module is specifically configured to:

word segmentation is carried out on the text information of the question/business data to obtain a plurality of words;

and carrying out average value processing on a plurality of word vectors of the text information of the question/service data to obtain sentence vectors.

In one embodiment, the problem vector library and text knowledge vector library construction module is further configured to:

vector clustering processing is carried out on all vectors in a vector library;

and establishing an index for each class vector according to the class.

For specific limitations regarding a system for optimizing human-machine dialogues based on LLM model, reference is made to the above limitation regarding a method for optimizing human-machine dialogues based on LLM model, and details thereof are not repeated herein. The modules in the system for optimizing man-machine conversation based on the LLM model can be fully or partially realized by software, hardware and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method for optimizing human-machine conversations based on LLM models as described above.

In an embodiment, a computer readable storage medium is also provided, on which a computer program is stored, involving all or part of the flow of the method of the above embodiment.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A method for optimizing man-machine conversation based on LLM model, the method comprising:

2. The method of claim 1, wherein vector transformation of the plurality of questions by a pre-trained word vector model, in particular comprises:

word segmentation is carried out on the problem to obtain a plurality of words;

3. The method of claim 1, wherein the construction of the problem vector library and the textual knowledge vector library further comprises:

vector clustering is carried out on all vectors;

and establishing an index for each class vector according to the class.

4. The method of claim 1, wherein preprocessing the business material text information before generating a plurality of questions based on the business material text information by controlling the LLM big model comprises:

5. The method of claim 1, wherein after text stitching is performed on the question vector and the text knowledge vector to form an answer content background segment, non-canonical words in the answer content background segment are filtered.

6. A system for optimizing human-machine conversations based on LLM models, the system comprising:

7. The LLM model-based system for optimizing man-machine conversations of claim 6, wherein the problem vector library and text knowledge vector library construction module is specifically configured to:

8. The LLM model-based system for optimizing man-machine conversations of claim 6, wherein the problem vector library and text knowledge vector library construction module is further configured to:

vector clustering processing is carried out on all vectors in a vector library;

and establishing an index for each class vector according to the class.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 5 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 5.