CN117493529B

CN117493529B - Anthropomorphic dialogue method and device based on natural language model and electronic equipment

Info

Publication number: CN117493529B
Application number: CN202311801484.5A
Authority: CN
Inventors: 张辉
Original assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2023-12-26
Filing date: 2023-12-26
Publication date: 2024-03-15
Anticipated expiration: 2043-12-26
Also published as: CN117493529A

Abstract

The application provides a natural language model-based anthropomorphic conversation method, a natural language model-based anthropomorphic conversation device and electronic equipment, and relates to the technical field of computers, wherein the method comprises the following steps: acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user. The anthropomorphic conversation method and device based on the natural language model are used for improving conversation generation capacity and conversation quality of the chat robot driven by the natural language model LLM aiming at the professional field problem, and simultaneously enabling conversation of the chat robot to be more anthropomorphic.

Description

Anthropomorphic dialogue method and device based on natural language model and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for anthropomorphic dialogue based on a natural language model.

Background

With the development of large language model (Large Language Model, LLM) technology, new chat robots (chatbots) driven with large language model LLM are rapidly developing. Compared with the traditional dialogue robot, the dialogue robot is more natural and coherent, accords with the thinking mode and the expression habit of human beings, and is higher in intelligent level and better in understanding and answering various types of questions.

However, the chat robot driven by the large language model LLM in the related art is mainly oriented to the chat and open domain knowledge question-and-answer scene, because most of the pre-training data of the large language model LLM comes from the internet, the professional domain knowledge is insufficient, and the phenomenon that the domain knowledge is not understood and the professional term is not understood exists. Meanwhile, the method also lacks dialogue strategies in specific application scenes, such as job interviews, customer services and the like, and the large language model LLM also does not have dialogue strategy capability in specific scenes, so that the phenomena of unreasonable dialogue and no anthropomorphic phenomenon are easy to occur.

Disclosure of Invention

The purpose of the application is to provide a anthropomorphic conversation method and device based on a natural language model, which are used for improving conversation generation capacity and conversation quality of a chat robot driven by the natural language model aiming at professional field problems, and simultaneously enabling conversation of the chat robot to be more anthropomorphic.

The application provides a natural language model-based anthropomorphic dialogue method, which comprises the following steps:

acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user.

Optionally, the acquiring the second text information in the dialogue process includes: acquiring historical dialogue information in a dialogue process; and screening out dialogue information of a plurality of rounds from the historical dialogue information through a long-short-time memory mechanism, and determining the dialogue information of the plurality of rounds as the context information.

Optionally, the screening the dialogue information of multiple rounds from the historical dialogue information through a long-short-time memory mechanism, and determining the dialogue information of the multiple rounds as the context information includes: preprocessing the historical dialogue information to obtain processed historical dialogue information; the preprocessing operation is used for merging a plurality of continuous dialogue information of the same object into one dialogue information; screening dialogue information of a plurality of dialogue turns adjacent to the current problem of the user from the processed historical dialogue information based on the current problem of the user as short-term memory dialogue information, and screening dialogue information of a plurality of dialogue turns related to the current problem of the user from the processed historical dialogue information based on the current problem of the user as long-term memory dialogue information; combining the long-time memory dialogue information and the short-time memory dialogue information to obtain the context information; the dialogue content included in the long-term memory dialogue information is the dialogue content before the dialogue content included in the short-term memory dialogue information.

Optionally, the determining a plurality of domain knowledge related to the technical domain of the current problem of the user includes: converting the first text information into a first semantic vector, screening a plurality of pieces of result information matched with the first semantic vector from the domain knowledge vector database based on the first semantic vector, and determining domain knowledge corresponding to the plurality of pieces of result information as the plurality of pieces of domain knowledge.

Optionally, the converting the first text information into a first semantic vector, and screening a plurality of result information matched with the first semantic vector from the domain knowledge vector database based on the first semantic vector, includes: converting the first text information into the first semantic vector by using a preset conversion algorithm, and calculating the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm; determining a plurality of semantic vectors with similarity values meeting preset similarity in the domain knowledge vector database as the plurality of result information; wherein the preset similarity algorithm comprises any one of the following: euclidean distance algorithm and inner product algorithm.

Optionally, the calculating, based on a preset similarity algorithm, the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database includes: calculating the similarity of the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula I:

(equation I)

Wherein,is the first semantic vector; />Semantic vectors for any domain knowledge in the domain knowledge vector database.

Optionally, the calculating, based on a preset similarity algorithm, the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database includes: calculating the similarity of the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula II:

(equation II)

Optionally, the determining fourth text information corresponding to the topic scene of the first text information includes: inputting the first text information into a scene classification model, and determining a topic scene corresponding to the first text information; screening a target prompt template of which the topic scene is matched with the topic scene corresponding to the first text information from a dialogue prompt template library based on the topic scene corresponding to the first text information; one topic scene corresponds to one dialogue prompt template.

Optionally, the merging the first text information, the second text information, the third text information, and the fourth text information to obtain target text information includes: the first text information, the second text information, the third text information and the fourth text information are combined to obtain text information to be processed; executing text word segmentation operation on the text information to be processed by using a preset word segmentation algorithm, and calculating the number of first word segments corresponding to the text information to be processed after word segmentation; calculating the difference value between the maximum word segmentation number and the first word segmentation number under the condition that the first word segmentation number of the text information to be processed is larger than the maximum word segmentation number allowed to be input by the natural language model, so as to obtain word segmentation number difference; and cutting off the text information from the head to the tail of the target text information based on the word segmentation quantity difference to obtain first sub-text information close to the head and second sub-text information close to the tail, and determining the second text information as the target text information.

Optionally, after performing text word segmentation on the text information to be processed by using a preset word segmentation algorithm and calculating the first word segmentation number corresponding to the text information to be processed after word segmentation, the method further includes: and determining the text information to be processed as the target text information under the condition that the first word segmentation number of the text information to be processed is smaller than or equal to the maximum word segmentation number allowed to be input by the natural language model.

Optionally, the merging the first text information, the second text information, the third text information, and the fourth text information to obtain target text information includes: based on a target splicing order matched with dialog prompt templates contained in the plurality of domain knowledge and the fourth text information, splicing and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information; the output results of the corresponding natural language models are not identical based on the target text information obtained by different splicing sequences; the target splicing sequence is the highest user satisfaction degree scoring splicing sequence in the output results of the natural language model corresponding to the splicing sequences; the splicing orders are splicing orders corresponding to the first text information, the second text information, the third text information and the fourth text information.

Optionally, before the first text information is input into the scene classification model and the topic scene corresponding to the first text information is determined, the method further includes: acquiring a history dialogue record between a user and a dialogue robot as a training set to be marked, marking each sample in the training set to be marked based on a preset scene keyword, and obtaining a target training set after marking; and training the pre-training model by using the target training set to obtain the scene classification model.

The application also provides a natural language model-based anthropomorphic dialogue device, which comprises:

the information acquisition module is used for acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and the information processing module is used for combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, inputting the target text information into a natural language model, and obtaining target reply content corresponding to the current problem of the user.

Optionally, the information obtaining module is specifically configured to obtain historical dialogue information in a dialogue process; the information acquisition module is specifically further configured to screen out session information of multiple rounds from the historical session information through a long short-time memory mechanism, and determine the session information of the multiple rounds as the context information.

Optionally, the information obtaining module is specifically configured to perform a preprocessing operation on the historical dialogue information to obtain processed historical dialogue information; the preprocessing operation is used for merging a plurality of continuous dialogue information of the same object into one dialogue information; the information acquisition module is specifically configured to screen dialogue information of a plurality of dialogue rounds adjacent to the user current problem from the processed historical dialogue information based on the user current problem as short-term memory dialogue information, and screen dialogue information of a plurality of dialogue rounds related to the user current problem from the processed historical dialogue information based on the user current problem as long-term memory dialogue information; the information acquisition module is specifically configured to combine the long-term memory session information and the short-term memory session information to obtain the context information; the dialogue content included in the long-term memory dialogue information is the dialogue content before the dialogue content included in the short-term memory dialogue information.

Optionally, the information obtaining module is specifically configured to convert the first text information into a first semantic vector, screen a plurality of result information matched with the first semantic vector from the domain knowledge vector database based on the first semantic vector, and determine domain knowledge corresponding to the plurality of result information as the plurality of domain knowledge.

Optionally, the information obtaining module is specifically configured to convert the first text information into the first semantic vector by using a preset conversion algorithm, and calculate a similarity between the first semantic vector and a semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm; the information acquisition module is specifically configured to determine, as the plurality of result information, a plurality of semantic vectors whose similarity values satisfy a preset similarity in the domain knowledge vector database; wherein the preset similarity algorithm comprises any one of the following: euclidean distance algorithm and inner product algorithm.

Optionally, the information obtaining module is specifically configured to calculate the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula:

(equation I)

Optionally, the information obtaining module is specifically configured to calculate, based on the following formula two, similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database:

(equation II)

Optionally, the information obtaining module is specifically configured to input the first text information into a scene classification model, and determine a topic scene corresponding to the first text information; the information acquisition module is specifically further configured to screen out a target prompt template, in which the topic scene is matched with the topic scene corresponding to the first text information, from a dialogue prompt template library based on the topic scene corresponding to the first text information; one topic scene corresponds to one dialogue prompt template.

Optionally, the data processing module is specifically configured to obtain the text information to be processed after the first text information, the second text information, the third text information, and the fourth text information are combined; the data processing module is specifically used for executing text word segmentation operation on the text information to be processed by using a preset word segmentation algorithm, and calculating the number of first word segments corresponding to the text information to be processed after word segmentation; the data processing module is specifically configured to calculate a difference value between the maximum word segmentation number and the first word segmentation number when the first word segmentation number of the text information to be processed is greater than the maximum word segmentation number allowed to be input by the natural language model, so as to obtain a word segmentation number difference; and cutting off the text information from the head to the tail of the target text information based on the word segmentation quantity difference to obtain first sub-text information close to the head and second sub-text information close to the tail, and determining the second text information as the target text information.

Optionally, the data processing module is specifically further configured to determine the text information to be processed as the target text information when the first word segmentation number of the text information to be processed is less than or equal to the maximum word segmentation number allowed to be input by the natural language model.

Optionally, the data processing module is specifically configured to splice and combine the first text information, the second text information, the third text information, and the fourth text information based on a target splicing order matched with the dialog prompt templates included in the plurality of domain knowledge and the fourth text information, so as to obtain target text information; the output results of the corresponding natural language models are not identical based on the target text information obtained by different splicing sequences; the target splicing sequence is the highest user satisfaction degree scoring splicing sequence in the output results of the natural language model corresponding to the splicing sequences; the splicing orders are splicing orders corresponding to the first text information, the second text information, the third text information and the fourth text information.

Optionally, the apparatus further comprises: a training module; the acquisition module is further used for acquiring a history dialogue record between a user and the dialogue robot as a training set to be marked, marking each sample in the training set to be marked based on a preset scene keyword, and obtaining a target training set after marking; and the training module is used for training the pre-training model by using the target training set to obtain the scene classification model.

The present application also provides a computer program product comprising computer programs/instructions which when executed by a processor implement the steps of the natural language model based anthropomorphic dialog method as described in any of the above.

The application also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the natural language model based anthropomorphic dialog method as described in any of the above when the program is executed.

The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the natural language model based anthropomorphic dialog method as described in any of the above.

According to the anthropomorphic conversation method and device based on the natural language model, first text information in a conversation process and second text information in the conversation process are obtained, and third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information are determined; the first text information includes: a user's current question; the second text information includes: context information; the third text includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; then, the first text information, the second text information, the third text information and the fourth text information are combined to obtain target text information; and finally, inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user. Therefore, the conversation generation capability and conversation quality of the chat robot driven by the natural language model LLM aiming at the professional field problem can be improved, and meanwhile, the conversation of the chat robot can be more personified.

Drawings

In order to more clearly illustrate the technical solutions of the present application or the prior art, the following description will briefly introduce the drawings used in the embodiments or the description of the prior art, and it is obvious that, in the following description, the drawings are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is one of the flow diagrams of the natural language model-based anthropomorphic dialog method provided herein;

FIG. 2 is a second flow chart of the proposed anthropomorphic dialogue method based on the natural language model;

FIG. 3 is a third flow chart of the anthropomorphic dialogue method based on the natural language model provided by the present application;

FIG. 4 is a flow chart of the anthropomorphic dialog method based on the natural language model provided by the present application;

FIG. 5 is a fifth flow chart of the anthropomorphic dialog method based on the natural language model provided by the present application;

FIG. 6 is a flow chart of the anthropomorphic dialog method based on the natural language model provided by the present application;

FIG. 7 is a schematic diagram of the structure of the natural language model-based anthropomorphic dialog device provided by the present application;

Fig. 8 is a schematic structural diagram of an electronic device provided in the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The following description is made with respect to terms of art referred to in the embodiments of the present application:

large language model LLM: is an artificial intelligence model that can understand and generate human language. The large language model LLM trains over a large amount of text data and can perform a variety of tasks such as text summarization, translation, emotion analysis, etc. Large language model LLMs are characterized by a large scale, containing billions of parameters, which help them learn complex patterns in language data. Large language model LLMs are typically based on deep learning architectures, such as converters, which help them to achieve impressive performance on various natural language processing tasks.

Customer service is critical to the enterprise, maintaining good customer service may increase customer loyalty and satisfaction, thereby increasing business success rate and increasing revenue. Compared with a manual seat, the client service robot can provide stable dialogue service for 7×24 hours, can reduce the operation cost of enterprises, has wide application prospect in enterprises, but the intelligent customer service robot (Intelligent Customer Service Robot, ICSR) based on traditional natural language processing (Natural Language Processing, NLP) needs to manually construct various knowledge bases, and has high cost; the problems of low adaptability, insufficient processing capacity, incapability of intelligent learning and the like are faced, and the problems of poor conversation quality and the like of the robot are caused, so that the application of the robot in the aspect of providing high-quality customer service is limited.

In the related art, the method mainly focuses on fine tuning a basic universal Model (FM) based on industry field data to obtain a large language Model LLM in the vertical field, or adopts a Prompt Engineering method of introducing field knowledge by means of searching and the like to optimize a prompt word to guide the output of the large language Model LLM. However, only the two modes are adopted, the problem of adaptation of the large language model LLM to the industry field can be partially solved, and the problem of a proper dialogue strategy in a specific scene, such as a customer service scene, cannot be solved. For example, "a client purchases a server, a problem that a hard disk is not recognized in the installation process at present" is caused, the client consults with an ICSR which is acted as a chat robot, a conversation is carried out between the client and the chat robot, the chat robot needs to guide the client to locate the cause of the problem as the real person, and a targeted scheme is given according to the specific cause, so that the problem faced by the client is solved.

A specific example is as follows, first a dialogue between a real person customer service and a customer:

customer service: you get your work 1104, very happy to serve you. The user: hello, i.e. i just made a call, i.e. i now install that server, he now says not to recognize the hard disk. Customer service: the hard disk is not seen to be a bar. The user: for whether that RAID drive is absent. Customer service: is you likely to do RAID? The user: done, I look at what was written above should be RAID1. Customer service: what does RAID? Customer service: or what shortcut key prompts are seen at that time. The user: hiccup, whether there is CTRL plus A or plus instant, is the one that should be the integrated mani, the integrated motherboard. Customer service: CTRL plus a is generally not integrated. The user: he was plus-what was just what was seen when the interface was started. Customer service: for you look at you restart to look at you again to see what model you look at. The user: then we go to our corporate network to find a drive for this RAID card. Customer service: the loading is performed.

Secondly, the conversation between the chat robot and the client:

chat robot: you are young, i are a small source, and very happy to serve you. The user: for hello, i.e. i just made a call, i.e. i now have that server, he is now said to be unable to see the hard disk. Chat robot: you can ask what model of server and hard disk you use? In addition, do you try to reinsert the hard disk, replace another hard disk, or check if the hard disk is properly inserted into the server? The user: the factory is ready to install and should not be unplugged. Whether that RAID drive is absent. Chat robot: if so, your server may lack a corresponding RAID driver, you can download and install a RAID driver appropriate for your server model. If you do not determine the driver for your server model, please provide the specific server model and operating system version in order to provide you with more detailed guidance and advice. The user: i look at what was written above should be RAID1. Chat robot: if you want to configure RAID1, then the drivers and RAID management tools available for RAID1 provided by the server model you are using need to be installed. Please confirm that at least two hard disks have been installed in your server and detected by the system after a physical connection. Next, you need to enter the server RAID manager to make RAID configuration, the specific operations vary from one server brand to another, you can refer to the corresponding server technical manual or online documentation for more detailed guidance.

According to the above-mentioned dialogue, in the prior art, ICSR driven by the large language model LLM can answer a long paragraph of textbook according to the user's question, and is not a style of anthropomorphic dialogue, and cannot really solve the problem of the client.

Aiming at the technical problems of the chat robot driven by the large language model LLM, in the related technology, methods such as networking search, external plug-in and the like are generally adopted to complement the problem of lack of domain knowledge in the large language model LLM in the specific industry scene, so that the accuracy of conversation is improved, but the conversation strategy research in the specific application scene is less, and the level of anthropomorphic conversation is difficult to achieve.

Based on the above, the embodiment of the application provides a anthropomorphic dialogue method based on a natural language model, which can improve the generation capacity of high-quality dialogue in a customer service scene and is convenient for enterprises to rapidly deploy own anthropomorphic dialogue systems based on the natural language model.

The anthropomorphic dialogue method based on the natural language model provided by the embodiment of the application is described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

As shown in fig. 1, a method for anthropomorphic dialogue based on a natural language model according to an embodiment of the present application may include the following steps 101 and 102:

Step 101, acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information.

Wherein the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template.

Illustratively, the first text information is text information including a current question of the user, and the first text information is a latest question input by the user during a conversation with the chat robot. The second text information is context information of the current user problem obtained based on the history dialogue.

The plurality of domain knowledge in the third text information is, for example, a plurality of domain knowledge that is obtained after semantic retrieval is performed in a domain knowledge base based on the current problem of the user and is most relevant to the current problem of the user.

For example, as shown in fig. 2, a flow chart of a anthropomorphic dialogue method based on a natural language model according to an embodiment of the application is shown. Firstly, the context information of the current problem (query) of the user and the current problem of the user in the dialogue history needs to be obtained, semantic retrieval is carried out based on the current problem of the user, and the most relevant K pieces of domain knowledge are retrieved. Meanwhile, scene classification is required based on the current problem of the user, and the corresponding scene dialogue prompt Promt template is matched according to the classification label. And then, the obtained information can be combined.

Specifically, in the step 101, the step of obtaining the second text information in the session may further include the following steps 101a and 101b:

historical dialogue information in the dialogue process is captured 101a and acquired.

Step 101b, screening out dialogue information of a plurality of rounds from the historical dialogue information through a long-short time memory mechanism, and determining the dialogue information of the plurality of rounds as the context information.

Illustratively, the latest n rounds of dialogs can be screened from the history dialogs by a long-short-term memory mechanism, and the m rounds of dialogs most relevant to the current user problem are combined into the context information before the n rounds of dialogs.

Specifically, the step 101b may further include the following steps 101b1 to 101b3:

step 101b1, performing a preprocessing operation on the history dialogue information to obtain processed history dialogue information.

Wherein the preprocessing operation is used for merging a plurality of continuous dialogue information of the same object into one dialogue information.

Step 101b2, based on the user current question, screening dialogue information of a plurality of dialogue turns adjacent to the user current question from the processed historical dialogue information as short-time memory dialogue information, and based on the user current question, screening dialogue information of a plurality of dialogue turns related to the user current question from the processed historical dialogue information as long-time memory dialogue information.

Step 101b3, merging the long-term memory dialogue information and the short-term memory dialogue information to obtain the context information.

The dialogue content included in the long-term memory dialogue information is the dialogue content before the dialogue content included in the short-term memory dialogue information.

For example, as shown in fig. 3, based on the history dialogue, the context information of the current query is obtained, a long-short-term memory mechanism is adopted, the dialogue of the nearest n rounds is used as short-term memory to be listed in the context information, and the m rounds of dialogue content with the most relevant semantics of the previous n rounds are used as long-term memory to be listed in the context information. Generally, n is 1 to 3, and m is 1 to 3. One question and one answer is referred to as 1 round, wherein a plurality of consecutive questions are combined into 1 question, and a plurality of consecutive answers are combined into one answer. For example, the current user problem is round 10, the dialogue of round 9 and round 8 is short-term memory, the most relevant dialogue of round 10 is round 3, the round 3 is long-term memory, and the context information is rounds 3, 8 and 9.

Specifically, the step of determining a plurality of domain knowledge related to the technical domain of the current problem of the user in the step 101 may further include the following step 101c:

Step 101c, converting the first text information into a first semantic vector, screening a plurality of result information matched with the first semantic vector from the domain knowledge vector database based on the first semantic vector, and determining domain knowledge corresponding to the plurality of result information as the plurality of domain knowledge.

The plurality of domain knowledge may be, for example, a plurality of domain knowledge in a domain knowledge base that is most relevant to the current problem of the user. The domain knowledge base is a vector database built offline in advance, and the content comprises industry domain knowledge, enterprise product knowledge, historical dialogue experience and the like.

Specifically, the step 101c may further include the following steps 101c1 and 101c2:

step 101c1, converting the first text information into the first semantic vector by using a preset conversion algorithm, and calculating the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm.

For example, the preset conversion algorithm may be a text2vec algorithm, and the commonly used text2vec algorithm includes: text2vec-large-chinese, ernie-3.0-base-zh, etc.

Step 101c2, determining a plurality of semantic vectors with similarity values meeting preset similarity in the domain knowledge vector database as the plurality of result information.

Wherein the preset similarity algorithm comprises any one of the following: euclidean distance algorithm and inner product algorithm.

After obtaining the first semantic vector corresponding to the current problem of the user, a plurality of semantic vectors with similarity meeting the preset similarity with the first semantic vector can be screened out from the domain knowledge vector database based on the first semantic vector, so that a plurality of result information most relevant to the current problem of the user is obtained.

For example, as shown in fig. 4, firstly, text2vec algorithm is used to convert the query text into semantic vectors, then search is performed in a vector database, and K most relevant results are calculated, wherein K is an integer and generally 1-5 are taken. The vector databases support similarity retrieval, the similarity also supports different measurement indexes, such as vector database Faiss, and supports two ways of similarity measurement of L2 (Euclidean distance) and inner product, and when the two vectors are subjected to normalization processing, the inner product calculation result is cosine similarity. Of course, the internet search can be performed by using text keywords based on the current query, and the process is consistent, but many enterprises cannot perform networking based on the requirement of data security, so that the domain knowledge vector database in the embodiment of the application does not integrate the function of internet search.

Specifically, for the scheme of calculating the similarity using the euclidean distance, the step 101c1 may further include the following step 101c11:

step 101c11, calculating the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula:

(equation I)

Specifically, for the scheme of calculating the similarity by using the inner product, the above step 101c1 may further include the following step 101c12:

step 101c12, calculating the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula two:

(equation II)

Specifically, in the step 101, the step of determining the fourth text information corresponding to the topic scene of the first text information may further include the following steps 101d1 and 101d2:

step 101d1, inputting the first text information into a scene classification model, and determining a topic scene corresponding to the first text information.

Step 101d2, screening out a target prompt template with a topic scene matched with the topic scene corresponding to the first text information from a dialogue prompt template library based on the topic scene corresponding to the first text information.

One topic scene corresponds to one dialogue prompt template.

Illustratively, the information type of the dialogue prompt template in the embodiment of the application is text information.

For example, as shown in fig. 5, first, classifying the subdivided scenes based on the user query, obtaining class labels of the subdivided scenes, and then matching the corresponding scene template according to the class labels. The template corresponding to each scene is preset, and each subdivision scene corresponds to 1 template (for example, the template is classified as a 'hard disk unidentified' scene, and the template corresponding to the 'hard disk unidentified' scene is selected, wherein the subdivision scene classification tag library is preset and is a subdivision scene determined under an enterprise customer service scene, and the subdivision scene classification tag library may be tens, hundreds or thousands of subdivision scenes.

Specifically, before the step 101d1, the anthropomorphic dialogue method based on the natural language model provided in the embodiment of the application may further include the following steps 201 and 202:

Step 201, a history dialogue record between a user and a dialogue robot is obtained as a training set to be marked, and each sample in the training set to be marked is marked based on a preset scene keyword, so that a target training set after marking is obtained.

And 202, training a pre-training model by using the target training set to obtain the scene classification model.

The scene classification model in the embodiment of the application is obtained after fine tuning training based on the Bert model, training data is derived from the actual historical dialogue between a manual client and the client, automatic labeling is performed based on scene keyword matching, and then verification is performed manually.

And 102, combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user.

The context information, the multiple domain knowledge, the dialogue prompt templates and the current problems of the user are combined and used as input of a natural language model, the output result of the natural language model is dialogue reply of the current dialogue round, and if the dialogue reply contains multiple sentences in a longer text, the dialogue reply can be returned for multiple times according to the sentences.

Specifically, the step 102 may include the following steps 102a1 to 102a3:

and 102a1, merging the first text information, the second text information, the third text information and the fourth text information to obtain the text information to be processed.

Step 102a2, executing text word segmentation operation on the text information to be processed by using a preset word segmentation algorithm, and calculating the number of first word segments corresponding to the text information to be processed after word segmentation.

102a3, calculating a difference value between the maximum word segmentation number and the first word segmentation number to obtain a word segmentation number difference when the first word segmentation number of the text information to be processed is larger than the maximum word segmentation number allowed to be input by the natural language model; and cutting off the text information from the head to the tail of the target text information based on the word segmentation quantity difference to obtain first sub-text information close to the head and second sub-text information close to the tail, and determining the second text information as the target text information.

Illustratively, following the step 102a, the step 102 may further include the following step 102a4:

Step 102a4, determining the text information to be processed as the target text information when the first word segmentation number of the text information to be processed is smaller than or equal to the maximum word segmentation number allowed to be input by the natural language model.

For example, as shown in fig. 6, four types of content texts, namely, context information, a plurality of related domain knowledge, a template of a prompt and a current problem of a user are combined, if the input is on line beyond the support of a natural language model, text is truncated from the front, and only the rear part of the text is reserved. If the four types of texts are combined into a text, the text is divided into words and then is 3000 tokens, and the LLM supported input is 2048 tokens, only the 953 th to 3000 th tokens in the text are reserved as LLM input. The text segmentation may employ a jieba segmentation algorithm. In the embodiment of the application, a source 1.0 model can be used as a natural language model, the source 1.0 model is a generated pre-training language model, two different parameter scales (13B and 245.7B) exist, and a model with 130 hundred million parameters can be selected to be used in the embodiment of the application.

It should be noted that, in the embodiment of the present application, a large language model LLM is taken as an example of a natural language model to describe the anthropomorphic dialogue method based on the natural language model provided in the embodiment of the present application.

It should be noted that the source 1.0 model is a large language model based on a transducer network, has 2457 hundred million parameters, and can understand and generate Chinese language to perform various tasks of natural language processing. The structure of the source 1.0 model consists of two parts, namely an Encoder (Encoder) and a Decoder (Decoder), each part is formed by stacking a plurality of identical layers, and each layer comprises a Self-Attention (Self-Attention) sub-layer and a feedforward neural network (Feed Forward Neural Network) sub-layer. The self-attention mechanism allows the model to take into account the contextual relationships of all elements in the input or output sequence to better understand and generate the language. The encoder portion of the source 1.0 model has 96 layers, the decoder portion has 48 layers, each layer has a hidden layer dimension of 12288, each self-attention header has a dimension of 96, and each layer has a number of self-attention headers of 128.

Alternatively, in the embodiment of the present application, since the output results of the natural language models obtained by different text splicing sequences are not identical, the results generated by each of the splicing sequences may be scored for user satisfaction, and marked using a dialogue prompt template and multiple domain knowledge.

Specifically, the step 102 may further include the following step 102b:

and 102b, based on a target splicing order matched with the dialog prompt templates contained in the plurality of domain knowledge and the fourth text information, splicing and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information.

The output results of the corresponding natural language models are not identical based on the target text information obtained by different splicing sequences; the target splicing sequence is the highest user satisfaction degree scoring splicing sequence in the output results of the natural language model corresponding to the splicing sequences; the splicing orders are splicing orders corresponding to the first text information, the second text information, the third text information and the fourth text information.

The following illustrates a specific application scenario of the anthropomorphic dialogue method based on the natural language model provided in the embodiment of the application:

1. the user carries the problems encountered with him/herself to consult and inputs the query. If query is "I now install that server," the problem now arises is that the hard disk is not visible. "according to this query, semantic search is performed, and if k is set to 2, the two most relevant domain knowledge are searched from the domain knowledge base. The two most relevant areas of knowledge are obtained as follows:

Top1, the raid array of the server is in a normal state, and the hard disk can be identified.

Top2 hard disk type is 2.5/3.5 inch SAS/SATA/NVMe hard disk. Hard disk fault alarm indicator lamp, normally bright red: hard disk failure, normally bright blue: hard disk positioning, normal bright pink: and (5) matching with RAID Rebuilding.

2. Classifying the query, wherein a subdivision scene label is 'hard disk unrecognized', and a template corresponding to the label is: "hard disk does not recognize" scene template: as an after-market engineer customer service for an IT enterprise server product, when answering the customer 'hard disk does not recognize' question, you should first ask the user if the user has configured the raid, and if not, guide the user to perform the raid configuration. You should keep a polite mood at all times, brief, clear, popular language.

3. After the context information of the user history dialogue is obtained, the four types of texts are spliced and combined, the spliced content is as follows, the length of the spliced content is smaller than the maximum number of tokens supported by the model, and the spliced content can be directly used as the input of LLM (logical link management), as shown in the following figure. LLM can generate results, if the text content of the results is longer and contains multiple sentences, the results can be divided into multiple results to be output, the dialog of the current round is ended, the user problem is waited, and the next round of dialog is entered. Specific examples are as follows:

The user: i now install that server o, the problem that now occurs is that the hard disk is not visible.

Chat robot input: domain knowledge: the raid array of the server is in a normal state, and the hard disk can be identified. The hard disk type is a 2.5/3.5 inch SAS/SATA/NVMe hard disk. Hard disk fault alarm indicator lamp, normally bright red: hard disk failure, normally bright blue: hard disk positioning, normal bright pink: and (5) matching with RAID Rebuilding. Scene: hard disk does not recognize, dialogue policy: as an after-market engineer customer service for an IT enterprise server product, when answering the client 'hard disk does not recognize' question, you should first ask the user if the user has configured the raid, and if not, guide the user to perform the raid configuration. You should keep a polite mood at all times, brief, clear, popular language. User problem: i now install that server o, the problem that now occurs is that the hard disk is not visible.

Chat robot output: ask you if you have done RAID configuration?

The user: done, I look at what was written above should be RAID1.

Chat robot input: contextual information: the user: i now install that server o, the problem that now occurs is that the hard disk is not visible. Customer service: ask you if you have done RAID configuration? Domain knowledge: RAID is an acronym for Redundant Array of Independent Disks (redundant array of independent disks), a data storage technology that combines multiple hard disks into one logical disk array, providing a faster, more reliable, and safer data access and protection mechanism. Domain knowledge: RAID1 is a disk array level, also known as "disk mirroring" or "disk copy," which creates an identical disk image from at least two hard disks. When RAID1 writes data on one hard disk, it writes the same data on another hard disk at the same time, which makes the data highly tolerant to damage to the hard disk, since data can still be retrieved from other mirrored hard disks when a failure occurs on the disk. When the server is started, the configuration tool for configuring RAID is pressed to enter RAID, and configuration of different RAID levels can be performed. Scene: RAID configuration, dialogue strategy: as an after-market engineer customer service for an IT enterprise server product, when answering the customer's RAID configuration question, you should first ask the user what shortcut key to configure to make IT clear whether the user has been properly configured. You should keep a polite mood at all times, brief, clear, popular language. User problem: done, I look at what was written above should be RAID1.

Chat robot output: well, I want to know what you do, what shortcuts to prompt when configuring.

By means of the anthropomorphic dialogue method based on the natural language model, anthropomorphic dialogue results can be output by the natural language model in a customer service scene, landing of the natural language model in an industry scene can be facilitated, ICSR roles in an enterprise customer service system can be borne, and ICSR intelligence level is effectively improved.

By way of example, the anthropomorphic dialogue method based on the natural language model, provided by the embodiment of the application, integrates the capability of semantically searching the knowledge base of an enterprise, can effectively solve the problem of knowledge deficiency in the field of the natural language model industry, does not need to conduct large-scale fine tuning training, is a more economical method, and is beneficial to reducing the threshold and convenience of the enterprise for using the natural language model.

By means of the anthropomorphic dialogue method based on the natural language model, the history dialogue information is formed into the context information based on the theory of long and short time memory, the forgetting problem in Toronchi dialogue can be effectively solved, long and short time memory function is achieved, and the accuracy of multi-round dialogue is improved.

By way of example, the anthropomorphic dialogue method based on the natural language model provided by the embodiment of the application integrates the field knowledge, the context and the dialogue strategy under the scene through the standardized framework, thereby realizing the floor customer service scene based on the natural language model, realizing the anthropomorphic dialogue effect and improving the ICSR intelligent level.

According to the anthropomorphic dialogue method based on the natural language model, first text information in a dialogue process and second text information in the dialogue process are obtained, and third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information are determined; the first text information includes: a user's current question; the second text information includes: context information; the third text includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; then, the first text information, the second text information, the third text information and the fourth text information are combined to obtain target text information; and finally, inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user. Therefore, the conversation generating capability and conversation quality of the chat robot driven by the natural language model aiming at the professional field problem can be improved, and meanwhile, the conversation of the chat robot can be more personified.

It should be noted that, in the anthropomorphic dialogue method based on the natural language model provided in the embodiments of the present application, the execution subject may be a anthropomorphic dialogue device based on the natural language model, or a control module for executing the anthropomorphic dialogue method based on the natural language model in the anthropomorphic dialogue device based on the natural language model. In the embodiment of the application, a natural language model-based anthropomorphic dialogue device is taken as an example to execute a natural language model-based anthropomorphic dialogue method, and the natural language model-based anthropomorphic dialogue device provided in the embodiment of the application is described.

In the embodiment of the application, the method is shown in the drawings. The anthropomorphic dialogue methods based on the natural language model are exemplified by the accompanying drawings in combination with the embodiment of the application. In specific implementation, the anthropomorphic dialogue method based on the natural language model shown in the above method drawings can be further implemented in combination with any other drawing that may be combined and is illustrated in the above embodiment, and will not be described herein.

The natural language model-based anthropomorphic dialogue device provided by the application is described below, and the description of the anthropomorphic dialogue device based on the natural language model and the description of the anthropomorphic dialogue method based on the natural language model can be referred to correspondingly.

Fig. 7 is a schematic structural diagram of an anthropomorphic dialogue device based on a natural language model according to an embodiment of the application, and as shown in fig. 7, the anthropomorphic dialogue device specifically includes:

an information obtaining module 701, configured to obtain first text information in a dialogue process and second text information in a dialogue process, and determine third text information related to a technical field of the first text information, and fourth text information corresponding to a topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and the information processing module is used for combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, inputting the target text information into a natural language model, and obtaining target reply content corresponding to the current problem of the user.

Optionally, the information obtaining module 701 is specifically configured to obtain historical dialogue information in a dialogue process; the information obtaining module 701 is specifically further configured to screen out session information of multiple rounds from the historical session information through a long short-time memory mechanism, and determine the session information of the multiple rounds as the context information.

Optionally, the information obtaining module 701 is specifically configured to perform a preprocessing operation on the historical dialog information to obtain processed historical dialog information; the preprocessing operation is used for merging a plurality of continuous dialogue information of the same object into one dialogue information; the information obtaining module 701 is specifically further configured to screen, based on the current problem of the user, session information of a plurality of session rounds adjacent to the current problem of the user from the processed historical session information as short-time memory session information, and screen, based on the current problem of the user, session information of a plurality of session rounds related to the current problem of the user from the processed historical session information as long-time memory session information; the information obtaining module 701 is specifically further configured to combine the long-term memory session information and the short-term memory session information to obtain the context information; the dialogue content included in the long-term memory dialogue information is the dialogue content before the dialogue content included in the short-term memory dialogue information.

Optionally, the information obtaining module 701 is specifically configured to convert the first text information into a first semantic vector, screen a plurality of result information matched with the first semantic vector from the domain knowledge vector database based on the first semantic vector, and determine domain knowledge corresponding to the plurality of result information as the plurality of domain knowledge.

Optionally, the information obtaining module 701 is specifically configured to convert the first text information into the first semantic vector by using a preset conversion algorithm, and calculate a similarity between the first semantic vector and a semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm; the information obtaining module 701 is specifically further configured to determine, as the plurality of result information, a plurality of semantic vectors whose similarity values in the domain knowledge vector database satisfy a preset similarity; wherein the preset similarity algorithm comprises any one of the following: euclidean distance algorithm and inner product algorithm.

Optionally, the information obtaining module 701 is specifically configured to calculate the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula:

(equation I)

Optionally, the information obtaining module 701 is specifically configured to calculate the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula two:

(equation II)

Optionally, the information obtaining module 701 is specifically configured to input the first text information into a scene classification model, and determine a topic scene corresponding to the first text information; the information obtaining module 701 is specifically further configured to screen, based on the topic scene corresponding to the first text information, a target prompt template that matches the topic scene corresponding to the first text information from a dialogue prompt template library; one topic scene corresponds to one dialogue prompt template.

Optionally, the data processing module 702 is specifically configured to combine the first text information, the second text information, the third text information, and the fourth text information to obtain text information to be processed; the data processing module 702 is specifically further configured to perform text word segmentation on the text information to be processed by using a preset word segmentation algorithm, and calculate the number of first words corresponding to the text information to be processed after word segmentation; the data processing module 702 is specifically further configured to calculate a difference value between the maximum word segmentation number and the first word segmentation number when the first word segmentation number of the text information to be processed is greater than the maximum word segmentation number allowed to be input by the natural language model, so as to obtain a word segmentation number difference; and cutting off the text information from the head to the tail of the target text information based on the word segmentation quantity difference to obtain first sub-text information close to the head and second sub-text information close to the tail, and determining the second text information as the target text information.

Optionally, the data processing module 702 is specifically further configured to determine the text information to be processed as the target text information when the first word segmentation number of the text information to be processed is less than or equal to the maximum word segmentation number allowed to be input by the natural language model.

Optionally, the data processing module 702 is specifically configured to splice and combine the first text information, the second text information, the third text information, and the fourth text information based on a target splicing order matched with the dialog prompt templates included in the plurality of domain knowledge and the fourth text information, so as to obtain target text information; the output results of the corresponding natural language models are not identical based on the target text information obtained by different splicing sequences; the target splicing sequence is the highest user satisfaction degree scoring splicing sequence in the output results of the natural language model corresponding to the splicing sequences; the splicing orders are splicing orders corresponding to the first text information, the second text information, the third text information and the fourth text information.

According to the anthropomorphic dialogue device based on the natural language model, first text information in a dialogue process and second text information in the dialogue process are obtained, and third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information are determined; the first text information includes: a user's current question; the second text information includes: context information; the third text includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; then, the first text information, the second text information, the third text information and the fourth text information are combined to obtain target text information; and finally, inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user. Therefore, the conversation generation capability and conversation quality of the chat robot driven by the natural language model LLM aiming at the professional field problem can be improved, and meanwhile, the conversation of the chat robot can be more personified.

Fig. 8 illustrates a physical structure diagram of an electronic device, as shown in fig. 8, which may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform a natural language model based anthropomorphic dialog method comprising: acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user.

Further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present application also provides a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the natural language model based anthropomorphic dialog method provided by the methods above, the method comprising: acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user.

In yet another aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor is implemented to perform the above provided natural language model based anthropomorphic dialog method, the method comprising: acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template; and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, and inputting the target text information into a natural language model to obtain target reply content corresponding to the current problem of the user.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A natural language model-based anthropomorphic dialog method, comprising:

acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template;

the first text information, the second text information, the third text information and the fourth text information are combined to obtain target text information, and the target text information is input into a natural language model to obtain target reply content corresponding to the current problem of the user;

the step of combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information includes:

Based on a target splicing order matched with dialog prompt templates contained in the plurality of domain knowledge and the fourth text information, splicing and combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information;

2. The method of claim 1, wherein the obtaining the second text information during the conversation comprises:

acquiring historical dialogue information in a dialogue process;

and screening out dialogue information of a plurality of rounds from the historical dialogue information through a long-short-time memory mechanism, and determining the dialogue information of the plurality of rounds as the context information.

3. The method of claim 2, wherein the screening the historical dialog information for a plurality of rounds from the historical dialog information via a long-short-term memory mechanism and determining the dialog information for the plurality of rounds as the context information comprises:

preprocessing the historical dialogue information to obtain processed historical dialogue information; the preprocessing operation is used for merging a plurality of continuous dialogue information of the same object into one dialogue information;

screening dialogue information of a plurality of dialogue turns adjacent to the current problem of the user from the processed historical dialogue information based on the current problem of the user as short-term memory dialogue information, and screening dialogue information of a plurality of dialogue turns related to the current problem of the user from the processed historical dialogue information based on the current problem of the user as long-term memory dialogue information;

combining the long-time memory dialogue information and the short-time memory dialogue information to obtain the context information;

4. The method of claim 1, wherein the determining a plurality of domain knowledge related to a domain of technology of the user's current question comprises:

converting the first text information into a first semantic vector, screening a plurality of result information matched with the first semantic vector from a domain knowledge vector database based on the first semantic vector, and determining domain knowledge corresponding to the plurality of result information as the plurality of domain knowledge.

5. The method of claim 4, wherein converting the first text information into a first semantic vector and screening a plurality of result information matching the first semantic vector from the domain knowledge vector database based on the first semantic vector comprises:

converting the first text information into the first semantic vector by using a preset conversion algorithm, and calculating the similarity between the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm;

determining a plurality of semantic vectors with similarity values meeting preset similarity in the domain knowledge vector database as the plurality of result information;

6. The method of claim 5, wherein the calculating the similarity of the first semantic vector to the semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm comprises:

calculating the similarity of the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula I:

(equation I)

7. The method of claim 5, wherein the calculating the similarity of the first semantic vector to the semantic vector of each domain knowledge in the domain knowledge vector database based on a preset similarity algorithm comprises:

calculating the similarity of the first semantic vector and the semantic vector of each domain knowledge in the domain knowledge vector database based on the following formula II:

(equation II)

8. The method of claim 1, wherein the determining fourth text information corresponding to a topic scene of the first text information comprises:

inputting the first text information into a scene classification model, and determining a topic scene corresponding to the first text information;

screening a target prompt template of which the topic scene is matched with the topic scene corresponding to the first text information from a dialogue prompt template library based on the topic scene corresponding to the first text information;

one topic scene corresponds to one dialogue prompt template.

9. The method of claim 1, wherein the merging the first text information, the second text information, the third text information, and the fourth text information to obtain the target text information comprises:

the first text information, the second text information, the third text information and the fourth text information are combined to obtain text information to be processed;

executing text word segmentation operation on the text information to be processed by using a preset word segmentation algorithm, and calculating the number of first word segments corresponding to the text information to be processed after word segmentation;

Calculating the difference value between the maximum word segmentation number and the first word segmentation number under the condition that the first word segmentation number of the text information to be processed is larger than the maximum word segmentation number allowed to be input by the natural language model, so as to obtain word segmentation number difference; and cutting off the text information from the head to the tail of the target text information based on the word segmentation quantity difference to obtain first sub-text information close to the head and second sub-text information close to the tail, and determining the second text information as the target text information.

10. The method according to claim 9, wherein after performing text word segmentation on the text information to be processed using a preset word segmentation algorithm and calculating a first word segmentation number corresponding to the text information to be processed after word segmentation, the method further comprises:

and determining the text information to be processed as the target text information under the condition that the first word segmentation number of the text information to be processed is smaller than or equal to the maximum word segmentation number allowed to be input by the natural language model.

11. The method of claim 8, wherein the inputting the first text information into a scene classification model, prior to determining a topic scene to which the first text information corresponds, the method further comprises:

Acquiring a history dialogue record between a user and a dialogue robot as a training set to be marked, marking each sample in the training set to be marked based on a preset scene keyword, and obtaining a target training set after marking;

and training the pre-training model by using the target training set to obtain the scene classification model.

12. An anthropomorphic dialog device based on a natural language model, the device comprising:

the information acquisition module is used for acquiring first text information in a dialogue process and second text information in the dialogue process, and determining third text information related to the technical field of the first text information and fourth text information corresponding to the topic scene of the first text information; the first text information includes: a user's current question; the second text information includes: context information of the current problem of the user; the third text information includes: a plurality of domain knowledge; the fourth text information includes: a dialog prompt template;

the information processing module is used for combining the first text information, the second text information, the third text information and the fourth text information to obtain target text information, inputting the target text information into a natural language model, and obtaining target reply content corresponding to the current problem of the user;

The information processing module is specifically configured to perform stitching and merging on the first text information, the second text information, the third text information, and the fourth text information based on a target stitching order matched with dialogue prompt templates included in the plurality of domain knowledge and the fourth text information, so as to obtain target text information;

13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the natural language model based anthropomorphic dialog method according to any one of claims 1 to 11 when the program is executed.

14. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements the steps of the natural language model based anthropomorphic dialog method as claimed in any one of claims 1 to 11.