CN116910223A

CN116910223A - Intelligent question-answering data processing system based on pre-training model

Info

Publication number: CN116910223A
Application number: CN202310995752.5A
Authority: CN
Inventors: 杨桢
Original assignee: Beijing Anliantong Technology Co ltd
Current assignee: Beijing Anliantong Technology Co ltd
Priority date: 2023-08-09
Filing date: 2023-08-09
Publication date: 2023-10-20

Abstract

The application relates to the technical field of questions and answers, in particular to an intelligent question and answer data processing system based on a pre-training model. According to the application, through training of a pre-training model in a specific field, the question-answering performance of the model on a specific task can be improved; the difference between the pre-training data set and the data set in the specific field is judged, namely the adaptability of the pre-training model in the specific field is judged, so that whether the model is to be finely tuned in a certain field can be judged in advance, the consumption of calculation resources and time can be reduced, and excessive training of the model under the unnecessary condition is avoided; the application tracks and analyzes the behavior of the user, evaluates the risk degree of the user and performs corresponding risk control; the intelligent question-answering system provides an effective user management mechanism for preventing malicious behaviors.

Description

Intelligent question-answering data processing system based on pre-training model

Technical Field

The application relates to the technical field of questions and answers, in particular to an intelligent question and answer data processing system based on a pre-training model.

Background

With the rapid development of deep learning, intelligent question-answering systems are increasingly applied in various fields. However, despite the many high quality pre-trained models, such as BERT, GPT, etc., how to effectively use these pre-trained models to accommodate domain-specific question-and-answer tasks, and how to manage and control user status, remains a major challenge for current intelligent question-and-answer systems.

In a traditional intelligent question-answering system, a data collection module collects original data of intelligent questions and answers, a pre-training model module uses a pre-training model to understand and process questions input by a user, an inference module predicts new questions through the question-answering model to generate answers, and a post-processing module processes the generated answers to enable the answers to be easier to understand and use. However, a disadvantage of this approach is that pre-trained models tend to perform poorly on domain-specific question-and-answer tasks, as they are typically trained on large text corpora that often do not cover all of the domain-specific features. On the other hand, conventional approaches tend to ignore management and control of user states, which may lead to improper response of the system to malicious users.

For example, chinese patent 202110771766.X discloses an intelligent question-answer data processing system, which comprises a preset database, a preset model library, a processor and a memory storing a computer program, and solves the problem of cold start of data based on a pre-training model and an unsupervised recall method, and improves recall accuracy based on a recall method of a pre-training model micro-and text multi-label classification model. However, the intelligent question-answering data processing system has the following defects: although the system finely adjusts the pre-training model and optimizes the accuracy of the model, the training of the question-answering model in the specific field is not optimized, so that the problem of excessive cost can be caused, the risk degree of the user is not evaluated in the intelligent question-answering data processing system, the malicious behaviors of the user are not processed, and therefore, the intelligent question-answering data processing system based on the pre-training model is needed to solve the problem.

For the problems in the related art, no effective solution has been proposed at present.

Disclosure of Invention

Accordingly, an objective of an embodiment of the present application is to provide an intelligent question-answering data processing system based on a pre-training model, so as to overcome the above technical problems existing in the related art.

For this purpose, the application adopts the following specific technical scheme:

an intelligent question-answering data processing system based on a pre-training model comprises a data collection module, a pre-training model module, a fine adjustment module, an reasoning module, a post-processing module and a user state recognition module.

The data collection module is used for collecting the original data of the intelligent questions and answers and preprocessing the collected original data.

The pre-training model module is used for understanding and processing the problem input by the user by using the pre-training model.

And the fine tuning module is used for judging the adaptability of the pre-training model in a specific field, and if the adaptability is poor, training the pre-training model by using a question-answer data set in the specific field on the basis of the pre-training model to obtain a question-answer model in the specific field.

And the reasoning module is used for predicting the new questions through the question-answering model and generating answers.

The post-processing module is used for processing the answers generated by the reasoning module and enabling the generated answers to be easier to understand and use.

The user state recognition module is used for recognizing the state of the user and performing risk control on the user with the malicious state.

Preferably, the fine adjustment module comprises a tag data acquisition module, a first test module, a second test module and a judgment module;

the tag data acquisition module is used for acquiring a pre-training data set and a specific field data set, and respectively acquiring a plurality of tagged data X and tagged data Y from the pre-training data set and the specific field data set;

the first test module is used for reversely bringing the data X with the tag into the pre-training model for testing, so as to obtain a first test result;

the second test module is used for adding the tagged data X and the tagged data Y into the pre-training data set, training the pre-training model to obtain a preliminary question-answering model, and reversely bringing the tagged data X and the tagged data Y into the preliminary question-answering model for testing to obtain a second test result;

the judging module is used for calculating the difference between the second test result and the first test result, and judging that the pre-training model has poor adaptability in the specific field if the difference is larger than zero.

Preferably, the processing of the answers generated by the reasoning module includes text generation, formatting and error checking of the generated answers.

Preferably, the user state identification module comprises a user data collection module, a model parameter setting module, a parameter learning module, a state inference module, a risk assessment module and a risk control module;

the user data collection module is used for collecting the behavior log of the user and preprocessing the data in the behavior log;

the model parameter setting module is used for randomly initializing parameters of the hidden Markov model of each user, and the parameters of the hidden Markov model comprise a state transition probability matrix, an observation probability matrix and an initial state probability vector;

the parameter learning module is used for optimizing parameters of the hidden Markov model of each user through multiple iterations;

the state deducing module is used for calculating the maximum likelihood path of each state according to each time point, and starting from the last time point, backtracking to the first time point according to the maximum likelihood path to obtain the most likely state sequence;

the risk assessment module is used for assessing the risk degree of the user according to the information entropy and the information gain of the most likely state sequence;

the risk control module is used for acquiring the risk degree of the user and controlling the risk of the user.

Preferably, the model parameter setting module comprises a state transition probability module, an observation probability module and an initial state probability module;

the state transition probability module is used for displaying the probability of any user transitioning from one state to another, wherein the states of the users comprise normal, suspicious and malicious states;

the observation probability module is used for displaying the probability of a certain observation behavior of any user in a certain state;

the initial state probability module is used for displaying the state distribution of any user at a first time point.

Preferably, the parameter learning module comprises an initialization module, an iterative optimization module, a circulation module and a stop judgment module;

the initialization module is used for randomly initializing parameters of the hidden Markov model of each user as initial parameters, and giving tolerance, parameter set, termination tolerance, tolerance attenuation coefficient and cycle number;

the iteration optimization module is used for executing a plurality of iterations under each tolerance, randomly changing part of parameters of the hidden Markov model in each iteration, calculating an objective function value of the hidden Markov model after the parameters are changed, and accepting the change of the parameters of the hidden Markov model if the objective function value is improved, otherwise accepting the change of the parameters of the hidden Markov model according to a preset probability;

the loop module is used for reducing the tolerance after the iteration under each tolerance is completed, and returning to the iteration optimization module;

and the stopping judging module is used for stopping iteration when the tolerance value is lower than the ending tolerance value or the change amount of the objective function value is smaller than a preset threshold value, and outputting the current parameter of the hidden Markov model as the optimal parameter.

Preferably, the iterative optimization module comprises a parameter adjustment module and a likelihood function value calculation module;

the parameter adjustment module is used for randomly generating a new parameter set in the current field of the parameter set;

the likelihood function value calculation module is used for calculating the current objective functions of the parameter set and the new parameter set respectively, obtaining the current objective function value and the new objective function value, and comparing the magnitudes of the current objective function value and the new objective function value.

Preferably, after the iteration under each tolerance is completed, reducing the tolerance, and returning to the iteration optimization module, multiplying the tolerance attenuation coefficient by the current tolerance to obtain a new tolerance, judging whether the new tolerance is smaller than the termination tolerance, if so, stopping the loop, otherwise, outputting the new tolerance as the current tolerance, and returning to the iteration optimization module.

Preferably, the state inference module comprises a first state module, a recurrence module and a backtracking module;

the first state module is configured to calculate a probability of each state when the first observation behavior occurs at a first time point according to an initial state probability of the hidden markov model and an observation probability of a given first observation behavior;

the recurrence module is used for calculating the probability of the state at the given previous time point and the maximum probability of each state under the current observation behavior from the second time point;

the backtracking module is used for finding the most probable state of the last time point after reaching the last time point, backtracking the most probable state of each time point from the state, and obtaining the most probable state sequence.

Preferably, when the risk degree of the user is obtained and the risk control is performed on the user, if the risk degree of the user is higher than a preset risk threshold, the user is manually audited, and the question-answering activities of the user are limited.

Embodiments of the present application include the following beneficial effects:

(1) According to the intelligent question-answering data processing system based on the pre-training model, the question-answering performance of the model on a specific task can be improved by training the pre-training model in a specific field. And by judging the difference between the pre-training data set and the data set in the specific field, namely judging the adaptability of the pre-training model in the specific field, whether the model is to be finely tuned in the specific field can be judged in advance, so that the consumption of calculation resources and time can be reduced, and meanwhile, the model is prevented from being excessively trained under the unnecessary condition.

(2) The application tracks and analyzes the behavior of the user, evaluates the risk degree of the user and performs corresponding risk control. The intelligent question-answering system provides an effective user management mechanism for preventing malicious behaviors.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a functional block diagram of an intelligent question-answering data processing system based on a pre-training model according to an embodiment of the present application.

In the figure:

1. a data collection module; 2. a pre-training model module; 3. a fine tuning module; 4. an inference module; 5. a post-processing module; 6. and a user state identification module.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

For the purpose of further illustrating the various embodiments, the present application provides the accompanying drawings, which are a part of the disclosure of the present application, and which are mainly used to illustrate the embodiments and, together with the description, serve to explain the principles of the embodiments, and with reference to these descriptions, one skilled in the art will recognize other possible implementations and advantages of the present application, wherein elements are not drawn to scale, and like reference numerals are generally used to designate like elements.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.

According to an embodiment of the application, an intelligent question-answering data processing system based on a pre-training model is provided.

The application will be further described with reference to the accompanying drawings and the specific embodiments, as shown in fig. 1, an intelligent question-answering data processing system based on a pre-training model according to an embodiment of the application includes a data collecting module 1, a pre-training model module 2, a fine tuning module 3, an reasoning module 4, a post-processing module 5 and a user state identifying module 6;

the data collection module 1 is used for collecting original data of intelligent questions and answers and preprocessing the collected original data; such data may be questions entered by the user, online text, social media posts, and the like. The data may need further processing, such as cleaning, formatting or tagging, to make it suitable for input into the model. This also includes text cleansing (removal of irrelevant symbols, punctuation, stop words, etc.), word segmentation, word vector representation, etc.

The pre-training model module 2 is configured to understand and process the problem input by the user by using a pre-training model (such as BERT, GPT, roBERTa, etc.); wherein the pre-training model is trained by text data for understanding the complexity and context of the language. The text data may be from various sources including, but not limited to, publicly available large-scale text data sets, professional-area text data, web crawler captured data, etc., or may be data preprocessed by the data collection module 1.

The fine tuning module 3 is configured to determine suitability of the pre-training model in a specific field, and if the suitability is poor, train the pre-training model by using a question-answer data set in the specific field on the basis of the pre-training model to obtain a question-answer model in the specific field; fine tuning may improve the performance of the model on a particular task. The question-answer data set in the specific field aims at data in the specific field and is used for further training the pre-training model to obtain a question-answer model in the specific field.

In a further embodiment, the fine tuning module 3 includes a tag data acquiring module, a first testing module, a second testing module, and a judging module.

The tag data acquisition module is used for acquiring a pre-training data set and a specific field data set, and respectively acquiring a plurality of tagged data X and tagged data Y from the pre-training data set and the specific field data set; wherein the pre-training data set and the domain-specific data set are in a parallel relationship. The labels are "targets" or "answers" that require model learning and prediction.

the second test module is used for adding the tagged data X and the tagged data Y into the pre-training data set, training the pre-training model to obtain a preliminary question-answering model, and reversely bringing the tagged data X and the tagged data Y into the preliminary question-answering model for testing to obtain a second test result; the purpose of reversely bringing the labeled data X and the labeled data Y into the preliminary question-answering model for testing is to test the adaptability or the performance of the finely tuned preliminary question-answering model in a specific scene, and the testing mode is an application performance test.

The reasoning module 4 is used for predicting new questions through the question-answering model to generate answers; this may include selecting the most likely answer, or generating a new answer associated with the question.

In a further embodiment, the processing of the answers generated by the reasoning module 4 includes text generation, formatting and error checking of the generated answers.

The post-processing module 5 is configured to process the answers generated by the reasoning module, and make the generated answers easier to understand and use.

The user state recognition module 6 is configured to recognize a state of a user and perform risk control on the user having a malicious state.

In a further embodiment, the user state recognition module 6 includes a user data collection module, a model parameter setting module, a parameter learning module, a state inference module, a risk assessment module, and a risk control module;

the user data collection module is used for collecting the behavior log of the user and preprocessing the data in the behavior log; the user's behavioral data includes the frequency of account questions, the content of the questions, the quality of the answers, the speed of the answers, the quality of the interactions, etc. And preprocessing the data as required, such as cleaning and word segmentation of text data, normalization of continuous data, and the like. This step is to adapt the data to the subsequent model training.

the parameter learning module is used for optimizing parameters of the hidden Markov model of each user through multiple iterations; the parameter learning module is used for learning parameters of the hidden Markov model. Once these parameters are learned and optimized, the state inference module uses these parameters to infer a state sequence.

it should be noted that, the information entropy is an index for measuring the uncertainty or confusion degree of a system, and users with more dispersed or confused states will have higher information entropy. The information gain is used to measure the information quantity brought by a certain action or event and to judge the change of the uncertainty of the system before and after the event. These two indicators may help us assess the risk level of a user, for example if the status of an account changes frequently, possibly indicating that his risk is high.

In this embodiment, the model parameter setting module includes a state transition probability module, an observation probability module, and an initial state probability module;

The state transition probability matrix describes the probability of a user transitioning from one state to another, such as what the probability of transitioning from a normal state to a suspicious state, what the probability of transitioning from a suspicious state to a malicious state, and so on. The observation probability matrix is a matrix describing the probability of a user to take some observation action in one state, such as how much the probability of a query question, a question, an answer, etc. is in a normal state, how much the probability of a question being forwarded in a suspicious state, etc. The initial state probability vector describes the state distribution of the account at the first time point, such as how many probabilities are normal states, how many probabilities are suspicious states, how many probabilities are malicious states, and so on. These parameters are used to build hidden Markov models, which can be used to predict the future state of the user and infer the historical state of the user.

In this embodiment, the parameter learning module includes an initialization module, an iterative optimization module, a loop module, and a stop judgment module;

the initialization module is used for randomly initializing parameters of the hidden Markov model of each user as initial parameters, and giving a tolerance (the tolerance is the initial tolerance and the numerical value is higher), a parameter set (the parameters of the hidden Markov model), a termination tolerance, a tolerance attenuation coefficient and the number of cycles;

In this embodiment, the iterative optimization module includes a parameter adjustment module and a likelihood function value calculation module;

the likelihood function value calculation module is used for calculating the current objective functions of the parameter set and the new parameter set respectively, obtaining the current objective function value and the new objective function value, and comparing the magnitudes of the current objective function value and the new objective function value. The objective function is a likelihood function, if the likelihood function value increases, that is, the fitness of the hidden markov model increases.

Defining a likelihood function: given a set of observation sequences O and a hidden markov model HMM, the probability P (o|hmm) of generating these observation sequences under the model, which is the likelihood function, is calculated.

Assuming an initial parameter set theta_0, the calculation is performed by Bayesian formula:

P(theta_0|O)＝P(O|theta_0)*P(theta_0)/P(O)

where P (O|theta_0) is the likelihood function, P (theta_0) is the a priori distribution of parameters, and P (O) is the edge distribution of the observed sequence. The parameter set theta n is iteratively updated using a expectation-maximization algorithm such that P (theta n O) increases after each iteration until convergence. Thus, an optimal hidden markov model parameter set theta is obtained.

In this embodiment, after the iteration under each tolerance is completed, the tolerance is reduced, and when the iteration optimization module is returned, the tolerance attenuation coefficient is multiplied by the current tolerance to obtain a new tolerance, and whether the new tolerance is smaller than the termination tolerance is judged, if yes, the loop is stopped, otherwise, the new tolerance is output as the current tolerance, and the iteration optimization module is returned.

In this embodiment, the state inference module includes a first state module, a recurrence module, and a backtracking module;

wherein the first state module is configured to calculate a probability of each state when a first observation behavior occurs at a first point in time according to an initial state probability of the hidden markov model and an observation probability of a given first observation behavior (e.g., "answer question");

the recurrence module is used for calculating the probability of the state at the given previous time point and the maximum probability of each state under the current observation behavior from the second time point; the maximum probability refers to the maximum value of the product of the previous state probability, the current state transition probability, and the current observation probability.

It should be noted that the user has three states: "normal", "suspicious" and "malicious", the corresponding initial state probabilities are pi= [ pi_1, pi_2, pi_3], and the observation probability for "answer a question" is b= [ b_1, b_2, b_3], then the probability of each state at the first point in time can be calculated by the following formula:

p (normal|answer a question) =pi_1×b_1;

p (suspicious|answer a question) =pi_2×b_2;

p (malicious|answer a question) =pi_3×b_3;

at the second point in time, it is assumed that the user's behavior is "forwarding problem". We need to update the probability of each state at this point in time, which needs to depend on the probability of each state at the previous point in time, the current state transition probability and the current observation probability.

For example, to calculate the maximum probability of a "normal" state at the second point in time, three possible transition paths need to be considered: the previous points in time are "normal" to "normal", "suspicious" to "normal", "malicious" to "normal". The corresponding probabilities are respectively:

p (normal|answer a question) ×p (normal|normal) ×p (forward question|normal);

p (suspicious|answers a question) ×p (normal|suspicious) ×p (forwarding a question|normal);

p (malicious|answer a question) ×p (normal|malicious) ×p (forwarding question|normal);

the maximum of these three probabilities is chosen as the probability of a "normal" state at the second point in time. Similarly, we can calculate the probability of "suspicious" states and "malicious" states at the second point in time. During the calculation we also need to record from which state the maximum probability of each state was transferred in order to find the optimal state sequence in the back-tracking step. The goal is to find the most likely path to each state, and thus the most likely state sequence, throughout the state sequence. This is an implementation based on the dynamic programming concept, by recursion and backtracking we can efficiently find the globally optimal state sequence, not just the locally optimal state at each point in time.

In this embodiment, when the risk degree of the user is obtained and the risk control is performed on the user, if the risk degree of the user is higher than a preset risk threshold, the user is manually audited, and the question-answering activity of the user is limited. And meanwhile, adopting rules, namely if a user has malicious states which are two or more times in a period of time, considering the user as a malicious account and performing risk control on the user.

In summary, according to the intelligent question-answering data processing system based on the pre-training model, the question-answering performance of the model on a specific task can be improved by training the pre-training model in a specific field. And by judging the difference between the pre-training data set and the data set in the specific field, namely judging the adaptability of the pre-training model in the specific field, whether the model is to be finely tuned in the specific field can be judged in advance, so that the consumption of calculation resources and time can be reduced, and meanwhile, the model is prevented from being excessively trained under the unnecessary condition. The application tracks and analyzes the behavior of the user, evaluates the risk degree of the user and performs corresponding risk control. The intelligent question-answering system provides an effective user management mechanism for preventing malicious behaviors.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional charging modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the application.

Claims

1. The intelligent question-answering data processing system based on the pre-training model is characterized by comprising a data collection module, a pre-training model module, a fine adjustment module, an reasoning module, a post-processing module and a user state recognition module;

the data collection module is used for collecting the original data of the intelligent questions and answers and preprocessing the collected original data;

the pre-training model module is used for understanding and processing the problem input by the user by using the pre-training model;

the fine tuning module is used for judging the adaptability of the pre-training model in a specific field, and if the adaptability is poor, training the pre-training model by using a question-answer data set in the specific field on the basis of the pre-training model to obtain a question-answer model in the specific field;

the reasoning module is used for predicting new questions through the question-answering model and generating answers;

the post-processing module is used for processing the answers generated by the reasoning module and enabling the generated answers to be easier to understand and use;

2. The intelligent question-answering data processing system based on a pre-training model according to claim 1, wherein the fine tuning module comprises a tag data acquisition module, a first test module, a second test module and a judgment module;

3. A pre-trained model based intelligent question-answering data processing system according to claim 1 or claim 2, wherein the processing of answers generated by the reasoning module includes text generation, formatting and error checking of the generated answers.

4. The intelligent question-answering data processing system based on a pre-training model according to claim 1, wherein the user state recognition module comprises a user data collection module, a model parameter setting module, a parameter learning module, a state inference module, a risk assessment module and a risk control module;

5. The intelligent question-answering data processing system based on a pre-training model according to claim 4, wherein the model parameter setting module comprises a state transition probability module, an observation probability module and an initial state probability module;

6. The intelligent question-answering data processing system based on the pre-training model according to claim 5, wherein the parameter learning module comprises an initialization module, an iterative optimization module, a circulation module and a stop judgment module;

7. The intelligent question-answering data processing system based on the pre-training model according to claim 6, wherein the iterative optimization module comprises a parameter adjustment module and a likelihood function value calculation module;

8. The intelligent question-answering data processing system based on the pre-training model according to claim 7, wherein after the iteration under each tolerance is completed, the tolerance is reduced, and when the iteration optimization module is returned, the tolerance attenuation coefficient is multiplied by the current tolerance to obtain a new tolerance, and whether the new tolerance is smaller than the termination tolerance is judged, if yes, the loop is stopped, otherwise, the new tolerance is output as the current tolerance, and the iteration optimization module is returned.

9. The intelligent question-answering data processing system based on a pre-training model according to claim 8, wherein the state inference module comprises a first state module, a recurrence module and a backtracking module;

10. The intelligent question-answering data processing system based on the pre-training model according to claim 1, wherein when the risk degree of the user is obtained and the risk control is performed on the user, if the risk degree of the user is higher than a preset risk threshold, the user is manually checked and question-answering activities of the user are limited.