WO2020151310A1 - Text generation method and device, computer apparatus, and medium - Google Patents

Text generation method and device, computer apparatus, and medium Download PDF

Info

Publication number
WO2020151310A1
WO2020151310A1 PCT/CN2019/116941 CN2019116941W WO2020151310A1 WO 2020151310 A1 WO2020151310 A1 WO 2020151310A1 CN 2019116941 W CN2019116941 W CN 2019116941W WO 2020151310 A1 WO2020151310 A1 WO 2020151310A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
model
initial
discriminator
generator
Prior art date
Application number
PCT/CN2019/116941
Other languages
French (fr)
Chinese (zh)
Inventor
毕野
黄博
吴振宇
王建明
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020151310A1 publication Critical patent/WO2020151310A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • This application belongs to the field of model construction, and more specifically, relates to a text generation method, device, computer equipment and medium.
  • LSTM Long Short-Term Memory Networks
  • RNN Recurrent/Recursive Neural Network
  • the common way to train RNN is maximum likelihood estimation, that is, given the first t-1 words, the next word is given by maximizing the log likelihood of the t-th word.
  • RNN Recurrent/Recursive Neural Network
  • the disadvantage of using RNN is that it will produce a gradually increasing deviation, because when generating a sentence, RNN is generated word by word, and the next word is generated on the basis of the previous word, which leads to A deviation, and as the length of the sequence increases, the deviation will become larger.
  • RNN cannot be self-improved.
  • a minimizing loss function can be added to improve the model. But for the text generation model, because the input data is discrete data, there is no directly usable loss function, and there is no suitable way to guide the text generation model to self-improve to obtain close to real output.
  • the embodiments of the present application provide a text generation method, device, computer equipment, and storage medium to solve the current problem of low efficiency of text generation.
  • a text generation method including:
  • Generate test text based on the generator model input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and calculate the gradient of the generator model according to the reward value. Gradient update the generator model;
  • Obtain the text to be recognized input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • a text generating device includes:
  • a text positive sample acquisition module used to acquire a real text data set, and acquire a positive text sample from the real text data set;
  • the generator model acquisition module is used to establish an initial generator model, input the positive text samples to the initial generator model for pre-training, obtain a generator model, and generate a first negative text sample according to the generator model ;
  • the discriminator model acquisition module is used to establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model;
  • the generator model update module is configured to generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, and calculate the generator according to the reward value The gradient of the model, and update the generator model according to the gradient;
  • the discriminator model update module is used to generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and according to minimize cross entropy Update the discriminator model;
  • a text generation model acquisition module configured to alternately update the generator model and the discriminator model, if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
  • the target text generation module is used to obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • a computer device includes a memory, a processor, and computer readable instructions that are stored in the memory and can run on the processor, and the processor implements the above text generation method when the computer readable instructions are executed.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the above text generation method.
  • FIG. 1 is a schematic diagram of an application environment of a text generation method in an embodiment of the present application
  • Figure 2 is a flowchart of a text generation method in an embodiment of the present application
  • Fig. 3 is another flowchart of a text generation method in an embodiment of the present application.
  • FIG. 4 is another flowchart of a text generation method in an embodiment of the present application.
  • FIG. 5 is another flowchart of a text generation method in an embodiment of the present application.
  • Fig. 6 is another flowchart of a text generation method in an embodiment of the present application.
  • FIG. 7 is a functional block diagram of a text generating device in an embodiment of the present application.
  • FIG. 8 is a functional block diagram of the generator model acquisition module in the text generation device in an embodiment of the present application.
  • FIG. 9 is a functional block diagram of the discriminator model acquisition module in the text generation device in an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
  • the text generation method provided in this application can be applied to the application environment as shown in Figure 1, where the client communicates with the server through the network, the server obtains the real text data set through the client, and the text positive from the real text database. Sample; then build an initial generator model based on the input of the client, input the positive text samples to the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model; then build according to the input of the client
  • the initial discriminator model, the positive sample of the text and the negative sample of the first text are input into the initial discriminator model for pre-training to obtain the discriminator model; then the server generates test text based on the generator model, and inputs the test text to the discriminator Obtain the reward value of the test text in the model, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient; the server generates a second negative text sample according to the updated generator model, and compares the second negative text sample with The positive samples of text are input into the discriminator model,
  • the text generation model is obtained according to the generator model at the time of convergence ; Finally, the text to be recognized is obtained, and the text to be recognized is input into the text generation model, and the target text is generated based on the text generation model and returned to the client.
  • the client can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented with an independent server or a server cluster composed of multiple servers.
  • a text generation method is provided.
  • the method is applied to the server in FIG. 1 as an example for description, which may specifically include the following steps:
  • S10 Obtain a real text data set, and obtain a positive text sample from the real text data set.
  • the real text data set refers to the original text data set corresponding to the text that is expected to be finally output by the text generation model.
  • the real text data set is a data set composed of various poems.
  • the text in this embodiment can be a poem, an answer to a question or a dialogue, etc.
  • This embodiment uses the final output poem as an example for description.
  • the text positive sample refers to multiple samples extracted from the real text data set, for example, multiple poems extracted from the real text data set.
  • a large number of data sets of poems can be collected in advance and stored in the database of the server as real text data sets.
  • the server randomly obtains real text data sets from the database, and extracts some poems (samples) from the real text data sets. ) As a positive sample of the text.
  • step S10 may specifically include the following steps:
  • S11 Select N text data from the real text data set, where N is a positive integer.
  • the server selects N samples from the database as positive text samples, where N is a positive integer. It can be understood that the more N samples are extracted, the better the training effect.
  • which samples are specifically selected as positive text samples can be obtained through the input of the client, for example, the client inputs the sample number, and then the server selects the corresponding sample from the database according to the sample number input by the client.
  • S12 Convert the N text data into a vector form using the word vector model, and use the N text data converted into the vector form as a positive text sample.
  • the word vector model is the word2vec model
  • the word2vec model includes two neural network structures, namely CBOW and Skip-gram.
  • the server can input the poem (real text data set) into the word2vec model for training.
  • the word2vec model can be used to map each entry of the poem to a vector.
  • the word2vec algorithm can be used to transform the term i into x i
  • the server converts the selected N text data into a vector form through the word2vec model, and then converts these N text data into a vector form as a positive text sample.
  • S20 Establish an initial generator model, input the positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model.
  • the initial generator model and subsequent initial discriminator models are all models constructed based on neural networks.
  • the establishment of the initial generator model can use a recurrent neural network (RNN); in order to speed up the training of the neural network and reduce the amount of calculation, the establishment of the initial discriminator model can be used Product neural network (CNN).
  • the establishment of the initial generator model and the initial discriminator model can also use other neural networks, which are not specifically limited here.
  • the initial generator model is a recurrent neural network and the initial discriminator model is a convolutional neural network.
  • the parameters of the RNN are randomly selected to establish the initial generator model.
  • the positive text samples obtained in step S10 are input into the initial generator model for pre-training, and the generator model can be obtained after pre-training.
  • the generator model can be obtained after pre-training.
  • the server can also select additional sample data from the real text data set and input it into the initial generator model for pre-training.
  • S30 Establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model.
  • the server randomly selects the parameters of the CNN to establish an initial discriminator model.
  • the obtained positive text samples and the first negative text samples are respectively labeled.
  • the positive sample of the text may be labeled as 1
  • the negative sample of the first text may be labeled as 0.
  • the labeled positive text samples and the first negative text samples are input into the initial discriminator model for pre-training to obtain the discriminator model.
  • Using CNN to build the discriminator model is because an appropriate pooling layer can be set in the CNN, and the pooling operation can prevent the discriminator model from overfitting the data, speed up the discriminator model training, and reduce the amount of calculation.
  • S40 Generate a test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient.
  • the reward value of the test text refers to the value output by the discriminator model.
  • the server uses the generator model to generate the test text, then inputs the test text into the discriminator model, and obtains the value output by the discriminator model as the reward value.
  • the generator model here uses the Policy Gradient in Reinforcement Learning (RL), that is, when the output value of the discriminator model to the test text is relatively high, increase The probability of the corresponding action of the RNN in the generator model; when the output value of the discriminator model to the test text is relatively low, reduce the probability of the corresponding action of the RNN in the generator model.
  • RL Policy Gradient in Reinforcement Learning
  • the level of the output value of the discriminator model is relative Concept, the value of different training stages is different, you can preset according to experience, for example, when the output of the generator model is relatively poor at the beginning of training, you can set the output of the discriminator model to be higher than 0.3 Set the value lower than 0.2 as a relatively low value; in the later stage of training, you can set the discriminator model output higher than 0.4 as a relatively high value, and set lower than 0.3 as a relatively low value.
  • the strategy gradient of the generator is calculated according to the reward value of the test text, and finally the generator model is updated with the calculated strategy gradient.
  • J( ⁇ ) refers to the objective function of the generator model
  • E refers to the expected value
  • G ⁇ refers to the generator model
  • Y 1:t-1 ⁇ G ⁇ refers to the text Y generated by the generator model Obey the probability distribution G ⁇
  • Y 1:t-1 ) refers to the probability that y t appears in Y 1:t-1 under the generator model
  • D ⁇ refers to the discriminator model
  • It refers to the reward of the text generated by the generator model G ⁇ in the discriminator model D ⁇ .
  • the expectation in the above gradient can be approximated by sampling, and then update the generator model G ⁇ parameter ⁇ :
  • ⁇ h refers to the learning rate of the hidden layer.
  • S50 Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy.
  • the server uses the updated generator model to generate some texts as the second text negative samples, and then respectively label the second text negative samples and the text positive samples and input them into the discriminator model for training.
  • the second text negative sample is labeled as 0
  • the text positive sample is labeled as 1.
  • the text positive sample and the previous training text positive sample can be the same sample, or another sample can be extracted from the real text data set.
  • the sample data is used as a positive sample of the text.
  • the purpose of the training of the discriminator model is that when the input is real text data, the output value is as close to 1 as possible; when the input is text generated by the generator, the output value is as close to 0 as possible. It can output an accurate value when given an arbitrary sample. Specifically, the following minimizing cross entropy can be used to obtain the pre-trained discriminator parameters:
  • the discriminator D ⁇ (Y) returns the probability that the sample Y belongs to the real sample, which is a number belonging to [0,1].
  • Y ⁇ p data indicates that Y obeys the probability distribution p data
  • p data refers to the probability distribution obeyed by the real text data set.
  • Y ⁇ G ⁇ means that Y obeys the probability distribution G ⁇
  • E means the expected value; minimizing cross entropy can make the first part and the second part of the above formula as large as possible, that is, the probability of real data is as large as possible, and the probability of generating data is as large as possible small.
  • the parameters of the discriminator model can be updated, and the discriminator model can be updated.
  • the discriminator model when the discriminator model is updated, it is based on the fixed generator model, and the number of times to update the discriminator model can be multiple times, which is specifically set according to the actual situation, and no specific limitation is made here.
  • S60 alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence.
  • the server alternately updates the generator model and the discriminator model, that is, when the discriminator model does not converge, the generator model and the discriminator model are repeatedly updated, so that the generator model and the discriminator model continue to fight against training.
  • the generator model is updated first, and the discriminator model remains unchanged; then the generator model is kept unchanged, and the discriminator model is updated. That is, let the parameters of the discriminator model be fixed, train the generator model; then let the parameters of the generator model be fixed, train the discriminator model; repeat this process until the output of the discriminator model converges. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence.
  • output convergence means that the value of the discriminator's output for a given sample (positive sample or negative sample) is close to 0.5, then the discriminator is considered to be unable to distinguish between positive and negative samples, and the server determines that the output of the discriminator has converged, and then generates according to the convergence
  • the machine model can get the final text generation model.
  • S70 Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • the text to be recognized is the input of the text generation model
  • the target text is the output of the text generation model.
  • the text to be recognized and the target text correspond to the real text data set, that is, if the data set of poetry is used to train the text generation model, the text to be recognized and the target text corresponding to the text generation model are also poems; if you use dialogue To train the text generation model, the text to be recognized and the target text corresponding to the text generation model are also dialogues.
  • the to-be-recognized text and the target text may also be answers to questions, speech scripts, or short essays.
  • the server obtains the to-be-recognized text input by the user through the client, and then inputs the to-be-recognized text into the text generation model, the text generation model generates the target text, and the server then outputs the target text to the client.
  • the server obtains the above text in the dialog input by the user through the client, such as "How is the weather today?", and then the server inputs the above text in the dialog into the text generation model, and the text generation model generates and uploads
  • the text corresponds to the target text below, such as: "Today's weather is very good! or "According to the weather forecast, it will rain today.” etc., so as to form a corresponding dialogue, and finally the server will output the target text to the client.
  • the positive text samples are obtained from the real text data set; the initial generator model is established, and the positive text samples are input to the initial generator model for pre-training to obtain the generator model , And generate the first negative text sample according to the generator model; establish an initial discriminator model, and input the positive text sample and the first negative text sample into the initial discriminant model for pre-training to obtain the discriminator model; generate tests based on the generator model Text, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient; generate the second negative text sample according to the updated generator model, Input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy; alternately update the generator model and the discriminator model, if the output of the discriminator model converges, then according to the convergence time
  • the generator model obtains a text generation model; obtains the
  • the text generation model can be quickly constructed, and the accuracy of the generated text is high, which improves the construction efficiency of the text generation model.
  • the precision of the generated text is high, which improves the construction efficiency of the text generation model.
  • step S20 an initial generator model is established, the positive text samples are input to the initial generator model for pre-training, the generator model is obtained, and the first generator model is generated according to the generator model.
  • a negative sample of text can specifically include the following steps:
  • the initial generation parameters may be randomly selected recurrent neural network (RNN) parameters. That is, before the pre-training, the parameters can be randomly selected and input into the RNN to obtain the initial generator model.
  • RNN recurrent neural network
  • S22 Input the positive text sample into the initial generator model for pre-training, and convert it into a probability output according to the probability distribution function to obtain pre-trained parameters.
  • the server inputs the positive text samples into the initial generator model for pre-training.
  • the positive text samples are (x 1 ,x 2 , ⁇ x T ), and first (x 1 ,x 2 , ⁇ ⁇ x T) in the recursive mapping RNN to hidden (h 1, h 2, ⁇ h T), wherein the input means is hidden parameter in the hidden layer recurrent neural networks (hidden layers), but also a neuron
  • the output parameters are expressed by the following formula:
  • W is the weight matrix
  • U is the hidden state of h t-1 (or called the transition matrix).
  • can be a sigmoid function or a hyperbolic tangent function (tanh), and ⁇ can be determined according to specific circumstances.
  • the probability distribution function can be a soft max function, expressed by the following formula:
  • the above formula means that when (x 1 , x 2 , ⁇ x T ) is known, the distribution of the output y t of the RNN is soft max (c+Vh t ), and z(h t ) means A function z of h t is needed to convert the output into the form of probability.
  • the output value belongs to [0,1].
  • This function z can be taken as the soft max function.
  • the pre-trained parameters c and V can be obtained.
  • the original initial generation parameters of the initial generator model are updated according to the parameters c and V obtained after the pre-training to obtain the generator model.
  • the model generator G ⁇ can be expressed by the parameter c can be obtained and the model parameter V [theta] of the generator model G ⁇ .
  • certain sample data can be extracted from the real text data set and input into the generator model G ⁇ to generate the first negative text sample.
  • the initial generator model is established by inputting the initial generation parameters into the recurrent neural network, and then the positive text samples are input into the initial generator model for pre-training, and the probability distribution function is converted into probability output. Obtain the pre-trained parameters; finally update the parameters of the initial generator model according to the pre-trained parameters to obtain the generator model.
  • Building a generator model through a recursive neural network can combine the characteristics of text generation as discrete data, so that the final text model output text is more efficient; in addition, the generator model can be pre-trained first, and the pre-trained generator model can be used Generate some negative samples to achieve pre-training of the discriminator model.
  • step S30 an initial discriminator model is established, and the positive text samples and the first negative text samples are input to the initial discriminator model for pre-training, and the discriminator model is obtained. It can include the following steps:
  • S31 Input the initial discriminating parameters into the convolutional neural network to establish an initial discriminator model.
  • the initial discriminating parameter may be a randomly selected convolutional neural network (CNN) parameter, that is, before pre-training, the randomly selected parameter may be input to the CNN to obtain the initial discriminator model.
  • CNN convolutional neural network
  • S32 Input the text positive sample and the first text negative sample into the initial discriminator model for pre-training, transform it into a probability output according to the probability distribution function, and update the initial discriminating parameters of the initial discriminator according to the minimized cross entropy to obtain pre-training After the discriminant parameters.
  • the training samples are labeled, that is, the positive text samples are labeled as 1, and the negative text samples are labeled as 0.
  • the convolution kernel ⁇ R l ⁇ k indicates that the convolution kernel is a real matrix of l ⁇ k, and ⁇ i:i+l-1 refers to the i-th to i+l-1 in the positive sample of the text Row is also a real matrix of l ⁇ k, b is the required parameter, which is a real number, Refers to the sum of the products of corresponding elements in the matrix.
  • the above pooling refers to the feature c i of the extracted text positive samples taking the maximum value.
  • average pooling can also be used here, which is not specifically limited.
  • FC fully connected layer
  • the first negative text sample marked as 0 is input into the CNN, and after the same process, the sigmoid function is used to convert it into a probability output.
  • the pre-training discriminant parameters namely ⁇ and b.
  • a high-speed neural network can be used to train the discriminator model.
  • the high-speed neural network can be calculated by the following formula:
  • refers to a set of behavior sequences that generate text
  • W T , b T and W H are the weights of the high-speed layer
  • H is an affine transformation plus a nonlinear activation function (such as linear rectification function ReLU), denoted as linear
  • the rectification function is f
  • the sigmoid function is used to convert into probability output.
  • W 0 and b 0 are the weights and deviations of the output layer of the discriminator.
  • S33 Update the parameters of the initial discriminator model according to the discriminant parameters after pre-training to obtain the discriminator model.
  • the parameters of the initial discriminator model are updated according to the discriminant parameters ⁇ and b after pre-training to obtain the discriminator model.
  • the discriminator model can be represented by D ⁇ , where the parameters ⁇ of the discriminator model can be obtained from the parameters ⁇ and b.
  • the initial discriminator model is established by inputting the initial discriminant parameters into the convolutional neural network; then the positive text samples and the first negative text samples are input into the initial discriminator model for pre-training, according to the probability
  • the distribution function is transformed into a probability output, and the initial discriminant parameters of the initial discriminator are updated according to the minimized cross entropy to obtain the discriminant parameters after pre-training; the discriminant parameters after the final pre-training update the parameters of the initial discriminator model to obtain the discriminator model.
  • the initial discriminator model is trained through the negative samples generated by the generator model and the positive text samples, and the discriminator model can be obtained. After the discriminator model is obtained, the generator model and the discriminator model can be trained against each other to finally generate text generation model.
  • step S40 the test text is generated based on the generator model, the test text is input into the discriminator model to obtain the reward value of the test text, and the generator model is calculated according to the reward value.
  • Gradient, and update the generator model according to the gradient which can specifically include the following steps:
  • the generator model will have many intermediate steps in the process of generating test text. For example, if the final generated text is "Moonlight in front of the bed”, then the generator model will generate “bed”, “before the bed”, “ “Before the bed”... Wait for the text in these processes, and the server can obtain the test text in these processes as the test sub-text.
  • Monte Carlo search method refers to the use of random numbers (or more commonly pseudo-random numbers) to solve calculation problems.
  • the server uses the Monte Carlo search method to generate N hypothetical texts according to the test sub-texts, and then input the N hypothetical texts into the discriminator model to obtain the reward value, and use the average of these reward values as the reward value of the test sub-text .
  • Monte Carlo search to generate N hypothetical texts can be expressed by the following formula:
  • N hypothetical texts are generated by Monte Carlo search method under the condition of given test sub-text Y 1:t .
  • S43 Input M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model to obtain the reward value of the test text.
  • the discriminator model D ⁇ (Y) returns the probability that the test sample Y belongs to the real sample, which is a number belonging to [0,1]; T time refers to the completion of the entire poem, so the reward value at T can be directly Given by the discriminator.
  • t 1: the reward value at time T-1 (that is, t from 1 to time T-1) needs to be given by Monte Carlo search simulation.
  • the test sub-text at time t is Y 1:t-1 , and then use Monte Carlo search for N times to obtain N hypothetical texts Y 1:T , and use the average of the reward values of these N hypothetical texts as t The reward value at the moment.
  • the generator model can be trained in reinforcement learning (RL).
  • S44 Calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain the updated generator model.
  • the following formula can be used to calculate the policy gradient of the generator model:
  • the expected E in the above gradient can be approximated by sampling, and then the parameter ⁇ of the generator model is updated as
  • the server obtains the updated generator model according to the parameters of the updated generator model, and then uses the updated generator model to update the discriminator model, alternately updating the generator model and the discriminator model, until the discriminator model converges , And finally get the text generation model according to the generator model at the time of convergence.
  • the generator model is updated, it is performed on the basis of a fixed discriminator model, and the number of times to update the parameters of the generator model can be set according to the actual situation, which is not specifically limited here.
  • the Monte Carlo search method is used to generate M hypothetical texts; then the M hypothetical texts are input to the discrimination In the model, the average value of the reward of M hypothetical texts is obtained as the reward value of the test sub-text, and the test text is input into the discriminator model to obtain the reward value of the test text; finally, according to the reward value of the test sub-text and the test text
  • the reward value calculates the gradient of the generator model, and updates the parameters of the generator model according to the gradient to obtain the updated generator model.
  • a text generation device is provided, and the text generation device corresponds to the text generation method in the above-mentioned embodiment one-to-one.
  • the text generation device includes a text positive sample acquisition module 10, a generator model acquisition module 20, a discriminator model acquisition module 30, a generator model update module 40, a discriminator model update module 50, and a text generation model acquisition module.
  • Module 60 The detailed description of each functional module is as follows:
  • the text positive sample obtaining module 10 is used to obtain a real text data set, and obtain a text positive sample from the real text data set;
  • the generator model acquisition module 20 is used to establish an initial generator model, input positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model;
  • the discriminator model acquisition module 30 is used to establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model;
  • the generator model update module 40 is used to generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient ;
  • the discriminator model update module 50 is configured to generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy;
  • the text generation model acquisition module 60 is used to alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence;
  • the target text generation module 70 is configured to obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • text positive sample acquisition module 10 is also used for:
  • the N text data are converted into a vector form using the word vector model, and the N text data converted into the vector form are used as the text positive samples.
  • the generator model acquisition module 20 includes an initial generation model establishment unit 21, an initial generation model pre-training unit 22 and a generator model acquisition unit 23.
  • the initial generation model establishment unit 21 is configured to input initial generation parameters into the recurrent neural network to establish an initial generator model
  • the initial generation model pre-training unit 22 is used to input positive text samples into the initial generator model for pre-training, and convert it into a probability output according to the probability distribution function to obtain pre-trained parameters;
  • the generator model obtaining unit 23 is configured to update the parameters of the initial generator model according to the pre-trained parameters to obtain the generator model.
  • the discriminator model acquisition module 30 includes an initial discriminant model establishment unit 31, an initial discriminant model pre-training unit 32 and a discriminator model acquisition unit 33.
  • the initial discriminant model establishment unit 31 is used to input initial discriminant parameters into the convolutional neural network to establish an initial discriminator model
  • the initial discriminant model pre-training unit 32 is used to input the positive text sample and the first negative text sample into the initial discriminator model for pre-training, convert it into a probability output according to the probability distribution function, and update the initial discriminator according to the minimized cross entropy
  • the initial discriminant parameters of, and the discriminant parameters after pre-training are obtained;
  • the discriminator model acquisition unit 33 is configured to update the parameters of the initial discriminator model according to the discriminant parameters after pre-training to obtain the discriminator model.
  • generator model update module 40 is also used to:
  • the gradient of the generator model is calculated according to the reward value of the test sub-text and the reward value of the test text, and the parameters of the generator model are updated according to the gradient to obtain the updated generator model.
  • Each module in the above-mentioned text generation device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10.
  • the computer equipment includes a processor, a memory, a network interface and a database connected by a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store real text data sets, text positive samples, text negative samples, and word vector models.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a text generation method.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • Generate test text based on the generator model input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient;
  • Obtain the text to be recognized input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • one or more readable storage media storing computer readable instructions are provided, and when the computer readable instructions are executed by one or more processors, the one or more processors execute The following steps:
  • Generate test text based on the generator model input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient;
  • Obtain the text to be recognized input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  • the readable storage medium includes a non-volatile readable storage medium and a volatile readable storage medium.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A text generation method and device, an apparatus and a medium in the field of model construction. The method comprises: acquiring a positive text sample from a real text data set; establishing an initial generator model, using the positive text sample to pre-train the initial generator model so as to acquire a generator model, and using the generator model to generate a negative text sample; establishing an initial discerning model, and using the positive text sample and the negative text sample to perform pre-training so as to acquire a discerning model; causing the generator model and the discerning model to continuously confront each other, and updating parameters of the models; when the discerning model converges, acquiring a text generation model according to the generator model at the time of convergence; and acquiring text to be identified, inputting the text into the text generation model, and generating target text on the basis of the text generation model. The text generation method improves the efficiency of text generation model construction and the accuracy of text generation.

Description

文本生成方法、装置、计算机设备及介质Text generation method, device, computer equipment and medium
本申请以2019年01月24日提交的申请号为201910067379.0,名称为“文本生成方法、装置、计算机设备及介质”的中国发明专利申请为基础,并要求其优先权。This application is based on the Chinese invention patent application filed on January 24, 2019 with the application number 201910067379.0 and titled "text generation method, device, computer equipment and medium", and claims its priority.
技术领域Technical field
本申请属于模型构建领域,更具体地说,是涉及一种文本生成方法、装置、计算机设备及介质。This application belongs to the field of model construction, and more specifically, relates to a text generation method, device, computer equipment and medium.
背景技术Background technique
随着科技的发展,我们希望计算机能够像人类一样会写作,能够撰写出高质量的自然语言文本,而文本自动生成技术就是实现这一目标的关键技术。With the development of science and technology, we hope that computers can write like humans and can write high-quality natural language texts. The automatic text generation technology is the key technology to achieve this goal.
目前,常用的方法是利用长短期记忆网络(Long Short-Term Memory Networks,简称LSTM)来进行文本生成,LSTM是递归神经网络(Recurrent/Recursive Neural Network,简称RNN)的一种。其中,训练RNN常用的方式是最大似然估计,即在给定前t-1个单词的情况下,通过最大化第t个单词的对数似然来给出下一个单词。但是,使用RNN的不足在于是会产生逐步递增的偏差,因为在生成一句话的时候,RNN是逐个单词依次生成的,下一单词是在前面单词给定的基础上生成的,这样就导致产生了一个偏差,而且随着序列的长度的增加,偏差也会越来越大。At present, a commonly used method is to use Long Short-Term Memory Networks (LSTM) for text generation. LSTM is a type of Recurrent/Recursive Neural Network (RNN). Among them, the common way to train RNN is maximum likelihood estimation, that is, given the first t-1 words, the next word is given by maximizing the log likelihood of the t-th word. However, the disadvantage of using RNN is that it will produce a gradually increasing deviation, because when generating a sentence, RNN is generated word by word, and the next word is generated on the basis of the previous word, which leads to A deviation, and as the length of the sequence increases, the deviation will become larger.
另外,RNN不能进行自我改进,对于RNN的某些应用,可以加入最小化损失函数来改进模型。但是对于文本生成模型,由于输入的数据为离散型数据,因此没有直接可用的损失函数,没有一种合适方式来指导文本生成模型进行自我改进以获得接近真实的输出。In addition, RNN cannot be self-improved. For some applications of RNN, a minimizing loss function can be added to improve the model. But for the text generation model, because the input data is discrete data, there is no directly usable loss function, and there is no suitable way to guide the text generation model to self-improve to obtain close to real output.
综上所述,目前用以生成文本的模型的效率较低,亟待找到一种文本生成模型可以较快速、较准确的生成文本。To sum up, the current models used to generate text are inefficient, and it is urgent to find a text generation model that can generate text faster and more accurately.
发明内容Summary of the invention
本申请实施例提供一种文本生成方法、装置、计算机设备及存储介质,以解决目前生成文本的效率较低的问题。The embodiments of the present application provide a text generation method, device, computer equipment, and storage medium to solve the current problem of low efficiency of text generation.
一种文本生成方法,包括:A text generation method, including:
获取真实文本数据集,从所述真实文本数据集中获取文本正样本;Obtaining a real text data set, and obtaining a positive text sample from the real text data set;
建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;Establishing an initial generator model, inputting the positive text samples to the initial generator model for pre-training, obtaining a generator model, and generating a first negative text sample according to the generator model;
建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;Establishing an initial discriminator model, and inputting the positive text samples and the first negative text samples into the initial discriminating model for pre-training to obtain the discriminator model;
基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and calculate the gradient of the generator model according to the reward value. Gradient update the generator model;
根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross-entropy;
交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;Alternately updating the generator model and the discriminator model, and if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
一种文本生成装置,包括:A text generating device includes:
文本正样本获取模块,用于获取真实文本数据集,从所述真实文本数据集中获取文本正样本;A text positive sample acquisition module, used to acquire a real text data set, and acquire a positive text sample from the real text data set;
生成器模型获取模块,用于建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;The generator model acquisition module is used to establish an initial generator model, input the positive text samples to the initial generator model for pre-training, obtain a generator model, and generate a first negative text sample according to the generator model ;
判别器模型获取模块,用于建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;The discriminator model acquisition module is used to establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model;
生成器模型更新模块,用于基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;The generator model update module is configured to generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, and calculate the generator according to the reward value The gradient of the model, and update the generator model according to the gradient;
判别器模型更新模块,用于根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;The discriminator model update module is used to generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and according to minimize cross entropy Update the discriminator model;
文本生成模型获取模块,用于交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;A text generation model acquisition module, configured to alternately update the generator model and the discriminator model, if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
目标文本生成模块,用于获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。The target text generation module is used to obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述文本生成方法。A computer device includes a memory, a processor, and computer readable instructions that are stored in the memory and can run on the processor, and the processor implements the above text generation method when the computer readable instructions are executed.
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行上述文本生成方法。One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the above text generation method.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are presented in the following drawings and descriptions, and other features and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, for those of ordinary skill in the art, other drawings may be obtained based on these drawings without creative labor.
图1是本申请一实施例中文本生成方法的一应用环境示意图;FIG. 1 is a schematic diagram of an application environment of a text generation method in an embodiment of the present application;
图2是本申请一实施例中文本生成方法的一流程图;Figure 2 is a flowchart of a text generation method in an embodiment of the present application;
图3是本申请一实施例中文本生成方法的另一流程图;Fig. 3 is another flowchart of a text generation method in an embodiment of the present application;
图4是本申请一实施例中文本生成方法的另一流程图;FIG. 4 is another flowchart of a text generation method in an embodiment of the present application;
图5是本申请一实施例中文本生成方法的另一流程图;FIG. 5 is another flowchart of a text generation method in an embodiment of the present application;
图6是本申请一实施例中文本生成方法的另一流程图;Fig. 6 is another flowchart of a text generation method in an embodiment of the present application;
图7是本申请一实施例中文本生成装置的一原理框图;FIG. 7 is a functional block diagram of a text generating device in an embodiment of the present application;
图8是本申请一实施例中文本生成装置中生成器模型获取模块的一原理框图;FIG. 8 is a functional block diagram of the generator model acquisition module in the text generation device in an embodiment of the present application;
图9是本申请一实施例中文本生成装置中判别器模型获取模块的一原理框图;9 is a functional block diagram of the discriminator model acquisition module in the text generation device in an embodiment of the present application;
图10是本申请一实施例中计算机设备的一示意图。Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present application in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请提供的文本生成方法,可应用在如图1的应用环境中,其中,客户端通过网络与 服务端进行通信,服务端通过客户端获取真实文本数据集,从真实文本数据库中获取文本正样本;然后根据客户端的输入建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本;接着根据客户端的输入建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别器模型中进行预训练,得到判别器模型;再接着服务端基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型;服务端根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型;交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型;最后获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本返回客户端。其中,客户端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务端可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The text generation method provided in this application can be applied to the application environment as shown in Figure 1, where the client communicates with the server through the network, the server obtains the real text data set through the client, and the text positive from the real text database. Sample; then build an initial generator model based on the input of the client, input the positive text samples to the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model; then build according to the input of the client The initial discriminator model, the positive sample of the text and the negative sample of the first text are input into the initial discriminator model for pre-training to obtain the discriminator model; then the server generates test text based on the generator model, and inputs the test text to the discriminator Obtain the reward value of the test text in the model, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient; the server generates a second negative text sample according to the updated generator model, and compares the second negative text sample with The positive samples of text are input into the discriminator model, and the discriminator model is updated according to the minimized cross entropy; the generator model and the discriminator model are alternately updated. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence ; Finally, the text to be recognized is obtained, and the text to be recognized is input into the text generation model, and the target text is generated based on the text generation model and returned to the client. Among them, the client can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented with an independent server or a server cluster composed of multiple servers.
在一实施例中,如图2所示,提供一种文本生成方法,以该方法应用在图1中的服务端为例进行说明,具体可以包括如下步骤:In an embodiment, as shown in FIG. 2, a text generation method is provided. The method is applied to the server in FIG. 1 as an example for description, which may specifically include the following steps:
S10:获取真实文本数据集,从真实文本数据集中获取文本正样本。S10: Obtain a real text data set, and obtain a positive text sample from the real text data set.
其中,真实文本数据集是指希望文本生成模型最终输出的文本对应的原始文本数据集,例如,若希望文本生成模型输出是诗,则真实文本数据集就是各种诗组成的数据集。本实施例中的文本可以为诗、问题答案或者对话等,本实施例以最终输出诗为例子进行说明。Among them, the real text data set refers to the original text data set corresponding to the text that is expected to be finally output by the text generation model. For example, if the output of the text generation model is expected to be poems, the real text data set is a data set composed of various poems. The text in this embodiment can be a poem, an answer to a question or a dialogue, etc. This embodiment uses the final output poem as an example for description.
其中,文本正样本是指从真实文本数据集中抽取的多个样本,例如是从真实文本数据集中抽取的多首诗。Among them, the text positive sample refers to multiple samples extracted from the real text data set, for example, multiple poems extracted from the real text data set.
具体地,可以预先收集大量诗的数据集存储于服务端的数据库中作为真实文本数据集,开始训练时,服务端随机从数据库中获取真实文本数据集,并从真实文本数据集中抽取一些诗(样本)作为文本正样本。Specifically, a large number of data sets of poems can be collected in advance and stored in the database of the server as real text data sets. At the beginning of training, the server randomly obtains real text data sets from the database, and extracts some poems (samples) from the real text data sets. ) As a positive sample of the text.
在一实施例中,如图3所示,为了使生成器模型和判别器模型得到更好的训练,可以将真实文本数据集转化为向量的形式,即步骤S10具体可以包括以下步骤:In an embodiment, as shown in FIG. 3, in order to better train the generator model and the discriminator model, the real text data set may be converted into the form of a vector, that is, step S10 may specifically include the following steps:
S11:从真实文本数据集中选取N个文本数据,N为正整数。S11: Select N text data from the real text data set, where N is a positive integer.
具体地,服务端从数据库中选取N个样本作为文本正样本,其中N为正整数,可以理解,抽取N的数目越多,训练的效果越好。可选地,具体选取哪些样本作为文本正样本可以通过客户端的输入得到,例如客户端输入样本的编号,然后服务端根据客户端输入的样本的编号从数据库中选取相应的样本。Specifically, the server selects N samples from the database as positive text samples, where N is a positive integer. It can be understood that the more N samples are extracted, the better the training effect. Optionally, which samples are specifically selected as positive text samples can be obtained through the input of the client, for example, the client inputs the sample number, and then the server selects the corresponding sample from the database according to the sample number input by the client.
S12:将N个文本数据用词向量模型转化为向量形式,将转化为向量形式的N个文本数据作为文本正样本。S12: Convert the N text data into a vector form using the word vector model, and use the N text data converted into the vector form as a positive text sample.
其中,词向量模型为word2vec模型,word2vec模型包括两种神经网络结构,分别是CBOW和Skip-gram。具体地,服务端可以将诗(真实文本数据集)输入至word2vec模型中进行训练,训练完成之后word2vec模型就可用来映射诗的每个词条到一个向量。例如,若一首诗可以表示为{词条1,词条2,…,词条n},调用word2vec算法,可以将词条i转化x i,该首诗可以用向量表示为X T=(x 1,x 2,···x T)。 Among them, the word vector model is the word2vec model, and the word2vec model includes two neural network structures, namely CBOW and Skip-gram. Specifically, the server can input the poem (real text data set) into the word2vec model for training. After the training is completed, the word2vec model can be used to map each entry of the poem to a vector. For example, if a poem can be represented as {term 1, term 2, ..., term n}, the word2vec algorithm can be used to transform the term i into x i , and the poem can be represented by a vector as X T =( x 1 ,x 2 ,···x T ).
具体地,服务端将选取的N个文本数据通过word2vec模型转化为向量形式,再将这些转化为向量形式的N个文本数据作为文本正样本。Specifically, the server converts the selected N text data into a vector form through the word2vec model, and then converts these N text data into a vector form as a positive text sample.
在图3对应的实施例中,通过从真实文本数据集中选取N个文本数据,再将N个文本数据用词向量模型转化为向量形式,最后将转化为向量形式的N个文本数据作为文本正样本。通过将文本数据转化为向量形式,可以文本中的词条相关性更好,方便后续生成器模型和判别器模型的训练。In the embodiment corresponding to FIG. 3, by selecting N text data from the real text data set, and then converting the N text data into a vector form using the word vector model, and finally converting the N text data into the vector form as the text positive sample. By converting the text data into a vector form, the terms in the text can be more relevant, which facilitates the subsequent training of the generator model and the discriminator model.
S20:建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生 成器模型,并根据生成器模型生成第一文本负样本。S20: Establish an initial generator model, input the positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model.
应理解,这里的初始生成器模型及后续的初始判别器模型都是基于神经网络构建的模型。可选地,由于输入数据文本为离散型数据,因此初始生成器模型的建立可以采用递归神经网络(RNN);而为了加快神经网络的训练,减少运算量,初始判别器模型的建立可以用卷积神经网络(CNN)。可选地,初始生成器模型和初始判别器模型的建立也可以采用其它神经网络,这里不做具体限定。本实施例以初始生成器模型为递归神经网络和初始判别器模型为卷积神经网络为例进行说明。It should be understood that the initial generator model and subsequent initial discriminator models are all models constructed based on neural networks. Optionally, since the input data text is discrete data, the establishment of the initial generator model can use a recurrent neural network (RNN); in order to speed up the training of the neural network and reduce the amount of calculation, the establishment of the initial discriminator model can be used Product neural network (CNN). Optionally, the establishment of the initial generator model and the initial discriminator model can also use other neural networks, which are not specifically limited here. In this embodiment, the initial generator model is a recurrent neural network and the initial discriminator model is a convolutional neural network.
具体地,随机选取RNN的参数建立初始生成器模型,建立初始生成器模型后,将步骤S10获取的文本正样本输入至初始生成器模型中进行预训练,预训练之后就可以得到生成器模型,再根据生成器模型生成一些负样本作为第一文本负样本,以便对初始判别器模型进行预训练。应理解,初始生成器模型和生成器模型只是为了区分预训练前后的神经网络。可选地,服务端也可以从真实文本数据集中另外选取样本数据输入至初始生成器模型中进行预训练。Specifically, the parameters of the RNN are randomly selected to establish the initial generator model. After the initial generator model is established, the positive text samples obtained in step S10 are input into the initial generator model for pre-training, and the generator model can be obtained after pre-training. Then generate some negative samples as the first text negative samples according to the generator model, so as to pre-train the initial discriminator model. It should be understood that the initial generator model and the generator model are only for distinguishing the neural network before and after pre-training. Optionally, the server can also select additional sample data from the real text data set and input it into the initial generator model for pre-training.
S30:建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别模型中进行预训练,得到判别器模型。S30: Establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model.
具体地,服务端随机选取CNN的参数建立初始判别器模型,建立初始判别器模型后,将获取的文本正样本和第一文本负样本分别进行标注。示例性地,可以将文本正样本标注为1,第一文本负样本标注为0。再将标注后的文本正样本和第一文本负样本输入至初始判别器模型中进行预训练,得到判别器模型。其中,文本正样本和第一文本负样本都为N个,N的个数可以根据实际情况决定,样本越多,得到的判别器模型判别精度越高。采用CNN建立判别器模型,是由于CNN中可以设置适当的池化层,采用池化操作,可以防止判别器模型对数据的过拟合,也可以加快判别器模型训练的速度,减少运算量。Specifically, the server randomly selects the parameters of the CNN to establish an initial discriminator model. After the initial discriminator model is established, the obtained positive text samples and the first negative text samples are respectively labeled. Exemplarily, the positive sample of the text may be labeled as 1, and the negative sample of the first text may be labeled as 0. Then the labeled positive text samples and the first negative text samples are input into the initial discriminator model for pre-training to obtain the discriminator model. Among them, there are N positive text samples and the first negative text samples. The number of N can be determined according to the actual situation. The more samples, the higher the discrimination accuracy of the obtained discriminator model. Using CNN to build the discriminator model is because an appropriate pooling layer can be set in the CNN, and the pooling operation can prevent the discriminator model from overfitting the data, speed up the discriminator model training, and reduce the amount of calculation.
S40:基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型。S40: Generate a test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient.
其中,测试文本的奖励值是指判别器模型输出的数值。Among them, the reward value of the test text refers to the value output by the discriminator model.
具体地,服务端采用生成器模型生成测试文本,然后将测试文本输入至判别器模型中,获取判别器模型输出的数值作为奖励值。为了使生成器模型不断改进,这里生成器模型采用的是强化学习((Reinforcement Learning,简称RL)中的策略梯度(Policy Gradient),即当判别器模型对测试文本的输出数值比较高时,增加生成器模型中RNN相应动作的概率;当判别器模型对测试文本的输出数值比较低时,减少生成器模型中RNN相应动作的概率。应理解,这里判别器模型输出数值的高低是一个相对的概念,不同的训练阶段数值的高低是不一样的,可以根据经验进行预置,例如,刚开始训练时生成器模型的输出比较差,则可以将判别器模型输出高于0.3设定为比较高的数值,将低于0.2设定为比较低的数值;到训练后期,则可以将判别器模型输出高于0.4作为比较高的数值,将低于0.3设定为比较低的数值。Specifically, the server uses the generator model to generate the test text, then inputs the test text into the discriminator model, and obtains the value output by the discriminator model as the reward value. In order to continuously improve the generator model, the generator model here uses the Policy Gradient in Reinforcement Learning (RL), that is, when the output value of the discriminator model to the test text is relatively high, increase The probability of the corresponding action of the RNN in the generator model; when the output value of the discriminator model to the test text is relatively low, reduce the probability of the corresponding action of the RNN in the generator model. It should be understood that the level of the output value of the discriminator model is relative Concept, the value of different training stages is different, you can preset according to experience, for example, when the output of the generator model is relatively poor at the beginning of training, you can set the output of the discriminator model to be higher than 0.3 Set the value lower than 0.2 as a relatively low value; in the later stage of training, you can set the discriminator model output higher than 0.4 as a relatively high value, and set lower than 0.3 as a relatively low value.
具体地,根据测试文本的奖励值计算生成器的策略梯度,最后用计算出来的策略梯度更新生成器模型。用以下公式表示:Specifically, the strategy gradient of the generator is calculated according to the reward value of the test text, and finally the generator model is updated with the calculated strategy gradient. Expressed by the following formula:
Figure PCTCN2019116941-appb-000001
Figure PCTCN2019116941-appb-000001
其中,
Figure PCTCN2019116941-appb-000002
是指求策略梯度,J(θ)是指生成器模型的目标函数,E指求期望值,G θ是指生成器模型,Y 1:t-1~G θ是指生成器模型生成的文本Y服从概率分布G θ,G θ(y t|Y 1:t-1)是指在生成器模型下y t出现在Y 1:t-1的概率,D φ是指判别器模型,
Figure PCTCN2019116941-appb-000003
是指生成器模型G θ生成的文本在判别器模型D φ得到的奖励。上述梯度中的期望可以用采样近似,然后更新生成器模型G θ参数θ:
among them,
Figure PCTCN2019116941-appb-000002
Refers to the strategy gradient, J(θ) refers to the objective function of the generator model, E refers to the expected value, G θ refers to the generator model, Y 1:t-1 ~G θ refers to the text Y generated by the generator model Obey the probability distribution G θ , G θ (y t |Y 1:t-1 ) refers to the probability that y t appears in Y 1:t-1 under the generator model, D φ refers to the discriminator model,
Figure PCTCN2019116941-appb-000003
It refers to the reward of the text generated by the generator model G θ in the discriminator model D φ . The expectation in the above gradient can be approximated by sampling, and then update the generator model G θ parameter θ:
Figure PCTCN2019116941-appb-000004
Figure PCTCN2019116941-appb-000004
其中,α h是指隐藏层的学习率。 Among them, α h refers to the learning rate of the hidden layer.
S50:根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型。S50: Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy.
具体地,服务端采用更新后的生成器模型生成一些文本作为第二文本负样本,再将第二文本负样本和文本正样本分别标注后输入至判别器模型中进行训练。其中,第二文本负样本标注为0,文本正样本标注为1,应理解,这里的文本正样本与前面训练用的文本正样本可以为相同的样本,也可以从真实文本数据集中抽取另外的样本数据作为文本正样本。Specifically, the server uses the updated generator model to generate some texts as the second text negative samples, and then respectively label the second text negative samples and the text positive samples and input them into the discriminator model for training. Among them, the second text negative sample is labeled as 0, and the text positive sample is labeled as 1. It should be understood that the text positive sample and the previous training text positive sample can be the same sample, or another sample can be extracted from the real text data set. The sample data is used as a positive sample of the text.
应理解,判别器模型的训练的目的是当输入的是真实文本数据时,输出的数值越接近1越好;当输入的是生成器生成的文本时,输出的数值越接近0越好,从而在给定一个任意样本时可以输出一个准确的数值。具体地,用以下的最小化交叉熵可以得到预训练后的判别器参数:It should be understood that the purpose of the training of the discriminator model is that when the input is real text data, the output value is as close to 1 as possible; when the input is text generated by the generator, the output value is as close to 0 as possible. It can output an accurate value when given an arbitrary sample. Specifically, the following minimizing cross entropy can be used to obtain the pre-trained discriminator parameters:
Figure PCTCN2019116941-appb-000005
Figure PCTCN2019116941-appb-000005
其中,判别器D φ(Y)返回的是样本Y属于真实样本的概率,是一个属于[0,1]的数。Y~p data表示Y服从概率分布p data,p data是指真实文本数据集服从的概率分布。Y~G θ表示Y服从概率分布G θ,E指求期望值;最小化交叉熵可以使得上式第一部分和第二部分尽可能大,即真实数据的概率尽可能大,生成数据的概率尽可能小。 Among them, the discriminator D φ (Y) returns the probability that the sample Y belongs to the real sample, which is a number belonging to [0,1]. Y~p data indicates that Y obeys the probability distribution p data , and p data refers to the probability distribution obeyed by the real text data set. Y~G θ means that Y obeys the probability distribution G θ , and E means the expected value; minimizing cross entropy can make the first part and the second part of the above formula as large as possible, that is, the probability of real data is as large as possible, and the probability of generating data is as large as possible small.
根据最小化交叉熵可以更新判别器模型的参数,更新判别器模型。其中,在更新判别器模型时,是在固定生成器模型的基础上的,而更新判别器模型的次数可以为多次,具体根据实际情况进行设定,这里不做具体限定。According to the minimized cross entropy, the parameters of the discriminator model can be updated, and the discriminator model can be updated. Among them, when the discriminator model is updated, it is based on the fixed generator model, and the number of times to update the discriminator model can be multiple times, which is specifically set according to the actual situation, and no specific limitation is made here.
S60:交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型。S60: alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence.
具体地,服务端交替更新生成器模型和判别器模型,即当判别器模型未收敛时,重复更新生成器模型和判别器模型,让生成器模型和判别器模型不断对抗训练。其中,更新时先更新生成器模型,判别器模型保持不变;然后保持生成器模型不变,更新判别器模型。即让判别器模型的参数固定,训练生成器模型;然后让生成器模型的参数固定,训练判别器模型;不断重复这个过程,直到判别器模型的输出收敛。若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型。其中,输出收敛是指判别器对于给定样本(正样本或者负样本)输出的数值接近0.5,则认为判别器无法分辨正负样本,服务端判定判别器的输出收敛,再根据收敛时的生成器模型可以得到最终的文本生成模型。Specifically, the server alternately updates the generator model and the discriminator model, that is, when the discriminator model does not converge, the generator model and the discriminator model are repeatedly updated, so that the generator model and the discriminator model continue to fight against training. Among them, when updating, the generator model is updated first, and the discriminator model remains unchanged; then the generator model is kept unchanged, and the discriminator model is updated. That is, let the parameters of the discriminator model be fixed, train the generator model; then let the parameters of the generator model be fixed, train the discriminator model; repeat this process until the output of the discriminator model converges. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence. Among them, output convergence means that the value of the discriminator's output for a given sample (positive sample or negative sample) is close to 0.5, then the discriminator is considered to be unable to distinguish between positive and negative samples, and the server determines that the output of the discriminator has converged, and then generates according to the convergence The machine model can get the final text generation model.
S70:获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本。S70: Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
其中,待识别文本为文本生成模型的输入,目标文本为文本生成模型的输出。可以理解,待识别文本和目标文本是与真实文本数据集对应的,即如果用诗的数据集来训练文本生成模型,那么文本生成模型对应的待识别文本和目标文本也为诗;如果用对话的数据集来训练文本生成模型,那么文本生成模型对应的待识别文本和目标文本也为对话。可选地,待识别文本和目标文本还可以是问题答案、演讲稿或短文等。Among them, the text to be recognized is the input of the text generation model, and the target text is the output of the text generation model. It can be understood that the text to be recognized and the target text correspond to the real text data set, that is, if the data set of poetry is used to train the text generation model, the text to be recognized and the target text corresponding to the text generation model are also poems; if you use dialogue To train the text generation model, the text to be recognized and the target text corresponding to the text generation model are also dialogues. Optionally, the to-be-recognized text and the target text may also be answers to questions, speech scripts, or short essays.
具体地,服务端通过客户端获取用户输入的待识别文本,然后将待识别文本输入至文本生成模型中,由文本生成模型生成目标文本,服务端再将目标文本输出至客户端。例如,服务端通过客户端获取用户输入的对话中的上文,如“今天的天气怎么样?”,然后服务端将对话中的上文输入至文本生成模型中,由文本生成模型生成与上文对应的下文的目标文本,如:“今天的天气很好!”或“根据天气预报,今天会下雨。”等,从而可以形成相应的对话,最后服务端将目标文本输出至客户端。Specifically, the server obtains the to-be-recognized text input by the user through the client, and then inputs the to-be-recognized text into the text generation model, the text generation model generates the target text, and the server then outputs the target text to the client. For example, the server obtains the above text in the dialog input by the user through the client, such as "How is the weather today?", and then the server inputs the above text in the dialog into the text generation model, and the text generation model generates and uploads The text corresponds to the target text below, such as: "Today's weather is very good!" or "According to the weather forecast, it will rain today." etc., so as to form a corresponding dialogue, and finally the server will output the target text to the client.
在图2对应的实施例中,通过获取真实文本数据集,从真实文本数据集中获取文本正样本;建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本;建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别模型中进行预训练,得到判别器模型;基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型;根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型;交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型;获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本。通过构建生成器模型和判别器模型,再让生成器模型与判别器模型不断对抗,不断自我改进,可以快速构建文本生成模型,并且生成文本的准确性高,提高了文本生成模型的构建效率和生成文本的精度。In the embodiment corresponding to FIG. 2, by obtaining the real text data set, the positive text samples are obtained from the real text data set; the initial generator model is established, and the positive text samples are input to the initial generator model for pre-training to obtain the generator model , And generate the first negative text sample according to the generator model; establish an initial discriminator model, and input the positive text sample and the first negative text sample into the initial discriminant model for pre-training to obtain the discriminator model; generate tests based on the generator model Text, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient; generate the second negative text sample according to the updated generator model, Input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy; alternately update the generator model and the discriminator model, if the output of the discriminator model converges, then according to the convergence time The generator model obtains a text generation model; obtains the text to be recognized, inputs the text to be recognized into the text generation model, and generates target text based on the text generation model. By constructing the generator model and the discriminator model, and then constantly confronting the generator model and the discriminator model, and continuously improving themselves, the text generation model can be quickly constructed, and the accuracy of the generated text is high, which improves the construction efficiency of the text generation model. The precision of the generated text.
在一实施例中,如图4所示,在步骤S20中,即建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本,具体可以包括以下步骤:In one embodiment, as shown in FIG. 4, in step S20, an initial generator model is established, the positive text samples are input to the initial generator model for pre-training, the generator model is obtained, and the first generator model is generated according to the generator model. A negative sample of text can specifically include the following steps:
S21:将初始生成参数输入递归神经网络建立初始生成器模型。S21: Input the initial generation parameters into the recurrent neural network to establish an initial generator model.
可选地,初始生成参数可以为随机选取的递归神经网络(RNN)的参数。即在预训练之前,可以随机选取参数输入到RNN中,得到初始生成器模型。Optionally, the initial generation parameters may be randomly selected recurrent neural network (RNN) parameters. That is, before the pre-training, the parameters can be randomly selected and input into the RNN to obtain the initial generator model.
S22:将文本正样本输入至初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数。S22: Input the positive text sample into the initial generator model for pre-training, and convert it into a probability output according to the probability distribution function to obtain pre-trained parameters.
具体地,服务端将文本正样本输入至初始生成器模型中进行预训练,文本正样本例如是(x 1,x 2,···x T),首先将(x 1,x 2,···x T)在RNN中递归映射到隐藏状态(h 1,h 2,···h T),其中隐藏状态是指递归神经网络的隐藏层(hidden layers)的输入参数,同时也是一个神经元的输出参数,用以下公式表示: Specifically, the server inputs the positive text samples into the initial generator model for pre-training. For example, the positive text samples are (x 1 ,x 2 ,···x T ), and first (x 1 ,x 2 ,·· · x T) in the recursive mapping RNN to hidden (h 1, h 2, ··· h T), wherein the input means is hidden parameter in the hidden layer recurrent neural networks (hidden layers), but also a neuron The output parameters are expressed by the following formula:
h t=g(h t-1,x t)=σ(Wx t+Uh t-1) h t =g(h t-1 ,x t )=σ(Wx t +Uh t-1 )
其中,W是权重矩阵,U是h t-1的隐藏状态(或称为过渡矩阵)。σ可以是sigmoid函数或双曲正切函数(tanh),σ可以视具体情况而定。 Among them, W is the weight matrix, and U is the hidden state of h t-1 (or called the transition matrix). σ can be a sigmoid function or a hyperbolic tangent function (tanh), and σ can be determined according to specific circumstances.
然后,用概率分布函数转化为输出概率,可选地,概率分布函数可以用soft max函数,用以下公式进行表示:Then, use the probability distribution function to convert to the output probability. Optionally, the probability distribution function can be a soft max function, expressed by the following formula:
P(y t|x 1,x 2,···x t)=z(h t)=soft max(c+Vh t) P(y t |x 1 ,x 2 ,···x t )=z(h t )=soft max(c+Vh t )
其中,以上公式是指在已知(x 1,x 2,···x T)的情况下,RNN的输出y t的分布为soft max(c+Vh t),z(h t)是指需要一个h t的函数z将输出转换成概率的形式,其输出值属于[0,1],这个函数z可以取为soft max函数。 Among them, the above formula means that when (x 1 , x 2 , ···x T ) is known, the distribution of the output y t of the RNN is soft max (c+Vh t ), and z(h t ) means A function z of h t is needed to convert the output into the form of probability. The output value belongs to [0,1]. This function z can be taken as the soft max function.
具体地,服务端将文本正样本输入至初始生成器模型的RNN中进行预训练后,可以得到预训练后的参数c和V。Specifically, after the server inputs the positive text samples into the RNN of the initial generator model for pre-training, the pre-trained parameters c and V can be obtained.
S23:根据预训练后的参数更新初始生成器模型的参数,得到生成器模型。S23: Update the parameters of the initial generator model according to the pre-trained parameters to obtain the generator model.
具体地,根据预训练后得到的参数c和V更新初始生成器模型原来的初始生成参数,得到生成器模型。可以理解,生成器模型可以用G θ来进行表示,由参数c和V可以得到生成器模型G θ的模型参数θ。得到生成器模型G θ之后,就可以从真实文本数据集中抽取一定样本数据输入至生成器模型G θ中,生成第一文本负样本。 Specifically, the original initial generation parameters of the initial generator model are updated according to the parameters c and V obtained after the pre-training to obtain the generator model. It will be appreciated, the model generator G θ can be expressed by the parameter c can be obtained and the model parameter V [theta] of the generator model G θ. After the generator model G θ is obtained, certain sample data can be extracted from the real text data set and input into the generator model G θ to generate the first negative text sample.
在图4对应的实施例中,通过将初始生成参数输入递归神经网络建立初始生成器模型,然后将文本正样本输入至初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;最后根据预训练后的参数更新初始生成器模型的参数,得到生成 器模型。通过递归神经网络建立生成器模型,可以结合文本生成为离散型数据的特点,使最终生成文本模型输出文本的效率更高;另外,首先预训练生成器模型,可以通过预训练后的生成器模型生成一些负样本,实现对判别器模型的预训练。In the embodiment corresponding to Fig. 4, the initial generator model is established by inputting the initial generation parameters into the recurrent neural network, and then the positive text samples are input into the initial generator model for pre-training, and the probability distribution function is converted into probability output. Obtain the pre-trained parameters; finally update the parameters of the initial generator model according to the pre-trained parameters to obtain the generator model. Building a generator model through a recursive neural network can combine the characteristics of text generation as discrete data, so that the final text model output text is more efficient; in addition, the generator model can be pre-trained first, and the pre-trained generator model can be used Generate some negative samples to achieve pre-training of the discriminator model.
在一实施例中,如图5所示,在步骤S30中,即建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别器模型进行预训练,得到判别器模型,具体可以包括以下步骤:In one embodiment, as shown in FIG. 5, in step S30, an initial discriminator model is established, and the positive text samples and the first negative text samples are input to the initial discriminator model for pre-training, and the discriminator model is obtained. It can include the following steps:
S31:将初始判别参数输入至卷积神经网络建立初始判别器模型。S31: Input the initial discriminating parameters into the convolutional neural network to establish an initial discriminator model.
可选地,初始判别参数可以为随机选取的卷积神经网络(CNN)的参数,即在预训练之前,可以随机选取参数输入至CNN中,得到初始判别器模型。Optionally, the initial discriminating parameter may be a randomly selected convolutional neural network (CNN) parameter, that is, before pre-training, the randomly selected parameter may be input to the CNN to obtain the initial discriminator model.
S32:将文本正样本与第一文本负样本输入至初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新初始判别器的初始判别参数,得到预训练后的判别参数。S32: Input the text positive sample and the first text negative sample into the initial discriminator model for pre-training, transform it into a probability output according to the probability distribution function, and update the initial discriminating parameters of the initial discriminator according to the minimized cross entropy to obtain pre-training After the discriminant parameters.
具体地,对训练样本进行标注,即将文本正样本标注为1,第一文本负本样标注为0。Specifically, the training samples are labeled, that is, the positive text samples are labeled as 1, and the negative text samples are labeled as 0.
首先,将文本正样本例如是(x 1,x 2,···x T)输入至初始判别模型的CNN中,CNN用卷积核ω∈R l×k作用于文本正样本,得到文本正样本的特征,用以下公式表示: First, input the positive sample of text, such as (x 1 , x 2 ,...x T ) into the CNN of the initial discriminant model, and CNN uses the convolution kernel ω ∈ R l×k to act on the positive sample of the text to obtain the positive The characteristics of the sample are expressed by the following formula:
Figure PCTCN2019116941-appb-000006
Figure PCTCN2019116941-appb-000006
其中,卷积核ω∈R l×k表示卷积核为一个l×k的实矩阵,ε i:i+l-1指的是文本正样本中的的第i到第i+l-1行,也是一个l×k的实矩阵,b是要求的参数,是一个实数,
Figure PCTCN2019116941-appb-000007
是指矩阵中对应元素乘积的和。
Among them, the convolution kernel ω∈R l×k indicates that the convolution kernel is a real matrix of l×k, and ε i:i+l-1 refers to the i-th to i+l-1 in the positive sample of the text Row is also a real matrix of l×k, b is the required parameter, which is a real number,
Figure PCTCN2019116941-appb-000007
Refers to the sum of the products of corresponding elements in the matrix.
然后用最大池化(Max pooling)进行池化:Then use Max pooling for pooling:
Figure PCTCN2019116941-appb-000008
Figure PCTCN2019116941-appb-000008
其中,上述池化是指提取的文本正样本的特征c i取最大值。可选地,这里也可以采用均值池化,具体不做限定。 Among them, the above pooling refers to the feature c i of the extracted text positive samples taking the maximum value. Optionally, average pooling can also be used here, which is not specifically limited.
经过一定数量的卷积和池化操作后,经过全连接层(fully connected layers,FC),亦即输出层,用sigmoid函数转化为概率输出。After a certain number of convolution and pooling operations, it passes through a fully connected layer (FC), that is, an output layer, and is converted into a probability output using a sigmoid function.
同样地,将标注为0的第一文本负样本输入至CNN中,经过相同的过程,最后用sigmoid函数转化为概率输出。Similarly, the first negative text sample marked as 0 is input into the CNN, and after the same process, the sigmoid function is used to convert it into a probability output.
最后,经过文本正样本和第一文本负样本的预训练后,可以得到预训练后的判别参数,即ω和b。Finally, after the pre-training of the text positive sample and the first text negative sample, the pre-training discriminant parameters, namely ω and b, can be obtained.
可选地,为了使判别器模型得到一个好的效果,在最大池化得到
Figure PCTCN2019116941-appb-000009
以后,可以采用高速神经网络对判别器模型进行训练,其中高速神经网络可以通过以下公式进行计算:
Optionally, in order to get a good effect on the discriminator model, get
Figure PCTCN2019116941-appb-000009
In the future, a high-speed neural network can be used to train the discriminator model. The high-speed neural network can be calculated by the following formula:
Figure PCTCN2019116941-appb-000010
Figure PCTCN2019116941-appb-000010
Figure PCTCN2019116941-appb-000011
Figure PCTCN2019116941-appb-000011
其中,τ是指生成文本的一组行为序列,W T,b T和W H是高速层的权重,H是一个仿射变换再加一个非线性激活函数(例如线性整流函数ReLU),记线性整流函数为f,则
Figure PCTCN2019116941-appb-000012
最后用sigmoid函数转化为概率输出。
Among them, τ refers to a set of behavior sequences that generate text, W T , b T and W H are the weights of the high-speed layer, H is an affine transformation plus a nonlinear activation function (such as linear rectification function ReLU), denoted as linear The rectification function is f, then
Figure PCTCN2019116941-appb-000012
Finally, the sigmoid function is used to convert into probability output.
Figure PCTCN2019116941-appb-000013
Figure PCTCN2019116941-appb-000013
其中,W 0和b 0是判别器的输出层的的权重和偏差。 Among them, W 0 and b 0 are the weights and deviations of the output layer of the discriminator.
S33:根据预训练后的判别参数更新初始判别器模型的参数,得到判别器模型。S33: Update the parameters of the initial discriminator model according to the discriminant parameters after pre-training to obtain the discriminator model.
具体地,根据预训练后的判别参数ω和b更新初始判别器模型的参数,得到判别器模型。可以理解,判别器模型可以用D φ来进行表示,其中,由参数ω和b可以得到判别器模型的 参数φ。得到判别器模型后,就可以进行生成器模型与判别器模型的对抗训练,交替更新生成器模型与判别器模型,直到模型收敛,得到最终的文本生成模型。 Specifically, the parameters of the initial discriminator model are updated according to the discriminant parameters ω and b after pre-training to obtain the discriminator model. It can be understood that the discriminator model can be represented by D φ , where the parameters φ of the discriminator model can be obtained from the parameters ω and b. After obtaining the discriminator model, you can conduct confrontation training between the generator model and the discriminator model, and alternately update the generator model and the discriminator model until the model converges to obtain the final text generation model.
在图5对应的实施例中,通过将初始判别参数输入至卷积神经网络建立初始判别器模型;然后将文本正样本与第一文本负样本输入至初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新初始判别器的初始判别参数,得到预训练后的判别参数;最后预训练后的判别参数更新初始判别器模型的参数,得到判别器模型。通过生成器模型生成的负样本与文本正样本训练初始判别器模型,可以得到判别器模型,在得到判别器模型之后,就可以使生成器模型与判别器模型进行对抗训练,从而最终生成文本生成模型。In the embodiment corresponding to FIG. 5, the initial discriminator model is established by inputting the initial discriminant parameters into the convolutional neural network; then the positive text samples and the first negative text samples are input into the initial discriminator model for pre-training, according to the probability The distribution function is transformed into a probability output, and the initial discriminant parameters of the initial discriminator are updated according to the minimized cross entropy to obtain the discriminant parameters after pre-training; the discriminant parameters after the final pre-training update the parameters of the initial discriminator model to obtain the discriminator model. The initial discriminator model is trained through the negative samples generated by the generator model and the positive text samples, and the discriminator model can be obtained. After the discriminator model is obtained, the generator model and the discriminator model can be trained against each other to finally generate text generation model.
在一实施例中,如图6所示,在步骤S40中,即基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型,具体可以包括以下步骤:In one embodiment, as shown in FIG. 6, in step S40, the test text is generated based on the generator model, the test text is input into the discriminator model to obtain the reward value of the test text, and the generator model is calculated according to the reward value. Gradient, and update the generator model according to the gradient, which can specifically include the following steps:
S41:获取生成测试文本过程中的文本作为测试子文本。S41: Obtain the text in the process of generating the test text as the test sub-text.
可以理解,生成器模型在生成测试文本的过程中会有很多中间步骤,例如,若最终生成的文本为“床前明月光”,那么生成器模型会生成“床”,“床前”,“床前明”……等这些过程中的文本,服务端可以获取这些过程中的测试文本作为测试子文本。It can be understood that the generator model will have many intermediate steps in the process of generating test text. For example, if the final generated text is "Moonlight in front of the bed", then the generator model will generate "bed", "before the bed", " "Before the bed"... Wait for the text in these processes, and the server can obtain the test text in these processes as the test sub-text.
S42:根据每一测试子文本采用蒙特卡洛搜索方式生成M个假设文本。S42: Use Monte Carlo search to generate M hypothetical texts according to each test sub-text.
其中,蒙特卡洛搜索方式(Monte Carlo method)是指使用随机数(或更常见的伪随机数)来解决计算问题的方法。Among them, the Monte Carlo search method (Monte Carlo method) refers to the use of random numbers (or more commonly pseudo-random numbers) to solve calculation problems.
应理解,由于判别器模型只能判断一整句话的真伪,因此,当生成器模型生成的测试文本的过程中,需要获取测试子文本的奖励值,以便生成器模型的学习和梯度的计算。具体地,服务端用蒙特卡洛搜索方式根据测试子文本生成N个假设文本,再将N个假设文本输入至判别器模型中获取奖励值,将这些奖励值的均值作为测试子文本的奖励值。具体地,用蒙特卡洛搜索方式生成N个假设文本可以用以下式子表示:It should be understood that since the discriminator model can only judge the authenticity of a whole sentence, when the generator model generates the test text, it is necessary to obtain the reward value of the test sub-text, so that the generator model can learn and gradient Calculation. Specifically, the server uses the Monte Carlo search method to generate N hypothetical texts according to the test sub-texts, and then input the N hypothetical texts into the discriminator model to obtain the reward value, and use the average of these reward values as the reward value of the test sub-text . Specifically, using Monte Carlo search to generate N hypothetical texts can be expressed by the following formula:
Figure PCTCN2019116941-appb-000014
Figure PCTCN2019116941-appb-000014
以上式子表示在给定测试子文本Y 1:t的情况下,用蒙特卡洛搜索方式生成N个假设文本。其中,用蒙特卡洛搜索要遵循一个概率分布,这个概率分布就是G β,这里令G β=G θ,即可以采用蒙特卡洛搜索方式生成N个假设文本。 The above formula means that N hypothetical texts are generated by Monte Carlo search method under the condition of given test sub-text Y 1:t . Among them, the Monte Carlo search must follow a probability distribution, and this probability distribution is G β , where G β =G θ , that is, N hypothetical texts can be generated using the Monte Carlo search method.
S43:将M个假设文本输入至判别器模型中,获取M个假设文本的奖励均值作为测试子文本的奖励值,并将测试文本输入至判别器模型中,获取测试文本的奖励值。S43: Input M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model to obtain the reward value of the test text.
具体地,可以用以下公式计算测试子文本和测试文本的奖励值:Specifically, the following formula can be used to calculate the test sub-text and the reward value of the test text:
Figure PCTCN2019116941-appb-000015
Figure PCTCN2019116941-appb-000015
其中,判别器模型D φ(Y)返回的是测试样本Y属于真实样本的概率,是一个属于[0,1]的数;T时刻指整首诗生成完毕,因此T时候的奖励值可以直接由判别器给出。而t=1:T-1时刻(即t从1到T-1时刻)的奖励值,需要用蒙特卡洛搜索模拟的方式给出。在t时刻的测试子文本为Y 1:t-1,,然后用蒙特卡洛搜索方式进行N次得到N个假设文本Y 1:T,用这N个假设文本的奖励值的平均值作为t时刻的奖励值。这样,由于每个中间步骤都定义了奖励值,便可以用强化学习(RL)中进行训练生成器模型。 Among them, the discriminator model D φ (Y) returns the probability that the test sample Y belongs to the real sample, which is a number belonging to [0,1]; T time refers to the completion of the entire poem, so the reward value at T can be directly Given by the discriminator. And t=1: the reward value at time T-1 (that is, t from 1 to time T-1) needs to be given by Monte Carlo search simulation. The test sub-text at time t is Y 1:t-1 ,, and then use Monte Carlo search for N times to obtain N hypothetical texts Y 1:T , and use the average of the reward values of these N hypothetical texts as t The reward value at the moment. In this way, since each intermediate step defines a reward value, the generator model can be trained in reinforcement learning (RL).
S44:根据测试子文本的奖励值和测试文本的奖励值计算生成器模型的梯度,并根据梯度更新生成器模型的参数,得到更新后的生成器模型。S44: Calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain the updated generator model.
具体地,在获得测试子文本的奖励值和测试文本的奖励值后,可以用以下公式计算生成 器模型的策略梯度:Specifically, after obtaining the reward value of the test sub-text and the reward value of the test text, the following formula can be used to calculate the policy gradient of the generator model:
Figure PCTCN2019116941-appb-000016
Figure PCTCN2019116941-appb-000016
上述梯度中的期望E可以用采样近似,然后更新生成器模型的参数θ为The expected E in the above gradient can be approximated by sampling, and then the parameter θ of the generator model is updated as
Figure PCTCN2019116941-appb-000017
Figure PCTCN2019116941-appb-000017
然后,服务端根据更新后的生成器模型的参数得到更新后的生成器模型,再用更新后的生成器模型去更新判别器模型,交替更新生成器模型和判别器模型,直到判别器模型收敛,最后根据收敛时的生成器模型得到文本生成模型。其中,在更新生成器模型时,是在固定的判别器模型的基础上进行的,而更新生成器模型的参数的次数可以根据实际情况进行设定,这里不做具体限定。Then, the server obtains the updated generator model according to the parameters of the updated generator model, and then uses the updated generator model to update the discriminator model, alternately updating the generator model and the discriminator model, until the discriminator model converges , And finally get the text generation model according to the generator model at the time of convergence. Among them, when the generator model is updated, it is performed on the basis of a fixed discriminator model, and the number of times to update the parameters of the generator model can be set according to the actual situation, which is not specifically limited here.
在图6对应的实施例中,通过获取生成测试文本过程中的文本作为测试子文本,根据每一测试子文本采用蒙特卡洛搜索方式生成M个假设文本;然后将M个假设文本输入至判别器模型中,获取M个假设文本的奖励均值作为测试子文本的奖励值,并将测试文本输入至判别器模型中,获取测试文本的奖励值;最后根据测试子文本的奖励值和测试文本的奖励值计算生成器模型的梯度,并根据梯度更新生成器模型的参数,得到更新后的生成器模型。通过采用蒙特卡洛搜索方式,使生成器模型生成的中间文本可以得到相应地奖励值,从而可以使用强化学习进行训练生成器模型,提高生成器模型的训练效率。In the embodiment corresponding to FIG. 6, by obtaining the text in the process of generating the test text as the test sub-text, according to each test sub-text, the Monte Carlo search method is used to generate M hypothetical texts; then the M hypothetical texts are input to the discrimination In the model, the average value of the reward of M hypothetical texts is obtained as the reward value of the test sub-text, and the test text is input into the discriminator model to obtain the reward value of the test text; finally, according to the reward value of the test sub-text and the test text The reward value calculates the gradient of the generator model, and updates the parameters of the generator model according to the gradient to obtain the updated generator model. By using the Monte Carlo search method, the intermediate text generated by the generator model can be rewarded accordingly, so that reinforcement learning can be used to train the generator model, and the training efficiency of the generator model can be improved.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种文本生成装置,该文本生成装置与上述实施例中文本生成方法一一对应。如图7所示,该文本生成装置包括文本正样本获取模块10、生成器模型获取模块20、判别器模型获取模块30、生成器模型更新模块40、判别器模型更新模块50和文本生成模型获取模块60。各功能模块详细说明如下:In one embodiment, a text generation device is provided, and the text generation device corresponds to the text generation method in the above-mentioned embodiment one-to-one. As shown in FIG. 7, the text generation device includes a text positive sample acquisition module 10, a generator model acquisition module 20, a discriminator model acquisition module 30, a generator model update module 40, a discriminator model update module 50, and a text generation model acquisition module. Module 60. The detailed description of each functional module is as follows:
文本正样本获取模块10,用于获取真实文本数据集,从真实文本数据集中获取文本正样本;The text positive sample obtaining module 10 is used to obtain a real text data set, and obtain a text positive sample from the real text data set;
生成器模型获取模块20,用于建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本;The generator model acquisition module 20 is used to establish an initial generator model, input positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model;
判别器模型获取模块30,用于建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别模型中进行预训练,得到判别器模型;The discriminator model acquisition module 30 is used to establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model;
生成器模型更新模块40,用于基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型;The generator model update module 40 is used to generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient ;
判别器模型更新模块50,用于根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型;The discriminator model update module 50 is configured to generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy;
文本生成模型获取模块60,用于交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型;The text generation model acquisition module 60 is used to alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence;
目标文本生成模块70,用于获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本。The target text generation module 70 is configured to obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
进一步地,文本正样本获取模块10还用于:Further, the text positive sample acquisition module 10 is also used for:
从真实文本数据集中选取N个文本数据,N为正整数;Select N text data from the real text data set, where N is a positive integer;
将N个文本数据用词向量模型转化为向量形式,将转化为向量形式的N个文本数据作为文本正样本。The N text data are converted into a vector form using the word vector model, and the N text data converted into the vector form are used as the text positive samples.
进一步地,如图8所示,生成器模型获取模块20包括初始生成模型建立单元21、初始生成模型预训练单元22和生成器模型获取单元23。Further, as shown in FIG. 8, the generator model acquisition module 20 includes an initial generation model establishment unit 21, an initial generation model pre-training unit 22 and a generator model acquisition unit 23.
初始生成模型建立单元21,用于将初始生成参数输入递归神经网络建立初始生成器模型;The initial generation model establishment unit 21 is configured to input initial generation parameters into the recurrent neural network to establish an initial generator model;
初始生成模型预训练单元22,用于将文本正样本输入至初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;The initial generation model pre-training unit 22 is used to input positive text samples into the initial generator model for pre-training, and convert it into a probability output according to the probability distribution function to obtain pre-trained parameters;
生成器模型获取单元23,用于根据预训练后的参数更新初始生成器模型的参数,得到生成器模型。The generator model obtaining unit 23 is configured to update the parameters of the initial generator model according to the pre-trained parameters to obtain the generator model.
进一步,如图9所示,判别器模型获取模块30包括初始判别模型建立单元31、初始判别模型预训练单元32和判别器模型获取单元33。Further, as shown in FIG. 9, the discriminator model acquisition module 30 includes an initial discriminant model establishment unit 31, an initial discriminant model pre-training unit 32 and a discriminator model acquisition unit 33.
初始判别模型建立单元31,用于将初始判别参数输入至卷积神经网络建立初始判别器模型;The initial discriminant model establishment unit 31 is used to input initial discriminant parameters into the convolutional neural network to establish an initial discriminator model;
初始判别模型预训练单元32,用于将文本正样本与第一文本负样本输入至初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新初始判别器的初始判别参数,得到预训练后的判别参数;The initial discriminant model pre-training unit 32 is used to input the positive text sample and the first negative text sample into the initial discriminator model for pre-training, convert it into a probability output according to the probability distribution function, and update the initial discriminator according to the minimized cross entropy The initial discriminant parameters of, and the discriminant parameters after pre-training are obtained;
判别器模型获取单元33,用于根据预训练后的判别参数更新初始判别器模型的参数,得到判别器模型。The discriminator model acquisition unit 33 is configured to update the parameters of the initial discriminator model according to the discriminant parameters after pre-training to obtain the discriminator model.
进一步地,生成器模型更新模块40还用于:Further, the generator model update module 40 is also used to:
获取生成测试文本过程中的文本作为测试子文本;Obtain the text in the process of generating the test text as the test sub-text;
根据每一测试子文本采用蒙特卡洛搜索方式生成M个假设文本;Use Monte Carlo search method to generate M hypothetical texts according to each test sub-text;
将M个假设文本输入至判别器模型中,获取M个假设文本的奖励均值作为测试子文本的奖励值,并将测试文本输入至判别器模型中,获取测试文本的奖励值;Input M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model to obtain the reward value of the test text;
根据测试子文本的奖励值和测试文本的奖励值计算生成器模型的梯度,并根据梯度更新生成器模型的参数,得到更新后的生成器模型。The gradient of the generator model is calculated according to the reward value of the test sub-text and the reward value of the test text, and the parameters of the generator model are updated according to the gradient to obtain the updated generator model.
关于文本生成装置的具体限定可以参见上文中对于文本生成方法的限定,在此不再赘述。上述文本生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the text generation device, please refer to the above limitation of the text generation method, which will not be repeated here. Each module in the above-mentioned text generation device can be implemented in whole or in part by software, hardware, and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图10所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储真实文本数据集、文本正样本、文本负样本和词向量模型等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种文本生成方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 10. The computer equipment includes a processor, a memory, a network interface and a database connected by a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store real text data sets, text positive samples, text negative samples, and word vector models. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to realize a text generation method.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现以下步骤:In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
获取真实文本数据集,从真实文本数据集中获取文本正样本;Obtain real text data sets, and obtain positive text samples from real text data sets;
建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本;Establish an initial generator model, input positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model;
建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别模型中进行预训练,得到判别器模型;Establish an initial discriminator model, input the positive text sample and the first negative text sample into the initial discriminant model for pre-training to obtain the discriminator model;
基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient;
根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy;
交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型;Alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence;
获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
在一个实施例中,提供了一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:In one embodiment, one or more readable storage media storing computer readable instructions are provided, and when the computer readable instructions are executed by one or more processors, the one or more processors execute The following steps:
获取真实文本数据集,从真实文本数据集中获取文本正样本;Obtain real text data sets, and obtain positive text samples from real text data sets;
建立初始生成器模型,将文本正样本输入至初始生成器模型进行预训练,得到生成器模型,并根据生成器模型生成第一文本负样本;Establish an initial generator model, input positive text samples into the initial generator model for pre-training, obtain the generator model, and generate the first negative text sample according to the generator model;
建立初始判别器模型,将文本正样本与第一文本负样本输入至初始判别模型中进行预训练,得到判别器模型;Establish an initial discriminator model, input the positive text sample and the first negative text sample into the initial discriminant model for pre-training to obtain the discriminator model;
基于生成器模型生成测试文本,将测试文本输入至判别器模型中获取测试文本的奖励值,根据奖励值计算生成器模型的梯度,并根据梯度更新生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and update the generator model according to the gradient;
根据更新后的生成器模型生成第二文本负样本,将第二文本负样本与文本正样本输入至判别器模型中,根据最小化交叉熵更新判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross entropy;
交替更新生成器模型和判别器模型,若判别器模型的输出收敛,则根据收敛时的生成器模型得到文本生成模型;Alternately update the generator model and the discriminator model. If the output of the discriminator model converges, the text generation model is obtained according to the generator model at the time of convergence;
获取待识别文本,并将待识别文本输入至文本生成模型中,基于文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
其中,所述可读存储介质包括非易失性可读存储介质和易失性可读存储介质。Wherein, the readable storage medium includes a non-volatile readable storage medium and a volatile readable storage medium.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units, Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种文本生成方法,其特征在于,包括:A text generation method, characterized in that it comprises:
    获取真实文本数据集,从所述真实文本数据集中获取文本正样本;Obtaining a real text data set, and obtaining a positive text sample from the real text data set;
    建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;Establishing an initial generator model, inputting the positive text samples to the initial generator model for pre-training, obtaining a generator model, and generating a first negative text sample according to the generator model;
    建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;Establishing an initial discriminator model, and inputting the positive text samples and the first negative text samples into the initial discriminating model for pre-training to obtain the discriminator model;
    基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and calculate the gradient of the generator model according to the reward value. Gradient update the generator model;
    根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross-entropy;
    交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;Alternately updating the generator model and the discriminator model, and if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
    获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  2. 如权利要求1所述的文本生成方法,其特征在于,所述从所述真实文本数据集中获取文本正样本,包括:The text generation method according to claim 1, wherein said obtaining a positive text sample from said real text data set comprises:
    从所述真实文本数据集中选取N个文本数据,N为正整数;Select N text data from the real text data set, where N is a positive integer;
    将N个所述文本数据用词向量模型转化为向量形式,将转化为向量形式的N个所述文本数据作为文本正样本。The N pieces of text data are converted into a vector form using a word vector model, and the N pieces of text data converted into a vector form are used as text positive samples.
  3. 如权利要求1所述的文本生成方法,其特征在于,所述建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本,包括:The text generation method according to claim 1, wherein said establishing an initial generator model, inputting said positive samples of text to said initial generator model for pre-training, obtaining a generator model, and according to said The generator model generates the first negative sample of text, including:
    将初始生成参数输入递归神经网络建立初始生成器模型;Input the initial generation parameters into the recurrent neural network to establish the initial generator model;
    将所述文本正样本输入至所述初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;Inputting the positive text sample into the initial generator model for pre-training, and converting it into a probability output according to a probability distribution function, to obtain pre-trained parameters;
    根据所述预训练后的参数更新所述初始生成器模型的参数,得到生成器模型。The parameters of the initial generator model are updated according to the pre-trained parameters to obtain a generator model.
  4. 如权利要求1所述的文本生成方法,其特征在于,所述建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型,包括:The text generation method according to claim 1, wherein said establishing an initial discriminator model, inputting said positive text samples and said first negative text samples into said initial discriminant model for pre-training to obtain The discriminator model includes:
    将初始判别参数输入至卷积神经网络建立初始判别器模型;Input the initial discriminating parameters into the convolutional neural network to establish an initial discriminator model;
    将所述文本正样本与所述第一文本负样本输入至所述初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新所述初始判别器的初始判别参数,得到预训练后的判别参数;Input the positive text sample and the first negative text sample into the initial discriminator model for pre-training, transform it into a probability output according to the probability distribution function, and update the initial discriminator according to minimized cross entropy Discriminant parameters to obtain discriminant parameters after pre-training;
    根据所述预训练后的判别参数更新所述初始判别器模型的参数,得到判别器模型。The parameters of the initial discriminator model are updated according to the discriminant parameters after the pre-training to obtain the discriminator model.
  5. 如权利要求1所述的文本生成方法,其特征在于,所述基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型,包括:The text generation method according to claim 1, wherein the test text is generated based on the generator model, and the test text is input into the discriminator model to obtain the reward value of the test text, according to The reward value calculating the gradient of the generator model and updating the generator model according to the gradient includes:
    获取生成所述测试文本过程中的文本作为测试子文本;Acquiring the text in the process of generating the test text as the test sub-text;
    根据每一所述测试子文本采用蒙特卡洛搜索方式生成M个假设文本;According to each of the test sub-texts, M hypothetical texts are generated using a Monte Carlo search method;
    将M个所述假设文本输入至所述判别器模型中,获取M个所述假设文本的奖励均值作为所述测试子文本的奖励值,并将所述测试文本输入至所述判别器模型中,获取所述测试文本的奖励值;Input the M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model To obtain the reward value of the test text;
    根据所述测试子文本的奖励值和所述测试文本的奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型的参数,得到更新后的生成器模型。Calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain an updated generator model.
  6. 一种文本生成装置,其特征在于,包括:A text generating device, characterized in that it comprises:
    文本正样本获取模块,用于获取真实文本数据集,从所述真实文本数据集中获取文本正样本;A text positive sample acquisition module, used to acquire a real text data set, and acquire a positive text sample from the real text data set;
    生成器模型获取模块,用于建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;The generator model acquisition module is used to establish an initial generator model, input the positive text samples to the initial generator model for pre-training, obtain a generator model, and generate a first negative text sample according to the generator model ;
    判别器模型获取模块,用于建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;The discriminator model acquisition module is used to establish an initial discriminator model, and input the positive text samples and the first negative text samples into the initial discriminant model for pre-training to obtain the discriminator model;
    生成器模型更新模块,用于基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;The generator model update module is configured to generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, and calculate the generator according to the reward value The gradient of the model, and update the generator model according to the gradient;
    判别器模型更新模块,用于根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;The discriminator model update module is used to generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and according to minimize cross entropy Update the discriminator model;
    文本生成模型获取模块,用于交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;A text generation model acquisition module, configured to alternately update the generator model and the discriminator model, if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
    目标文本生成模块,用于获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。The target text generation module is used to obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  7. 如权利要求6所述的文本生成装置,其特征在于,所述生成器模型获取模块包括初始生成模型建立单元、初始生成模型预训练单元和生成器模型获取单元;7. The text generation device of claim 6, wherein the generator model acquisition module includes an initial generation model establishment unit, an initial generation model pre-training unit, and a generator model acquisition unit;
    所述初始生成模型建立单元,用于将初始生成参数输入递归神经网络建立初始生成器模型;The initial generation model establishment unit is used to input initial generation parameters into the recurrent neural network to establish an initial generator model;
    所述初始生成模型预训练单元,用于将所述文本正样本输入至所述初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;The initial generation model pre-training unit is configured to input the positive text sample into the initial generator model for pre-training, and convert it into a probability output according to the probability distribution function to obtain pre-trained parameters;
    所述生成器模型获取单元,用于根据所述预训练后的参数更新所述初始生成器模型的参数,得到生成器模型。The generator model obtaining unit is configured to update the parameters of the initial generator model according to the pre-trained parameters to obtain a generator model.
  8. 如权利要求6所述的文本生成装置,其特征在于,所述判别器模型获取模块包括初始判别模型建立单元、初始判别模型预训练单元和判别器模型获取单元;7. The text generating device according to claim 6, wherein the discriminator model acquisition module comprises an initial discriminant model establishment unit, an initial discriminant model pre-training unit, and a discriminator model acquisition unit;
    所述初始判别模型建立单元,用于将初始判别参数输入至卷积神经网络建立初始判别器模型;The initial discriminant model establishment unit is used to input initial discriminant parameters into the convolutional neural network to establish an initial discriminator model;
    所述初始判别模型预训练单元,用于将所述文本正样本与所述第一文本负样本输入至所述初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新所述初始判别器的初始判别参数,得到预训练后的判别参数;The initial discriminant model pre-training unit is used to input the positive text samples and the first negative text samples into the initial discriminator model for pre-training, convert them into probability output according to the probability distribution function, and according to the minimum Update the initial discriminating parameters of the initial discriminator to obtain the discriminating parameters after pre-training;
    所述判别器模型获取单元,用于根据所述预训练后的判别参数更新所述初始判别器模型的参数,得到判别器模型。The discriminator model acquisition unit is configured to update the parameters of the initial discriminator model according to the discriminant parameters after the pre-training to obtain the discriminator model.
  9. 如权利要求6所述的文本生成装置,其特征在于,所述生成器模型更新模块还用于获取生成测试文本过程中的文本作为测试子文本;根据每一测试子文本采用蒙特卡洛搜索方式生成M个假设文本;将M个假设文本输入至判别器模型中,获取M个假设文本的奖励均值作为测试子文本的奖励值,并将测试文本输入至判别器模型中,获取测试文本的奖励值;根据测试子文本的奖励值和测试文本的奖励值计算生成器模型的梯度,并根据梯度更新生成器模型的参数,得到更新后的生成器模型。The text generation device according to claim 6, wherein the generator model update module is also used to obtain the text in the process of generating the test text as the test sub-text; Monte Carlo search is adopted according to each test sub-text Generate M hypothetical texts; input M hypothetical texts into the discriminator model, obtain the reward average of M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model to obtain the reward of the test text Value; calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain the updated generator model.
  10. 如权利要求6所述的文本生成装置,其特征在于,所述文本正样本获取模块还用于从真实文本数据集中选取N个文本数据,N为正整数;将N个文本数据用词向量模型转化为向量形式,将转化为向量形式的N个文本数据作为文本正样本。The text generation device according to claim 6, wherein the text positive sample acquisition module is further used to select N text data from the real text data set, where N is a positive integer; and use the word vector model for the N text data Converted into vector form, and N text data converted into vector form are used as text positive samples.
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器 上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein the processor executes the computer-readable instructions as follows step:
    获取真实文本数据集,从所述真实文本数据集中获取文本正样本;Obtaining a real text data set, and obtaining a positive text sample from the real text data set;
    建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;Establishing an initial generator model, inputting the positive text samples to the initial generator model for pre-training, obtaining a generator model, and generating a first negative text sample according to the generator model;
    建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;Establishing an initial discriminator model, and inputting the positive text samples and the first negative text samples into the initial discriminating model for pre-training to obtain the discriminator model;
    基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and calculate the gradient of the generator model according to the reward value. Gradient update the generator model;
    根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross-entropy;
    交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;Alternately updating the generator model and the discriminator model, and if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
    获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  12. 如权利要求11所述的计算机设备,其特征在于,所述从所述真实文本数据集中获取文本正样本,包括:11. The computer device of claim 11, wherein said obtaining a positive text sample from said real text data set comprises:
    从所述真实文本数据集中选取N个文本数据,N为正整数;Select N text data from the real text data set, where N is a positive integer;
    将N个所述文本数据用词向量模型转化为向量形式,将转化为向量形式的N个所述文本数据作为文本正样本。The N pieces of text data are converted into a vector form using a word vector model, and the N pieces of text data converted into a vector form are used as text positive samples.
  13. 如权利要求11所述的计算机设备,其特征在于,所述建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本,包括:The computer device according to claim 11, wherein the initial generator model is established, the positive text samples are input to the initial generator model for pre-training to obtain the generator model, and the generator model is generated according to the The generator model generates the first negative sample of text, including:
    将初始生成参数输入递归神经网络建立初始生成器模型;Input the initial generation parameters into the recurrent neural network to establish the initial generator model;
    将所述文本正样本输入至所述初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;Inputting the positive text sample into the initial generator model for pre-training, and converting it into a probability output according to a probability distribution function, to obtain pre-trained parameters;
    根据所述预训练后的参数更新所述初始生成器模型的参数,得到生成器模型。The parameters of the initial generator model are updated according to the pre-trained parameters to obtain a generator model.
  14. 如权利要求11所述的计算机设备,其特征在于,所述建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型,包括:The computer device of claim 11, wherein the initial discriminator model is established, and the positive text samples and the first negative text samples are input into the initial discriminant model for pre-training to obtain the discrimination Model, including:
    将初始判别参数输入至卷积神经网络建立初始判别器模型;Input the initial discriminating parameters into the convolutional neural network to establish an initial discriminator model;
    将所述文本正样本与所述第一文本负样本输入至所述初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新所述初始判别器的初始判别参数,得到预训练后的判别参数;Input the positive text sample and the first negative text sample into the initial discriminator model for pre-training, transform it into a probability output according to the probability distribution function, and update the initial discriminator according to minimized cross entropy Discriminant parameters to obtain discriminant parameters after pre-training;
    根据所述预训练后的判别参数更新所述初始判别器模型的参数,得到判别器模型。The parameters of the initial discriminator model are updated according to the discriminant parameters after the pre-training to obtain the discriminator model.
  15. 如权利要求11所述的计算机设备,其特征在于,所述基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型,包括:The computer device according to claim 11, wherein the test text is generated based on the generator model, the test text is input into the discriminator model to obtain the reward value of the test text, and the reward value of the test text is obtained according to the The reward value calculating the gradient of the generator model and updating the generator model according to the gradient includes:
    获取生成所述测试文本过程中的文本作为测试子文本;Acquiring the text in the process of generating the test text as the test sub-text;
    根据每一所述测试子文本采用蒙特卡洛搜索方式生成M个假设文本;According to each of the test sub-texts, M hypothetical texts are generated using a Monte Carlo search method;
    将M个所述假设文本输入至所述判别器模型中,获取M个所述假设文本的奖励均值作为所述测试子文本的奖励值,并将所述测试文本输入至所述判别器模型中,获取所述测试文本的奖励值;Input the M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model To obtain the reward value of the test text;
    根据所述测试子文本的奖励值和所述测试文本的奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型的参数,得到更新后的生成器模型。Calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain an updated generator model.
  16. 一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    获取真实文本数据集,从所述真实文本数据集中获取文本正样本;Obtaining a real text data set, and obtaining a positive text sample from the real text data set;
    建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本;Establishing an initial generator model, inputting the positive text samples to the initial generator model for pre-training, obtaining a generator model, and generating a first negative text sample according to the generator model;
    建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型;Establishing an initial discriminator model, and inputting the positive text samples and the first negative text samples into the initial discriminating model for pre-training to obtain the discriminator model;
    基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型;Generate test text based on the generator model, input the test text into the discriminator model to obtain the reward value of the test text, calculate the gradient of the generator model according to the reward value, and calculate the gradient of the generator model according to the reward value. Gradient update the generator model;
    根据更新后的所述生成器模型生成第二文本负样本,将所述第二文本负样本与所述文本正样本输入至判别器模型中,根据最小化交叉熵更新所述判别器模型;Generate a second negative text sample according to the updated generator model, input the second negative text sample and the positive text sample into the discriminator model, and update the discriminator model according to the minimized cross-entropy;
    交替更新所述生成器模型和所述判别器模型,若所述判别器模型的输出收敛,则根据收敛时的所述生成器模型得到文本生成模型;Alternately updating the generator model and the discriminator model, and if the output of the discriminator model converges, obtain a text generation model according to the generator model at the time of convergence;
    获取待识别文本,并将所述待识别文本输入至所述文本生成模型中,基于所述文本生成模型生成目标文本。Obtain the text to be recognized, input the text to be recognized into the text generation model, and generate the target text based on the text generation model.
  17. 如权利要求16所述的可读存储介质,其特征在于,所述从所述真实文本数据集中获取文本正样本,包括:15. The readable storage medium of claim 16, wherein said obtaining a positive text sample from the real text data set comprises:
    从所述真实文本数据集中选取N个文本数据,N为正整数;Select N text data from the real text data set, where N is a positive integer;
    将N个所述文本数据用词向量模型转化为向量形式,将转化为向量形式的N个所述文本数据作为文本正样本。The N pieces of text data are converted into a vector form using a word vector model, and the N pieces of text data converted into a vector form are used as text positive samples.
  18. 如权利要求16所述的可读存储介质,其特征在于,所述建立初始生成器模型,将所述文本正样本输入至所述初始生成器模型进行预训练,得到生成器模型,并根据所述生成器模型生成第一文本负样本,包括:The readable storage medium according to claim 16, wherein the initial generator model is established, the positive text samples are input to the initial generator model for pre-training, and the generator model is obtained according to the The generator model generates the first negative sample of text, including:
    将初始生成参数输入递归神经网络建立初始生成器模型;Input the initial generation parameters into the recurrent neural network to establish the initial generator model;
    将所述文本正样本输入至所述初始生成器模型中进行预训练,并根据概率分布函数转化为概率输出,得到预训练后的参数;Inputting the positive text sample into the initial generator model for pre-training, and converting it into a probability output according to a probability distribution function, to obtain pre-trained parameters;
    根据所述预训练后的参数更新所述初始生成器模型的参数,得到生成器模型。The parameters of the initial generator model are updated according to the pre-trained parameters to obtain a generator model.
  19. 如权利要求16所述的可读存储介质,其特征在于,所述建立初始判别器模型,将所述文本正样本与所述第一文本负样本输入至所述初始判别模型中进行预训练,得到判别器模型,包括:16. The readable storage medium of claim 16, wherein the establishing an initial discriminator model, inputting the positive text samples and the first negative text samples into the initial discriminating model for pre-training, Obtain the discriminator model, including:
    将初始判别参数输入至卷积神经网络建立初始判别器模型;Input the initial discriminating parameters into the convolutional neural network to establish an initial discriminator model;
    将所述文本正样本与所述第一文本负样本输入至所述初始判别器模型中进行预训练,根据概率分布函数转化为概率输出,并根据最小化交叉熵更新所述初始判别器的初始判别参数,得到预训练后的判别参数;Input the positive text sample and the first negative text sample into the initial discriminator model for pre-training, transform it into a probability output according to the probability distribution function, and update the initial discriminator according to minimized cross entropy Discriminant parameters to obtain discriminant parameters after pre-training;
    根据所述预训练后的判别参数更新所述初始判别器模型的参数,得到判别器模型。The parameters of the initial discriminator model are updated according to the discriminant parameters after the pre-training to obtain the discriminator model.
  20. 如权利要求16所述的可读存储介质,其特征在于,所述基于所述生成器模型生成测试文本,将所述测试文本输入至所述判别器模型中获取所述测试文本的奖励值,根据所述奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型,包括:15. The readable storage medium of claim 16, wherein the generating test text based on the generator model, inputting the test text into the discriminator model to obtain the reward value of the test text, Calculating the gradient of the generator model according to the reward value and updating the generator model according to the gradient includes:
    获取生成所述测试文本过程中的文本作为测试子文本;Acquiring the text in the process of generating the test text as the test sub-text;
    根据每一所述测试子文本采用蒙特卡洛搜索方式生成M个假设文本;According to each of the test sub-texts, M hypothetical texts are generated using a Monte Carlo search method;
    将M个所述假设文本输入至所述判别器模型中,获取M个所述假设文本的奖励均值作为所述测试子文本的奖励值,并将所述测试文本输入至所述判别器模型中,获取所述测试文本的奖励值;Input the M hypothetical texts into the discriminator model, obtain the reward average value of the M hypothetical texts as the reward value of the test sub-text, and input the test text into the discriminator model To obtain the reward value of the test text;
    根据所述测试子文本的奖励值和所述测试文本的奖励值计算所述生成器模型的梯度,并根据所述梯度更新所述生成器模型的参数,得到更新后的生成器模型。Calculate the gradient of the generator model according to the reward value of the test sub-text and the reward value of the test text, and update the parameters of the generator model according to the gradient to obtain an updated generator model.
PCT/CN2019/116941 2019-01-24 2019-11-11 Text generation method and device, computer apparatus, and medium WO2020151310A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910067379.0A CN109885667A (en) 2019-01-24 2019-01-24 Document creation method, device, computer equipment and medium
CN201910067379.0 2019-01-24

Publications (1)

Publication Number Publication Date
WO2020151310A1 true WO2020151310A1 (en) 2020-07-30

Family

ID=66926787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116941 WO2020151310A1 (en) 2019-01-24 2019-11-11 Text generation method and device, computer apparatus, and medium

Country Status (2)

Country Link
CN (1) CN109885667A (en)
WO (1) WO2020151310A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109885667A (en) * 2019-01-24 2019-06-14 平安科技(深圳)有限公司 Document creation method, device, computer equipment and medium
CN112115257B (en) * 2019-06-20 2023-07-14 百度在线网络技术(北京)有限公司 Method and device for generating information evaluation model
CN111126503B (en) * 2019-12-27 2023-09-26 北京同邦卓益科技有限公司 Training sample generation method and device
CN111339749B (en) * 2020-03-02 2022-05-20 乐山师范学院 Unconditional text generating method, text generating device and storage medium
US11972604B2 (en) 2020-03-11 2024-04-30 Shenzhen Institutes Of Advanced Technology Image feature visualization method, image feature visualization apparatus, and electronic device
CN112036955B (en) * 2020-09-07 2021-09-24 贝壳找房(北京)科技有限公司 User identification method and device, computer readable storage medium and electronic equipment
CN112328750A (en) * 2020-11-26 2021-02-05 上海天旦网络科技发展有限公司 Method and system for training text discrimination model
CN115442324B (en) * 2021-06-04 2023-08-18 中国移动通信集团浙江有限公司 Message generation method, device, message management equipment and storage medium
CN114844767A (en) * 2022-04-27 2022-08-02 中国电子科技集团公司第五十四研究所 Alarm data generation method based on countermeasure generation network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254555A1 (en) * 2014-03-04 2015-09-10 SignalSense, Inc. Classifying data with deep learning neural records incrementally refined through expert input
CN108829898A (en) * 2018-06-29 2018-11-16 无码科技(杭州)有限公司 HTML content page issuing time extracting method and system
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN109885667A (en) * 2019-01-24 2019-06-14 平安科技(深圳)有限公司 Document creation method, device, computer equipment and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180336439A1 (en) * 2017-05-18 2018-11-22 Intel Corporation Novelty detection using discriminator of generative adversarial network
CN108334497A (en) * 2018-02-06 2018-07-27 北京航空航天大学 The method and apparatus for automatically generating text
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses
CN108923922B (en) * 2018-07-26 2021-04-23 北京工商大学 Text steganography method based on generation of confrontation network
CN109242090B (en) * 2018-08-28 2020-06-26 电子科技大学 Video description and description consistency judgment method based on GAN network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254555A1 (en) * 2014-03-04 2015-09-10 SignalSense, Inc. Classifying data with deep learning neural records incrementally refined through expert input
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN108829898A (en) * 2018-06-29 2018-11-16 无码科技(杭州)有限公司 HTML content page issuing time extracting method and system
CN109885667A (en) * 2019-01-24 2019-06-14 平安科技(深圳)有限公司 Document creation method, device, computer equipment and medium

Also Published As

Publication number Publication date
CN109885667A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
WO2020151310A1 (en) Text generation method and device, computer apparatus, and medium
Goceri Analysis of deep networks with residual blocks and different activation functions: classification of skin diseases
Yu et al. Seqgan: Sequence generative adversarial nets with policy gradient
WO2020232877A1 (en) Question answer selection method and apparatus, computer device, and storage medium
US20220083868A1 (en) Neural network training method and apparatus, and electronic device
WO2021057884A1 (en) Sentence paraphrasing method, and method and apparatus for training sentence paraphrasing model
WO2021184902A1 (en) Image classification method and apparatus, training method and apparatus, device, and medium
US11776269B2 (en) Action classification in video clips using attention-based neural networks
CN113688244B (en) Text classification method, system, equipment and storage medium based on neural network
CN111061847A (en) Dialogue generation and corpus expansion method and device, computer equipment and storage medium
Wang et al. Text generation based on generative adversarial nets with latent variables
CN109523014B (en) News comment automatic generation method and system based on generative confrontation network model
CN109977394B (en) Text model training method, text analysis method, device, equipment and medium
WO2021139344A1 (en) Text generation method and apparatus based on artificial intelligence, computer device, and medium
CN111598213B (en) Network training method, data identification method, device, equipment and medium
CN112000788B (en) Data processing method, device and computer readable storage medium
CN116775843A (en) Question-answer pair evaluation data generation method, question-answer pair evaluation data generation device, computer equipment and storage medium
CN116992942B (en) Natural language model optimization method, device, natural language model, equipment and medium
CN110598210A (en) Entity recognition model training method, entity recognition device, entity recognition equipment and medium
CN117236421A (en) Large model training method based on federal knowledge distillation
CN114861671A (en) Model training method and device, computer equipment and storage medium
Zhang et al. Weight uncertainty in Boltzmann machine
Vargas et al. Relu-based activations: Analysis and experimental study for deep learning
WO2021151324A1 (en) Method and apparatus for medical data processing based on transfer learning, device, and medium
CN111797220A (en) Dialog generation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19911305

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19911305

Country of ref document: EP

Kind code of ref document: A1