CN116795972B

CN116795972B - Model training method and device, storage medium and electronic equipment

Info

Publication number: CN116795972B
Application number: CN202311010097.XA
Authority: CN
Inventors: 谢冰; 宋伟; 朱世强; 王雨菡; 赵鑫安; 尹越; 袭向明
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-08-11
Filing date: 2023-08-11
Publication date: 2024-01-09
Anticipated expiration: 2043-08-11
Also published as: CN116795972A

Abstract

The specification discloses a model training method, a device, a storage medium and electronic equipment, wherein a plurality of independent expression sentences are input into a generator of the model to obtain a fusion sentence, the fusion sentence is added into a training data set, sentences to be distinguished in the training data set are input into a source discriminator of the model to determine whether the sentences to be distinguished are distinguishing results of original sentences, source distinguishing loss of the source discriminator and source generating loss of the generator are respectively determined according to the distinguishing results, the model is trained according to the source generating loss and the source distinguishing loss, and the generator of the model is used for fusing a plurality of independent expression sentences replied to a user. The method carries out countermeasure training on the generator and the source discriminator in the model so as to enable the generator to generate a fusion sentence close to an original sentence, thereby obtaining a sentence with normal word order and natural and not hard content connection, and the source discriminator can more accurately judge whether the input sentence is the original sentence.

Description

Model training method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to a method and apparatus for model training, a storage medium, and an electronic device.

Background

With the development of artificial intelligence, intelligent robots are applied to various fields, such as chat robots. In order to enable users to have good chat experience, the chat robot can introduce new topics while replying to user problems. Since the content returned to the user and the new topic may have the same human-called pronouns or related content, if only the content returned to the user and the new topic are pieced together, the chat content displayed to the user is hard. For example, the user's questions are: "what are Zhang Sanzhu patents? ", the chat robot can reply: the patent of Zhang three has … … ", in order to introduce new topics, chat robots can also add replies: the "Zhang san engaged in long-distance running activity" is obvious that the chat robot replies to the user with the following contents: the patent of Zhang San has … …, zhang San participates in long running activities, but the reply content is very hard, and the chat experience of the user is affected.

Based on this, the present specification provides a method of model training.

Disclosure of Invention

The present disclosure provides a method, apparatus, storage medium and electronic device for model training, so as to partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the present specification provides a method of model training, comprising:

splitting an original sentence into a plurality of independent expression sentences, and adding the original sentence into a training data set;

inputting each independent expression sentence into a generator of the model, so as to fuse each independent expression sentence through the generator to obtain a fused sentence; adding the fusion sentence into the training data set;

inputting sentences to be discriminated in the training data set into a source discriminator of the model to obtain a discrimination result which is output by the source discriminator and used for judging whether the sentences to be discriminated are original sentences;

determining the source discrimination loss of the source discriminator according to the discrimination result of the source discriminator and the source tag of the sentence to be discriminated; determining the source generation loss of the generator according to the judging result of the source judging device, the source label of the sentence to be judged and the original sentence corresponding to the sentence to be judged;

and performing countermeasure training on the model according to the source generation loss and the source discrimination loss, wherein a generator of the model is used for fusing a plurality of independent expression sentences of the replying user.

Optionally, determining the source generation loss of the generator according to the discrimination result of the source discriminator, the source tag of the sentence to be discriminated and the original sentence corresponding to the sentence to be discriminated specifically includes:

if the judging result of the source judging device is inconsistent with the source label of the sentence to be judged, determining the word sequence difference between the sentence to be judged and the original sentence corresponding to the sentence to be judged;

and determining the source generation loss of the generator according to the word order difference.

Optionally, the method further comprises:

inputting the sentence to be discriminated and the independent expression sentence corresponding to the sentence to be discriminated into a semantic discriminator of the model to obtain a discrimination result which is output by the semantic discriminator and judges whether the semantic of the sentence to be discriminated is consistent with the semantic of the independent expression sentence corresponding to the sentence to be discriminated;

and performing countermeasure training on the model according to the judging result output by the semantic judging device.

Optionally, according to the discrimination result output by the semantic discriminator, performing countermeasure training on the model, which specifically includes:

determining semantic discrimination loss of the semantic discriminator according to the discrimination result output by the semantic discriminator and the semantic tag of the sentence to be discriminated; determining semantic generation loss of the generator according to the judging result of the semantic judging device, the semantic label of the sentence to be judged and the independent expression sentence corresponding to the sentence to be judged;

And performing countermeasure training on the model according to the semantic generation loss and the semantic discrimination loss.

Optionally, determining the semantic generation loss of the generator according to the discrimination result of the semantic discriminator, the semantic tag of the sentence to be discriminated and the original sentence corresponding to the sentence to be discriminated specifically includes:

if the judging result of the semantic judging device is inconsistent with the semantic label of the sentence to be judged, determining the semantic difference between the sentence to be judged and the independent expression sentence corresponding to the sentence to be judged;

and determining semantic generation loss of the generator according to the semantic difference.

Optionally, the method further comprises:

and inputting a plurality of independent expression sentences for replying the user into the generator of the model to obtain a fusion sentence output by the generator and displaying the fusion sentence.

Optionally, the generator comprises a bi-directional autoregressive transformer; the source identifier includes a BERT model and a linear classifier.

The present specification provides an apparatus for model training, comprising:

the independent expression sentence acquisition module is used for splitting an original sentence into a plurality of independent expression sentences and adding the original sentence into a training data set;

The fusion sentence acquisition module is used for inputting each independent expression sentence into a generator of the model so as to fuse each independent expression sentence through the generator to obtain a fusion sentence; adding the fusion sentence into the training data set;

the word order judging result module is used for inputting sentences to be judged in the training data set into a source discriminator of the model so as to obtain a judging result which is output by the source discriminator and is used for judging whether the sentences to be judged are original sentences or not;

the loss determination module is used for determining the source discrimination loss of the source discriminator according to the discrimination result of the source discriminator and the source tag of the sentence to be discriminated; determining the source generation loss of the generator according to the judging result of the source judging device, the source label of the sentence to be judged and the original sentence corresponding to the sentence to be judged;

and the model training module is used for performing countermeasure training on the model according to the source generation loss and the source discrimination loss, wherein the generator of the model is used for fusing a plurality of independent expression sentences of the replying user.

Optionally, the loss determining module is specifically configured to determine, if a discrimination result of the source discriminator is inconsistent with a source tag of the sentence to be discriminated, a word order difference between the sentence to be discriminated and an original sentence corresponding to the sentence to be discriminated; and determining the source generation loss of the generator according to the word order difference.

Optionally, the apparatus further comprises:

the semantic training module is used for inputting the sentence to be discriminated and the independent expression sentence corresponding to the sentence to be discriminated into the semantic discriminator of the model so as to obtain a discrimination result which is output by the semantic discriminator and is used for judging whether the semantic meaning of the sentence to be discriminated is consistent with the semantic meaning of the independent expression sentence corresponding to the sentence to be discriminated; and performing countermeasure training on the model according to the judging result output by the semantic judging device.

Optionally, the semantic judgment module is specifically configured to determine a semantic judgment loss of the semantic identifier according to a judgment result output by the semantic identifier and the semantic tag of the sentence to be judged; determining semantic generation loss of the generator according to the judging result of the semantic judging device, the semantic label of the sentence to be judged and the independent expression sentence corresponding to the sentence to be judged; and performing countermeasure training on the model according to the semantic generation loss and the semantic discrimination loss.

Optionally, the semantic judgment module is specifically configured to determine a semantic difference between the sentence to be judged and an independent expression sentence corresponding to the sentence to be judged if the judgment result of the semantic judgment device is inconsistent with the semantic label of the sentence to be judged; and determining semantic generation loss of the generator according to the semantic difference.

Optionally, the apparatus further comprises:

and the application module is used for inputting a plurality of independent expression sentences for replying the user into the generator of the model so as to obtain a fusion sentence output by the generator and display the fusion sentence.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the method of model training described above.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method of model training as described above when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

according to the method for training the model, which is provided by the specification, the generator and the source discriminator in the model are subjected to countermeasure training, so that the generator generates a fusion sentence close to an original sentence, sentences with normal word order and natural and not-hard content connection are obtained, and the source discriminator more accurately judges whether the input sentence is the original sentence.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a flow chart of a method of model training provided in the present specification;

FIG. 2 is a schematic diagram of an internal structure for generating an countermeasure network model provided in the present specification;

FIG. 3 is a schematic diagram of the internal structure of another generator provided in the present specification;

FIG. 4 is a schematic diagram of the internal structure of a source identifier provided in the present specification;

FIG. 5 is a schematic diagram of an internal structure for generating an countermeasure network model provided in the present specification;

FIG. 6 is a schematic diagram of an internal structure of a semantic identifier provided in the present specification;

FIG. 7 is a schematic diagram of a model training apparatus provided herein;

fig. 8 is a schematic structural diagram of an electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a flow chart of a method for model training provided in the present specification, including the following steps:

s100: splitting an original sentence into a plurality of independent expression sentences, and adding the original sentence into a training data set.

In general, when a chat robot replies to a user's problem and introduces a new topic to the user, only a plurality of associated contents are spliced, and finally the contents presented to the user are relatively hard, for example, the chat robot replies to the user as follows: ' Zhang Sanshen high 188 cm, zhang Sanai basketball. The present specification thus provides a method of model training. The chat robot can fuse sentences related to given pieces of content through the model, and the fused generated content is fluent. The execution subject of the present specification may be a server for training a model, or may be a chat robot or other electronic device for replying to a user. For convenience of explanation, a model training method provided in the present specification will be explained below with only a server as an execution subject.

The present specification trains a generated countermeasure network model to acquire content for replying to a user through a generator in the generated countermeasure network.

Fig. 2 is a schematic diagram of an internal structure of generating an countermeasure network model, as shown in fig. 2.

To train the generator in the generated countermeasure network model, a training sample for training the generator is first acquired, the training sample including independent expression. When the server acquires the independent expression sentences, the original sentences can be split into a plurality of independent expression sentences. That is, the independent expression sentence can be obtained by the original sentence, that is, the name of the expression in the original sentence and the name of the entity referred to by the expression are replaced by the full name of the entity referred to by the expression, and the conjunctive words used for connecting the sentences in the original sentence are replaced by the periods.

For example, the original sentence is: "Mai somewhere: "give me a pair of high-heeled shoes, me can conquer the world … …". A man may not look at the chip, but i believe that she. "by giving me a pair of high-heeled shoes, me can conquer me in the world … …" and me believes that she's ' are replaced with "mai's" to get independent expressions: "Mai somewhere: "give a certain pair of high-heeled shoes, a certain wheat can conquer the world … …". Men may not look at the chip, but I believe that there is something. ". The original sentence includes any sentence in the reference resolved dataset. Note that the original sentence may be a plurality of sentences or one sentence, which is not limited in this specification. Where an reference resolution task is a process of partitioning different references representing the same entity into an equivalent set, any sentence in the reference resolution dataset may be a pending reference resolution task. That is, the original sentence may include a full name, a name, and a short name of the entity, which are referred to as references. Along the above example, maiden is the entity and is the whole name of maiden, western, maiden is the abbreviation of maiden, western, and i'm can conquer me that i'm "in the world … … is the name of maiden.

The server may also add the original sentence to the training dataset in order to facilitate subsequent training of the discriminant generating the challenge network model and to determine labels for a number of independent expression sentences.

S102: inputting each independent expression sentence into a generator of the model, so as to fuse each independent expression sentence through the generator to obtain a fused sentence; and adding the fusion sentence into the training data set.

In one or more embodiments of the present description, the generating the countermeasure network model further includes a source arbiter. In order to determine a source discrimination loss of the source discriminator, to improve accuracy of a discrimination result of the source discriminator by the source discrimination loss, and to determine a source generation loss of the generator, to improve smoothness of contents of a fusion sentence generated by the generator by the source generation loss, the server may input each independent expression sentence into the generator of the model, to fuse each independent expression sentence by the generator, to obtain a fusion sentence, and to add the fusion sentence into the training dataset.

Specifically, fig. 3 is a schematic diagram of the internal structure of a generator provided in the present specification, as shown in fig. 3.

The generator includes a bi-directional autoregressive transformer (BART), and the server inputs each independent expression into the encoder of the generator and obtains a fusion sentence according to the decoder of the generator. When each independent expression sentence is input into the generator, the independent expression sentences are split by periods, question marks or exclamation marks, the split independent expression sentences are spliced in sequence starting with [ CLS ] symbols and separated by [ SEP ], and the [ SEP ] is spliced at the end. Along with the above examples: "Mai somewhere: "give a certain pair of high-heeled shoes, a certain wheat can conquer the world … …". A man may not look at the chip, but i believe that she. ", the input format to the generator is: "[ CLS ] wheat somehow: "give a certain pair of high-heeled shoes, a certain wheat can conquer the world … …". SEP men may not look at the chip, but i believe something. [ SEP ] ". The server may determine an input format in which each independent expression is input to a generator according to a type of the generator, which is not limited in this specification.

S104: and inputting sentences to be discriminated in the training data set into a source discriminator of the model to obtain a discrimination result which is output by the source discriminator and used for judging whether the sentences to be discriminated are original sentences.

In order to determine the source discrimination loss of the source discriminator and train the source discriminator according to the source discrimination loss, the server needs to input the sentence to be discriminated in the training data set into the source discriminator of the model to obtain the discrimination result output by the source discriminator for judging whether the sentence to be discriminated is the original sentence. And determining the source discrimination loss according to the discrimination result.

Specifically, fig. 4 is a schematic diagram of an internal structure of a source identifier provided in the present specification, as shown in fig. 4.

The source arbiter includes a bi-directional encoder representation from the transformer, namely a BERT (Bidirectional Encoder Representation from Transformers, BERT) model and a linear classifier. When the server inputs the sentence to be discriminated into the source discriminator, the sentence to be discriminated is processed by the BERT model, and then the sentence to be discriminated passes through the linear classifier, and finally the discriminating result of the source discriminator is output. The input format of the sentence to be discriminated input to the source discriminator may be the same as the input format of the sentence to be discriminated input to the generator in step S102, and the description is omitted. The output of the source identifier is a classification result, and the meaning corresponding to the discrimination result output by the source identifier can be set according to the requirement, which is not limited in the specification. For example, when the discrimination result output by the source discriminator is 1, it indicates that the sentence to be discriminated is an original sentence, and when the discrimination result output by the source discriminator is 0, it indicates that the sentence to be discriminated is not an original sentence. The method may also be configured to indicate that the sentence to be discriminated is an original sentence when the discrimination result output by the source discriminator is 0, and indicate that the sentence to be discriminated is not an original sentence when the discrimination result output by the source discriminator is 1.

S106: and determining the source discrimination loss of the source discriminator according to the discrimination result of the source discriminator and the source label of the sentence to be discriminated.

In one or more embodiments of the present disclosure, the source tag of the sentence to be discriminated refers to the sentence type of the sentence to be discriminated, for example, if the sentence to be discriminated is an original sentence, the source tag of the sentence to be discriminated is 1, if the sentence to be discriminated is a fusion sentence, the source tag of the sentence to be discriminated is 0, and the source tag can be set according to the requirement, but needs to be consistent with the meaning expressed by the discrimination result of the source discriminator. For example, if the source tag is set to be 1 when the sentence to be discriminated is the original sentence, the source discriminator judges that the discrimination result output when the sentence to be discriminated is the original sentence is also 1, and if 0 is output, the source discriminator discriminates an error. If the judging result of the source judging device is consistent with the source label of the sentence to be judged, the source judging device judges correctly, and when the judging result of the source judging device is inconsistent with the source label of the sentence to be judged, the source judging loss can be determined according to the following formula:

wherein,loss of source discrimination for source discriminator, +. >The generator generates a sample, i.e. fusion sentence,>is a true sample, i.e., an original sentence.

S108: and determining the source generation loss of the generator according to the judging result of the source judging device, the source label of the sentence to be judged and the original sentence corresponding to the sentence to be judged.

Specifically, if the distinguishing result of the source distinguishing device is inconsistent with the source label of the sentence to be distinguished, determining the word sequence difference between the sentence to be distinguished and the original sentence corresponding to the sentence to be distinguished, and determining the source generation loss of the generator according to the word sequence difference. Of course, the source generation loss of the generator may also be determined according to other differences between the sentence to be determined and the original sentence corresponding to the sentence to be determined, which is not limited in this specification.

S110: and performing countermeasure training on the model according to the source generation loss and the source discrimination loss, wherein a generator of the model is used for fusing a plurality of independent expression sentences of the replying user.

Specifically, the source discriminator is trained by taking the reduced source discrimination loss as a source discrimination training target, the source generation loss is reduced as a source generation training target, and the generator is trained. The trained source identifier not only enables the generator to generate more natural fusion sentences, but also can be used for classifying input sentences.

The server can alternatively train the generator and the source identifier, namely train the source identifier once according to the source identification loss, determine the source generation loss of the generator after the source identifier completes one training, and train the generator once according to the source generation loss. After the generator completes one training, the server completes one iteration.

Based on the method for training the model shown in fig. 1, the method carries out countermeasure training on the generator and the source discriminator in the model so as to enable the generator to generate a fusion sentence close to an original sentence, thereby obtaining a sentence with normal word order and natural and not hard content connection, and the source discriminator can more accurately judge whether the input sentence is the original sentence.

The server may also train the generator using maximum likelihood estimation or other methods to make the countermeasure training more stable, and the loss of the generator when training the generator using maximum likelihood estimation is:

wherein m is the sentence length of the fusion sentence, t is the t-th character,for the t-th predicted character, < >>For the t-th tag character, < > >Is the character that has been generated before t.

Of course, the generator may not be trained before step S102 is performed, which is not limited in this specification.

In executing step S102, since the model does not generally support direct recognition of text, when each independent expression is input into the generator, each independent expression is first converted into an integer index or other recognizable format recognizable by the generator.

Similarly, when the sentence to be discriminated is input into the source discriminator, the same processing is also required, and this will not be repeated in this specification. In addition, when the source identifier identifies the integer index list, a situation that the source identifier cannot derive may occur, and the identification result cannot be output. Therefore, the gummel-softmax technique can be used to solve the problem of non-leadership.

Specifically, the generation process of the generator can be expressed as:

wherein,for the input of layer 0, s is an integer index list of the input sentences,to obtain a word embedding matrix operation,/->For obtaining the position coding matrix operation, M is the result matrix of the BART model encoder, +.>Encoder operation for BART model, +.>For the matrix that has been generated before time t, +.>Decoder operation for BART model, +. >For the generated t moment vector, MLP is feed-forward neural network, < >>For the distribution vector of t time on word stockFor a length of vocabulary size, dimension i is +.>G is the vocabulary size, each dimension is independently sampled from the vector of the standard gummel distribution. Using the Stright-Thour technique, the gradient is calculated using the equation only when the gradient is returned.

Fig. 5 is a schematic diagram of an internal structure of another embodiment of generating an countermeasure network model provided in the present specification, as shown in fig. 5.

Although the model is trained by the method, the generator of the model can generate a fusion sentence with normal language sequence, and the content of the fusion sentence can have different expressions and arrangements, the semantics of the fusion sentence may be different from the semantics of the corresponding independent expression sentence. The method comprises the following steps: the method comprises the steps of lacking the semantics of a certain independent expression sentence, tampering the semantics of a certain independent expression sentence and increasing the information which is not expressed in all independent expression sentences. Thus, the model may also contain a semantic discriminant.

Fig. 6 is a schematic diagram of an internal structure of a semantic identifier provided in the present specification, as shown in fig. 6.

In one or more embodiments of the present description, the semantic discriminant includes a spandex bert model and a linear classifier. In order to obtain a fusion sentence with the same meaning as that of the independent expression sentence, the server inputs the sentence to be discriminated and the independent expression sentence corresponding to the sentence to be discriminated into a semantic discriminator of the model to obtain a discrimination result which is output by the semantic discriminator and is used for judging whether the meaning of the sentence to be discriminated is consistent with that of the independent expression sentence corresponding to the sentence to be discriminated or not, and then the model is subjected to countermeasure training according to the discrimination result which is output by the semantic discriminator.

Specifically, the server determines the semantic discrimination loss of the semantic discriminator according to the discrimination result output by the semantic discriminator and the semantic tag of the sentence to be discriminated. The input format of the semantic discriminator can be the same as that of the generator, such as [ CLS ] original sentence [ SEP ] independent expression "," [ CLS ] fusion sentence [ SEP ] independent expression ", or [ CLS ] interference sentence [ SEP ] independent expression. The semantic tag of the sentence to be discriminated refers to whether the semantic of the sentence to be discriminated is consistent with the semantic of the independent expression sentence corresponding to the sentence to be discriminated, if so, the semantic tag is 1, otherwise, 0, and the semantic tag can be set according to the requirement.

The server may determine the semantic discrimination loss by the following formula:

wherein,the semantic judgment loss of the semantic judgment device is that Y is an interference sentence and Z is an independent expression sentence.

The output of the semantic discriminator is a classification result, and the meaning corresponding to the discrimination result output by the discriminator can be set according to the requirement, which is not limited in the specification. It should be noted that the training dataset may also include interfering sentences, i.e. sentences obtained after replacing a pronoun in an original sentence with a word not referred to by the pronoun. For example: let "wheat somehow: "for me to double high-heeled shoes, me can conquer that" me "in world … …" is replaced with Zhang three, lifour, etc.

The server can determine the semantic generation loss of the generator according to the judgment result of the semantic judgment device, the semantic label of the sentence to be judged and the independent expression sentence corresponding to the sentence to be judged at the same time of determining the semantic judgment loss. That is, if the discrimination result of the semantic discriminator is inconsistent with the semantic label of the sentence to be discriminated, determining the semantic difference between the sentence to be discriminated and the independent expression sentence corresponding to the sentence to be discriminated, and determining the semantic generation loss of the generator according to the semantic difference. And finally, training the generated countermeasure network model according to the semantic generation loss and the semantic discrimination loss. Wherein the generating counter network model is to:

wherein,and generating the loss of the countering network model, wherein a and b are weight parameters, and can be set according to the needs.

In addition, the server may train the generator, the source identifier and the semantic identifier of the countermeasure network model at the same time, or train the generator and the source identifier first and train the semantic identifier first, which is not limited in this specification.

If the server trains the generator and the source arbiter before training the semantic arbiter, it needs to be determined that the source arbiter has been trained before training the semantic arbiter.

After the training of the generated countermeasure network model is completed, the server can input a plurality of independent expression sentences for replying the user into a generator of the model so as to obtain a fusion sentence output by the generator and display the fusion sentence. The fusion sentence is not only coherent and natural in content, but also consistent in semantics with the independent expression sentence. In addition, the source discriminator may be used to detect whether sentences are smooth, and the semantic discriminator may be used to determine whether the semantics of two sentences are similar.

The foregoing is a method of one or more implementations of the present disclosure, and based on the same concept, the present disclosure further provides a corresponding apparatus for model training, as shown in fig. 7.

Fig. 7 is a schematic diagram of a model training apparatus provided in the present specification, including:

the independent expression sentence acquisition module 700 is configured to split an original sentence into a plurality of independent expression sentences, and add the original sentence into a training data set;

the fusion sentence acquisition module 702 is configured to input each independent expression sentence into a generator of the model, so that the generator fuses each independent expression sentence to obtain a fusion sentence; adding the fusion sentence into the training data set;

the word order discrimination result module 704 is configured to input a sentence to be discriminated in the training dataset into a source discriminator of the model, so as to obtain a discrimination result output by the source discriminator, where the discrimination result is used for judging whether the sentence to be discriminated is an original sentence;

A loss determining module 706, configured to determine a source discrimination loss of the source discriminator according to a discrimination result of the source discriminator and a source tag of the sentence to be discriminated; determining the source generation loss of the generator according to the judging result of the source judging device, the source label of the sentence to be judged and the original sentence corresponding to the sentence to be judged;

the model training module 708 is configured to perform countermeasure training on the model according to the source generation loss and the source discrimination loss, where the generator of the model is configured to fuse several independent expressions of the replying user.

Optionally, the loss determination module 706 is specifically configured to determine, if the discrimination result of the source discriminator is inconsistent with the source tag of the sentence to be discriminated, a word order difference between the sentence to be discriminated and the original sentence corresponding to the sentence to be discriminated; and determining the source generation loss of the generator according to the word order difference.

Optionally, the apparatus further comprises:

the semantic training module 710 is configured to input the sentence to be discriminated and the independent expression sentence corresponding to the sentence to be discriminated into a semantic discriminator of the model, so as to obtain a discrimination result output by the semantic discriminator for judging whether the semantic meaning of the sentence to be discriminated is consistent with the semantic meaning of the independent expression sentence corresponding to the sentence to be discriminated; and performing countermeasure training on the model according to the judging result output by the semantic judging device.

Optionally, the semantic judging module 710 is specifically configured to determine a semantic judgment loss of the semantic identifier according to a judgment result output by the semantic identifier and a semantic tag of the sentence to be judged; determining semantic generation loss of the generator according to the judging result of the semantic judging device, the semantic label of the sentence to be judged and the independent expression sentence corresponding to the sentence to be judged; and performing countermeasure training on the model according to the semantic generation loss and the semantic discrimination loss.

Optionally, the semantic judging module 710 is specifically configured to determine, if the judging result of the semantic judging device is inconsistent with the semantic label of the sentence to be judged, a semantic difference between the sentence to be judged and an independent expression sentence corresponding to the sentence to be judged; and determining semantic generation loss of the generator according to the semantic difference.

Optionally, the apparatus further comprises:

and the application module 712 is used for inputting a plurality of independent expression sentences for replying to the user into the generator of the model so as to obtain a fusion sentence output by the generator and display the fusion sentence.

The present specification also provides a computer readable storage medium having stored thereon a computer program operable to perform a method of model training as provided in fig. 1 above.

The present specification also provides a schematic structural diagram of the electronic device shown in fig. 8, which corresponds to fig. 1. At the hardware level, as shown in fig. 8, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile storage, and may of course include hardware required by other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement a model training method as described above with respect to fig. 1.

Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method of model training, the method comprising:

performing countermeasure training on the model according to the source generation loss and the source discrimination loss, wherein a generator of the model is used for fusing a plurality of independent expression sentences of a replying user;

determining the source generation loss of the generator according to the word order difference;

2. The method of claim 1, wherein the model is countertrained based on the discrimination results output by the semantic discriminator, specifically comprising:

3. The method of claim 2, wherein determining the semantic generation loss of the generator according to the discrimination result of the semantic discriminator, the semantic tag of the sentence to be discriminated, and the independent expression sentence corresponding to the sentence to be discriminated specifically comprises:

4. The method of claim 1, wherein the method further comprises:

5. The method of claim 1, wherein the generator comprises a bi-directional autoregressive transformer; the source identifier includes a BERT model and a linear classifier.

6. An apparatus for model training, the apparatus comprising:

the loss determination module is used for determining the source discrimination loss of the source discriminator according to the discrimination result of the source discriminator and the source tag of the sentence to be discriminated; determining the source generation loss of the generator according to the discrimination result of the source discriminator, the source label of the sentence to be discriminated and the original sentence corresponding to the sentence to be discriminated, and determining the word order difference between the sentence to be discriminated and the original sentence corresponding to the sentence to be discriminated if the discrimination result of the source discriminator is inconsistent with the source label of the sentence to be discriminated; determining the source generation loss of the generator according to the word order difference;

The model training module is used for performing countermeasure training on the model according to the source generation loss and the source discrimination loss, wherein the generator of the model is used for fusing a plurality of independent expression sentences of a replying user;

7. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-5.

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-5 when executing the program.