CN118113862A - Sample processing method, sample processing device, model updating method, medium and electronic equipment - Google Patents
Sample processing method, sample processing device, model updating method, medium and electronic equipment Download PDFInfo
- Publication number
- CN118113862A CN118113862A CN202211485919.5A CN202211485919A CN118113862A CN 118113862 A CN118113862 A CN 118113862A CN 202211485919 A CN202211485919 A CN 202211485919A CN 118113862 A CN118113862 A CN 118113862A
- Authority
- CN
- China
- Prior art keywords
- text
- class
- loss function
- semantic
- sample text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012545 processing Methods 0.000 title claims abstract description 25
- 238000003672 processing method Methods 0.000 title abstract description 19
- 230000006870 function Effects 0.000 claims description 129
- 230000008451 emotion Effects 0.000 claims description 26
- 238000004590 computer program Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 14
- 238000012549 training Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- GVVPGTZRZFNKDS-JXMROGBWSA-N geranyl diphosphate Chemical compound CC(C)=CCC\C(C)=C\CO[P@](O)(=O)OP(O)(O)=O GVVPGTZRZFNKDS-JXMROGBWSA-N 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The application provides a sample processing method, a sample processing device, a model updating method, a model updating device, a medium and electronic equipment, and relates to the technical field of computers, wherein the method comprises the following steps: acquiring a sample text, a class label corresponding to the sample text and a class II label corresponding to the sample text; generating first semantic features according to the sample text and the class-one tags, and generating second semantic features according to the sample text and the class-two tags; a text loss function for updating the model parameters is generated from the first semantic features and the second semantic features. Therefore, the first class label and the second class label corresponding to the sample text can be obtained, and two semantic features are generated based on the respective combination of the sample text and the first class label and the second class label.
Description
Technical Field
The present application relates to the field of computer technology, and in particular, to a sample processing method, a sample processing apparatus, a model updating method, a model updating apparatus, a computer-readable storage medium, and an electronic device.
Background
In terms of natural language processing, a model can perform various tasks (e.g., generating tasks, identifying tasks, classifying tasks, etc.) based on input language text, and regardless of the task performed, the model generally needs to encode the language text in a manner that is generally: individual characters in the text are encoded separately. However, a loss function with low accuracy is easily obtained based on the result of encoding in the above-described manner.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the application and thus may include information that does not form an existing solution that is already known to those of ordinary skill in the art.
Disclosure of Invention
The application aims to provide a sample processing method, a sample processing device, a model updating method, a model updating device, a computer readable storage medium and electronic equipment, wherein a first type label and a second type label corresponding to a sample text can be obtained, and two semantic features are generated based on the respective combination of the sample text and the first type label and the second type label.
Other features and advantages of the application will be apparent from the following detailed description, or may be learned by the practice of the application.
According to an aspect of the present application, there is provided a sample processing method, the method comprising:
Acquiring a sample text, a class label corresponding to the sample text and a class II label corresponding to the sample text;
Generating first semantic features according to the sample text and the class-one tags, and generating second semantic features according to the sample text and the class-two tags;
A text loss function for updating the model parameters is generated from the first semantic features and the second semantic features.
In an exemplary embodiment of the present application, the emotion polarities corresponding to the class one tags are opposite emotion polarities to the emotion polarities corresponding to the class two tags.
In an exemplary embodiment of the application, generating a text loss function for updating model parameters from the first semantic feature and the second semantic feature comprises:
Triggering a discriminator to generate a discriminating loss function according to the first semantic feature and the second semantic feature;
a text penalty function for updating model parameters of the generator is generated from the discriminant penalty function and the first semantic feature.
In an exemplary embodiment of the present application, after the trigger arbiter generates the discrimination loss function according to the first semantic feature and the second semantic feature, the method further comprises:
And updating model parameters of the discriminant according to the discriminant loss function.
In an exemplary embodiment of the present application, after generating the text loss function for updating the model parameters of the generator according to the discriminant loss function and the first semantic feature, the method further comprises:
Model parameters of the generator are updated according to the text loss function.
In an exemplary embodiment of the present application, generating a first semantic feature from a sample text and a class of tags, and generating a second semantic feature from the sample text and a class of tags, includes:
triggering an encoder to calculate text features corresponding to the sample text;
The trigger decoder calculates a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag.
In an exemplary embodiment of the application, before generating the text loss function for updating the model parameters from the first semantic feature and the second semantic feature, the method further comprises:
And respectively adjusting the sequence dimensions corresponding to the first semantic features and the second semantic features into target sequence dimensions to obtain the first semantic features corresponding to the target sequence dimensions and the second semantic features corresponding to the target sequence dimensions.
According to an aspect of the present application, there is provided a sample processing device comprising:
the data acquisition unit is used for acquiring the sample text, a type of label corresponding to the sample text and a type of label corresponding to the sample text;
the feature generation unit is used for generating first semantic features according to the sample text and the class-one tags and generating second semantic features according to the sample text and the class-two tags;
And the loss function generation unit is used for generating a text loss function for updating the model parameters according to the first semantic features and the second semantic features.
In an exemplary embodiment of the present application, the emotion polarities corresponding to the class one tags are opposite emotion polarities to the emotion polarities corresponding to the class two tags.
In an exemplary embodiment of the present application, the loss function generating unit generates a text loss function for updating model parameters according to the first semantic feature and the second semantic feature, including:
Triggering a discriminator to generate a discriminating loss function according to the first semantic feature and the second semantic feature;
a text penalty function for updating model parameters of the generator is generated from the discriminant penalty function and the first semantic feature.
In an exemplary embodiment of the present application, the apparatus further includes:
And the parameter updating unit is used for updating model parameters of the discriminator according to the discriminating loss function after the discriminating loss function generating unit triggers the discriminator to generate the discriminating loss function according to the first semantic feature and the second semantic feature.
In an exemplary embodiment of the application, the parameter updating unit is further configured to update the model parameters of the generator according to the text loss function after the loss function generating unit generates the text loss function for updating the model parameters of the generator according to the discriminating loss function and the first semantic feature.
In an exemplary embodiment of the present application, the feature generating unit generates a first semantic feature according to the sample text and the class-one tag, and generates a second semantic feature according to the sample text and the class-two tag, including:
triggering an encoder to calculate text features corresponding to the sample text;
The trigger decoder calculates a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag.
In an exemplary embodiment of the present application, the apparatus further includes:
The feature dimension adjustment unit is used for respectively adjusting the sequence dimensions corresponding to the first semantic features and the second semantic features into target sequence dimensions before the loss function generation unit generates a text loss function for updating the model parameters according to the first semantic features and the second semantic features, so as to obtain the first semantic features corresponding to the target sequence dimensions and the second semantic features corresponding to the target sequence dimensions.
According to an aspect of the present application, there is provided a model updating method including:
Acquiring a sample text, a class label corresponding to the sample text and a class II label corresponding to the sample text;
Generating first semantic features according to the sample text and the class-one tags, and generating second semantic features according to the sample text and the class-two tags;
generating a text loss function according to the first semantic features and the second semantic features;
updating model parameters of the generator according to the text loss function; wherein the generator is for generating a class II tag.
According to an aspect of the present application, there is provided a model updating apparatus including:
the data acquisition unit is used for acquiring the sample text, a type of label corresponding to the sample text and a type of label corresponding to the sample text;
the feature generation unit is used for generating first semantic features according to the sample text and the class-one tags and generating second semantic features according to the sample text and the class-two tags;
The loss function generation unit is used for generating a text loss function according to the first semantic features and the second semantic features;
The model parameter updating unit is used for updating the model parameters of the generator according to the text loss function; wherein the generator is for generating a class II tag.
According to an aspect of the present application, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of any of the above.
According to an aspect of the present application, there is provided an electronic apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the method of any of the above via execution of executable instructions.
Exemplary embodiments of the present application may have some or all of the following advantages:
In the sample processing method provided by an example embodiment of the present application, a sample text, a class label corresponding to the sample text, and a class label corresponding to the sample text may be obtained; generating first semantic features according to the sample text and the class-one tags, and generating second semantic features according to the sample text and the class-two tags; a text loss function for updating the model parameters is generated from the first semantic features and the second semantic features. Therefore, the first class label and the second class label corresponding to the sample text can be obtained, and two semantic features are generated based on the respective combination of the sample text and the first class label and the second class label.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a schematic diagram of an exemplary system architecture of a sample processing method and sample processing device to which embodiments of the present application may be applied;
FIG. 2 schematically illustrates a flow chart of a sample processing method according to one embodiment of the application;
FIG. 3 schematically shows a flow chart of a sample processing method according to another embodiment of the application;
FIG. 4 schematically illustrates a architectural diagram of a sample processing model in accordance with one embodiment of the present application;
FIG. 5 schematically shows a block diagram of a sample processing device in one embodiment according to the application;
FIG. 6 schematically illustrates a flow chart of a model update method according to one embodiment of the application;
FIG. 7 schematically shows a block diagram of a model updating apparatus in one embodiment according to the present application;
fig. 8 schematically shows a schematic of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the application may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known aspects have not been shown or described in detail to avoid obscuring aspects of the application.
Furthermore, the drawings are merely schematic illustrations of the present application and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture of an exemplary application environment to which a sample processing method and a sample processing apparatus according to an embodiment of the present application may be applied. As shown in fig. 1, the system architecture 100 may include one or more of the terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105.
The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The terminal devices 101, 102, 103 may be devices that provide voice and/or data connectivity to a user, handheld devices with wireless connectivity, or other processing devices connected to a wireless modem. The wireless terminal may communicate with one or more core networks via the RAN. The wireless terminal may be a User Equipment (UE), a handheld terminal, a notebook computer, a subscriber unit (subscriber unit), a cellular phone (cellular phone), a smart phone (smart phone), a wireless data card, a Personal Digital Assistant (PDA) computer, a tablet computer, a wireless modem (modem), a handheld device (handheld), a laptop computer (laptop computer), a cordless phone (cordless phone) or a wireless local loop (wireless local loop, WLL) station, a Machine Type Communication (MTC) terminal, or other device that may access a network. The terminals communicate with the access network device using some kind of air interface technology (e.g. 3GPP access technology or non-3 GPP access technology). It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 105 may be a server cluster formed by a plurality of servers.
The sample processing method provided by the embodiment of the present application may be executed by the server 105, and accordingly, the sample processing device is generally disposed in the server 105. However, it will be readily understood by those skilled in the art that the sample processing method provided in the embodiment of the present application may be performed by the terminal device 101, 102 or 103, and accordingly, the sample processing apparatus may be provided in the terminal device 101, 102 or 103, which is not particularly limited in the present exemplary embodiment. For example, in one exemplary embodiment, the server 105 may obtain the sample text, a class label corresponding to the sample text, and a class label corresponding to the sample text; generating first semantic features according to the sample text and the class-one tags, and generating second semantic features according to the sample text and the class-two tags; a text loss function for updating the model parameters is generated from the first semantic features and the second semantic features.
Referring to fig. 2, fig. 2 schematically shows a flow chart of a sample processing method according to an embodiment of the application. As shown in fig. 2, the sample processing method may include: step S210 to step S230.
Step S210: and acquiring the sample text, a first type label corresponding to the sample text and a second type label corresponding to the sample text.
Step S220: and generating a first semantic feature according to the sample text and the class-one tag, and generating a second semantic feature according to the sample text and the class-two tag.
Step S230: a text loss function for updating the model parameters is generated from the first semantic features and the second semantic features.
By implementing the method shown in fig. 2, a first type label and a second type label corresponding to the sample text can be obtained, and two semantic features are generated based on the combination of the sample text and the first type label and the second type label respectively.
Next, the above steps of the present exemplary embodiment will be described in more detail.
In step S210, a sample text, a class label corresponding to the sample text, and a class two label corresponding to the sample text are obtained.
Specifically, characters in any language may be included in the sample text, for example, the sample text may be "price elsewhere is three times here. The first type of label corresponding to the sample text and the second type of label corresponding to the sample text can be the same label or different labels.
Further, the label corresponding to the sample text may be understood as a real label, that is, a label that exists objectively, taking the example that the price of the sample text in another place is three times as high as that of the sample text, the label corresponding to the sample text is price and positive, where the price is an entity extracted from the sample text, the positive is the emotion polarity of the sample text, and the emotion polarity may be preset by a user. The second class labels corresponding to the sample text can be labels generated by the generator, and the second class labels generated by the generator can be consistent with or inconsistent with the first class labels. If the results are inconsistent, the generation result of the description generator is inaccurate, model training is needed to be further carried out, and if the results are consistent, the generation result of the description generator is accurate, and model training is not needed. Taking the example of the sample text "price elsewhere is three times here", its corresponding class two label may be price, negative.
In addition, optionally, considering that the high probability of the first class label generated by the generator in the later training period is consistent with that of the second class label, in order to avoid directly giving the first class label to the discriminator for discrimination, the number of the second class labels corresponding to the sample text can be one or more, namely, the second class labels corresponding to various sample texts can be generated and provided for the discriminator for discrimination, so that the accuracy of the generated loss function is improved, and the model training precision is improved.
Wherein, optionally, obtaining the sample text includes: extracting sample text from a sample library; or receiving text entered by the user as sample text. And, optionally, obtaining a type of tag corresponding to the sample text, including: and obtaining a type of label corresponding to the preset sample text. And, optionally, obtaining a second class label corresponding to the sample text, including: inputting sample text (e.g., x 1,x2…xn) into the generator to trigger an encoder in the generator based on the expression max θP(yt|y1,y2…yt-1;x1,x2…xn; θ) character encoding the sample text and triggering a decoder in the generator to generate a class II tag (e.g., y t|y1,y2…yt-1) corresponding to the sample text based on the character encoding result; where x 1,x2…xn can be the sample text, n is the length of the sample text, and y 1,y2…yt-1,yt is the (1, 2, …, t-1, t) th token of the text.
Furthermore, optionally, the method may further include: if the first class label is inconsistent with the second class label, step S220 is executed, and if the first class label is inconsistent with the second class label, the process is stopped until the first class label and the second class label generated by the model are inconsistent.
As an alternative embodiment, the emotion polarities corresponding to the first type of label and the emotion polarities corresponding to the second type of label are opposite emotion polarities. Therefore, different semantic features can be generated based on labels with different emotion polarities, and more accurate loss functions can be calculated, so that training accuracy of a model is improved.
Specifically, the relative emotion polarities may be understood as inconsistent emotion polarities, for example, emotion polarities corresponding to one type of tag are "positive", "middle and upper", and emotion polarities corresponding to two types of tag are "negative", "neutral", and the like.
In step S220, a first semantic feature is generated according to the sample text and the class one tag, and a second semantic feature is generated according to the sample text and the class two tag.
In particular, the first semantic feature and the second semantic feature may be used to characterize different semantics, both of which may be represented as vectors/matrices.
As an alternative embodiment, generating the first semantic feature according to the sample text and the class-one tag, and generating the second semantic feature according to the sample text and the class-two tag includes: triggering an encoder to calculate text features corresponding to the sample text; the trigger decoder calculates a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag. Therefore, the text character coding and the semantic coding of the sample text can be realized, the precision of the extracted features is improved, and the model training can be realized more efficiently.
In particular, the encoder and decoder may be a network model (e.g., T5) constructed based on a transducer. Triggering an encoder to calculate text features corresponding to sample text, comprising: and triggering the encoder to calculate codes corresponding to the characters in the sample text, and obtaining text features corresponding to the characters.
In step S230, a text loss function for updating the model parameters is generated from the first semantic features and the second semantic features.
The method comprises the steps of setting a delay () function in a program to avoid updating a generator before updating a discriminator, and calculating the loss function of the generator according to the loss function of the discriminator, so that the updating efficiency of the loss function of the generator to the generator is improved.
As an alternative embodiment, before generating the text loss function for updating the model parameters from the first semantic feature and the second semantic feature, the method further comprises: and respectively adjusting the sequence dimensions corresponding to the first semantic features and the second semantic features into target sequence dimensions to obtain the first semantic features corresponding to the target sequence dimensions and the second semantic features corresponding to the target sequence dimensions. This allows sentence coding of different lengths to be tuned to the same dimension.
In particular, the target sequence dimension may be used to characterize the text sequence length, and the target sequence dimension may be expressed as a positive integer, e.g., 10. The method for adjusting the sequence dimension corresponding to the first semantic feature and the second semantic feature to be the target sequence dimension comprises the following steps: the sequence dimension (e.g., 10) of the first semantic feature is adjusted to the target sequence dimension (e.g., 1) based on the vector length (e.g., 512) of the first semantic feature, and the sequence dimension (e.g., 10) of the second semantic feature is adjusted to the target sequence dimension (e.g., 1) based on the vector length (e.g., 512) of the second semantic feature.
As an alternative embodiment, generating a text loss function for updating model parameters from the first semantic feature and the second semantic feature comprises: triggering a discriminator to generate a discriminating loss function according to the first semantic feature and the second semantic feature; a text penalty function for updating model parameters of the generator is generated from the discriminant penalty function and the first semantic feature. Therefore, text loss function calculation based on the discrimination loss function can be realized, and the precision of the text loss function can be improved, so that a model with higher precision can be trained.
Specifically, before the trigger arbiter generates the discrimination loss function according to the first semantic feature and the second semantic feature, the method may further include: triggering the discriminator to generate a discrimination result (e.g., 1) corresponding to the first semantic feature and a discrimination result (e.g., 0) corresponding to the second semantic feature, the discrimination result if 1 indicates that the discriminator considers the label corresponding to the semantic feature as true, and the discrimination result if 0 indicates that the discriminator considers the label corresponding to the semantic feature as model generated instead of objectively existing.
The triggering discriminator generates a discriminating loss function according to the first semantic feature and the second semantic feature, and the triggering discriminator comprises: the trigger arbiter is based on the expression log (D (G (y 1,y2…yT;x1,x2...xn))) and generates a corresponding discrimination loss function L D from the first semantic feature G1 (y 1,y2...yT;x1,x2...xn) and a corresponding discrimination loss function L D 2 from the second semantic feature G2 (y 1,y2...yT;x1,x2...xn). The arbiter may determine semantic features corresponding to a class of tags based on L D and L D 2.
And generating a text loss function for updating the model parameters of the generator according to the discriminant loss function and the first semantic features, comprising: generating a text loss function for updating model parameters of the generator from the discriminant loss function log (D (G (y 1,y2…yT;x1,x2...xn))) and the first semantic feature G1 (y 1,y2…yT;x1,x2…xn)
As an alternative embodiment, after the trigger arbiter generates the discrimination loss function according to the first semantic feature and the second semantic feature, the method further comprises: and updating model parameters of the discriminant according to the discriminant loss function. Therefore, the timely updating of the discriminator can be realized, and the discrimination precision of the discriminator can be improved.
Specifically, the model parameters of the arbiter may include parameters such as weights, bias terms, and the like, which are not limited in the embodiment of the present application.
As an alternative embodiment, after generating the text loss function for updating the model parameters of the generator based on the discriminant loss function and the first semantic feature, the method further comprises: model parameters of the generator are updated according to the text loss function. Thus, timely updating of the generator can be realized, and the generation precision of the generator can be improved.
In particular, the model parameters of the generator may include encoder parameters and decoder parameters; the encoder parameters and the decoder parameters may include parameters such as weights, bias terms, and the like, which are not limited in the embodiment of the present application.
Furthermore, after updating the model parameters of the generator according to the text loss function, the method may further include: when a text to be distinguished is received, the text to be distinguished is character-coded according to an encoder in a generator, a decoder in the generator is triggered to generate a label corresponding to the text to be distinguished based on a character coding result, semantic features are generated based on the character coding and the label, and a label (such as [ cheaply, actively ]) corresponding to the text to be distinguished (such as that the thing is cheaply) is generated based on the semantic features.
Referring to fig. 3, fig. 3 schematically shows a flow chart of a sample processing method according to another embodiment of the application. As shown in fig. 3, the sample processing method may include: step S310 to step S380.
Step S310: acquiring a sample text, a class label corresponding to the sample text and a class II label corresponding to the sample text; the emotion polarities corresponding to the first type of labels and the emotion polarities corresponding to the second type of labels are opposite.
Step S320: the trigger encoder calculates text features corresponding to the sample text.
Step S330: the trigger decoder calculates a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag.
Step S340: and respectively adjusting the sequence dimensions corresponding to the first semantic features and the second semantic features into target sequence dimensions to obtain the first semantic features corresponding to the target sequence dimensions and the second semantic features corresponding to the target sequence dimensions.
Step S350: the trigger discriminator generates a discrimination loss function according to the first semantic feature and the second semantic feature.
Step S360: and updating model parameters of the discriminant according to the discriminant loss function.
Step S370: a text penalty function for updating model parameters of the generator is generated from the discriminant penalty function and the first semantic feature.
Step S380: model parameters of the generator are updated according to the text loss function.
It should be noted that, the steps S310 to S380 correspond to the steps and embodiments shown in fig. 2, and for the specific implementation of the steps S310 to S380, please refer to the steps and embodiments shown in fig. 2, and the description thereof is omitted here.
It can be seen that, by implementing the method shown in fig. 3, a first type label and a second type label corresponding to a sample text can be obtained, and two kinds of semantic features are generated based on the respective combinations of the sample text and the first type label and the second type label.
Referring to fig. 4, fig. 4 schematically illustrates a schematic diagram of a sample processing model according to one embodiment of the application. As shown in fig. 4, the sample processing model 400 may include: generator (G) 410, arbiter (Discriminator) 420. Among other things, the Generator (G) 410 may include: a T5 encoder (encoder) 411, a T5 decoder (decoder) 412.
The sample processing model 400 may be a countermeasure model (GAN) that may be used to perform various tasks, such as an information extraction task, a machine translation task, a summary generation task, a title generation task, and a fine-grained emotion analysis task.
In GAN, the goal of the Generator (G) 410 is opposite to the goal of the arbiter (Discriminator) 420, where the goal of the Generator (G) 410 is: the correct tag is generated such that the arbiter (Discriminator) 420 cannot distinguish whether the current tag is true or false, i.e., whether the current tag is a class one tag or a class two tag. While the objective of the arbiter (Discriminator) 420 is: distinguishing whether the received tag is a type tag or a type tag.
In particular, a T5 encoder (encoder) 411 may calculate text features corresponding to the sample text. The T5 decoder (decoder) 412 may calculate a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag. The Generator (G) 410 adjusts the sequence dimensions corresponding to the first semantic feature and the second semantic feature to target sequence dimensions, respectively, obtains the first semantic feature corresponding to the target sequence dimensions and the second semantic feature corresponding to the target sequence dimensions, and inputs the first semantic feature and the second semantic feature into the arbiter (Discriminator) 420, so that the arbiter (Discriminator) 420 generates a discriminant loss function according to the first semantic feature and the second semantic feature, and updates model parameters of the arbiter according to the discriminant loss function. Further, the Generator (G) 410 may generate a text loss function for updating model parameters of the Generator according to the discriminant loss function and the first semantic feature, and update the model parameters of the Generator according to the text loss function.
It can be seen that, by implementing the architecture shown in fig. 4, a first type label and a second type label corresponding to a sample text can be obtained, and two kinds of semantic features are generated based on the respective combinations of the sample text and the first type label and the second type label.
Referring to fig. 5, fig. 5 schematically shows a block diagram of a sample processing device in an embodiment according to the application. The sample processing device 500 corresponds to the method shown in fig. 2, and as shown in fig. 5, the sample processing device 500 includes:
a data obtaining unit 501, configured to obtain a sample text, a class label corresponding to the sample text, and a class II label corresponding to the sample text;
the feature generation unit 502 is configured to generate a first semantic feature according to the sample text and the class-one tag, and generate a second semantic feature according to the sample text and the class-two tag;
A penalty function generating unit 503 for generating a text penalty function for updating the model parameters based on the first semantic feature and the second semantic feature.
As can be seen, by implementing the device shown in fig. 5, a first type of tag and a second type of tag corresponding to a sample text can be obtained, and two types of semantic features are generated based on the respective combinations of the sample text and the first type of tag and the second type of tag.
In an exemplary embodiment of the present application, the emotion polarities corresponding to the class one tags are opposite emotion polarities to the emotion polarities corresponding to the class two tags.
It can be seen that implementing this alternative embodiment may facilitate generating different semantic features based on labels of different emotion polarities, thereby facilitating calculation of a more accurate loss function to improve training accuracy of the model.
In an exemplary embodiment of the present application, the loss function generating unit 503 generates a text loss function for updating model parameters according to the first semantic feature and the second semantic feature, including:
Triggering a discriminator to generate a discriminating loss function according to the first semantic feature and the second semantic feature;
a text penalty function for updating model parameters of the generator is generated from the discriminant penalty function and the first semantic feature.
Therefore, by implementing the alternative embodiment, text loss function calculation based on the discrimination loss function can be realized, and the precision of the text loss function can be improved, so that the training of a model with higher precision is facilitated.
In an exemplary embodiment of the present application, the apparatus further includes:
the parameter updating unit is configured to update model parameters of the discriminator according to the discriminating loss function after the discriminating loss function generating unit 503 triggers the discriminator to generate the discriminating loss function according to the first semantic feature and the second semantic feature.
Therefore, by implementing the alternative embodiment, the timely updating of the discriminator can be realized, and the discrimination precision of the discriminator can be improved.
In an exemplary embodiment of the application, the parameter updating unit is further configured to update the model parameters of the generator according to the text loss function after the loss function generating unit 503 generates the text loss function for updating the model parameters of the generator according to the discriminating loss function and the first semantic feature.
It can be seen that implementing this alternative embodiment may enable timely updates to the generator, which may be beneficial to improving the accuracy of the generator's generation.
In an exemplary embodiment of the present application, the feature generating unit 502 generates a first semantic feature according to the sample text and the class-one tag, and generates a second semantic feature according to the sample text and the class-two tag, including:
triggering an encoder to calculate text features corresponding to the sample text;
The trigger decoder calculates a first semantic feature corresponding to the text feature and the class one tag and a second semantic feature corresponding to the text feature and the class two tag.
Therefore, by implementing the alternative embodiment, the text character coding and the semantic coding of the sample text can be realized, the precision of the extracted features is improved, and the model training can be realized more efficiently.
In an exemplary embodiment of the present application, the apparatus further includes:
The feature dimension adjustment unit is configured to adjust sequence dimensions corresponding to the first semantic feature and the second semantic feature to target sequence dimensions respectively before the loss function generation unit 503 generates a text loss function for updating model parameters according to the first semantic feature and the second semantic feature, so as to obtain the first semantic feature corresponding to the target sequence dimension and the second semantic feature corresponding to the target sequence dimension.
It will be seen that implementing this alternative embodiment, sentence codes of different lengths can be tuned to the same dimension.
Since each functional module of the sample processing device according to the exemplary embodiment of the present application corresponds to a step of the exemplary embodiment of the sample processing method, for details not disclosed in the embodiment of the device according to the present application, please refer to the embodiment of the sample processing method according to the present application.
Referring to fig. 6, fig. 6 schematically shows a flow chart of a model update method according to an embodiment of the application. As shown in fig. 6, the model updating method includes:
Step S610: and acquiring the sample text, a first type label corresponding to the sample text and a second type label corresponding to the sample text.
Step S620: and generating a first semantic feature according to the sample text and the class-one tag, and generating a second semantic feature according to the sample text and the class-two tag.
Step S630: a text loss function is generated from the first semantic feature and the second semantic feature.
Step S640: updating model parameters of the generator according to the text loss function; wherein the generator is for generating a class II tag.
By implementing the method shown in fig. 6, a first class label and a second class label corresponding to the sample text can be obtained, and two semantic features are generated based on the combination of the sample text and the first class label and the second class label respectively, so that a more accurate text loss function can be calculated, model parameters of a generator (namely, a training generator) are updated by using the accurate text loss function, the generator can generate the more accurate second class label, and compared with the prior art, the method disclosed by the application focuses on semantic information of the whole text rather than only aiming at single character coding, and the training precision of the model can be improved.
Referring to fig. 7, fig. 7 schematically shows a block diagram of a model updating apparatus in an embodiment according to the present application. As shown in fig. 7, the model updating apparatus 700 includes:
The data acquisition unit 701 is configured to acquire a sample text, a class of tags corresponding to the sample text, and a class II tag corresponding to the sample text;
A feature generating unit 702, configured to generate a first semantic feature according to the sample text and the class-one tag, and generate a second semantic feature according to the sample text and the class-two tag;
A penalty function generating unit 703, configured to generate a text penalty function according to the first semantic feature and the second semantic feature;
a model parameter updating unit 704, configured to update model parameters of the generator according to the text loss function; wherein the generator is for generating a class II tag.
As can be seen, by implementing the apparatus shown in fig. 7, a first class label and a second class label corresponding to a sample text can be obtained, and two semantic features are generated based on the combination of the sample text and the first class label and the second class label, so that a more accurate text loss function can be calculated, model parameters of a generator (i.e., a training generator) can be updated by using the accurate text loss function, and the generator can generate the second class label more accurately.
Since each functional module of the model updating apparatus according to the exemplary embodiment of the present application corresponds to a step of the exemplary embodiment of the model updating method, for details not disclosed in the apparatus embodiment of the present application, please refer to the embodiment of the model updating method according to the present application.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Referring to fig. 8, fig. 8 is a schematic diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
It should be noted that, the computer system 800 of the electronic device shown in fig. 8 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU) 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for system operation are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage portion 808 as needed.
In particular, according to embodiments of the present application, the process described above with reference to the flowcharts may be implemented as a computer software program. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. When being executed by a Central Processing Unit (CPU) 801, performs the various functions defined in the methods and apparatus of the present application.
As another aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the methods described in the above embodiments.
The computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
Claims (12)
1. A method of sample processing, comprising:
Acquiring a sample text, a first class label corresponding to the sample text and a second class label corresponding to the sample text;
generating a first semantic feature according to the sample text and the class-one tag, and generating a second semantic feature according to the sample text and the class-two tag;
And generating a text loss function for updating model parameters according to the first semantic features and the second semantic features.
2. The method of claim 1, wherein the emotion polarities of the class one tags and the emotion polarities of the class two tags are opposite emotion polarities.
3. The method of claim 1, wherein generating a text loss function for updating model parameters from the first semantic feature and the second semantic feature comprises:
triggering a discriminator to generate a discriminating loss function according to the first semantic feature and the second semantic feature;
And generating a text loss function for updating the model parameters of the generator according to the discriminant loss function and the first semantic features.
4. A method according to claim 3, wherein after the trigger arbiter generates a discrimination loss function from the first semantic feature and the second semantic feature, the method further comprises:
And updating model parameters of the discriminator according to the discriminating loss function.
5. A method according to claim 3, wherein after generating a text loss function for updating model parameters of a generator from the discriminant loss function and the first semantic feature, the method further comprises:
And updating model parameters of the generator according to the text loss function.
6. The method of claim 1, wherein generating a first semantic feature from the sample text and the class one tag and generating a second semantic feature from the sample text and the class two tag comprises:
triggering an encoder to calculate text features corresponding to the sample text;
The trigger decoder calculates a first semantic feature corresponding to the text feature and the class of tags and a second semantic feature corresponding to the text feature and the class of tags.
7. The method according to any one of claims 1-6, wherein before generating a text loss function for updating model parameters from the first semantic features and the second semantic features, the method further comprises:
And respectively adjusting the sequence dimensions corresponding to the first semantic features and the second semantic features into target sequence dimensions to obtain first semantic features corresponding to the target sequence dimensions and second semantic features corresponding to the target sequence dimensions.
8. A sample processing device, comprising:
The data acquisition unit is used for acquiring a sample text, a first type label corresponding to the sample text and a second type label corresponding to the sample text;
The feature generation unit is used for generating first semantic features according to the sample text and the class-one tags and generating second semantic features according to the sample text and the class-two tags;
And the loss function generation unit is used for generating a text loss function for updating the model parameters according to the first semantic features and the second semantic features.
9. A method of updating a model, comprising:
Acquiring a sample text, a first class label corresponding to the sample text and a second class label corresponding to the sample text;
generating a first semantic feature according to the sample text and the class-one tag, and generating a second semantic feature according to the sample text and the class-two tag;
generating a text loss function according to the first semantic features and the second semantic features;
updating model parameters of a generator according to the text loss function; wherein the generator is configured to generate the class II tag.
10. A model updating apparatus, characterized by comprising:
The data acquisition unit is used for acquiring a sample text, a first type label corresponding to the sample text and a second type label corresponding to the sample text;
The feature generation unit is used for generating first semantic features according to the sample text and the class-one tags and generating second semantic features according to the sample text and the class-two tags;
a loss function generating unit, configured to generate a text loss function according to the first semantic feature and the second semantic feature;
A model parameter updating unit, configured to update model parameters of a generator according to the text loss function; wherein the generator is configured to generate the class II tag.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1-7, 9.
12. An electronic device, comprising:
A processor; and
A memory for storing executable instructions of the processor;
Wherein the processor is configured to perform the method of any of claims 1-7, 9 via execution of the executable instructions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211485919.5A CN118113862A (en) | 2022-11-24 | 2022-11-24 | Sample processing method, sample processing device, model updating method, medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211485919.5A CN118113862A (en) | 2022-11-24 | 2022-11-24 | Sample processing method, sample processing device, model updating method, medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118113862A true CN118113862A (en) | 2024-05-31 |
Family
ID=91210723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211485919.5A Pending CN118113862A (en) | 2022-11-24 | 2022-11-24 | Sample processing method, sample processing device, model updating method, medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118113862A (en) |
-
2022
- 2022-11-24 CN CN202211485919.5A patent/CN118113862A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111027331B (en) | Method and apparatus for evaluating translation quality | |
CN109063174B (en) | Query answer generation method and device, computer storage medium and electronic equipment | |
CN110705301B (en) | Entity relationship extraction method and device, storage medium and electronic equipment | |
CN113470619B (en) | Speech recognition method, device, medium and equipment | |
CN110969012B (en) | Text error correction method and device, storage medium and electronic equipment | |
CN110046254B (en) | Method and apparatus for generating a model | |
CN113436620B (en) | Training method of voice recognition model, voice recognition method, device, medium and equipment | |
CN111428010A (en) | Man-machine intelligent question and answer method and device | |
CN112883968B (en) | Image character recognition method, device, medium and electronic equipment | |
CN109920431B (en) | Method and apparatus for outputting information | |
CN112650841A (en) | Information processing method and device and electronic equipment | |
CN110991165A (en) | Method and device for extracting character relation in text, computer equipment and storage medium | |
CN112883967B (en) | Image character recognition method, device, medium and electronic equipment | |
CN112163434B (en) | Text translation method, device, medium and electronic equipment based on artificial intelligence | |
CN113140012B (en) | Image processing method, device, medium and electronic equipment | |
CN111368560A (en) | Text translation method and device, electronic equipment and storage medium | |
CN110738056B (en) | Method and device for generating information | |
CN114494709A (en) | Feature extraction model generation method, image feature extraction method and device | |
CN113591490B (en) | Information processing method and device and electronic equipment | |
CN113947060B (en) | Text conversion method, text conversion device, medium and electronic equipment | |
CN116108810A (en) | Text data enhancement method and device | |
CN118113862A (en) | Sample processing method, sample processing device, model updating method, medium and electronic equipment | |
CN112101023B (en) | Text processing method and device and electronic equipment | |
CN115129845A (en) | Text information processing method and device and electronic equipment | |
CN111914535B (en) | Word recognition method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |