CN112328750A - Method and system for training text discrimination model - Google Patents
Method and system for training text discrimination model Download PDFInfo
- Publication number
- CN112328750A CN112328750A CN202011347328.2A CN202011347328A CN112328750A CN 112328750 A CN112328750 A CN 112328750A CN 202011347328 A CN202011347328 A CN 202011347328A CN 112328750 A CN112328750 A CN 112328750A
- Authority
- CN
- China
- Prior art keywords
- model
- sample
- discrimination
- modifiers
- language sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000012549 training Methods 0.000 title claims abstract description 19
- 239000003607 modifier Substances 0.000 claims abstract description 66
- 230000006870 function Effects 0.000 claims description 10
- 238000012546 transfer Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 238000003780 insertion Methods 0.000 claims description 4
- 230000037431 insertion Effects 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention provides a method and a system for training a text discrimination model, which comprises the following steps: extracting a real language sample from a real language library and inputting the real language sample into a generation model; the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples; inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the input first new language sample or the second new language sample is a positive sample or a negative sample; comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result; and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model. The quality of the generated positive samples and the quality of the generated negative samples by introducing the confusion words are controllable compared with the conventional learning mode.
Description
Technical Field
The invention relates to the field of data processing, in particular to a method and a system for training a text discrimination model.
Background
Counterlearning is mainly used in the field of image recognition at present, and an image generated by a generative model is distinguished through a discriminant model, so that the image generation capability of the generative model is continuously improved.
For example, patent document CN109949317A discloses a semi-supervised image instance segmentation method based on gradual counterstudy, which retrains an instance segmentation model to obtain a segmentation model with higher accuracy. The existing countermeasure learning mainly focuses on the performance improvement of the generated model after training, and neglects the performance improvement in the countermeasure learning of the discrimination model. The way of improving the robustness of the discriminant model through counterlearning is not much in the current published documents.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method and a system for training a text discrimination model.
The method for training the text discriminant model provided by the invention comprises the following steps:
a sample extraction step: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation step: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging step: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
and (3) judging the updating step of the model: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
generating a model updating step: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Preferably, the sample generating step comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
Preferably, the sample generating step comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
Preferably, the comparing in the step of discriminating includes: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
Preferably, the step of updating the discriminant model includes:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model updating step comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
The system for training the text discriminant model provided by the invention comprises the following components:
a sample extraction module: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging module: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
a discrimination model updating module: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
a generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Preferably, the sample generation module comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
Preferably, the sample generation module comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
Preferably, the comparison in the discrimination module includes: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
Preferably, the discriminant model update module includes:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model update module comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
Compared with the prior art, the invention has the following beneficial effects:
1) the patent provides a new generation learning scheme by utilizing a generation countermeasure learning mode according to the characteristics of a text.
2) The generation model in the scheme randomly extracts real language samples from real data distribution, and generates the samples on the basis, so that the initial quality of the samples is higher than that of the samples in the past learning mode.
3) The quality of the positive samples generated by inserting, deleting and replacing modifiers and the quality of the negative samples generated by introducing confusing words are controllable compared with the quality of the negative samples generated by the conventional learning method.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of the operation of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
As shown in fig. 1, the method for training a text discriminant model provided by the present invention trains two models:
model 1: a model is generated for generating text in a near real language, denoted G.
Model 2: and the discrimination model is used for discriminating whether the text generated by the G is real or not and is represented by D.
On a server, a generative model G and a discriminant model D are trained so that new language data generated by G cannot make D distinguish whether the new language data is real language data or generated language data.
Referring to the flow chart of fig. 1, the method comprises the following steps:
step 1: samples were drawn randomly. G, randomly extracting real samples from a real language library;
step 2: a positive sample is generated. Positive samples are text that is expected to be in real language. Specifically, the new language data is generated as a positive sample by inserting, deleting, and replacing some modifiers.
The step 2 specifically comprises the following steps: for the inserted modifier, G, after judging the position where the modifier can be inserted, inserting a modifier; for deleting the modifier, G deletes the modifier after judging the position of the modifier; and for the replacement modifier, G replaces the modifier with a new modifier after judging the position of the modifier. This operation does not change the classification of the original text;
and step 3: and generating a negative sample. Negative examples are examples of languages that are expected to be non-real. Specifically, some confusion words are introduced into main words to change the classification of the original text;
the step 3 specifically comprises the following steps: g is responsible for judging the position and the category of the main stem word and then inserting or replacing the original main stem word by the confusion word. Confusing words will change the category of the original main word.
And 4, step 4: the positive and negative samples are input to D, which compares them to the true sample distribution. Step 2 and step 3, in order to generate data which makes D judge wrongly as much as possible, namely, making D judge the generated positive sample as a negative sample by mistake, or making D judge the generated negative sample as a positive sample by mistake;
the comparison mode in the step 4 specifically includes: vectorization and KL divergence calculation. And if the distribution difference is small as shown in the calculation result, judging that the sample is a positive sample by D, otherwise, judging that the sample is a negative sample by D.
And 5: check if the judgment is correct. Specifically, the discrimination of D is compared with the expectation of G, D calculates the gradient of D according to the result, and updates the model parameter of D, and the aim is to minimize the discrimination error;
the comparison process in step 5 specifically includes: if the discrimination of D is consistent with the expectation of G, D updates the parameters by passing the gradient backwards so that D is more correct for the correct discrimination. If the discrimination of D is not consistent with the expectation of G, D updates the parameters by passing the gradient backwards so that D is less erroneous for erroneous discrimination.
Step 6: g, calculating the gradient of G according to the latest model parameter of D, and updating the model parameter of G, wherein the aim is to maximize the possibility of D discrimination error;
the calculation process in step 6 specifically includes: and G, on the basis of the latest model parameters of D, calculating gradients according to the directions of the previous D most easily judged errors and the future most likely judged errors, and respectively updating the model parameters for generating the positive samples and the negative samples.
The invention also provides a system for training the text discrimination model, which comprises:
a sample extraction module: and extracting a real language sample from the real language library and inputting the sample into the generating model.
A sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; and the generation model introduces confusion words into the main words of the extracted real language samples to obtain a second new language sample.
A judging module: and inputting the first new language sample or the second new language sample into a discrimination model, and comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model to judge whether the first new language sample or the second new language sample is a positive sample or a negative sample.
A discrimination model updating module: and comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result.
A generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Claims (10)
1. A method for training a text discriminant model, comprising:
a sample extraction step: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation step: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging step: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
and (3) judging the updating step of the model: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
generating a model updating step: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
2. The method of training a text discriminant model according to claim 1, wherein the sample generation step comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
3. The method of training a text discriminant model according to claim 1, wherein the sample generation step comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
4. The method of claim 1, wherein the comparing in the discriminating step comprises: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
5. The method of training a text discriminant model of claim 1, wherein the discriminant model updating step comprises:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model updating step comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
6. A system for training a text discriminant model, comprising:
a sample extraction module: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging module: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
a discrimination model updating module: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
a generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
7. The system for training a text discriminant model according to claim 6, wherein the sample generation module comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
8. The system for training a text discriminant model according to claim 6, wherein the sample generation module comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
9. The system for training a text discriminant model of claim 6, wherein the manner of comparison in the discriminant module comprises: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
10. The system for training a text discriminant model of claim 6, wherein the discriminant model update module comprises:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model update module comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011347328.2A CN112328750A (en) | 2020-11-26 | 2020-11-26 | Method and system for training text discrimination model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011347328.2A CN112328750A (en) | 2020-11-26 | 2020-11-26 | Method and system for training text discrimination model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112328750A true CN112328750A (en) | 2021-02-05 |
Family
ID=74309567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011347328.2A Pending CN112328750A (en) | 2020-11-26 | 2020-11-26 | Method and system for training text discrimination model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112328750A (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109003678A (en) * | 2018-06-12 | 2018-12-14 | 清华大学 | A kind of generation method and system emulating text case history |
CN109117482A (en) * | 2018-09-17 | 2019-01-01 | 武汉大学 | A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency |
CN109766432A (en) * | 2018-07-12 | 2019-05-17 | 中国科学院信息工程研究所 | A kind of Chinese abstraction generating method and device based on generation confrontation network |
CN109885667A (en) * | 2019-01-24 | 2019-06-14 | 平安科技(深圳)有限公司 | Document creation method, device, computer equipment and medium |
CN110347819A (en) * | 2019-06-21 | 2019-10-18 | 同济大学 | A kind of text snippet generation method based on positive negative sample dual training |
CN110414003A (en) * | 2019-07-29 | 2019-11-05 | 清华大学 | Establish method, apparatus, medium and the calculating equipment of text generation model |
CN110442859A (en) * | 2019-06-28 | 2019-11-12 | 中国人民解放军国防科技大学 | Method, device and equipment for generating labeled corpus and storage medium |
CN111027292A (en) * | 2019-11-29 | 2020-04-17 | 北京邮电大学 | Method and system for generating limited sampling text sequence |
US20200134415A1 (en) * | 2018-10-30 | 2020-04-30 | Huawei Technologies Co., Ltd. | Autoencoder-Based Generative Adversarial Networks for Text Generation |
CN111160043A (en) * | 2019-12-31 | 2020-05-15 | 科大讯飞股份有限公司 | Feature encoding method and device, electronic equipment and readable storage medium |
CN111488422A (en) * | 2019-01-25 | 2020-08-04 | 深信服科技股份有限公司 | Incremental method and device for structured data sample, electronic equipment and medium |
CN111738351A (en) * | 2020-06-30 | 2020-10-02 | 创新奇智(重庆)科技有限公司 | Model training method and device, storage medium and electronic equipment |
CN111881935A (en) * | 2020-06-19 | 2020-11-03 | 北京邮电大学 | Countermeasure sample generation method based on content-aware GAN |
CN111914552A (en) * | 2020-07-31 | 2020-11-10 | 平安科技(深圳)有限公司 | Training method and device of data enhancement model |
-
2020
- 2020-11-26 CN CN202011347328.2A patent/CN112328750A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109003678A (en) * | 2018-06-12 | 2018-12-14 | 清华大学 | A kind of generation method and system emulating text case history |
CN109766432A (en) * | 2018-07-12 | 2019-05-17 | 中国科学院信息工程研究所 | A kind of Chinese abstraction generating method and device based on generation confrontation network |
CN109117482A (en) * | 2018-09-17 | 2019-01-01 | 武汉大学 | A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency |
US20200134415A1 (en) * | 2018-10-30 | 2020-04-30 | Huawei Technologies Co., Ltd. | Autoencoder-Based Generative Adversarial Networks for Text Generation |
CN109885667A (en) * | 2019-01-24 | 2019-06-14 | 平安科技(深圳)有限公司 | Document creation method, device, computer equipment and medium |
CN111488422A (en) * | 2019-01-25 | 2020-08-04 | 深信服科技股份有限公司 | Incremental method and device for structured data sample, electronic equipment and medium |
CN110347819A (en) * | 2019-06-21 | 2019-10-18 | 同济大学 | A kind of text snippet generation method based on positive negative sample dual training |
CN110442859A (en) * | 2019-06-28 | 2019-11-12 | 中国人民解放军国防科技大学 | Method, device and equipment for generating labeled corpus and storage medium |
CN110414003A (en) * | 2019-07-29 | 2019-11-05 | 清华大学 | Establish method, apparatus, medium and the calculating equipment of text generation model |
CN111027292A (en) * | 2019-11-29 | 2020-04-17 | 北京邮电大学 | Method and system for generating limited sampling text sequence |
CN111160043A (en) * | 2019-12-31 | 2020-05-15 | 科大讯飞股份有限公司 | Feature encoding method and device, electronic equipment and readable storage medium |
CN111881935A (en) * | 2020-06-19 | 2020-11-03 | 北京邮电大学 | Countermeasure sample generation method based on content-aware GAN |
CN111738351A (en) * | 2020-06-30 | 2020-10-02 | 创新奇智(重庆)科技有限公司 | Model training method and device, storage medium and electronic equipment |
CN111914552A (en) * | 2020-07-31 | 2020-11-10 | 平安科技(深圳)有限公司 | Training method and device of data enhancement model |
Non-Patent Citations (3)
Title |
---|
LIANG BIN: "Deep Text Classification Can Be Fooled", 《IJCAI-18》 * |
言有三: "《深度学习之人脸图像处理:核心算法与案例实战》", 31 July 2020 * |
陈云霁: "《智能计算系统》", 29 February 2020 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wei et al. | Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code. | |
CN110110327B (en) | Text labeling method and equipment based on counterstudy | |
US10963685B2 (en) | Generating variations of a known shred | |
KR101813683B1 (en) | Method for automatic correction of errors in annotated corpus using kernel Ripple-Down Rules | |
CN107688803B (en) | Method and device for verifying recognition result in character recognition | |
CN108959418A (en) | Character relation extraction method and device, computer device and computer readable storage medium | |
CN112016553B (en) | Optical Character Recognition (OCR) system, automatic OCR correction system, method | |
US20170076152A1 (en) | Determining a text string based on visual features of a shred | |
CN112651238A (en) | Training corpus expansion method and device and intention recognition model training method and device | |
JP2005158010A (en) | Apparatus, method and program for classification evaluation | |
CN110309073B (en) | Method, system and terminal for automatically detecting user interface errors of mobile application program | |
CN113408535B (en) | OCR error correction method based on Chinese character level features and language model | |
CN112966088B (en) | Unknown intention recognition method, device, equipment and storage medium | |
CN111930939A (en) | Text detection method and device | |
CN105512195A (en) | Auxiliary method for analyzing and making decisions of product FMECA report | |
CN107357895A (en) | A kind of processing method of the text representation based on bag of words | |
CN114818643A (en) | Log template extraction method for reserving specific service information | |
CN116911289A (en) | Method, device and storage medium for generating large-model trusted text in government affair field | |
CN111680684A (en) | Method, device and storage medium for recognizing spine text based on deep learning | |
CN110633456A (en) | Language identification method, language identification device, server and storage medium | |
CN112328750A (en) | Method and system for training text discrimination model | |
KR102019208B1 (en) | Deep learning-based error identification method and apparatus | |
CN109284392B (en) | Text classification method, device, terminal and storage medium | |
CN115953123A (en) | Method, device and equipment for generating robot automation flow and storage medium | |
CN112651590B (en) | Instruction processing flow recommending method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210205 |