CN112328750A - Method and system for training text discrimination model - Google Patents

Method and system for training text discrimination model Download PDF

Info

Publication number
CN112328750A
CN112328750A CN202011347328.2A CN202011347328A CN112328750A CN 112328750 A CN112328750 A CN 112328750A CN 202011347328 A CN202011347328 A CN 202011347328A CN 112328750 A CN112328750 A CN 112328750A
Authority
CN
China
Prior art keywords
model
sample
discrimination
modifiers
language sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011347328.2A
Other languages
Chinese (zh)
Inventor
蔡晓华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Netis Technologies Co ltd
Original Assignee
Shanghai Netis Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Netis Technologies Co ltd filed Critical Shanghai Netis Technologies Co ltd
Priority to CN202011347328.2A priority Critical patent/CN112328750A/en
Publication of CN112328750A publication Critical patent/CN112328750A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention provides a method and a system for training a text discrimination model, which comprises the following steps: extracting a real language sample from a real language library and inputting the real language sample into a generation model; the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples; inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the input first new language sample or the second new language sample is a positive sample or a negative sample; comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result; and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model. The quality of the generated positive samples and the quality of the generated negative samples by introducing the confusion words are controllable compared with the conventional learning mode.

Description

Method and system for training text discrimination model
Technical Field
The invention relates to the field of data processing, in particular to a method and a system for training a text discrimination model.
Background
Counterlearning is mainly used in the field of image recognition at present, and an image generated by a generative model is distinguished through a discriminant model, so that the image generation capability of the generative model is continuously improved.
For example, patent document CN109949317A discloses a semi-supervised image instance segmentation method based on gradual counterstudy, which retrains an instance segmentation model to obtain a segmentation model with higher accuracy. The existing countermeasure learning mainly focuses on the performance improvement of the generated model after training, and neglects the performance improvement in the countermeasure learning of the discrimination model. The way of improving the robustness of the discriminant model through counterlearning is not much in the current published documents.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method and a system for training a text discrimination model.
The method for training the text discriminant model provided by the invention comprises the following steps:
a sample extraction step: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation step: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging step: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
and (3) judging the updating step of the model: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
generating a model updating step: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Preferably, the sample generating step comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
Preferably, the sample generating step comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
Preferably, the comparing in the step of discriminating includes: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
Preferably, the step of updating the discriminant model includes:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model updating step comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
The system for training the text discriminant model provided by the invention comprises the following components:
a sample extraction module: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging module: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
a discrimination model updating module: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
a generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Preferably, the sample generation module comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
Preferably, the sample generation module comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
Preferably, the comparison in the discrimination module includes: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
Preferably, the discriminant model update module includes:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model update module comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
Compared with the prior art, the invention has the following beneficial effects:
1) the patent provides a new generation learning scheme by utilizing a generation countermeasure learning mode according to the characteristics of a text.
2) The generation model in the scheme randomly extracts real language samples from real data distribution, and generates the samples on the basis, so that the initial quality of the samples is higher than that of the samples in the past learning mode.
3) The quality of the positive samples generated by inserting, deleting and replacing modifiers and the quality of the negative samples generated by introducing confusing words are controllable compared with the quality of the negative samples generated by the conventional learning method.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of the operation of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
As shown in fig. 1, the method for training a text discriminant model provided by the present invention trains two models:
model 1: a model is generated for generating text in a near real language, denoted G.
Model 2: and the discrimination model is used for discriminating whether the text generated by the G is real or not and is represented by D.
On a server, a generative model G and a discriminant model D are trained so that new language data generated by G cannot make D distinguish whether the new language data is real language data or generated language data.
Referring to the flow chart of fig. 1, the method comprises the following steps:
step 1: samples were drawn randomly. G, randomly extracting real samples from a real language library;
step 2: a positive sample is generated. Positive samples are text that is expected to be in real language. Specifically, the new language data is generated as a positive sample by inserting, deleting, and replacing some modifiers.
The step 2 specifically comprises the following steps: for the inserted modifier, G, after judging the position where the modifier can be inserted, inserting a modifier; for deleting the modifier, G deletes the modifier after judging the position of the modifier; and for the replacement modifier, G replaces the modifier with a new modifier after judging the position of the modifier. This operation does not change the classification of the original text;
and step 3: and generating a negative sample. Negative examples are examples of languages that are expected to be non-real. Specifically, some confusion words are introduced into main words to change the classification of the original text;
the step 3 specifically comprises the following steps: g is responsible for judging the position and the category of the main stem word and then inserting or replacing the original main stem word by the confusion word. Confusing words will change the category of the original main word.
And 4, step 4: the positive and negative samples are input to D, which compares them to the true sample distribution. Step 2 and step 3, in order to generate data which makes D judge wrongly as much as possible, namely, making D judge the generated positive sample as a negative sample by mistake, or making D judge the generated negative sample as a positive sample by mistake;
the comparison mode in the step 4 specifically includes: vectorization and KL divergence calculation. And if the distribution difference is small as shown in the calculation result, judging that the sample is a positive sample by D, otherwise, judging that the sample is a negative sample by D.
And 5: check if the judgment is correct. Specifically, the discrimination of D is compared with the expectation of G, D calculates the gradient of D according to the result, and updates the model parameter of D, and the aim is to minimize the discrimination error;
the comparison process in step 5 specifically includes: if the discrimination of D is consistent with the expectation of G, D updates the parameters by passing the gradient backwards so that D is more correct for the correct discrimination. If the discrimination of D is not consistent with the expectation of G, D updates the parameters by passing the gradient backwards so that D is less erroneous for erroneous discrimination.
Step 6: g, calculating the gradient of G according to the latest model parameter of D, and updating the model parameter of G, wherein the aim is to maximize the possibility of D discrimination error;
the calculation process in step 6 specifically includes: and G, on the basis of the latest model parameters of D, calculating gradients according to the directions of the previous D most easily judged errors and the future most likely judged errors, and respectively updating the model parameters for generating the positive samples and the negative samples.
The invention also provides a system for training the text discrimination model, which comprises:
a sample extraction module: and extracting a real language sample from the real language library and inputting the sample into the generating model.
A sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; and the generation model introduces confusion words into the main words of the extracted real language samples to obtain a second new language sample.
A judging module: and inputting the first new language sample or the second new language sample into a discrimination model, and comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model to judge whether the first new language sample or the second new language sample is a positive sample or a negative sample.
A discrimination model updating module: and comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result.
A generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A method for training a text discriminant model, comprising:
a sample extraction step: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation step: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging step: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
and (3) judging the updating step of the model: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
generating a model updating step: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
2. The method of training a text discriminant model according to claim 1, wherein the sample generation step comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
3. The method of training a text discriminant model according to claim 1, wherein the sample generation step comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
4. The method of claim 1, wherein the comparing in the discriminating step comprises: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
5. The method of training a text discriminant model of claim 1, wherein the discriminant model updating step comprises:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model updating step comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
6. A system for training a text discriminant model, comprising:
a sample extraction module: extracting a real language sample from a real language library and inputting the real language sample into a generation model;
a sample generation module: the generation model inserts, deletes or replaces modifiers into the extracted real language sample to obtain a first new language sample; the generation model introduces confusion words into the main words of the extracted real language samples to obtain second new language samples;
a judging module: inputting the first new language sample or the second new language sample into a discrimination model, comparing the input first new language sample or the second new language sample with the real language sample by the discrimination model, and judging whether the first new language sample or the second new language sample is a positive sample or a negative sample;
a discrimination model updating module: comparing the judgment result of the discriminant model with the expectation of the generated model, and updating the model parameters of the discriminant model according to the comparison result;
a generative model update module: and the generated model updates the model parameters of the generated model according to the updated model parameters of the discrimination model.
7. The system for training a text discriminant model according to claim 6, wherein the sample generation module comprises:
for the inserted modifiers, the generating model judges the insertable position according to the extracted real language sample, and inserts modifiers into the judged insertable position;
for deleting the modifiers, the generating model judges the positions of the modifiers according to the extracted real language sample, and deletes the modifiers at the positions of the modifiers obtained by judgment;
for the replacement of the modifiers, the generation model judges the positions of the modifiers according to the extracted real language sample, and replaces the modifiers at the positions of the modifiers obtained by judgment with new modifiers;
wherein the insertion, deletion or replacement of modifiers does not change the classification of the extracted real language sample itself as a positive or negative sample.
8. The system for training a text discriminant model according to claim 6, wherein the sample generation module comprises:
for introducing the confusion words, the generating model judges the positions and the types of the main words according to the extracted real language samples, and inserts or replaces the main words with the confusion words, thereby changing the classification that the extracted real language samples belong to positive samples or negative samples.
9. The system for training a text discriminant model of claim 6, wherein the manner of comparison in the discriminant module comprises: vectorizing and KL divergence calculation are carried out on the first new language sample or the second new language sample, if the distribution difference of the calculation results is smaller than a preset value, the discrimination model judges that the first new language sample or the second new language sample is a positive sample, and otherwise, the discrimination model judges that the first new language sample or the second new language sample is a negative sample.
10. The system for training a text discriminant model of claim 6, wherein the discriminant model update module comprises:
when the judgment result of the discrimination model is consistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of the discrimination model for giving correct discrimination is higher;
when the judgment result of the discrimination model is inconsistent with the expectation of the generated model, the discrimination model updates the model parameters through a reverse transfer function, so that the probability of giving wrong discrimination by the discrimination model is lower;
the generative model update module comprises: and on the basis of the model parameters of the updated discrimination model, the generative model calculates gradients according to the direction in which the previous discrimination model is most susceptible to errors and the future discrimination model is most susceptible to errors, and respectively updates the model parameters of the generative model.
CN202011347328.2A 2020-11-26 2020-11-26 Method and system for training text discrimination model Pending CN112328750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011347328.2A CN112328750A (en) 2020-11-26 2020-11-26 Method and system for training text discrimination model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011347328.2A CN112328750A (en) 2020-11-26 2020-11-26 Method and system for training text discrimination model

Publications (1)

Publication Number Publication Date
CN112328750A true CN112328750A (en) 2021-02-05

Family

ID=74309567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011347328.2A Pending CN112328750A (en) 2020-11-26 2020-11-26 Method and system for training text discrimination model

Country Status (1)

Country Link
CN (1) CN112328750A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN109117482A (en) * 2018-09-17 2019-01-01 武汉大学 A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency
CN109766432A (en) * 2018-07-12 2019-05-17 中国科学院信息工程研究所 A kind of Chinese abstraction generating method and device based on generation confrontation network
CN109885667A (en) * 2019-01-24 2019-06-14 平安科技(深圳)有限公司 Document creation method, device, computer equipment and medium
CN110347819A (en) * 2019-06-21 2019-10-18 同济大学 A kind of text snippet generation method based on positive negative sample dual training
CN110414003A (en) * 2019-07-29 2019-11-05 清华大学 Establish method, apparatus, medium and the calculating equipment of text generation model
CN110442859A (en) * 2019-06-28 2019-11-12 中国人民解放军国防科技大学 Method, device and equipment for generating labeled corpus and storage medium
CN111027292A (en) * 2019-11-29 2020-04-17 北京邮电大学 Method and system for generating limited sampling text sequence
US20200134415A1 (en) * 2018-10-30 2020-04-30 Huawei Technologies Co., Ltd. Autoencoder-Based Generative Adversarial Networks for Text Generation
CN111160043A (en) * 2019-12-31 2020-05-15 科大讯飞股份有限公司 Feature encoding method and device, electronic equipment and readable storage medium
CN111488422A (en) * 2019-01-25 2020-08-04 深信服科技股份有限公司 Incremental method and device for structured data sample, electronic equipment and medium
CN111738351A (en) * 2020-06-30 2020-10-02 创新奇智(重庆)科技有限公司 Model training method and device, storage medium and electronic equipment
CN111881935A (en) * 2020-06-19 2020-11-03 北京邮电大学 Countermeasure sample generation method based on content-aware GAN
CN111914552A (en) * 2020-07-31 2020-11-10 平安科技(深圳)有限公司 Training method and device of data enhancement model

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003678A (en) * 2018-06-12 2018-12-14 清华大学 A kind of generation method and system emulating text case history
CN109766432A (en) * 2018-07-12 2019-05-17 中国科学院信息工程研究所 A kind of Chinese abstraction generating method and device based on generation confrontation network
CN109117482A (en) * 2018-09-17 2019-01-01 武汉大学 A kind of confrontation sample generating method towards the detection of Chinese text emotion tendency
US20200134415A1 (en) * 2018-10-30 2020-04-30 Huawei Technologies Co., Ltd. Autoencoder-Based Generative Adversarial Networks for Text Generation
CN109885667A (en) * 2019-01-24 2019-06-14 平安科技(深圳)有限公司 Document creation method, device, computer equipment and medium
CN111488422A (en) * 2019-01-25 2020-08-04 深信服科技股份有限公司 Incremental method and device for structured data sample, electronic equipment and medium
CN110347819A (en) * 2019-06-21 2019-10-18 同济大学 A kind of text snippet generation method based on positive negative sample dual training
CN110442859A (en) * 2019-06-28 2019-11-12 中国人民解放军国防科技大学 Method, device and equipment for generating labeled corpus and storage medium
CN110414003A (en) * 2019-07-29 2019-11-05 清华大学 Establish method, apparatus, medium and the calculating equipment of text generation model
CN111027292A (en) * 2019-11-29 2020-04-17 北京邮电大学 Method and system for generating limited sampling text sequence
CN111160043A (en) * 2019-12-31 2020-05-15 科大讯飞股份有限公司 Feature encoding method and device, electronic equipment and readable storage medium
CN111881935A (en) * 2020-06-19 2020-11-03 北京邮电大学 Countermeasure sample generation method based on content-aware GAN
CN111738351A (en) * 2020-06-30 2020-10-02 创新奇智(重庆)科技有限公司 Model training method and device, storage medium and electronic equipment
CN111914552A (en) * 2020-07-31 2020-11-10 平安科技(深圳)有限公司 Training method and device of data enhancement model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIANG BIN: "Deep Text Classification Can Be Fooled", 《IJCAI-18》 *
言有三: "《深度学习之人脸图像处理:核心算法与案例实战》", 31 July 2020 *
陈云霁: "《智能计算系统》", 29 February 2020 *

Similar Documents

Publication Publication Date Title
Wei et al. Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code.
CN110110327B (en) Text labeling method and equipment based on counterstudy
US10963685B2 (en) Generating variations of a known shred
KR101813683B1 (en) Method for automatic correction of errors in annotated corpus using kernel Ripple-Down Rules
CN107688803B (en) Method and device for verifying recognition result in character recognition
CN108959418A (en) Character relation extraction method and device, computer device and computer readable storage medium
CN112016553B (en) Optical Character Recognition (OCR) system, automatic OCR correction system, method
US20170076152A1 (en) Determining a text string based on visual features of a shred
CN112651238A (en) Training corpus expansion method and device and intention recognition model training method and device
JP2005158010A (en) Apparatus, method and program for classification evaluation
CN110309073B (en) Method, system and terminal for automatically detecting user interface errors of mobile application program
CN113408535B (en) OCR error correction method based on Chinese character level features and language model
CN112966088B (en) Unknown intention recognition method, device, equipment and storage medium
CN111930939A (en) Text detection method and device
CN105512195A (en) Auxiliary method for analyzing and making decisions of product FMECA report
CN107357895A (en) A kind of processing method of the text representation based on bag of words
CN114818643A (en) Log template extraction method for reserving specific service information
CN116911289A (en) Method, device and storage medium for generating large-model trusted text in government affair field
CN111680684A (en) Method, device and storage medium for recognizing spine text based on deep learning
CN110633456A (en) Language identification method, language identification device, server and storage medium
CN112328750A (en) Method and system for training text discrimination model
KR102019208B1 (en) Deep learning-based error identification method and apparatus
CN109284392B (en) Text classification method, device, terminal and storage medium
CN115953123A (en) Method, device and equipment for generating robot automation flow and storage medium
CN112651590B (en) Instruction processing flow recommending method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210205