CN112819091A - Cross-language description oriented antagonism data enhancement method, system and storage medium - Google Patents

Cross-language description oriented antagonism data enhancement method, system and storage medium Download PDF

Info

Publication number
CN112819091A
CN112819091A CN202110198513.8A CN202110198513A CN112819091A CN 112819091 A CN112819091 A CN 112819091A CN 202110198513 A CN202110198513 A CN 202110198513A CN 112819091 A CN112819091 A CN 112819091A
Authority
CN
China
Prior art keywords
text
image
antagonism
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110198513.8A
Other languages
Chinese (zh)
Inventor
肖宇
鲁统伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Institute of Technology
Original Assignee
Wuhan Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Institute of Technology filed Critical Wuhan Institute of Technology
Priority to CN202110198513.8A priority Critical patent/CN112819091A/en
Publication of CN112819091A publication Critical patent/CN112819091A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a cross-language description oriented antagonism data enhancement method, a system and a storage medium, wherein the method comprises the following steps: obtaining a clean image-text pair dataset; generating a text antagonism sample by using a sequence-to-sequence model; training an image description generation model: if the current training stage is a resistance training stage, generating an image resistance sample, further expanding an image-text pair, then training a model by using the expanded image-text pair data set, and optimizing the model according to a joint loss function; if the current training stage is a non-antagonistic training stage, training the model by using a clean image-text data set, and optimizing the model according to a loss function; and obtaining a trained image description generation model and an expanded image-text pair data set. According to the method, the data set is expanded through an easy-to-operate data enhancement mode, and the robustness and the performance of the image description generation model are improved.

Description

Cross-language description oriented antagonism data enhancement method, system and storage medium
Technical Field
The invention belongs to the technical field of data enhancement, and particularly relates to a cross-language description oriented antagonism data enhancement method, system and storage medium.
Background
The algorithm can directly benefit from the scale of the data set, and the fitting degree and robustness of the model trained by the large-scale data set are often better than those of the model obtained by the small-scale data set. For the image description task in the small language, in order to achieve the performance consistent with the image description task in the english language, the challenge to be faced first is the acquisition of a large-scale data set.
To ensure quality, manually labeling the data set is the best method, but this method is very time consuming. In order to balance the performance of the model and the cost of manually labeling the data set, data enhancement methods are generally adopted to enlarge the data set, and the data enhancement is widely applied and has good effect in the field of images.
In the image description task, it is very challenging to perform image-text pairwise augmentation while keeping semantics the same. Both geometric transformations and random cropping of images can affect the accuracy of the generated sentence. When orientation information is involved, such as a person standing on the left side of a table, flipping or cropping the picture may result in the model acquiring less accurate and comprehensive information, resulting in a problem with the generated text. In the aspect of texts, it is also challenging to provide a general Language conversion rule, and a general data enhancement technology in Natural Language Processing (NLP) has not been fully explored.
Disclosure of Invention
The invention aims to provide a cross-language description oriented antagonism data enhancement method, a cross-language description oriented antagonism data enhancement system and a storage medium, and the requirement of a neural network on a large-scale data set is relieved.
The invention provides a cross-language description oriented antagonism data enhancement method, which comprises the following steps:
s1, acquiring a clean image-text pair data set;
s2, generating a text antagonism sample by using the sequence-to-sequence model;
s3, training the image description generation model:
if the current training stage is a resistance training stage, generating an image resistance sample, further expanding an image-text pair, then training a model by using the expanded image-text pair data set, and optimizing the model according to a joint loss function;
if the current training stage is a non-antagonistic training stage, training the model by using a clean image-text data set, and optimizing the model according to a loss function;
and S4, obtaining the trained image description generation model and the expanded image-text pair data set.
Further, an image-resistant sample is generated using a gradient attack algorithm.
Further, step S2 specifically includes:
s21, converting the original text into a target text according to the formula (1):
Figure BDA0002947126210000021
wherein S represents an original text, S 'represents a target text, P represents a probability distribution, and w'tThe t-th participle represents the target text, and n represents the participle number of the target text;
s22, K optimal translation target texts of the original text are obtained, the K optimal translation target texts are converted into sentences generating probability distribution in a target vocabulary, and calculation is carried out according to a formula (2):
Figure BDA0002947126210000022
in the formula, K represents the number of target texts obtained by a single original text, omega represents a word segmentation on a target vocabulary, C represents the obtained target texts, and E represents the original text;
wherein the probability distribution of the sentence is calculated according to equation (3):
Figure BDA0002947126210000023
in the formula, m represents the number of word segments of the target text.
Further, step S2 specifically includes:
s23, according to the formula (4), evaluating semantic similarity between the generated target text and the original text:
Figure BDA0002947126210000024
in the formula, P (S '| S) represents the probability of the target text S' given the original text S defined in equation (3), and P (S | S) is used to normalize the different distributions.
And S24, screening the text antagonism sample from the target text according to the semantic similarity.
Further, generating the image-resistance sample comprises:
generating an image resistance sample by using an iterative gradient attack method, wherein the calculation formula is as shown in formula (5):
Figure BDA0002947126210000025
in the formula IadvA resistance sample representing an image I, S representing a target text, θ representing a parameter of an image description generative model, L (θ, I, S) representing a loss function of the image description generative model, α representing a perturbation weight, N representing the number of iterations, and a Clip () function for replacing an overflowed value with a boundary value.
Further, augmenting the image-text pair comprises:
and obtaining two types of image contrast samples according to the text and the text contrast sample, and combining the image and the text pairwise to obtain an expanded image-text pair.
Further, the joint loss function loss is:
Figure BDA0002947126210000031
in the formula, L (theta, I, S) represents a loss function of original data, omega represents related weight in a control adversarial sample in the loss function, S represents original text, S represents relative weight in the control adversarial sample in the original text, andadvrepresenting text antagonism samples, IscRepresenting the corresponding image-resistant sample of the original text, IsadvThe image representing the text is a resistant sample to the corresponding image.
The invention also provides a cross-language description oriented antagonism data enhancement system for realizing the method, which comprises the following steps:
the data input module is used for reading in a clean image-text pair data set;
an enhanced image module for augmenting the image by an iterative gradient attack algorithm;
an enhanced text module to augment text by a sequence-to-sequence model;
the network training module is used for enhancing the image by utilizing the image enhancement module in the antagonism training stage so as to expand the image-text pair, training the data set by utilizing the expanded image-text and taking a minimum joint loss function as an optimization target; and training the data set by using clean images-texts in the rest training time periods, wherein the minimum loss function is taken as an optimization target.
Further, the iterative gradient attack algorithm generates a counteractive sample for attack noise generated by the network gradient; the joint loss function is a weighted calculation of the loss functions of the augmented image-text pairs.
The present invention also provides a computer storage medium having stored therein a computer program executable by a computer processor, the computer program performing the cross-language description oriented antagonism data enhancement method according to any one of claims 1-7.
The invention has the beneficial effects that: the cross-language description oriented antagonism data enhancement method, the cross-language description oriented antagonism data enhancement system and the storage medium expand a data set through an easy-to-operate data enhancement mode, and improve the robustness and performance of an image description generation model.
Drawings
FIG. 1 is a flow chart of the cross-language description oriented antagonism data enhancement method of the present invention.
Fig. 2 is a diagram of a network training framework according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of the cross-language description oriented antagonism data enhancement system of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings in which:
the invention discloses a cross-language image description oriented antagonism data enhancement method, which comprises the following steps: firstly, adopting a countermeasure algorithm of gradient attack for an image, and generating a countermeasure sample by adding disturbance as small as possible; secondly, generating an antagonism sample with the same semantic as the original sentence through the thought from the sequence to the sequence for the text; and finally, taking the antagonism sample as an additional sample, putting the additional sample and the clean sample into network training, generating the antagonism sample of the text before the training, and continuously generating an antagonism image in the training. Four pairs of additional data can be generated by generating each image-text pair once, the scale of the data set is effectively increased, and the model performance of the small-scale data set on the image description can be improved.
The cross-language description oriented antagonism data enhancement method of the embodiment of the invention, as shown in fig. 1, comprises the following steps:
s1, acquiring a clean image-text pair data set; taking Flickr8k as an example, the image data source is the photo album website Flickr of Yahoo, and the number of images in the data set is 8000; most of the images show the scenes of human beings participating in a certain activity, and the corresponding manual label of each image is 5 sentences of English.
And S2, generating a text antagonism sample by using the sequence-to-sequence model.
In the embodiment of the present invention, the step S2 may be implemented by the following steps:
s21, converting English into Chinese, and calculating formula according to formula (1):
Figure BDA0002947126210000041
wherein S is an original text, S 'is a target text, P represents a probability distribution, w'tThe t-th participle represents the target text, and n represents the participle number of the target text;
s22, obtaining K best translations of the original text, converting the Chinese sentences of the K best translations into sentences generating probability distribution in the target vocabulary, and calculating according to the formula (2):
Figure BDA0002947126210000042
in the formula, K represents the number of target texts acquired by a single original text, C represents an acquired Chinese sentence, E represents an English sentence, and omega represents participles on a target vocabulary;
wherein the probability distribution P (C | E) of a sentence is calculated according to equation (3):
Figure BDA0002947126210000043
in the formula, m represents the number of participles of the final chinese sentence.
S23, evaluating the semantic similarity score between the generated text and the original text, and calculating according to the formula (4):
Figure BDA0002947126210000044
where P (S '| S) is the probability of the target text S' given the original text S as defined in equation (3), P (S | S) being used to normalize the different distributions.
S24, according to the semantic similarity, text antagonism samples meeting the requirements are screened from the target text, and Chinese sentences with poor semantic similarity are removed.
S3, training the image description generation model:
the present embodiment is based on the framework of Convolutional Neural Networks (CNN) and long short term memory networks (LSTM) for training.
(1) If the current training stage is a resistance training stage, generating an image resistance sample, further expanding an image-text pair, then training the model by using the expanded image-text pair data set, and optimizing the model according to a joint loss function. The image generates antagonism for the gradient attack algorithm for the resistant sample.
S31, generating an image resistance sample by using an iterative gradient attack method, wherein the calculation formula is as shown in formula (5):
Figure BDA0002947126210000051
Iadvis an antagonistic sample of the image I, S represents the original text, θ is a series of parameters of the image description generative model, and L (θ, I, S) represents the loss function of the image description generative model. The attack will propagate the gradient back to the input image features to compute
Figure BDA0002947126210000052
Thereby updating the network. It then adjusts the network in small steps to maximize the loss. Alpha is a disturbance weight and is used for controlling the amplitude of attack noise, and the larger the value is, the larger the attack intensity is, and the noise is easier to observe by naked eyes. N represents iteration times, in order to save training time and calculation cost, we are set as 2, the Clip () function in the formula is used for replacing overflowed values with boundary values, because in the iteration updating, as the iteration times increase, partial pixel values overflow (exceed the boundary value range), at this time, the values need to be replaced with the boundary values to ensure that usable countermeasure samples can be generated, and the boundary values are set to be 2
Figure BDA0002947126210000053
Both α and ε are hyper-parameters, set to (0.0625, 0.3);
s32, according to the initial text S and the antagonistic text SadvTwo types of enhanced image samples can be obtained, and are respectively marked as ISCAnd IsadvAfter two are combined, we can obtain four types of extended data pairs: (I)sc,S),(Isadv,S),(Isc,Sadv) And (I)sadv,Sadv);
S33, updating the model parameters with the objective of minimizing the joint loss function, which is calculated according to equation (6):
Figure BDA0002947126210000054
where L (θ, I, S) is the loss function of the initial data and ω is the associated weight in the control adversarial sample in the loss function.
(2) If the current training stage is a non-antagonistic training stage, the model is trained on the data set by using a clean image-text, and the model is optimized by using a minimum loss function L (theta, I, S).
And S4, obtaining the weight of the trained image description generation model and the expanded image-text pair data set.
Test examples: the testing link selects Flickr8k-cn as a training data set. Each test image in Flickr8k-cn is associated with five Chinese texts, which are obtained by manually translating a corresponding number of English texts in Flickr8 k. The text countermeasure sample is obtained with the corresponding english sentence in Flickr8k as input.
The performance indexes widely used in NLP, namely BLEU-4, ROUGE-L and CIDER are adopted. As shown in FIG. 2, for the Chinese image description model, the CNN + LSTM method is followed. To extract image features, we used pre-trained ResNet-152, which obtained the latest results for image classification and detection in both ImageNet and COCO competitions. The image features are 2048-dimensional vectors from the ReLU after the pool5 layer. The extracted features were normalized by L2. The size of the image and word embedding and the hidden size of the LSTM are set to 512. The initial learning rate η is set to 0.001, and the rate is attenuated every ten cycles with an attenuation weight of 0.999.
Experimental results show that the method can obviously improve the performance of the cross-language image model while effectively expanding the data set. Experimental comparison results are provided below to illustrate the effectiveness and superiority of the method. As shown in table 1, table 2 and table 3, the method of the present invention is significantly improved in all three indexes compared to other methods, and it can be proved that the effect is more significant on small data sets.
TABLE 1 comparison of results of experiments on Flickr8k-cn data set with different data enhancement methods
Figure BDA0002947126210000061
TABLE 2 comparison of results of experiments on Flickr8k-cn data set using the method for different models
Figure BDA0002947126210000062
Figure BDA0002947126210000071
TABLE 3 comparison of results of experiments on different scale data sets with the process of the invention
Figure BDA0002947126210000072
The present invention also provides a cross-language description oriented antagonism data enhancement system for implementing the cross-language description oriented antagonism data enhancement method, as shown in fig. 3, including:
a data input module 101 for reading in a clean image-text pair dataset;
an enhanced image module 102 for augmenting the image by an iterative gradient attack algorithm; an iterative gradient attack algorithm generates a resistance sample for attack noise generated by network gradients;
an enhanced text module 103 for augmenting text by a sequence-to-sequence model;
the network training module 104 is used for enhancing the image by using the image enhancement module in the antagonism training stage so as to expand the image-text pair, training the data set by using the expanded image-text and taking a minimum joint loss function as an optimization target; and training the data set by using clean images-texts in the rest training time periods, wherein the minimum loss function is taken as an optimization target. The joint loss function is a weighted calculation of the loss functions of the augmented image-text pairs.
Based on the cross-language image description oriented antagonism data enhancement method, the invention also provides a computer storage medium. The above-described methods may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD-ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the methods described herein may be stored in such software processes on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the cross-language description oriented antagonism data enhancement method described herein. Further, when a general-purpose computer accesses code for implementing the processes shown herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the processes shown herein.
It should be noted that, according to the implementation requirement, each step/component described in the present application can be divided into more steps/components, and two or more steps/components or partial operations of the steps/components can be combined into new steps/components to achieve the purpose of the present invention.
It will be understood by those skilled in the art that the foregoing is merely a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included within the scope of the present invention.

Claims (10)

1. A cross-language description oriented antagonism data enhancement method is characterized by comprising the following steps:
s1, acquiring a clean image-text pair data set;
s2, generating a text antagonism sample by using the sequence-to-sequence model;
s3, training the image description generation model:
if the current training stage is a resistance training stage, generating an image resistance sample, further expanding an image-text pair, then training a model by using the expanded image-text pair data set, and optimizing the model according to a joint loss function;
if the current training stage is a non-antagonistic training stage, training the model by using a clean image-text data set, and optimizing the model according to a loss function;
and S4, obtaining the trained image description generation model and the expanded image-text pair data set.
2. The cross-language description-oriented antagonism data enhancement method according to claim 1, wherein the image-antagonism samples are generated by a gradient attack algorithm.
3. The cross-language description oriented antagonism data enhancement method according to claim 1, wherein step S2 specifically includes:
s21, converting the original text into a target text according to the formula (1):
Figure FDA0002947126200000011
wherein S represents an original text, S 'represents a target text, P represents a probability distribution, and w'tThe t-th participle represents the target text, and n represents the participle number of the target text;
s22, K optimal translation target texts of the original text are obtained, the K optimal translation target texts are converted into sentences generating probability distribution in a target vocabulary, and calculation is carried out according to a formula (2):
Figure FDA0002947126200000012
in the formula, K represents the number of target texts obtained by a single original text, omega represents a word segmentation on a target vocabulary, C represents the obtained target texts, and E represents the original text;
wherein the probability distribution of the sentence is calculated according to equation (3):
Figure FDA0002947126200000013
in the formula, m represents the number of word segments of the target text.
4. The cross-language description oriented antagonism data enhancement method according to claim 3, wherein the step S2 further comprises:
s23, according to the formula (4), evaluating semantic similarity between the generated target text and the original text:
Figure FDA0002947126200000021
in the formula, P (S '| S) represents the probability of the target text S' given the original text S defined in equation (3), and P (S | S) is used to normalize the different distributions.
And S24, screening the text antagonism sample from the target text according to the semantic similarity.
5. The cross-language description oriented antagonism data enhancement method of claim 1 wherein generating image antagonism samples comprises:
generating an image resistance sample by using an iterative gradient attack method, wherein the calculation formula is as shown in formula (5):
Figure FDA0002947126200000022
in the formula IadvA resistance sample representing an image I, S representing a target text, θ representing a parameter of an image description generative model, L (θ, I, S) representing a loss function of the image description generative model, α representing a perturbation weight, N representing the number of iterations, and a Clip () function for replacing an overflowed value with a boundary value.
6. The cross-language description oriented adversarial data enhancement method of claim 1, characterized in that augmenting image-text pairs comprises:
and obtaining two types of image contrast samples according to the text and the text contrast sample, and combining the image and the text pairwise to obtain an expanded image-text pair.
7. The cross-language description-oriented antagonism data enhancement method according to claim 1, wherein the joint loss function loss is:
Figure FDA0002947126200000023
in the formula, L (theta, I, S) represents a loss function of original data, omega represents related weight in a control adversarial sample in the loss function, S represents original text, S represents relative weight in the control adversarial sample in the original text, andadvrepresenting text antagonism samples, IscRepresenting the corresponding image-resistant sample of the original text, IsadvThe image representing the text is a resistant sample to the corresponding image.
8. A cross-language description oriented antagonism data enhancement system for implementing a cross-language description oriented antagonism data enhancement method, comprising:
the data input module is used for reading in a clean image-text pair data set;
an enhanced image module for augmenting the image by an iterative gradient attack algorithm;
an enhanced text module to augment text by a sequence-to-sequence model;
the network training module is used for enhancing the image by utilizing the image enhancement module in the antagonism training stage so as to expand the image-text pair, training the data set by utilizing the expanded image-text and taking a minimum joint loss function as an optimization target; and training the data set by using clean images-texts in the rest training time periods, wherein the minimum loss function is taken as an optimization target.
9. The cross-language description oriented adversarial data enhancement system of claim 8, characterized in that iterative gradient attack algorithm generates adversarial samples for attack noise generated by network gradients; the joint loss function is a weighted calculation of the loss functions of the augmented image-text pairs.
10. A computer storage medium, characterized in that: stored within it is a computer program executable by a computer processor, the computer program performing the cross-language description oriented antagonism data enhancement method according to any one of claims 1-7.
CN202110198513.8A 2021-02-22 2021-02-22 Cross-language description oriented antagonism data enhancement method, system and storage medium Pending CN112819091A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110198513.8A CN112819091A (en) 2021-02-22 2021-02-22 Cross-language description oriented antagonism data enhancement method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110198513.8A CN112819091A (en) 2021-02-22 2021-02-22 Cross-language description oriented antagonism data enhancement method, system and storage medium

Publications (1)

Publication Number Publication Date
CN112819091A true CN112819091A (en) 2021-05-18

Family

ID=75864766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110198513.8A Pending CN112819091A (en) 2021-02-22 2021-02-22 Cross-language description oriented antagonism data enhancement method, system and storage medium

Country Status (1)

Country Link
CN (1) CN112819091A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536006A (en) * 2021-06-25 2021-10-22 北京百度网讯科技有限公司 Method, device, equipment, storage medium and computer product for generating pictures
CN113627567A (en) * 2021-08-24 2021-11-09 北京达佳互联信息技术有限公司 Picture processing method, text processing method, related equipment and storage medium
CN114372537A (en) * 2022-01-17 2022-04-19 浙江大学 Image description system-oriented universal countermeasure patch generation method and system
CN116229442A (en) * 2023-01-03 2023-06-06 武汉工程大学 Text image synthesis and instantiation weight transfer learning method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480144A (en) * 2017-08-03 2017-12-15 中国人民大学 Possess the image natural language description generation method and device across language learning ability
CN112364138A (en) * 2020-10-12 2021-02-12 上海交通大学 Visual question-answer data enhancement method and device based on anti-attack technology

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480144A (en) * 2017-08-03 2017-12-15 中国人民大学 Possess the image natural language description generation method and device across language learning ability
CN112364138A (en) * 2020-10-12 2021-02-12 上海交通大学 Visual question-answer data enhancement method and device based on anti-attack technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU XIAO等: "《An Improved Method of Cross-Lingual Image Caption Based on Fluency-Guided》", 《2020 THE 5TH INTERNATIONAL CONFERENCE ON CONTROL, ROBOTICS AND CYBERNETICS》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536006A (en) * 2021-06-25 2021-10-22 北京百度网讯科技有限公司 Method, device, equipment, storage medium and computer product for generating pictures
CN113627567A (en) * 2021-08-24 2021-11-09 北京达佳互联信息技术有限公司 Picture processing method, text processing method, related equipment and storage medium
CN113627567B (en) * 2021-08-24 2024-04-02 北京达佳互联信息技术有限公司 Picture processing method, text processing method, related device and storage medium
CN114372537A (en) * 2022-01-17 2022-04-19 浙江大学 Image description system-oriented universal countermeasure patch generation method and system
CN114372537B (en) * 2022-01-17 2022-10-21 浙江大学 Image description system-oriented universal countermeasure patch generation method and system
CN116229442A (en) * 2023-01-03 2023-06-06 武汉工程大学 Text image synthesis and instantiation weight transfer learning method

Similar Documents

Publication Publication Date Title
CN110110585B (en) Intelligent paper reading implementation method and system based on deep learning and computer program
CN112819091A (en) Cross-language description oriented antagonism data enhancement method, system and storage medium
CN108984530B (en) Detection method and detection system for network sensitive content
CN106547735B (en) Construction and use method of context-aware dynamic word or word vector based on deep learning
CN113254599B (en) Multi-label microblog text classification method based on semi-supervised learning
US8433556B2 (en) Semi-supervised training for statistical word alignment
CN109960804B (en) Method and device for generating topic text sentence vector
CN105183720B (en) Machine translation method and device based on RNN model
Chen et al. Zero-resource neural machine translation with multi-agent communication game
CN110033008B (en) Image description generation method based on modal transformation and text induction
CN110188775B (en) Image content description automatic generation method based on joint neural network model
CN108763539B (en) Text classification method and system based on part-of-speech classification
Chen et al. Improving distributed representation of word sense via wordnet gloss composition and context clustering
Wei et al. Uncertainty-aware semantic augmentation for neural machine translation
CN112016271A (en) Language style conversion model training method, text processing method and device
Kišš et al. AT-ST: self-training adaptation strategy for OCR in domains with limited transcriptions
US20200004819A1 (en) Predicting probablity of occurrence of a string using sequence of vectors
Gao et al. Generating natural adversarial examples with universal perturbations for text classification
Maharana et al. Adversarial augmentation policy search for domain and cross-lingual generalization in reading comprehension
JP2016224483A (en) Model learning device, method and program
CN112434686B (en) End-to-end misplaced text classification identifier for OCR (optical character) pictures
CN110610006B (en) Morphological double-channel Chinese word embedding method based on strokes and fonts
CN112836525A (en) Human-computer interaction based machine translation system and automatic optimization method thereof
Sosun et al. Deep sentiment analysis with data augmentation in distance education during the pandemic
Namysl et al. Empirical error modeling improves robustness of noisy neural sequence labeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination