CN112861179A - Method for desensitizing personal digital spatial data based on text-generated countermeasure network - Google Patents
Method for desensitizing personal digital spatial data based on text-generated countermeasure network Download PDFInfo
- Publication number
- CN112861179A CN112861179A CN202110199023.XA CN202110199023A CN112861179A CN 112861179 A CN112861179 A CN 112861179A CN 202110199023 A CN202110199023 A CN 202110199023A CN 112861179 A CN112861179 A CN 112861179A
- Authority
- CN
- China
- Prior art keywords
- text
- data
- sequence
- personal digital
- desensitization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000000586 desensitisation Methods 0.000 claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 12
- 238000004458 analytical method Methods 0.000 claims abstract description 10
- 230000003042 antagnostic effect Effects 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 36
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000000306 recurrent effect Effects 0.000 claims description 8
- 125000004122 cyclic group Chemical group 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000012886 linear function Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioethics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method for desensitizing personal digital spatial data of an antagonistic network based on text generation, which comprises the following steps: s1: acquiring a data file to be desensitized in a personal digital space, and constructing a text to generate an confrontation network model; s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information; s3: inputting the analysis file as source data into a text to generate a confrontation network model for training; s4: judging whether the trained text generation confrontation network model is converged, if so, obtaining desensitization text data with the same statistical characteristics as the source data; if not, the process returns to step S3. The invention provides a text-based personal digital spatial data desensitization method for generating an antagonistic network, which solves the problem that the structured format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.
Description
Technical Field
The invention relates to the technical field of data desensitization processing, in particular to a method for desensitizing personal digital spatial data based on a text-generated confrontation network.
Background
Data desensitization is a data processing technique that can reduce or remove the sensitivity of data by processing the data. By adopting a data desensitization technology, the risk and harm of data leakage can be reduced, and the privacy of user data is effectively protected. In the field of internet and medical treatment, users can store, check and share personal medical treatment health data through personal digital space, but the personal medical treatment data face the risk of leakage of user medical treatment sensitive information in the processes of online doctor watching, online medicine purchasing, outpatient service appointment and the like, and the data of the users in the medical treatment industry have extremely high authenticity and sensitivity, and once the personal sensitive information of the users is leaked, potential life threat can be caused to the users. With data desensitization, information in the personal digital space can be used for business related analysis and processing while avoiding leakage of user data.
The existing data desensitization mode is usually used in a covering or generalization mode and the like, so that private data is protected, and meanwhile, the usability of the data is kept, so that the desensitized data can be continuously used in application scenes such as development testing, data mining, data distribution and the like. Data replacement, namely replacing data in the sensitive information by using random data; data shuffling, which performs row-to-row exchange in source data; numerical value conversion, which is to perform conversion processing on numerical data such as age, time and the like; data occlusion, replacing or altering sensitive data with special symbols such as "+, NULL", etc.; data deletion, namely sensitive data deletion and clearing; and (3) data generalization, namely representing the data from a specific dimension by using a more fuzzy dimension, enlarging the data representation range, eliminating sensitive information and the like. However, when the existing data desensitization technology is applied in a medical scene, the structured format of medical source data can be changed, and the requirements of desensitization and protection of medical sensitive information of a user in the medical scene cannot be met.
In the prior art, such as chinese patent published in 2019, 8, 16, a data desensitization method, apparatus, device, and computer readable storage medium, publication number CN110135193A, maximizes data desensitization degree, ensures that privacy information is not revealed, and effectively improves the practicability of desensitized data, but does not evaluate complete sequences, and changes the structured format of source data.
Disclosure of Invention
The invention provides a personal digital space data desensitization method based on a text generation confrontation network, aiming at overcoming the technical defect that the structural format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method of generating personal digital spatial data desensitization against a network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in the personal digital space,
constructing a text to generate an confrontation network model;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the process returns to step S3.
Preferably, the data file to be desensitized is based on semi-structured medical information data in a distributed database.
Preferably, the text generation confrontation network model comprises a generator and a discriminator.
Preferably, the generator generates the sequence using a recurrent neural network.
Preferably, the discriminator discriminates the sequence generated by the generator using a convolutional neural network.
Preferably, in step S3, the text generation countermeasure network model is trained in conjunction with the strategy of Monte Carlo search.
Preferably, the specific steps of training the text generation confrontation network model are as follows:
vector input cycle obtained by encoding words of source dataObtaining an embedding layer vector x by an embedding layer of the ring neural network1,...,xTOutput the hidden layer vector h1,...,TTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
reward (Reward) Q for the current sentence, denoted as
Q=D(Y1:t)
For an n-time Monte Carlo search, it is expressed as
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
For each sequence, embedding a layer vector x1,...,TConcatenated to represent a current sequence
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Wherein,for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; and (5) carrying out cyclic training to make the model converge when the error of the discriminator is minimum.
Preferably, the obtaining of the loss of the current sentence is based on the output distribution of the discriminator by solving the binary cross entropy, which specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q))。
Preferably, for the generated sequence, when the generator generates a false sequence, the cross entropy at which the discriminator judges true is
loss=-(1*logD(Y1∶T)+0*log(l-D(Yl:T))
=-logD(Y1∶T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
One sequence is false, and the cross entropy when the discriminator judges false is
The minimum cross entropy is calculated by the following formula:
preferably, the same statistical properties are: the proportion of the numbers or characters in the text is the same.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a text-based personal digital space data desensitization method for generating an antagonistic network, which is characterized in that desensitization data with the same statistical characteristics and structure as an analytic file containing sensitive information is generated by training a text-generated antagonistic network model, so that data desensitization processing on structured text information is realized, and a good text data desensitization effect is achieved under the condition that the structure of data in a personal digital space is not influenced.
Drawings
FIG. 1 is a flow chart of the steps for implementing the technical solution of the present invention;
FIG. 2 is a flow chart of the desensitization work flow of the text generation confrontation network model in the present invention;
FIG. 3 is a network diagram of the generator of the present invention;
FIG. 4 is a network diagram of the arbiter in the present invention;
FIG. 5 is a schematic diagram of a structure of a text-generated confrontation network model according to the present invention;
FIG. 6 is a graph showing a comparison between before and after desensitization with Gaussian-distributed random numbers according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
1-2, a method for generating personal digital spatial data desensitization against a network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in a personal digital space;
more specifically, the data file to be desensitized is based on semi-structured medical information data in a distributed database;
constructing a text to generate an confrontation network model;
more specifically, the text generation confrontation network model comprises a generator and a discriminator;
more specifically, as shown in fig. 3, the generator generates the sequence using a recurrent neural network;
more specifically, as shown in fig. 4, the discriminator uses a convolutional neural network to discriminate the sequence generated by the generator;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information; the analysis file is a json format file;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
more specifically, in step S3, the strategy of Monte Carlo search is combined to train the text generation countermeasure network model;
more specifically, as shown in fig. 5, the specific steps of training the text generation confrontation network model are as follows:
inputting the vector obtained by encoding the word of the source data into the recurrent nerveEmbedding layer of network to obtain embedding layer vector x1,...,xTOutput the hidden layer vector h1,...,TTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
the reward Q for the current sentence, denoted as
Q=D(Y1:t)
In order to obtain the evaluation of the discriminator on a complete sequence, a Monte Carlo search strategy is adopted to generate T-T current unknown words, so that the complete sequence is obtained for evaluation; for an n-time Monte Carlo search, it is expressed as
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
For each sequence, embedding a layer vector x1,...,TConcatenated to represent a current sequence
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Wherein,for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
carrying out cyclic training by adopting a policy gradient (gradient strategy), and updating the parameters of the generator according to the height of the reward Q, thereby reducing the loss of the generated sentences; the model is converged when the error of the discriminator is minimum through cyclic training;
more specifically, solving the binary cross entropy based on the output distribution of the discriminator to obtain the loss of the current sentence specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q));
More specifically, for the generated sequence, the cross entropy when the generator generates a false sequence for which the discriminator judges true is
loss=-(l*logD(Y1∶T)+0*log(l-D(Y1∶T))
=-logD(Y1:T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
One sequence is false, and the cross entropy when the discriminator judges false is
The minimum cross entropy is calculated by the following formula:
in practical implementation, in order to make the discriminator accurately identify, the smaller the cross entropy is, the better the cross entropy is;
s4: judging whether the trained text generation confrontation network model converges or not,
if so, desensitization text data with the same statistical characteristics as the source data is obtained, wherein the comparison before and after Gaussian distribution random number desensitization is shown in FIG. 6;
more specifically, the same statistical properties are: the proportion of the numbers or characters in the text is the same;
if not, the process returns to step S3.
Table 1 is a comparison of textual data before and after desensitization by the described method.
TABLE 1
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. A method for generating personal digital spatial data desensitization to an antagonistic network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in the personal digital space,
constructing a text to generate an confrontation network model;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the process returns to step S3.
2. The method for text-based desensitization of personal digital spatial data generated against a network according to claim 1, wherein said data files to be desensitized are based on semi-structured medical information data in a distributed database.
3. The method for desensitizing personal digital spatial data based on text-generated countermeasure networks according to claim 1, wherein the text-generated countermeasure network model includes a generator and a discriminator.
4. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 3, wherein said generator generates sequences using a recurrent neural network.
5. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 3, wherein said arbiter employs a convolutional neural network to discriminate between sequences generated by said generator.
6. The method for desensitizing personal digital spatial data of a text-based generated confrontation network according to claim 3, wherein in step S3, the text-based generated confrontation network model is trained in conjunction with the strategy of Monte Carlo search.
7. The method for desensitizing personal digital spatial data based on text-generated confrontation network of claim 6, wherein the specific steps for training the text-generated confrontation network model are:
inputting a vector obtained by encoding a word of source data into an embedding layer of a recurrent neural network to obtain an embedding layer vector x1,...,xTOutput the hidden layer vector h1,...,hTTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
the reward Q for the current sentence, denoted as
Q=D(Y1:t)
For an n-time Monte Carlo search, it is expressed as
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
For each sequence, embedding a layer vector x1,...,xTConcatenated to represent a current sequence
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Wherein,for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; and (5) carrying out cyclic training to make the model converge when the error of the discriminator is minimum.
8. The method for text-based generation of personal digital spatial data desensitization of antagonistic networks according to claim 7, characterized in that binary cross entropy is solved based on the output distribution of the discriminators to obtain the loss of the current sentence, in particular: let p be the probability of state 1 of the output bin, 1-p be the probability of state 0 of the output bin, Q be the probability of state 1 of the input Q, and 1-Q be the probability of state 0 of the input Q, then the cross entropy of P, Q is P, Q
H(P|Q)=-(p*logq+(1-p)log(1-q))。
9. The method for text-based generation of personal digital spatial data desensitization of countermeasure networks according to claim 8, wherein for a generated sequence, when the generator generates a false sequence, the cross entropy at which the discriminator determines true is
loss=-(1*logD(Y1:T)+0*log(1-D(Y1:T))
=-logD(Y1:T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
One sequence is false, and the cross entropy when the discriminator judges false is
The minimum cross entropy is calculated by the following formula:
10. the method for text-based desensitization of personal digital spatial data to an antagonistic network in accordance with claim 1, wherein the same statistical properties are: the proportion of the numbers or characters in the text is the same.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110199023.XA CN112861179B (en) | 2021-02-22 | 2021-02-22 | Method for desensitizing personal digital spatial data based on text-generated countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110199023.XA CN112861179B (en) | 2021-02-22 | 2021-02-22 | Method for desensitizing personal digital spatial data based on text-generated countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112861179A true CN112861179A (en) | 2021-05-28 |
CN112861179B CN112861179B (en) | 2023-04-07 |
Family
ID=75988569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110199023.XA Active CN112861179B (en) | 2021-02-22 | 2021-02-22 | Method for desensitizing personal digital spatial data based on text-generated countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112861179B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116910817A (en) * | 2023-09-13 | 2023-10-20 | 北京国药新创科技发展有限公司 | Desensitization processing method and device for medical data and electronic equipment |
CN117272941A (en) * | 2023-09-21 | 2023-12-22 | 北京百度网讯科技有限公司 | Data processing method, apparatus, device, computer readable storage medium and product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109614611A (en) * | 2018-11-28 | 2019-04-12 | 中山大学 | A kind of fusion generates the sentiment analysis method of non-confrontation network and convolutional neural networks |
US20190258984A1 (en) * | 2018-02-19 | 2019-08-22 | Microsoft Technology Licensing, Llc | Generative adversarial networks in predicting sequential data |
CN111428448A (en) * | 2020-03-02 | 2020-07-17 | 平安科技(深圳)有限公司 | Text generation method and device, computer equipment and readable storage medium |
CN111488911A (en) * | 2020-03-15 | 2020-08-04 | 北京理工大学 | Image entity extraction method based on Mask R-CNN and GAN |
CN111563275A (en) * | 2020-07-14 | 2020-08-21 | 中国人民解放军国防科技大学 | Data desensitization method based on generation countermeasure network |
-
2021
- 2021-02-22 CN CN202110199023.XA patent/CN112861179B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190258984A1 (en) * | 2018-02-19 | 2019-08-22 | Microsoft Technology Licensing, Llc | Generative adversarial networks in predicting sequential data |
CN109614611A (en) * | 2018-11-28 | 2019-04-12 | 中山大学 | A kind of fusion generates the sentiment analysis method of non-confrontation network and convolutional neural networks |
CN111428448A (en) * | 2020-03-02 | 2020-07-17 | 平安科技(深圳)有限公司 | Text generation method and device, computer equipment and readable storage medium |
CN111488911A (en) * | 2020-03-15 | 2020-08-04 | 北京理工大学 | Image entity extraction method based on Mask R-CNN and GAN |
CN111563275A (en) * | 2020-07-14 | 2020-08-21 | 中国人民解放军国防科技大学 | Data desensitization method based on generation countermeasure network |
Non-Patent Citations (3)
Title |
---|
于胡飞 等: "基于生成对抗网络的医学数据域适应研究", 《大数据》 * |
张煜等: "基于生成对抗网络的文本序列数据集脱敏", 《网络与信息安全学报》 * |
郑旭如: "基于深度学习的数据脱敏研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116910817A (en) * | 2023-09-13 | 2023-10-20 | 北京国药新创科技发展有限公司 | Desensitization processing method and device for medical data and electronic equipment |
CN116910817B (en) * | 2023-09-13 | 2023-12-29 | 北京国药新创科技发展有限公司 | Desensitization processing method and device for medical data and electronic equipment |
CN117272941A (en) * | 2023-09-21 | 2023-12-22 | 北京百度网讯科技有限公司 | Data processing method, apparatus, device, computer readable storage medium and product |
Also Published As
Publication number | Publication date |
---|---|
CN112861179B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | SocInf: Membership inference attacks on social media health data with machine learning | |
Dong et al. | Two-path deep semisupervised learning for timely fake news detection | |
Omar et al. | Robust natural language processing: Recent advances, challenges, and future directions | |
Zou et al. | Multi-task learning improves disease models from web search | |
EP3614645B1 (en) | Embedded dga representations for botnet analysis | |
CN112861179B (en) | Method for desensitizing personal digital spatial data based on text-generated countermeasure network | |
Malekzadeh et al. | Honest-but-curious nets: Sensitive attributes of private inputs can be secretly coded into the classifiers' outputs | |
Liu et al. | The authors matter: Understanding and mitigating implicit bias in deep text classification | |
Plant et al. | You are what you write: Preserving privacy in the era of large language models | |
Guillaudeux et al. | Patient-centric synthetic data generation, no reason to risk re-identification in biomedical data analysis | |
Ugendhar et al. | A Novel Intelligent‐Based Intrusion Detection System Approach Using Deep Multilayer Classification | |
Hossain et al. | High-precision inversion of dynamic radiography using hydrodynamic features | |
Khademi et al. | A causal lens for peeking into black box predictive models: Predictive model interpretation via causal attribution | |
Zhaoquan et al. | Marginal attacks of generating adversarial examples for spam filtering | |
EP4174738B1 (en) | Systems and methods for protecting trainable model validation datasets | |
Xu et al. | Lightweight and unobtrusive privacy preservation for remote inference via edge data obfuscation | |
CN114118398A (en) | Method and system for detecting target type website, electronic equipment and storage medium | |
Liu et al. | An Automatic Privacy‐Aware Framework for Text Data in Online Social Network Based on a Multi‐Deep Learning Model | |
Saini et al. | A Hybrid LSTM-BERT and Glove-based Deep Learning Approach for the Detection of Fake News | |
Agarwal et al. | DeepGram: Combining Language Transformer and N-Gram based ML Models for YouTube Spam Comment Detection | |
Liu et al. | Subverting privacy-preserving gans: Hiding secrets in sanitized images | |
Liu et al. | LAMBERT: Leveraging Attention Mechanisms to Improve the BERT Fine-Tuning Model for Encrypted Traffic Classification | |
Nguyen et al. | Supervised learning models for social bot detection: literature review and benchmark | |
Wang et al. | Link membership inference attacks against unsupervised graph representation learning | |
Xiong et al. | PriTxt: A privacy risk assessment method for text data based on semantic correlation learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |