CN112861179A - Method for desensitizing personal digital spatial data based on text-generated countermeasure network - Google Patents

Method for desensitizing personal digital spatial data based on text-generated countermeasure network Download PDF

Info

Publication number
CN112861179A
CN112861179A CN202110199023.XA CN202110199023A CN112861179A CN 112861179 A CN112861179 A CN 112861179A CN 202110199023 A CN202110199023 A CN 202110199023A CN 112861179 A CN112861179 A CN 112861179A
Authority
CN
China
Prior art keywords
text
data
sequence
personal digital
desensitization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110199023.XA
Other languages
Chinese (zh)
Other versions
CN112861179B (en
Inventor
孙伟
官明哲
张武军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202110199023.XA priority Critical patent/CN112861179B/en
Publication of CN112861179A publication Critical patent/CN112861179A/en
Application granted granted Critical
Publication of CN112861179B publication Critical patent/CN112861179B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for desensitizing personal digital spatial data of an antagonistic network based on text generation, which comprises the following steps: s1: acquiring a data file to be desensitized in a personal digital space, and constructing a text to generate an confrontation network model; s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information; s3: inputting the analysis file as source data into a text to generate a confrontation network model for training; s4: judging whether the trained text generation confrontation network model is converged, if so, obtaining desensitization text data with the same statistical characteristics as the source data; if not, the process returns to step S3. The invention provides a text-based personal digital spatial data desensitization method for generating an antagonistic network, which solves the problem that the structured format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.

Description

Method for desensitizing personal digital spatial data based on text-generated countermeasure network
Technical Field
The invention relates to the technical field of data desensitization processing, in particular to a method for desensitizing personal digital spatial data based on a text-generated confrontation network.
Background
Data desensitization is a data processing technique that can reduce or remove the sensitivity of data by processing the data. By adopting a data desensitization technology, the risk and harm of data leakage can be reduced, and the privacy of user data is effectively protected. In the field of internet and medical treatment, users can store, check and share personal medical treatment health data through personal digital space, but the personal medical treatment data face the risk of leakage of user medical treatment sensitive information in the processes of online doctor watching, online medicine purchasing, outpatient service appointment and the like, and the data of the users in the medical treatment industry have extremely high authenticity and sensitivity, and once the personal sensitive information of the users is leaked, potential life threat can be caused to the users. With data desensitization, information in the personal digital space can be used for business related analysis and processing while avoiding leakage of user data.
The existing data desensitization mode is usually used in a covering or generalization mode and the like, so that private data is protected, and meanwhile, the usability of the data is kept, so that the desensitized data can be continuously used in application scenes such as development testing, data mining, data distribution and the like. Data replacement, namely replacing data in the sensitive information by using random data; data shuffling, which performs row-to-row exchange in source data; numerical value conversion, which is to perform conversion processing on numerical data such as age, time and the like; data occlusion, replacing or altering sensitive data with special symbols such as "+, NULL", etc.; data deletion, namely sensitive data deletion and clearing; and (3) data generalization, namely representing the data from a specific dimension by using a more fuzzy dimension, enlarging the data representation range, eliminating sensitive information and the like. However, when the existing data desensitization technology is applied in a medical scene, the structured format of medical source data can be changed, and the requirements of desensitization and protection of medical sensitive information of a user in the medical scene cannot be met.
In the prior art, such as chinese patent published in 2019, 8, 16, a data desensitization method, apparatus, device, and computer readable storage medium, publication number CN110135193A, maximizes data desensitization degree, ensures that privacy information is not revealed, and effectively improves the practicability of desensitized data, but does not evaluate complete sequences, and changes the structured format of source data.
Disclosure of Invention
The invention provides a personal digital space data desensitization method based on a text generation confrontation network, aiming at overcoming the technical defect that the structural format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method of generating personal digital spatial data desensitization against a network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in the personal digital space,
constructing a text to generate an confrontation network model;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the process returns to step S3.
Preferably, the data file to be desensitized is based on semi-structured medical information data in a distributed database.
Preferably, the text generation confrontation network model comprises a generator and a discriminator.
Preferably, the generator generates the sequence using a recurrent neural network.
Preferably, the discriminator discriminates the sequence generated by the generator using a convolutional neural network.
Preferably, in step S3, the text generation countermeasure network model is trained in conjunction with the strategy of Monte Carlo search.
Preferably, the specific steps of training the text generation confrontation network model are as follows:
vector input cycle obtained by encoding words of source dataObtaining an embedding layer vector x by an embedding layer of the ring neural network1,...,xTOutput the hidden layer vector h1,...,TTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
reward (Reward) Q for the current sentence, denoted as
Q=D(Y1:t)
For an n-time Monte Carlo search, it is expressed as
Figure BDA0002947291720000031
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure BDA0002947291720000032
For each sequence, embedding a layer vector x1,...,TConcatenated to represent a current sequence
Figure BDA0002947291720000033
Wherein the content of the first and second substances,
Figure BDA0002947291720000034
the connection operation is performed according to rows;
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Figure BDA0002947291720000035
Wherein the content of the first and second substances,
Figure BDA0002947291720000036
for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; and (5) carrying out cyclic training to make the model converge when the error of the discriminator is minimum.
Preferably, the obtaining of the loss of the current sentence is based on the output distribution of the discriminator by solving the binary cross entropy, which specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q))。
Preferably, for the generated sequence, when the generator generates a false sequence, the cross entropy at which the discriminator judges true is
loss=-(1*logD(Y1∶T)+0*log(l-D(Yl:T))
=-logD(Y1∶T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
Figure BDA0002947291720000037
One sequence is false, and the cross entropy when the discriminator judges false is
Figure BDA0002947291720000041
The minimum cross entropy is calculated by the following formula:
Figure BDA0002947291720000042
preferably, the same statistical properties are: the proportion of the numbers or characters in the text is the same.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a text-based personal digital space data desensitization method for generating an antagonistic network, which is characterized in that desensitization data with the same statistical characteristics and structure as an analytic file containing sensitive information is generated by training a text-generated antagonistic network model, so that data desensitization processing on structured text information is realized, and a good text data desensitization effect is achieved under the condition that the structure of data in a personal digital space is not influenced.
Drawings
FIG. 1 is a flow chart of the steps for implementing the technical solution of the present invention;
FIG. 2 is a flow chart of the desensitization work flow of the text generation confrontation network model in the present invention;
FIG. 3 is a network diagram of the generator of the present invention;
FIG. 4 is a network diagram of the arbiter in the present invention;
FIG. 5 is a schematic diagram of a structure of a text-generated confrontation network model according to the present invention;
FIG. 6 is a graph showing a comparison between before and after desensitization with Gaussian-distributed random numbers according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
1-2, a method for generating personal digital spatial data desensitization against a network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in a personal digital space;
more specifically, the data file to be desensitized is based on semi-structured medical information data in a distributed database;
constructing a text to generate an confrontation network model;
more specifically, the text generation confrontation network model comprises a generator and a discriminator;
more specifically, as shown in fig. 3, the generator generates the sequence using a recurrent neural network;
more specifically, as shown in fig. 4, the discriminator uses a convolutional neural network to discriminate the sequence generated by the generator;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information; the analysis file is a json format file;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
more specifically, in step S3, the strategy of Monte Carlo search is combined to train the text generation countermeasure network model;
more specifically, as shown in fig. 5, the specific steps of training the text generation confrontation network model are as follows:
inputting the vector obtained by encoding the word of the source data into the recurrent nerveEmbedding layer of network to obtain embedding layer vector x1,...,xTOutput the hidden layer vector h1,...,TTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
the reward Q for the current sentence, denoted as
Q=D(Y1:t)
In order to obtain the evaluation of the discriminator on a complete sequence, a Monte Carlo search strategy is adopted to generate T-T current unknown words, so that the complete sequence is obtained for evaluation; for an n-time Monte Carlo search, it is expressed as
Figure BDA0002947291720000051
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure BDA0002947291720000061
For each sequence, embedding a layer vector x1,...,TConcatenated to represent a current sequence
Figure BDA0002947291720000062
Wherein the content of the first and second substances,
Figure BDA0002947291720000063
the connection operation is performed according to rows;
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Figure BDA0002947291720000064
Wherein the content of the first and second substances,
Figure BDA0002947291720000065
for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
carrying out cyclic training by adopting a policy gradient (gradient strategy), and updating the parameters of the generator according to the height of the reward Q, thereby reducing the loss of the generated sentences; the model is converged when the error of the discriminator is minimum through cyclic training;
more specifically, solving the binary cross entropy based on the output distribution of the discriminator to obtain the loss of the current sentence specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q));
More specifically, for the generated sequence, the cross entropy when the generator generates a false sequence for which the discriminator judges true is
loss=-(l*logD(Y1∶T)+0*log(l-D(Y1∶T))
=-logD(Y1:T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
Figure BDA0002947291720000066
One sequence is false, and the cross entropy when the discriminator judges false is
Figure BDA0002947291720000067
The minimum cross entropy is calculated by the following formula:
Figure BDA0002947291720000071
in practical implementation, in order to make the discriminator accurately identify, the smaller the cross entropy is, the better the cross entropy is;
s4: judging whether the trained text generation confrontation network model converges or not,
if so, desensitization text data with the same statistical characteristics as the source data is obtained, wherein the comparison before and after Gaussian distribution random number desensitization is shown in FIG. 6;
more specifically, the same statistical properties are: the proportion of the numbers or characters in the text is the same;
if not, the process returns to step S3.
Table 1 is a comparison of textual data before and after desensitization by the described method.
TABLE 1
Figure BDA0002947291720000072
Figure BDA0002947291720000081
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A method for generating personal digital spatial data desensitization to an antagonistic network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in the personal digital space,
constructing a text to generate an confrontation network model;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the process returns to step S3.
2. The method for text-based desensitization of personal digital spatial data generated against a network according to claim 1, wherein said data files to be desensitized are based on semi-structured medical information data in a distributed database.
3. The method for desensitizing personal digital spatial data based on text-generated countermeasure networks according to claim 1, wherein the text-generated countermeasure network model includes a generator and a discriminator.
4. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 3, wherein said generator generates sequences using a recurrent neural network.
5. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 3, wherein said arbiter employs a convolutional neural network to discriminate between sequences generated by said generator.
6. The method for desensitizing personal digital spatial data of a text-based generated confrontation network according to claim 3, wherein in step S3, the text-based generated confrontation network model is trained in conjunction with the strategy of Monte Carlo search.
7. The method for desensitizing personal digital spatial data based on text-generated confrontation network of claim 6, wherein the specific steps for training the text-generated confrontation network model are:
inputting a vector obtained by encoding a word of source data into an embedding layer of a recurrent neural network to obtain an embedding layer vector x1,...,xTOutput the hidden layer vector h1,...,hTTo obtain
ht=R(ht-1,xt)
Wherein h ist-1Is the hidden layer vector of the previous state, ht、xtHidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network1:tMiddle ytDistribution probability of (2):
p(yt|x1,...,xt)=softmax(b+Wht)
where b is the offset vector, W is the weight matrix, ytIs a sequence of length t;
the reward Q for the current sentence, denoted as
Q=D(Y1:t)
For an n-time Monte Carlo search, it is expressed as
Figure FDA0002947291710000021
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure FDA0002947291710000022
For each sequence, embedding a layer vector x1,...,xTConcatenated to represent a current sequence
Figure FDA0002947291710000023
Wherein the content of the first and second substances,
Figure FDA0002947291710000024
the connection operation is performed according to rows;
pairing sequence vectors d by convolution kernels omega1:TPerforming convolution operation
Figure FDA0002947291710000025
Wherein the content of the first and second substances,
Figure FDA0002947291710000026
for multiplication of corresponding positions, p is a non-linear function, ciIs the output value of the convolutional layer;
after the pooling layer, the vector c is obtained as max (c)1,...,cT-1+1) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; and (5) carrying out cyclic training to make the model converge when the error of the discriminator is minimum.
8. The method for text-based generation of personal digital spatial data desensitization of antagonistic networks according to claim 7, characterized in that binary cross entropy is solved based on the output distribution of the discriminators to obtain the loss of the current sentence, in particular: let p be the probability of state 1 of the output bin, 1-p be the probability of state 0 of the output bin, Q be the probability of state 1 of the input Q, and 1-Q be the probability of state 0 of the input Q, then the cross entropy of P, Q is P, Q
H(P|Q)=-(p*logq+(1-p)log(1-q))。
9. The method for text-based generation of personal digital spatial data desensitization of countermeasure networks according to claim 8, wherein for a generated sequence, when the generator generates a false sequence, the cross entropy at which the discriminator determines true is
loss=-(1*logD(Y1:T)+0*log(1-D(Y1:T))
=-logD(Y1:T)
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
Figure FDA0002947291710000031
One sequence is false, and the cross entropy when the discriminator judges false is
Figure FDA0002947291710000032
The minimum cross entropy is calculated by the following formula:
Figure FDA0002947291710000033
10. the method for text-based desensitization of personal digital spatial data to an antagonistic network in accordance with claim 1, wherein the same statistical properties are: the proportion of the numbers or characters in the text is the same.
CN202110199023.XA 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network Active CN112861179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110199023.XA CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110199023.XA CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Publications (2)

Publication Number Publication Date
CN112861179A true CN112861179A (en) 2021-05-28
CN112861179B CN112861179B (en) 2023-04-07

Family

ID=75988569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110199023.XA Active CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Country Status (1)

Country Link
CN (1) CN112861179B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910817A (en) * 2023-09-13 2023-10-20 北京国药新创科技发展有限公司 Desensitization processing method and device for medical data and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614611A (en) * 2018-11-28 2019-04-12 中山大学 A kind of fusion generates the sentiment analysis method of non-confrontation network and convolutional neural networks
US20190258984A1 (en) * 2018-02-19 2019-08-22 Microsoft Technology Licensing, Llc Generative adversarial networks in predicting sequential data
CN111428448A (en) * 2020-03-02 2020-07-17 平安科技(深圳)有限公司 Text generation method and device, computer equipment and readable storage medium
CN111488911A (en) * 2020-03-15 2020-08-04 北京理工大学 Image entity extraction method based on Mask R-CNN and GAN
CN111563275A (en) * 2020-07-14 2020-08-21 中国人民解放军国防科技大学 Data desensitization method based on generation countermeasure network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190258984A1 (en) * 2018-02-19 2019-08-22 Microsoft Technology Licensing, Llc Generative adversarial networks in predicting sequential data
CN109614611A (en) * 2018-11-28 2019-04-12 中山大学 A kind of fusion generates the sentiment analysis method of non-confrontation network and convolutional neural networks
CN111428448A (en) * 2020-03-02 2020-07-17 平安科技(深圳)有限公司 Text generation method and device, computer equipment and readable storage medium
CN111488911A (en) * 2020-03-15 2020-08-04 北京理工大学 Image entity extraction method based on Mask R-CNN and GAN
CN111563275A (en) * 2020-07-14 2020-08-21 中国人民解放军国防科技大学 Data desensitization method based on generation countermeasure network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
于胡飞 等: "基于生成对抗网络的医学数据域适应研究", 《大数据》 *
张煜等: "基于生成对抗网络的文本序列数据集脱敏", 《网络与信息安全学报》 *
郑旭如: "基于深度学习的数据脱敏研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910817A (en) * 2023-09-13 2023-10-20 北京国药新创科技发展有限公司 Desensitization processing method and device for medical data and electronic equipment
CN116910817B (en) * 2023-09-13 2023-12-29 北京国药新创科技发展有限公司 Desensitization processing method and device for medical data and electronic equipment

Also Published As

Publication number Publication date
CN112861179B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Higgins et al. Invasive plants have broader physiological niches
Amin et al. Android malware detection through generative adversarial networks
Omar et al. Robust natural language processing: Recent advances, challenges, and future directions
EP3614645B1 (en) Embedded dga representations for botnet analysis
Zou et al. Multi-task learning improves disease models from web search
Malekzadeh et al. Honest-but-curious nets: Sensitive attributes of private inputs can be secretly coded into the classifiers' outputs
Liu et al. The authors matter: Understanding and mitigating implicit bias in deep text classification
Shao et al. One-class classification with deep autoencoder neural networks for author verification in internet relay chat
Plant et al. You are what you write: Preserving privacy in the era of large language models
CN112861179B (en) Method for desensitizing personal digital spatial data based on text-generated countermeasure network
Alorini et al. LSTM-RNN based sentiment analysis to monitor COVID-19 opinions using social media data
Kulkarni et al. Personally identifiable information (pii) detection in the unstructured large text corpus using natural language processing and unsupervised learning technique
Jiao et al. Role discovery-guided network embedding based on autoencoder and attention mechanism
Hossain et al. High-precision inversion of dynamic radiography using hydrodynamic features
Zhaoquan et al. Marginal attacks of generating adversarial examples for spam filtering
Xing et al. HMBI: a new hybrid deep model based on behavior information for fake news detection
CN114118398A (en) Method and system for detecting target type website, electronic equipment and storage medium
Ergu et al. Predicting personality with twitter data and machine learning models
Xu et al. Lightweight and unobtrusive privacy preservation for remote inference via edge data obfuscation
Liu et al. Subverting privacy-preserving gans: Hiding secrets in sanitized images
Khan et al. Anomalous node detection in attributed social networks using dual variational autoencoder with generative adversarial networks
Muazu et al. A federated learning system with data fusion for healthcare using multi-party computation and additive secret sharing
Saini et al. A Hybrid LSTM-BERT and Glove-based Deep Learning Approach for the Detection of Fake News
YILMAZ Malware classification with using deep learning
Xiong et al. PriTxt: A privacy risk assessment method for text data based on semantic correlation learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant