CN112861179B - Method for desensitizing personal digital spatial data based on text-generated countermeasure network - Google Patents

Method for desensitizing personal digital spatial data based on text-generated countermeasure network Download PDF

Info

Publication number
CN112861179B
CN112861179B CN202110199023.XA CN202110199023A CN112861179B CN 112861179 B CN112861179 B CN 112861179B CN 202110199023 A CN202110199023 A CN 202110199023A CN 112861179 B CN112861179 B CN 112861179B
Authority
CN
China
Prior art keywords
text
data
sequence
personal digital
desensitization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110199023.XA
Other languages
Chinese (zh)
Other versions
CN112861179A (en
Inventor
孙伟
官明哲
张武军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202110199023.XA priority Critical patent/CN112861179B/en
Publication of CN112861179A publication Critical patent/CN112861179A/en
Application granted granted Critical
Publication of CN112861179B publication Critical patent/CN112861179B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a method for desensitizing personal digital spatial data of an antagonistic network based on text generation, which comprises the following steps: s1: acquiring a data file to be desensitized in a personal digital space, and constructing a text to generate a confrontation network model; s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information; s3: inputting the analysis file as source data into a text to generate a confrontation network model for training; s4: judging whether the trained text generation confrontation network model is converged, if so, obtaining desensitization text data with the same statistical characteristics as the source data; if not, the procedure returns to step S3. The invention provides a text-based personal digital spatial data desensitization method for generating an antagonistic network, which solves the problem that the structured format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.

Description

Method for desensitizing personal digital spatial data based on text-generated countermeasure network
Technical Field
The invention relates to the technical field of data desensitization processing, in particular to a method for desensitizing personal digital spatial data based on a text-generated confrontation network.
Background
Data desensitization is a data processing technique that can reduce or remove the sensitivity of data by processing the data. By adopting a data desensitization technology, the risk and harm of data leakage can be reduced, and the privacy of user data is effectively protected. In the field of internet and medical treatment, users can store, check and share personal medical treatment health data through a personal digital space, but the personal medical treatment data can face the risk of leakage of medical treatment sensitive information of the users in the processes of online seeing a doctor, online purchasing of medicines, clinic reservation and the like, the data of the users in the medical treatment industry have extremely high authenticity and sensitivity characteristics, and once the personal sensitive information of the users is leaked, potential life threat can be caused to the users. With data desensitization, information in the personal digital space can be used for business related analysis and processing while avoiding leakage of user data.
The existing data desensitization mode is usually used in a covering or generalization mode and the like, so that private data is protected, and meanwhile, the usability of the data is kept, so that the desensitized data can be continuously used in application scenes such as development testing, data mining, data distribution and the like. Data replacement, namely replacing data in the sensitive information by using random data; data shuffling, which performs row-to-row exchange in source data; numerical value conversion, which is to perform conversion processing on numerical data such as age, time and the like; data occlusion, replacing or altering sensitive data with special symbols such as "+, NULL", etc.; data deletion, namely sensitive data deletion and clearing; and (3) data generalization, namely representing the data from a specific dimension by using a more fuzzy dimension, enlarging the data representation range, eliminating sensitive information and the like. However, when the existing data desensitization technology is applied in a medical scene, the structured format of medical source data can be changed, and the requirements of desensitization and protection of medical sensitive information of a user in the medical scene cannot be met.
In the prior art, for example, chinese patent published in 2019, 8, 16, a data desensitization method, apparatus, device, and computer readable storage medium, publication No. CN110135193A, maximizes the degree of data desensitization, ensures that privacy information is not revealed, and simultaneously effectively improves the practicability of desensitized data, but does not evaluate a complete sequence, and may change the structured format of source data.
Disclosure of Invention
The invention provides a personal digital space data desensitization method based on a text generation confrontation network, aiming at overcoming the technical defect that the structural format of medical source data can be changed when the existing data desensitization technology is applied in a medical scene.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method of generating personal digital spatial data desensitization against a network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in a personal digital space,
constructing a text to generate an confrontation network model;
s2: analyzing the data file to be desensitized to obtain an analysis file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the procedure returns to step S3.
Preferably, the data file to be desensitized is based on semi-structured medical information data in a distributed database.
Preferably, the text generation confrontation network model comprises a generator and a discriminator.
Preferably, the generator generates the sequence using a recurrent neural network.
Preferably, the discriminator discriminates the sequence generated by the generator using a convolutional neural network.
Preferably, in step S3, the text generation countermeasure network model is trained in combination with the strategy of Monte Carlo search.
Preferably, the specific steps of training the text generation confrontation network model are as follows:
inputting a vector obtained by encoding a word of source data into an embedding layer of a recurrent neural network to obtain an embedding layer vector x 1 ,...,x T Output the hidden layer vector h 1 ,..., T To obtain
h t =R(h t-1 ,x t )
Wherein h is t-1 Is the hidden layer vector of the previous state, h t 、x t Hidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through the softmax layer of the recurrent neural network 1:t Middle y t Distribution probability of (2):
p(y t |x 1 ,...,x t )=softmax(b+Wh t )
where b is the offset vector, W is the weight matrix, y t Is a sequence of length t;
reward (Reward) Q for the current sentence, denoted as
Q=D(Y 1:t )
For an n-time Monte Carlo search, denoted as
Figure BDA0002947291720000031
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure BDA0002947291720000032
For each sequence, embedding a layer vector x 1 ,..., T Concatenated to represent a current sequence
Figure BDA0002947291720000033
Wherein the content of the first and second substances,
Figure BDA0002947291720000034
the operation is the connection operation according to the rows;
pairing sequence vectors d by convolution kernel omega 1:T Performing convolution operation
Figure BDA0002947291720000035
Wherein the content of the first and second substances,
Figure BDA0002947291720000036
for multiplication of corresponding positions, p is a non-linear function, c i Is the output value of the convolution layer;
after pooling layer, the vector c = max (c) 1 ,...,c T-1+1 ) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; and (5) carrying out cyclic training to make the model converge when the error of the discriminator is minimum.
Preferably, the obtaining of the loss of the current sentence is based on the output distribution of the discriminator by solving the binary cross entropy, which specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q))。
Preferably, for the generated sequence, when the generator generates a false sequence, the cross entropy at which the discriminator judges true is
loss=-(1*logD(Y 1∶T )+0*log(l-D(Y l:T ))
=-logD(Y 1∶T )
For the discriminated sequences, the discriminator identifies the true source of the sequences, one sequence is true, and the cross entropy when the discriminator determines that it is true is
Figure BDA0002947291720000037
One sequence is false, and the cross entropy when the discriminator judges false is
Figure BDA0002947291720000041
The minimum cross entropy is calculated by the following formula:
Figure BDA0002947291720000042
preferably, the same statistical properties are: the proportion of the numbers or characters in the text is the same.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a text-based personal digital space data desensitization method for generating an antagonistic network, which is characterized in that desensitization data with the same statistical characteristics and structure as an analytic file containing sensitive information is generated by training a text-generated antagonistic network model, so that data desensitization processing on structured text information is realized, and a good text data desensitization effect is achieved under the condition that the structure of data in a personal digital space is not influenced.
Drawings
FIG. 1 is a flow chart of the steps for implementing the technical solution of the present invention;
FIG. 2 is a flow chart of the desensitization operation of the text-generated confrontation network model in the present invention;
FIG. 3 is a network diagram of the generator of the present invention;
FIG. 4 is a network diagram of the arbiter in the present invention;
FIG. 5 is a schematic diagram of a structure of a text-generated confrontation network model according to the present invention;
FIG. 6 is a graph showing a comparison between before and after desensitization with Gaussian-distributed random numbers according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described with reference to the drawings and the embodiments.
Example 1
1-2, a method for text-based generation of personal digital spatial data desensitization to a confrontation network, comprising the steps of:
s1: acquiring a data file to be desensitized in a personal digital space;
more specifically, the data file to be desensitized is based on semi-structured medical information data in a distributed database;
constructing a text generation confrontation network model;
more specifically, the text generation confrontation network model comprises a generator and a discriminator;
more specifically, as shown in fig. 3, the generator generates the sequence using a recurrent neural network;
more specifically, as shown in fig. 4, the discriminator uses a convolutional neural network to discriminate the sequence generated by the generator;
s2: analyzing the data file to be desensitized to obtain an analyzed file containing sensitive information; the analysis file is a json format file;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
more specifically, in step S3, a text generation countermeasure network model is trained in combination with a strategy of Monte Carlo search;
more specifically, as shown in fig. 5, the specific steps of training the text generation confrontation network model are as follows:
inputting the vector obtained by encoding the word of the source data into the embedding layer of the recurrent neural network to obtain the vector x of the embedding layer 1 ,...,x T Output the hidden layer vector h 1 ,..., T To obtain
h t =R(h t-1 ,x t )
Wherein h is t-1 Is the hidden layer vector of the previous state, h t 、x t Hidden layer vectors and embedded layer vectors of the current state, respectively; t belongs to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through a softmax layer of the recurrent neural network 1:t Middle y t Distribution probability of (2):
p(y t |x 1 ,...,x t )=softmax(b+Wh t )
where b is the offset vector, W is the weight matrix, y t Is a sequence of length t;
the reward Q for the current sentence, denoted as
Q=D(Y 1:t )
In order to obtain the evaluation of the discriminator on a complete sequence, a Monte Carlo search strategy is adopted to generate T-T current unknown words, so that the complete sequence is obtained for evaluation; for an n-time Monte Carlo search, denoted as
Figure BDA0002947291720000051
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure BDA0002947291720000061
For each sequence, embedding a layer vector x 1 ,..., T Concatenated to represent a current sequence
Figure BDA0002947291720000062
/>
Wherein the content of the first and second substances,
Figure BDA0002947291720000063
the operation is the connection operation according to the rows;
pairing sequence vectors d by convolution kernel omega 1:T Performing convolution operation
Figure BDA0002947291720000064
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002947291720000065
for multiplication of corresponding positions, p is a non-linear function, c i Is the output value of the convolutional layer;
after pooling layer, the vector c = max (c) 1 ,...,c T-1+1 ) Sigmo through fully connected layersThe id function outputs the probability that the sequence is judged to be 'true', namely the reward Q;
performing cyclic training by adopting a policy gradient (gradient strategy), and updating the parameters of the generator according to the level of the reward Q, thereby reducing the loss of the generated sentence; the model is converged when the error of the discriminator is minimum through cyclic training;
more specifically, solving the binary cross entropy based on the output distribution of the discriminator to obtain the loss of the current sentence specifically includes: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q));
More specifically, for the generated sequence, the cross entropy when the generator generates a false sequence for which the discriminator judges true is
loss=-(l*logD(Y 1∶T )+0*log(l-D(Y 1∶T ))
=-logD(Y 1:T )
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
Figure BDA0002947291720000066
One sequence is false, and the cross entropy when the discriminator judges false is
Figure BDA0002947291720000067
The minimum cross entropy is calculated by the following formula:
Figure BDA0002947291720000071
in practical implementation, in order to make the discriminator accurately identify, the smaller the cross entropy is, the better the cross entropy is;
s4: judging whether the trained text generation confrontation network model converges or not,
if so, desensitization text data with the same statistical characteristics as the source data is obtained, wherein the comparison before and after Gaussian distribution random number desensitization is shown in FIG. 6;
more specifically, the same statistical properties are: the proportion of the numbers or characters in the text is the same;
if not, the step S3 is returned to.
Table 1 is a comparison of textual data before and after desensitization by the described method.
TABLE 1
Figure BDA0002947291720000072
Figure BDA0002947291720000081
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (7)

1. A method for generating personal digital spatial data desensitization to an antagonistic network based on text, comprising the steps of:
s1: acquiring a data file to be desensitized in a personal digital space,
constructing a text to generate an confrontation network model; the text generation confrontation network model comprises a generator and a discriminator;
s2: analyzing the data file to be desensitized to obtain an analyzed file containing sensitive information;
s3: inputting the analysis file as source data into a text to generate a confrontation network model for training;
in step S3, training a text generation countermeasure network model by combining a strategy of Monte Carlo search;
the specific steps of training the text generation confrontation network model are as follows:
inputting the vector obtained by encoding the word of the source data into the embedding layer of the recurrent neural network to obtain the vector x of the embedding layer 1 ,...,x T Output the hidden layer vector h 1 ,...,h T To obtain
h t =R(h t-1 ,x t )
Wherein h is t-1 Is the hidden layer vector of the previous state, h t 、x t Hidden layer vectors and embedded layer vectors of the current state, respectively; t is less than or equal to T, T is a word vector sequence number, and R is an RNN network;
obtaining a sequence Y generated by the current state by the hidden layer vector through the softmax layer of the recurrent neural network 1:t Middle y t Distribution probability of (2):
p(y t |x 1 ,...,x t )=softmax(b+Wh t )
where b is the offset vector, W is the weight matrix, y t Is a sequence of length t;
reward Q for the current sentence, denoted Q = D (Y) 1:t )
For an n-time Monte Carlo search, denoted as
Figure FDA0003886793830000011
The strategy of operating the Monte Carlo search obtains N output sequences from the current state to the end of the sequence, thus obtaining a more accurate reward Q, denoted as
Figure FDA0003886793830000012
For each sequence, embedding a layer vector x 1 ,...,x T Concatenated to represent a current sequence
Figure FDA0003886793830000023
Wherein the content of the first and second substances,
Figure FDA0003886793830000024
the connection operation is performed according to rows;
pairing sequence vectors d by convolution kernels omega 1:T Performing convolution operation
Figure FDA0003886793830000021
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003886793830000022
multiplication by the corresponding position, p being a non-linear function, c i Is the output value of the convolutional layer;
after pooling layer, the vector c = max (c) 1 ,...,c T-l+1 ) Outputting the probability that the sequence is judged to be real through a sigmoid function of the full connection layer, namely rewarding Q;
updating the parameters of the generator according to the high and low of the reward Q, thereby reducing the loss of the generated sentence; the model is converged when the error of the discriminator is minimum through cyclic training;
s4: judging whether the trained text generation confrontation network model converges or not,
if yes, desensitization text data with the same statistical characteristics as the source data are obtained;
if not, the procedure returns to step S3.
2. The method for text-based desensitization of personal digital spatial data generated against a network according to claim 1, wherein said data files to be desensitized are based on semi-structured medical information data in a distributed database.
3. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 1, wherein said generator generates sequences using a recurrent neural network.
4. The method for text-based generation of personal digital spatial data desensitization of an antagonistic network according to claim 1, wherein said arbiter employs a convolutional neural network to discriminate between the sequences generated by said generator.
5. The method for text-based generation of personal digital spatial data desensitization of antagonistic networks according to claim 1, characterized in that binary cross entropy is solved based on the output distribution of the discriminators to obtain the loss of the current sentence, specifically: let P be the probability of state 1 of output P, 1-P be the probability of state 0 of output P, Q be the probability of state 1 of input Q, and 1-Q be the probability of state 0 of input Q, then the cross entropy of P, Q is
H(P|Q)=-(p*logq+(1-p)log(1-q))。
6. The method for desensitizing personal digital space data of text-based generation anti-networking of claim 5, wherein for a generation sequence, when the generator generates a false sequence, the cross entropy at which the discriminator determines true is
loss=-(1*logD(Y 1:T )+0*log(1-D(Y 1:T ))
=-logD(Y 1:T )
For the discriminating sequence, the discriminator identifies the true source of the sequence, one sequence is true, and the cross entropy when the discriminator judges true is
Figure FDA0003886793830000031
One sequence is false, and the cross entropy when the discriminator judges false is
Figure FDA0003886793830000032
The minimum cross entropy is calculated by the following formula:
Figure FDA0003886793830000033
7. the method for text-based desensitization of personal digital spatial data to an antagonistic network in accordance with claim 1, wherein the same statistical properties are: the proportion of the numbers or characters in the text is the same.
CN202110199023.XA 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network Active CN112861179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110199023.XA CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110199023.XA CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Publications (2)

Publication Number Publication Date
CN112861179A CN112861179A (en) 2021-05-28
CN112861179B true CN112861179B (en) 2023-04-07

Family

ID=75988569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110199023.XA Active CN112861179B (en) 2021-02-22 2021-02-22 Method for desensitizing personal digital spatial data based on text-generated countermeasure network

Country Status (1)

Country Link
CN (1) CN112861179B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910817B (en) * 2023-09-13 2023-12-29 北京国药新创科技发展有限公司 Desensitization processing method and device for medical data and electronic equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190258984A1 (en) * 2018-02-19 2019-08-22 Microsoft Technology Licensing, Llc Generative adversarial networks in predicting sequential data
CN109614611B (en) * 2018-11-28 2021-09-03 中山大学 Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network
CN111428448B (en) * 2020-03-02 2024-05-07 平安科技(深圳)有限公司 Text generation method, device, computer equipment and readable storage medium
CN111488911B (en) * 2020-03-15 2022-04-19 北京理工大学 Image entity extraction method based on Mask R-CNN and GAN
CN111563275B (en) * 2020-07-14 2020-10-20 中国人民解放军国防科技大学 Data desensitization method based on generation countermeasure network

Also Published As

Publication number Publication date
CN112861179A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
ElShawi et al. Interpretability in healthcare: A comparative study of local machine learning interpretability techniques
Liu et al. Socinf: Membership inference attacks on social media health data with machine learning
Dai et al. A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability
Higgins et al. Invasive plants have broader physiological niches
Amin et al. Android malware detection through generative adversarial networks
Omar et al. Robust natural language processing: Recent advances, challenges, and future directions
EP3614645A1 (en) Embedded dga representations for botnet analysis
Gupta et al. PCA-RF: an efficient Parkinson's disease prediction model based on random forest classification
Malekzadeh et al. Honest-but-curious nets: Sensitive attributes of private inputs can be secretly coded into the classifiers' outputs
CN112861179B (en) Method for desensitizing personal digital spatial data based on text-generated countermeasure network
Thakur et al. An intelligent algorithmically generated domain detection system
Plant et al. You are what you write: Preserving privacy in the era of large language models
Ali et al. Tamp-X: Attacking explainable natural language classifiers through tampered activations
Cebi et al. Deep learning based security management of information systems: A comparative study
Kulkarni et al. Personally identifiable information (pii) detection in the unstructured large text corpus using natural language processing and unsupervised learning technique
Hossain et al. High-precision inversion of dynamic radiography using hydrodynamic features
Xu et al. Modeling Phishing Decision using Instance Based Learning and Natural Language Processing.
Jiao et al. Role discovery-guided network embedding based on autoencoder and attention mechanism
CN114118398A (en) Method and system for detecting target type website, electronic equipment and storage medium
Ergu et al. Predicting personality with twitter data and machine learning models
Liu et al. Subverting privacy-preserving gans: Hiding secrets in sanitized images
YILMAZ Malware classification with using deep learning
Xiong et al. PriTxt: A privacy risk assessment method for text data based on semantic correlation learning
Sundaram et al. Preventing Reverse Engineering of Critical Industrial Data with DIOD
Khan et al. Analyzing the effects of classifier lipschitzness on explainers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant