CN117056951A - Data security management method for digital platform - Google Patents

Data security management method for digital platform Download PDF

Info

Publication number
CN117056951A
CN117056951A CN202311004874.XA CN202311004874A CN117056951A CN 117056951 A CN117056951 A CN 117056951A CN 202311004874 A CN202311004874 A CN 202311004874A CN 117056951 A CN117056951 A CN 117056951A
Authority
CN
China
Prior art keywords
data
model
learning
training
privacy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311004874.XA
Other languages
Chinese (zh)
Inventor
郝慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Haoxin Haoyi Intelligent Technology Co ltd
Original Assignee
Shanghai Haoxin Haoyi Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Haoxin Haoyi Intelligent Technology Co ltd filed Critical Shanghai Haoxin Haoyi Intelligent Technology Co ltd
Priority to CN202311004874.XA priority Critical patent/CN117056951A/en
Publication of CN117056951A publication Critical patent/CN117056951A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a data security management method of a digital platform, which relates to the technical field of data security.

Description

Data security management method for digital platform
Technical Field
The application relates to the technical field of data security, in particular to a data security management method of a digital platform.
Background
Data security management methods refer to protecting data from the risk of unauthorized access, corruption, leakage, or tampering by employing a range of measures and policies. The goal of data security management is to ensure confidentiality, integrity, and availability of data, and to comply with applicable legal regulations and industry standards.
The data security management method of the digital platform refers to a series of measures and strategies adopted for the data on the digital platform to ensure confidentiality, integrity and usability of the data and prevent the data from being accessed, tampered, leaked or damaged by unauthorized. The digital platform can be various online services, application programs, cloud services and network and mobile applications, and common measures of the data security management method of the digital platform comprise identity authentication and access control, data encryption, data backup and disaster recovery, security audit and monitoring, staff training and consciousness, network security protection, updating and vulnerability repairing, data classification and access authority control and the like, so that data resources of the digital platform can be better protected, data security is ensured, and more reliable and safe services are provided for users. With the continuous development of technology, the data security management method of the digital platform is also continuously optimized and updated.
However, the conventional data security management method uses a rule engine and a signature detection method to identify the known attack mode, so that the identification of the external supply cannot be effectively performed, and meanwhile, the conventional data sharing method may directly share the original data and may cause the leakage of key data, so that a data security management method of a digital platform is needed to solve such problems.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the application provides a data security management method of a digital platform, which solves the problems that the prior art can not effectively identify external supply by using a rule engine and a signature detection mode to identify a known attack mode, and meanwhile, the traditional data sharing mode can directly share original data and can cause key data leakage.
(II) technical scheme
In order to achieve the above object, the present application provides a data security management method for a digital platform, which includes:
building a deep learning model, training by using historical data, continuously monitoring network flow and user behaviors, identifying abnormal activities, monitoring the network flow and the user behaviors in real time, automatically detecting and alarming abnormality and potential threat, and providing high-precision intrusion detection;
differential privacy data sharing, which adopts differential privacy technology to encrypt and noise sensitive data;
generating an antagonism network defense, introducing the generated antagonism network, and generating an antagonism sample to test and strengthen the safety of a traditional machine learning model;
federation learning, which adopts a federation learning method to train a model locally for a plurality of data sources, and only shares model parameters instead of original data;
safety reinforcement learning, which adopts a safety reinforcement learning technology to enable a system to interact with the environment, autonomously learn and adjust a defense strategy;
edge intelligence, namely deploying an edge intelligence technology on terminal equipment to realize real-time safety monitoring and processing;
an interpretable AI, which interprets and visualizes the model decision process using an interpretable artificial intelligence model;
automatic bug repair, automatically detecting bugs in the system by using a machine learning technology, and generating a repair strategy in real time.
The application is further arranged to: the specific steps of constructing the depth model and training are as follows:
collecting historical data of network traffic and user behaviors as a training data set, and performing data cleaning, feature extraction and label marking;
in the intrusion detection task, a deep learning algorithm is selected to comprise a convolutional neural network CNN, a recurrent neural network RNN, a long-short-term memory network LSTM and a converter, a model architecture is constructed, the model architecture comprises an input layer, a hidden layer and an output layer, and an activation function, a loss function and an optimization algorithm are set;
dividing the data set into a training set, a verification set and a test set, wherein the training set is used for model training, the verification set is used for adjusting super parameters and avoiding overfitting, and the test set is used for evaluating model performance;
training a deep learning model by using a training set, iterating and optimizing model parameters, minimizing a loss function, and selecting GradientDescent and variants thereof by using a gradient descent method in an optimization algorithm, wherein the gradient descent method comprises Adam and RMSprop;
according to the performance of the verification set, super parameters of the model are adjusted, including learning rate, regularization coefficient, hidden layer node number, performance and generalization capability of the model are optimized;
evaluating the trained deep learning model by using a test set, and calculating performance indexes including accuracy, recall and F1 value;
deploying the trained deep learning model on a digital platform, continuously monitoring network flow and user behavior, inputting data samples into the model in real time for prediction, identifying abnormal activities and triggering corresponding response measures;
the application is further arranged to: the deep learning model is built by a convolutional neural network:
input: x is X
Hidden layer h=f (W x+b)
Output layer y=g (V h+c)
Wherein X is a feature vector of a data sample, f is an activation function 1, g is an activation function 2;
loss function definition:
loss function L (y, y')
Wherein y is an actual tag and y' is a predicted tag;
algorithm optimization and parameter updating rule:
wherein alpha is learning rate and is gradient vector;
the application is further arranged to: the step of sharing the differential privacy data specifically comprises the following steps:
adding noise to the preprocessed data, and selecting Laplacian noise for noise adding processing, wherein a specific noise adding formula is as follows:
wherein,noise representing the Laplace distribution, epsilon is the privacy budget and sensitivity is the sensitivity;
setting privacy budget epsilon of differential privacy, and sharing the encrypted and noisy data to authorized data users;
when the inquiry of the data user is received, decrypting and processing the encrypted and noisy data, and then returning a response result;
performing privacy protection analysis, evaluating the effect of the differential privacy technology, and ensuring that the shared data meets the privacy protection requirement;
the application is further arranged to: the privacy preserving analysis step further includes:
for shared sensitive data, calculating the sensitivity thereof;
determining the size of noise according to the setting of privacy budget epsilon;
the mathematical definition of differential privacy is used to evaluate the privacy preserving effect of shared data, specifically:
for any adjacent data sets D and D', and any query Q, the following conditions are satisfied for all possible query results S:
Pr[Q(D)∈S]<=exp(epsilon)*Pr[Q(D’)∈S]
where epsilon represents the privacy budget, Q (D) represents the result of querying D on dataset D, exp (epsilon) represents the exponentiation of the privacy budget;
evaluating privacy preserving effects of the shared data using the differential privacy distortion;
after privacy protection processing, performance evaluation is carried out on the shared data, wherein the performance evaluation comprises model accuracy, data availability and query response time;
according to the result of privacy protection analysis, adjusting parameters in the differential privacy technology;
the application is further arranged to: the step of introducing the generated challenge network to test and strengthen the security of the traditional machine learning model specifically comprises the following steps:
adopting a generator network and a discriminator network, and preparing a data set for training GAN, wherein the data set comprises real data and noise data; training the GAN using the real data and the noise data, the generator network attempting to generate samples that approximate the real data, and the arbiter network attempting to distinguish the real data from the data generated by the generator;
generating an antagonism sample using the trained generator network;
testing a traditional machine learning model by using the generated resistance sample, inputting the resistance sample into the traditional model as input, and observing an output result of the model;
according to the test result of the traditional model, the improvement of the U-shaped face pinching and defending method of the resistance is selectively carried out:
resistance training: mixing the generated resistance sample with the original training data, and retraining the traditional model;
the application is further arranged to: the local training steps by adopting the federal learning method specifically comprise:
respectively collecting a plurality of data sources which need to participate in federal learning;
randomly initializing parameters of a federal learning model before federal learning is started;
in each federal learning iteration, the data source sequence is specifically:
each data source locally trains a model using local data;
after the local training is finished, each data source uploads model parameters obtained by the local training to a central server;
the central server aggregates the collected model parameters, and sends the aggregated model parameters back to each data source by the central server to update the respective local model parameters;
repeating the federal learning iteration until the model converges;
the parameter aggregation formula in the federal learning process is:
where ω_avg is the average of the parameters, N is the number of data sources, ω_i is the local model parameter of the ith data source;
the application is further arranged to: the safety reinforcement learning step specifically includes:
in the safety reinforcement learning, the system is interacted with the environment, and the specific steps of autonomous learning and adjusting the defense strategy are as follows:
modeling a system environment, including abstracting the system operating environment, a network structure and an attacker behavior into a mathematical model;
defining a reward function for evaluating the performance of the system in different states;
adopting a Q-learning algorithm to connect the built reinforcement learning model with a defending part of the system, so that the system can interact with the environment;
learning and optimizing according to the reinforcement learning algorithm, and continuously interacting with the environment and learning;
the updating rule formula of the reinforcement learning algorithm in reinforcement learning is as follows:
Q(s,a)=Q(s,a)+a*(r+γ*max(Q(s’,a’))-Q(s,a))
where Q (s, a) represents the expected return for performing action a in state s, a is the learning rate, r is the reward obtained after performing action a in state s, γ is the discount factor, s ' is the new state after performing action a, and a ' is the optimal action selected in the new state s '.
The application also provides a terminal device, which comprises: the control program of the data security management method of the digital platform is executed by the processor to realize the data security management method of the digital platform;
the application also provides a storage medium which is applied to a computer, wherein the storage medium is stored with a control program of the data security management method of the digital platform, and the control program of the data security management method of the digital platform realizes the data security management method of the digital platform when being executed by the processor.
(III) beneficial effects
The application provides a data security management method of a digital platform. The beneficial effects are as follows:
the data security management method of the digital platform provided by the application uses a deep learning model to perform intrusion detection, gathers historical data of network traffic and user behaviors as a training data set, builds a model framework in an intrusion detection task, comprises an input layer, a hidden layer and an output layer, sets an activation function, a loss function and an optimization algorithm, divides the data set into a training set, a verification set and a test set for training, super-parameter adjustment and evaluation of the model, uses the training set to train the deep learning model, optimizes model parameters through iteration, minimizes the loss function, optimizes by adopting a gradient descent method, and adjusts super-parameters of the model including learning rate, regularization coefficient and hidden layer node number according to the performance of the verification set so as to improve the performance and generalization capability of the model.
Aiming at real-time invasion, a trained deep learning model is deployed on a digital platform, network flow and user behaviors are continuously monitored, data samples are input into the model in real time for prediction, abnormal activities are identified, and corresponding response measures are triggered.
Aiming at privacy data, the differential privacy technology is utilized to encrypt and noise sensitive data, data privacy is protected, meanwhile, an authorized data user is allowed to obtain limited irreversible insight, and the encrypted and noisy data is shared to the authorized data user according to privacy budget.
The method comprises the steps of generating an antagonism network, generating an antagonism sample to test and strengthen the safety of a traditional machine learning model, enhancing the safety of the model by antagonism training and improving a defense strategy, performing model training on local equipment by using a federal learning method, sharing only model parameters instead of original data to reduce the risk of data leakage, improving the safety of the data, and simultaneously adopting a safety enhancement learning technology to enable a system to interact with the environment, autonomously learn and adjust the defense strategy so as to adapt to the constantly-changing safety threat.
The method solves the problems that the known attack mode is identified by using a rule engine and a signature detection mode in the prior art, the external supply cannot be effectively identified, and meanwhile, original data can be directly shared by a traditional data sharing mode, so that key data leakage can be caused.
Drawings
Fig. 1 is a flowchart of a data security management method of a digital platform according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Examples
Referring to fig. 1, the present application provides a data security management method for a digital platform, which includes the following steps:
s1, constructing a deep learning model, training by using historical data, continuously monitoring network flow and user behaviors, identifying abnormal activities, monitoring the network flow and the user behaviors in real time, automatically detecting and alarming abnormality and potential threat, and providing high-precision intrusion detection;
the specific steps of constructing the depth model and training are as follows:
collecting historical data of network traffic and user behaviors as a training data set, and performing data cleaning, feature extraction and label marking;
in the intrusion detection task, a deep learning algorithm is selected to comprise a convolutional neural network CNN, a recurrent neural network RNN, a long-short-term memory network LSTM and a converter, a model architecture is constructed, the model architecture comprises an input layer, a hidden layer and an output layer, and an activation function, a loss function and an optimization algorithm are set;
dividing the data set into a training set, a verification set and a test set, wherein the training set is used for model training, the verification set is used for adjusting super parameters and avoiding overfitting, and the test set is used for evaluating model performance;
training a deep learning model by using a training set, iterating and optimizing model parameters, minimizing a loss function, and selecting GradientDescent and variants thereof by using a gradient descent method in an optimization algorithm, wherein the gradient descent method comprises Adam and RMSprop;
according to the performance of the verification set, super parameters of the model are adjusted, including learning rate, regularization coefficient, hidden layer node number, performance and generalization capability of the model are optimized;
evaluating the trained deep learning model by using a test set, and calculating performance indexes including accuracy, recall and F1 value;
deploying the trained deep learning model on a digital platform, continuously monitoring network flow and user behavior, inputting data samples into the model in real time for prediction, identifying abnormal activities and triggering corresponding response measures;
the specific implementation process is as follows:
the deep learning model is built by a convolutional neural network:
input: x is X
Hidden layer h=f (W x+b)
Output layer y=g (V h+c)
Wherein X is a feature vector of a data sample, f is an activation function 1, g is an activation function 2;
loss function definition:
loss function L (y, y')
Wherein y is an actual tag and y' is a predicted tag;
algorithm optimization and parameter updating rule:
wherein alpha is learning rate and is gradient vector;
s2, differential privacy data sharing, namely encrypting and denoising sensitive data by adopting a differential privacy technology;
the step of sharing the differential privacy data specifically comprises the following steps:
adding noise to the preprocessed data, and selecting Laplacian noise for noise adding processing, wherein a specific noise adding formula is as follows:
wherein,noise representing the Laplace distribution, epsilon is the privacy budget and sensitivity is the sensitivity;
setting privacy budget epsilon of differential privacy, and sharing the encrypted and noisy data to authorized data users;
when the inquiry of the data user is received, decrypting and processing the encrypted and noisy data, and then returning a response result;
performing privacy protection analysis, evaluating the effect of the differential privacy technology, and ensuring that the shared data meets the privacy protection requirement;
the privacy protection analysis step specifically includes:
for shared sensitive data, calculating the sensitivity thereof;
determining the size of noise according to the setting of privacy budget epsilon;
the mathematical definition of differential privacy is used to evaluate the privacy preserving effect of shared data, specifically:
for any adjacent data sets D and D', and any query Q, the following conditions are satisfied for all possible query results S:
Pr[Q(D)∈S]<=exp(epsilon)*Pr[Q(D’)∈S]
where epsilon represents the privacy budget, Q (D) represents the result of querying D on dataset D, exp (epsilon) represents the exponentiation of the privacy budget;
evaluating privacy preserving effects of the shared data using the differential privacy distortion;
after privacy protection processing, performance evaluation is carried out on the shared data, wherein the performance evaluation comprises model accuracy, data availability and query response time;
according to the result of privacy protection analysis, adjusting parameters in the differential privacy technology; privacy budget epsilon and noise size to balance privacy protection and data accuracy;
after differential privacy processing is carried out on the shared data, the shared data can be protected while the privacy is still available and effective, and the shared data can achieve a better privacy protection effect under the protection of a differential privacy technology by setting privacy budget and noise size and carrying out performance evaluation and optimization parameters;
s3, generating countermeasure network defense, introducing a generated countermeasure network, and generating a countermeasure sample to test and strengthen the safety of the traditional machine learning model; the method has the advantages that the resistance attack is effectively detected and defended, and the robustness and safety of the model are improved;
the step of introducing the generation of the challenge network to test and strengthen the security of the traditional machine learning model specifically comprises:
adopting a generator network and a discriminator network, and preparing a data set for training GAN, wherein the data set comprises real data and noise data; training the GAN using the real data and the noise data, the generator network attempting to generate samples that approximate the real data, and the arbiter network attempting to distinguish the real data from the data generated by the generator;
generating an antagonism sample using the trained generator network; the resistance sample is a sample obtained after the original input sample is subjected to micro disturbance;
testing a traditional machine learning model by using the generated resistance sample, inputting the resistance sample into the traditional model as input, and observing an output result of the model; if the model performs poorly on the resistance sample, it may be shown that the model is less robust to resistance attacks;
according to the test result of the traditional model, the improvement of the U-shaped face pinching and defending method of the resistance is selectively carried out:
resistance training: mixing the generated resistance sample with the original training data, and retraining the traditional model;
through repeated iterative training, the generator network gradually learns to generate samples close to real data, and the arbiter network gradually improves the capability of distinguishing the real data from the generated data, and through generating an opposite sample by utilizing the generated opposite network, the robustness of the traditional machine learning model is evaluated and enhanced, and the performance of the model is improved when the model faces unknown attacks and opposite samples;
s4, federation learning, namely locally training a model for a plurality of data sources by adopting a federation learning method, wherein only model parameters are shared instead of original data;
model training is carried out under the condition of not centralizing data, so that the risk of data leakage is reduced, and better model generalization capability is realized;
the local training steps by adopting the federal learning method specifically comprise:
respectively collecting a plurality of data sources which need to participate in federal learning; each data source locally carries out pretreatment, feature extraction and label marking on own data, so that the consistency and usability of the data are ensured;
randomly initializing parameters of a federal learning model before federal learning is started;
in each federal learning iteration, the data source sequence is specifically:
each data source locally trains a model using local data;
after the local training is finished, each data source uploads model parameters obtained by the local training to a central server;
the central server aggregates the collected model parameters, and sends the aggregated model parameters back to each data source by the central server to update the respective local model parameters;
repeating the federal learning iteration until the model converges;
the parameter aggregation formula in the federal learning process is:
where ω_avg is the average of the parameters, N is the number of data sources, ω_i is the local model parameter of the ith data source;
federal learning allows multiple data sources to train a model locally, sharing only model parameters and not original data, thereby protecting user privacy and data security, through federal learning, different data sources can train a global model together without concentrating the data set in one place;
s5, safety reinforcement learning, namely enabling the system to interact with the environment, and autonomously learning and adjusting a defense strategy by adopting a safety reinforcement learning technology; the adaptability of the system to constantly-changing security threats is improved, and the security defense effect is enhanced;
in the safety reinforcement learning, the system is interacted with the environment, and the specific steps of autonomous learning and adjusting the defense strategy are as follows:
modeling a system environment, including abstracting the system operating environment, a network structure and an attacker behavior into a mathematical model;
defining a reward function for evaluating the performance of the system in different states;
adopting a Q-learning algorithm to connect the built reinforcement learning model with a defending part of the system, so that the system can interact with the environment;
learning and optimizing according to the reinforcement learning algorithm, and continuously interacting with the environment and learning;
the updating rule formula of the reinforcement learning algorithm in reinforcement learning is as follows:
Q(s,a)=Q(s,a)+a*(r+γ*max(Q(s’,a’))-Q(s,a))
wherein Q (s, a) represents the expected return of executing action a in state s, a is the learning rate, r is the reward obtained after executing action a in state s, gamma is the discount factor, s ' is the new state after executing action a, a ' is the optimal action selected in the new state s ', and the updating rule can enable the system to continuously optimize the Q value according to the feedback of the environment, so that the optimal defense strategy is found; the security reinforcement learning is used for autonomously learning and adjusting the defense strategy in the process of constantly interacting with the environment and learning, so that the adaptability of the system to security threats is improved, and the security of the system is enhanced;
s6, edge intelligence, namely deploying an edge intelligence technology on the terminal equipment to realize real-time safety monitoring and processing;
the dependence on a central server is reduced, and the data security and the instant response are enhanced;
the specific steps of deploying the edge intelligent technology on the terminal equipment comprise:
the method comprises the steps of adopting a lightweight edge intelligent technology of a deep learning model, collecting data required by safety monitoring on terminal equipment, including sensor data and log data, and transmitting the collected data to an edge node;
a lightweight intelligent algorithm of a deep learning model is deployed on the edge node to detect and analyze the safety event, wherein the lightweight intelligent algorithm comprises real-time data processing and a safety monitoring model;
real-time security monitoring and processing are carried out on the edge nodes, collected data are analyzed and processed, possible security threats are detected, and corresponding response measures are triggered;
when the security threat is detected, the edge node triggers a security response mechanism, sends an alarm and blocks attack flow;
s7, an interpretable AI, wherein an interpretable artificial intelligent model is used for interpreting and visualizing a model decision process; the security team is helped to understand the behavior of the model, and abnormal conditions and security events are rapidly identified;
the method for interpreting and visualizing the model decision process by using the interpretable artificial intelligence model comprises the following specific steps:
training an interpretability model by using the preprocessed data;
interpreting the decision process of the interpretable model through feature importance analysis, local interpretation and global interpretation;
in the local interpretation, a LIME method is used for constructing a local linear model to interpret the prediction result of the model on a specific sample;
in global interpretation, calculating contribution of features to model prediction results by adopting a SHAP method;
visualizing the interpreted result;
s8, automatically repairing the loopholes, automatically detecting the loopholes in the system by using a machine learning technology, generating a repairing strategy in real time, improving the safety of the system, automatically repairing the loopholes and reducing the safety risk;
adopting a logistic regression model, automatically detecting the loopholes in the system by using a machine learning technology, and generating a repairing strategy in real time, wherein the specific steps are as follows:
dividing the data set into a training set and a testing set for training and evaluating the model;
training the selected machine learning model using the training set;
evaluating the trained model by using a test set, and evaluating the accuracy and performance of the model;
when the system operates, detecting the loopholes in the system in real time, predicting the data collected in real time by using a trained machine learning model, judging whether the loopholes exist or not, and generating corresponding repairing strategies according to the prediction result of the model and the characteristics of the loopholes if the loopholes are detected;
applying the generated repairing strategy to the system to repair the detected loopholes;
the predictive formula of the machine learning model of logistic regression is:
where y is the predicted output of the model and z is the linear combination of the input data.
In the present application, the above is combined with the above matters:
the data security management method of the digital platform provided by the application uses a deep learning model to perform intrusion detection, gathers historical data of network traffic and user behaviors as a training data set, builds a model framework in an intrusion detection task, comprises an input layer, a hidden layer and an output layer, sets an activation function, a loss function and an optimization algorithm, divides the data set into a training set, a verification set and a test set for training, super-parameter adjustment and evaluation of the model, uses the training set to train the deep learning model, optimizes model parameters through iteration, minimizes the loss function, optimizes by adopting a gradient descent method, and adjusts super-parameters of the model including learning rate, regularization coefficient and hidden layer node number according to the performance of the verification set so as to improve the performance and generalization capability of the model.
Aiming at real-time invasion, a trained deep learning model is deployed on a digital platform, network flow and user behaviors are continuously monitored, data samples are input into the model in real time for prediction, abnormal activities are identified, and corresponding response measures are triggered.
Aiming at privacy data, the differential privacy technology is utilized to encrypt and noise sensitive data, data privacy is protected, meanwhile, an authorized data user is allowed to obtain limited irreversible insight, and the encrypted and noisy data is shared to the authorized data user according to privacy budget.
The method comprises the steps of generating an antagonism network, generating an antagonism sample to test and strengthen the safety of a traditional machine learning model, enhancing the safety of the model by antagonism training and improving a defense strategy, performing model training on local equipment by using a federal learning method, sharing only model parameters instead of original data to reduce the risk of data leakage, improving the safety of the data, and simultaneously adopting a safety enhancement learning technology to enable a system to interact with the environment, autonomously learn and adjust the defense strategy so as to adapt to the constantly-changing safety threat.
In the description of the embodiments of the present application, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A method for data security management of a digital platform, the method comprising:
building a deep learning model, training by using historical data, continuously monitoring network flow and user behaviors, identifying abnormal activities, monitoring the network flow and the user behaviors in real time, automatically detecting and alarming abnormality and potential threat, and providing high-precision intrusion detection;
differential privacy data sharing, which adopts differential privacy technology to encrypt and noise sensitive data;
generating an antagonism network defense, introducing the generated antagonism network, and generating an antagonism sample to test and strengthen the safety of a traditional machine learning model;
federation learning, which adopts a federation learning method to train a model locally for a plurality of data sources, and only shares model parameters instead of original data;
safety reinforcement learning, which adopts a safety reinforcement learning technology to enable a system to interact with the environment, autonomously learn and adjust a defense strategy;
edge intelligence, namely deploying an edge intelligence technology on terminal equipment to realize real-time safety monitoring and processing;
an interpretable AI, which interprets and visualizes the model decision process using an interpretable artificial intelligence model;
automatic bug repair, automatically detecting bugs in the system by using a machine learning technology, and generating a repair strategy in real time.
2. The method for data security management of a digital platform according to claim 1, wherein the specific steps of constructing a depth model and training are as follows:
collecting historical data of network traffic and user behaviors as a training data set, and performing data cleaning, feature extraction and label marking;
in the intrusion detection task, a deep learning algorithm is selected to comprise a convolutional neural network CNN, a recurrent neural network RNN, a long-short-term memory network LSTM and a converter, a model architecture is constructed, the model architecture comprises an input layer, a hidden layer and an output layer, and an activation function, a loss function and an optimization algorithm are set;
dividing the data set into a training set, a verification set and a test set, wherein the training set is used for model training, the verification set is used for adjusting super parameters and avoiding overfitting, and the test set is used for evaluating model performance;
training a deep learning model by using a training set, iterating and optimizing model parameters, minimizing a loss function, and selecting GradientDescent and variants thereof by using a gradient descent method in an optimization algorithm, wherein the gradient descent method comprises Adam and RMSprop;
according to the performance of the verification set, super parameters of the model are adjusted, including learning rate, regularization coefficient, hidden layer node number, performance and generalization capability of the model are optimized;
evaluating the trained deep learning model by using a test set, and calculating performance indexes including accuracy, recall and F1 value;
and deploying the trained deep learning model on a digital platform, continuously monitoring network flow and user behaviors, inputting data samples into the model in real time for prediction, identifying abnormal activities and triggering corresponding response measures.
3. The method for data security management of a digital platform according to claim 2, wherein the deep learning model is built by convolutional neural network:
input: x is X
Hidden layer h=f (W x+b)
Output layer y=g (V h+c)
Wherein X is a feature vector of a data sample, f is an activation function 1, g is an activation function 2;
loss function definition:
loss function L (y, y')
Wherein y is an actual tag and y' is a predicted tag;
algorithm optimization and parameter updating rule:
where α is the learning rate and is the gradient vector.
4. The method for data security management of a digital platform according to claim 1, wherein the step of sharing the differential privacy data specifically comprises:
adding noise to the preprocessed data, and selecting Laplacian noise for noise adding processing, wherein a specific noise adding formula is as follows:
wherein,noise representing the Laplace distribution, epsilon is the privacy budget and sensitivity is the sensitivity;
setting privacy budget epsilon of differential privacy, and sharing the encrypted and noisy data to authorized data users;
when the inquiry of the data user is received, decrypting and processing the encrypted and noisy data, and then returning a response result;
and carrying out privacy protection analysis, evaluating the effect of the differential privacy technology, and ensuring that the shared data meets the privacy protection requirement.
5. The method for data security management of a digital platform according to claim 1, wherein the privacy preserving analyzing step further comprises:
for shared sensitive data, calculating the sensitivity thereof;
determining the size of noise according to the setting of privacy budget epsilon;
the mathematical definition of differential privacy is used to evaluate the privacy preserving effect of shared data, specifically:
for any adjacent data sets D and D', and any query Q, the following conditions are satisfied for all possible query results S:
Pr[Q(D)∈S]<=exp(epsilon)*Pr[Q(D’)∈S]
where epsilon represents the privacy budget, Q (D) represents the result of querying D on dataset D, exp (epsilon) represents the exponentiation of the privacy budget;
evaluating privacy preserving effects of the shared data using the differential privacy distortion;
after privacy protection processing, performance evaluation is carried out on the shared data, wherein the performance evaluation comprises model accuracy, data availability and query response time;
and adjusting parameters in the differential privacy technology according to the result of the privacy protection analysis.
6. The method for data security management of a digital platform according to claim 1, wherein the step of introducing a generated challenge network to test and strengthen the security of a traditional machine learning model specifically comprises:
adopting a generator network and a discriminator network, and preparing a data set for training GAN, wherein the data set comprises real data and noise data; training the GAN using the real data and the noise data, the generator network attempting to generate samples that approximate the real data, and the arbiter network attempting to distinguish the real data from the data generated by the generator;
generating an antagonism sample using the trained generator network;
testing a traditional machine learning model by using the generated resistance sample, inputting the resistance sample into the traditional model as input, and observing an output result of the model;
according to the test result of the traditional model, the improvement of the U-shaped face pinching and defending method of the resistance is selectively carried out:
resistance training: the generated challenge samples are mixed with the original training data to retrain the traditional model.
7. The method for data security management of a digital platform according to claim 1, wherein the step of locally training using a federal learning method specifically comprises:
respectively collecting a plurality of data sources which need to participate in federal learning;
randomly initializing parameters of a federal learning model before federal learning is started;
in each federal learning iteration, the data source sequence is specifically:
each data source locally trains a model using local data;
after the local training is finished, each data source uploads model parameters obtained by the local training to a central server;
the central server aggregates the collected model parameters, and sends the aggregated model parameters back to each data source by the central server to update the respective local model parameters;
repeating the federal learning iteration until the model converges;
the parameter aggregation formula in the federal learning process is:
where ω_avg is the average of the parameters, N is the number of data sources, and ω_i is the local model parameter of the ith data source.
8. The method for data security management of a digital platform according to claim 1, wherein the security reinforcement learning step specifically comprises:
in the safety reinforcement learning, the system is interacted with the environment, and the specific steps of autonomous learning and adjusting the defense strategy are as follows:
modeling a system environment, including abstracting the system operating environment, a network structure and an attacker behavior into a mathematical model;
defining a reward function for evaluating the performance of the system in different states;
adopting a Q-learning algorithm to connect the built reinforcement learning model with a defending part of the system, so that the system can interact with the environment;
learning and optimizing according to the reinforcement learning algorithm, and continuously interacting with the environment and learning;
the updating rule formula of the reinforcement learning algorithm in reinforcement learning is as follows:
Q(s,a)=Q(s,a)+a*(r+γ*max(Q(s’,a’))-Q(s,a))
where Q (s, a) represents the expected return for performing action a in state s, a is the learning rate, r is the reward obtained after performing action a in state s, γ is the discount factor, s ' is the new state after performing action a, and a ' is the optimal action selected in the new state s '.
9. A terminal device, characterized in that the device comprises: a memory, a processor, and a control program for a data security management method of a digital platform stored on the memory and executable on the processor, the control program for the data security management method of the digital platform implementing the data security management method of the digital platform according to any one of claims 1 to 8 when executed by the processor.
10. A storage medium, characterized in that the medium is applied to a computer, the storage medium storing thereon a control program of a data security management method of a digital platform, the control program of the data security management method of the digital platform implementing the data security management method of the digital platform according to any one of claims 1 to 8 when executed by the processor.
CN202311004874.XA 2023-08-09 2023-08-09 Data security management method for digital platform Pending CN117056951A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311004874.XA CN117056951A (en) 2023-08-09 2023-08-09 Data security management method for digital platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311004874.XA CN117056951A (en) 2023-08-09 2023-08-09 Data security management method for digital platform

Publications (1)

Publication Number Publication Date
CN117056951A true CN117056951A (en) 2023-11-14

Family

ID=88661895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311004874.XA Pending CN117056951A (en) 2023-08-09 2023-08-09 Data security management method for digital platform

Country Status (1)

Country Link
CN (1) CN117056951A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117289141A (en) * 2023-11-22 2023-12-26 深圳市麦迪瑞科技有限公司 Electric bicycle charging state monitoring method based on artificial intelligence
CN117557270A (en) * 2024-01-08 2024-02-13 深圳合纵富科技有限公司 Mobile terminal secure payment management method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111245793A (en) * 2019-12-31 2020-06-05 西安交大捷普网络科技有限公司 Method and device for analyzing abnormity of network data
CN112866185A (en) * 2019-11-28 2021-05-28 海信集团有限公司 Network traffic monitoring device and abnormal traffic detection method
CN113468521A (en) * 2021-07-01 2021-10-01 哈尔滨工程大学 Data protection method for federal learning intrusion detection based on GAN
CN113536382A (en) * 2021-08-09 2021-10-22 北京理工大学 Block chain-based medical data sharing privacy protection method by using federal learning
CN116319061A (en) * 2023-04-18 2023-06-23 天津市职业大学 Intelligent control network system
CN116527362A (en) * 2023-05-06 2023-08-01 北京邮电大学 Data protection method based on LayerCFL intrusion detection
CN116541006A (en) * 2023-06-28 2023-08-04 壹仟零壹艺网络科技(北京)有限公司 Graphic processing method and device for computer man-machine interaction interface

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112866185A (en) * 2019-11-28 2021-05-28 海信集团有限公司 Network traffic monitoring device and abnormal traffic detection method
CN111245793A (en) * 2019-12-31 2020-06-05 西安交大捷普网络科技有限公司 Method and device for analyzing abnormity of network data
CN113468521A (en) * 2021-07-01 2021-10-01 哈尔滨工程大学 Data protection method for federal learning intrusion detection based on GAN
CN113536382A (en) * 2021-08-09 2021-10-22 北京理工大学 Block chain-based medical data sharing privacy protection method by using federal learning
CN116319061A (en) * 2023-04-18 2023-06-23 天津市职业大学 Intelligent control network system
CN116527362A (en) * 2023-05-06 2023-08-01 北京邮电大学 Data protection method based on LayerCFL intrusion detection
CN116541006A (en) * 2023-06-28 2023-08-04 壹仟零壹艺网络科技(北京)有限公司 Graphic processing method and device for computer man-machine interaction interface

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117289141A (en) * 2023-11-22 2023-12-26 深圳市麦迪瑞科技有限公司 Electric bicycle charging state monitoring method based on artificial intelligence
CN117557270A (en) * 2024-01-08 2024-02-13 深圳合纵富科技有限公司 Mobile terminal secure payment management method and system
CN117557270B (en) * 2024-01-08 2024-05-07 深圳合纵富科技有限公司 Mobile terminal secure payment management method and system

Similar Documents

Publication Publication Date Title
US10789367B2 (en) Pre-cognitive security information and event management
Tianfield Cyber security situational awareness
CN117056951A (en) Data security management method for digital platform
Al-Janabi Pragmatic miner to risk analysis for intrusion detection (PMRA-ID)
Jaiganesh et al. An analysis of intrusion detection system using back propagation neural network
Oreški et al. Genetic algorithm and artificial neural network for network forensic analytics
Sheikh et al. Intelligent and secure framework for critical infrastructure (CPS): Current trends, challenges, and future scope
Yoong et al. Deriving invariant checkers for critical infrastructure using axiomatic design principles
Wei et al. Toward identifying APT malware through API system calls
Alsobeh et al. Integrating data-driven security, model checking, and self-adaptation for IoT systems using BIP components: A conceptual proposal model
Roberts et al. A model-based approach to predicting the performance of insider threat detection systems
Alowaidi et al. Integrating artificial intelligence in cyber security for cyber-physical systems
Ni et al. Machine learning enabled Industrial IoT Security: Challenges, Trends and Solutions
Wu et al. Graphguard: Detecting and counteracting training data misuse in graph neural networks
Wei Application of Bayesian algorithm in risk quantification for network security
Al-Sanjary et al. Challenges on digital cyber-security and network forensics: a survey
Bhusal et al. SoK: Modeling Explainability in Security Analytics for Interpretability, Trustworthiness, and Usability
Alotaibi Network Intrusion Detection Model Using Fused Machine Learning Technique.
do Vale Dalarmelina et al. Using ML and DL Algorithms for Intrusion Detection in the Industrial Internet of Things
Chen et al. Analyzing system log based on machine learning model
Xu et al. Unawareness detection: Discovering black-box malicious models and quantifying privacy leakage risks
Jose et al. Prediction of Network Attacks Using Supervised Machine Learning Algorithm
Butt Cyber data anomaly detection using autoencoder neural networks
Akshaya et al. Enhancing Zero-Day Attack Prediction a Hybrid Game Theory Approach with Neural Networks
Wang et al. Evasion Attack and Defense On Machine Learning Models in Cyber-Physical Systems: A Survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination