CN116894113A - Data security classification method and data security management system based on deep learning - Google Patents
Data security classification method and data security management system based on deep learning Download PDFInfo
- Publication number
- CN116894113A CN116894113A CN202310875777.1A CN202310875777A CN116894113A CN 116894113 A CN116894113 A CN 116894113A CN 202310875777 A CN202310875777 A CN 202310875777A CN 116894113 A CN116894113 A CN 116894113A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- neural network
- hidcnn
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000013135 deep learning Methods 0.000 title claims abstract description 23
- 238000013528 artificial neural network Methods 0.000 claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 23
- 238000013527 convolutional neural network Methods 0.000 claims description 17
- 239000013598 vector Substances 0.000 claims description 16
- 230000002159 abnormal effect Effects 0.000 claims description 12
- 238000013145 classification model Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 10
- 238000001514 detection method Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 description 11
- 238000012545 processing Methods 0.000 description 7
- 238000007405 data analysis Methods 0.000 description 5
- 238000013523 data management Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 241000282414 Homo sapiens Species 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000007596 consolidation process Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data security classification method and a data security management system based on deep learning, wherein the method comprises the following steps: s1, acquiring corresponding service data in a service information system through a big data configuration center; s2, preprocessing service data and storing the processed service data; s3, constructing a novel convolution neural network HIDCNN combined model; s4, training the novel convolution neural network HIDCNN combination model to obtain a trained novel convolution neural network HIDCNN combination model; s5, classifying the data according to the trained novel convolution neural network HIDCNN combination model. According to the invention, the acquired original data is converted into the required target information, and after the acquisition is completed, the data is cleaned and converted, so that the safety of the data is improved, and the data loss is avoided.
Description
Technical Field
The invention relates to the technical field of data security classification, in particular to a data security classification method and a data security management system based on deep learning.
Background
Deep learning is a branch of machine learning, and aims to learn characteristics and modes of data by simulating the neural network structure and functions of a human brain, so as to achieve the aim of artificial intelligence. These features are further used for classification, regression, clustering, etc. tasks. Deep learning is a new method for realizing artificial intelligence by constructing a deep neural network to learn the internal law and the expression level of sample data by simulating the working mode of human brain, so that a machine can simulate the activities of human beings such as audio-visual and thinking, and the like, thereby solving a plurality of complex pattern recognition problems and greatly improving the artificial intelligence technology.
Machine learning learns from the data by parsing the data, employing a corresponding algorithmic model, and then makes decisions and predictions about events in the real world. Unlike conventional hard-coded software programs that address specific tasks, machine learning is "training" with a large amount of data from which it is learned by various algorithms how to accomplish the task. As a recent branch of the machine learning field, deep learning itself also uses supervised and unsupervised learning methods to train deep neural networks. In recent years, the development of the field is rapid, and some special learning means are sequentially proposed (such as a convolution network, a residual network, an antagonism network and the like), so that more and more researchers use deep neural networks to solve the feature expression learning of the specific field. The deep neural network comprises a plurality of hidden layers, and the learning tasks such as complex classification and the like can be completed by using a simple model after the initial low-layer characteristic representation is gradually converted into the high-layer characteristic representation through multi-layer processing.
In the field of data management, a data manager is often required to classify data according to application scenes and data contents, but manual classification of the data is not only labor-intensive, and has low efficiency, but also cannot be applied to the scene of classifying massive data with a large number of data categories in real time, and the safety of the data in the classification process is not guaranteed, so that data information is easy to leak, and at this time, in order to solve the problems, a method for improving the data safety classification is needed.
For the problems in the related art, no effective solution has been proposed at present.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a data security classification method based on deep learning, so as to overcome the technical problems existing in the prior related art.
For this purpose, the invention adopts the following specific technical scheme:
a data security classification method based on deep learning, the method comprising the steps of:
s1, acquiring corresponding service data in a service information system through a big data configuration center;
s2, preprocessing service data and storing the processed service data;
s3, constructing a novel convolution neural network HIDCNN combined model;
s4, training the novel convolution neural network HIDCNN combination model to obtain a trained novel convolution neural network HIDCNN combination model;
s5, classifying the data according to the trained novel convolution neural network HIDCNN combination model.
Further, the preprocessing the service data and storing the processed service data includes the following steps:
s21, integrating the service data and forming a unified data set;
s22, data cleaning is carried out on the data in the data set, and the data in different formats are subjected to unified conversion;
s23, clustering the data in the data set;
s24, verifying whether the processed data is accurate or not, and storing the processed data to a corresponding position to obtain required target data;
s25, backing up the obtained target data.
Further, the clustering processing of the data in the data set includes the following steps:
s231, finding out the similarity between every two data points in the original data to obtain a similarity matrix A;
s232, calculating a matrix D, enabling diagonal elements of the matrix D to be the sum of corresponding column values of a similarity matrix A, enabling the matrix B=D-A, solving a certain eigenvalue and eigenvector of the matrix B, and projecting data points to a K-dimensional space;
s233, clustering the data in the K-dimensional space according to the K-dimensional space coordinates of each data point.
Further, the clustering of the data in the K-dimensional space according to the K-dimensional space coordinates of each data point includes the following steps:
s2331, randomly finding out a plurality of center positions, and classifying each data point to the center nearest to the center;
s2332, the data points are divided into groups of clusters, and the center point of each cluster is found, and the center is transferred to the average position of the data points inside the cluster using a minimization function.
Further, the step of backing up the obtained target data includes the following steps:
s251, copying target data into a backup catalog, and starting a backup mode for a table space to be backed up;
s252, copying the table space and placing the table space in an end backup mode;
s253, executing S251 and S252 on each table space in the database;
s254, the current data sequence number is obtained by executing a command on the svrmgrl, and command forced data switching is executed, so that all data are conveniently archived.
Further, the construction of the novel convolution neural network HIDCNN combination model comprises the following steps:
s31, dividing the model framework into a mixed characteristic input layer and a model main body framework layer;
s32, the mixed characteristic input layer adopts a mode of classifying target vectors and initializing space vectors immediately, and converts data into continuous space vectors as input vectors of a model;
s33, selecting a model type as a text classification model by the model main body framework layer, and introducing an iterative cavity convolutional neural network;
s34, combining the high-speed neural network with the IDCNN to construct a novel convolution neural network HIDCNN combination model;
s35, stacking DCNN network blocks by adopting an iteration method to form an iterative cavity convolutional neural network;
s36, using the Highway network as a connecting layer of the cavity convolutional neural network and the Softmax classifying layer to form a hierarchical classifying model based on the HIDCNN, and simultaneously optimizing the characteristics extracted by the convolutional layer;
s37, connecting the Dropout layer and the Softmax classification layer by the hierarchical classification model of the HIDCNN to form a complete classification model.
Further, the calculation formula of the HIDCNN is as follows:
Y i =H(h i-1 ,W H )*T(h i-1 ,W T )+h i-1 *C(h i-1 ,W C )
in the formula, h i Outputting a vector for an i-th layer Highway layer;
h is a nonlinear affine transformation function;
t is a conversion gate;
c is a carrying door;
and T and C are hyperbolic tangent functions, C is 1-T.
Further, the training of the novel convolutional neural network HIDCNN combination model to obtain the trained novel convolutional neural network HIDCNN combination model comprises the following steps:
s41, defining a target of data security classification;
s42, training a novel convolution neural network HIDCNN combination model by using a target;
s44, evaluating and adjusting the trained model.
Further, the defining the data security hierarchical classification target includes the following steps:
s411, intensively taking m samples from the data tag to form a hierarchical classification target sample, and marking a set formed by the rest data samples as N;
s412, solving an optimization problem on the target sample by adopting a QP method to obtain a support vector, and forming a group of classification targets;
s413, using the samples in the classification target test set N, ending if the samples in the N are empty sets, and continuing otherwise;
s414, placing samples which do not meet the optimization condition in the set N into target samples, and simultaneously taking out the same number of samples from the target samples and placing the samples into the set N;
s415, repeating the step S412, and defining a plurality of groups of classification targets.
Further, the training of the novel convolution neural network HIDCNN combination model by using the target comprises the following steps:
s421, taking a classification target composition data set as training data to obtain an initial training model;
s422, evaluating the initial training model to obtain an abnormal data set generated in the evaluation;
s423, grouping the obtained abnormal data sets to obtain a plurality of abnormal data set groups;
s424, determining model training information according to the obtained abnormal data set group;
and S425, continuously adjusting the parameters of the detection model according to the training result until the training accuracy and the loss rate of the detection model are optimal, namely, the detection model is trained.
The beneficial effects of the invention are as follows:
1. the invention combines the original business information system data source as the initial data with the deep learning method, dynamically adjusts and classifies the data according to the need, realizes the automatic classification of the business information system data, can perform classification marking on the data of different business information systems in real time, and improves the working efficiency of a data manager.
2. The invention adjusts the discriminant standard of different levels of data by using the novel convolution neural network HIDCNN combination model, can output different service classification data marked according to different data security requirements of different service information systems, can make the data easier to understand and analyze by clustering the data, reduces the complexity of data processing, and can improve the accuracy of the classification model.
3. The invention converts the collected original data into the required target information, and after the collection is completed, the data is cleaned and converted, so that valuable information can be conveniently extracted, the data is more accurate, complete and consistent, the data can be better utilized and analyzed, the quality and the efficiency of data analysis are improved, the processed data is backed up, the safety of the data is improved, and the data loss is avoided.
4. According to the invention, a novel convolution neural network HIDCNN combined model is used, after model construction is completed, the model is trained by using data, data are verified to adjust model parameters and prevent over fitting, meanwhile, test data are used to evaluate the performance of the model, the model is continuously monitored according to the requirement, the accuracy of classification and the performance of the model are ensured, and repair and update can be carried out when required.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a data security classification method based on deep learning according to an embodiment of the invention.
Detailed Description
For the purpose of further illustrating the various embodiments, the present invention provides the accompanying drawings, which are a part of the disclosure of the present invention, and which are mainly used for illustrating the embodiments and for explaining the principles of the operation of the embodiments in conjunction with the description thereof, and with reference to these matters, it will be apparent to those skilled in the art to which the present invention pertains that other possible embodiments and advantages of the present invention may be practiced.
According to an embodiment of the invention, a data security classification method based on deep learning is provided.
The invention will be further described with reference to the accompanying drawings and detailed description, as shown in fig. 1, a data security classification method based on deep learning according to an embodiment of the invention, the method includes the following steps:
s1, acquiring corresponding service data in a service information system through a big data configuration center.
S2, preprocessing the service data and storing the processed service data.
Specifically, the service data is preprocessed, and the processed service data is stored.
Specifically, the preprocessing the service data and storing the processed service data includes the following steps:
s21, integrating the service data and forming a unified data set;
data consolidation refers to the merging of data in multiple data sets into one data set, typically in order to combine data of different sources or formats together for more comprehensive data analysis or processing.
S22, data cleaning is carried out on the data in the data set, and the data in different formats are subjected to unified conversion.
In particular, data conversion refers to the process of converting one data format or type to another data format or type. In computer science, data conversion typically involves converting data from one programming language, file format, database type, network protocol, etc. to another.
The data conversion may be unidirectional, i.e. converting data from one format to another, or bidirectional, i.e. converting data to each other between two formats. For example, converting a character string into numbers, converting a JSON object into XML format, and the like are all examples of data conversion.
Data conversion is typically accomplished using specialized tools or libraries. Common data transformation tools include programming language-built data type transformation functions, third party libraries such as Pandas and NumPy, ETL tools such as talen and informatics, and the like.
S23, clustering the data in the data set;
in particular, clustering is a process of grouping similar things together and classifying dissimilar things into different categories, which is a very important means in data analysis.
The clustering processing of the data in the data set comprises the following steps:
s231, finding out the similarity between every two data points in the original data to obtain a similarity matrix A;
s232, calculating a matrix D, enabling diagonal elements of the matrix D to be the sum of corresponding column values of a similarity matrix A, enabling a matrix B=D-A, solving a certain eigenvalue and eigenvector of the matrix B, and projecting data points to a K-dimensional space;
in particular, the method comprises the steps of,
D(i,i)=∑ j A(i,j)
where i and j are data points.
Specifically, when clustering data in a data set, firstly, similarity between every two N data points is given, that is, a similarity matrix a of n×n, where a (i, j) represents similarity between i and j, and the larger the value is, the more similar the value is, and attention is paid to
A(i,j)=A(j,i),A(i,j)=0。
Further calculating matrix D such that its diagonal elements are the sum of the values of the corresponding column (or row) of matrix A, the remainder being 0, i.e. such that
D(i,i)=∑ j A(i,j)
Let b=d-a, the first k eigenvalues and eigenvectors of the B matrix, project the data points into a k-dimensional space, the jth value of the ith eigenvector represents the projection of the jth data point in the k-dimensional space in the ith dimension, that is, if the k eigenvectors are combined into an N x k matrix, each row represents the coordinates of a data point in the k-dimensional space, and the clustering algorithm clusters the data in the k-dimensional space according to the k-dimensional space coordinates of each data point.
S233, clustering the data in the K-dimensional space according to the K-dimensional space coordinates of each data point.
Specifically, the clustering of the data in the K-dimensional space according to the K-dimensional space coordinates of each data point includes the following steps:
s2331, randomly finding out a plurality of center positions, and classifying each data point to the center nearest to the center;
s2332, the data points are divided into groups of clusters, and the center point of each cluster is found, and the center is transferred to the average position of the data points inside the cluster using a minimization function.
By clustering the data, the data can be easier to understand and analyze, the complexity of data processing is reduced, and the accuracy of the hierarchical classification model can be improved.
S24, verifying whether the processed data is accurate or not, and storing the processed data to a corresponding position to obtain required target data;
s25, backing up the obtained target data.
The step of backing up the obtained target data comprises the following steps:
s251, copying target data into a backup catalog, and starting a backup mode for a table space to be backed up;
s252, copying the table space and placing the table space in an end backup mode;
s253, executing S251 and S252 on each table space in the database;
s254, the current data sequence number is obtained by executing a command on the svrmgrl, and command forced data switching is executed, so that all data are conveniently archived.
The collected original data is converted into the required target information, and after the collection is completed, the data is cleaned and converted, so that valuable information can be conveniently extracted, the data is more accurate, complete and consistent, the data can be better utilized and analyzed, and the quality and efficiency of data analysis are improved. And the processed data is backed up, so that the safety of the data is improved, and the data loss is avoided.
S3, constructing a novel convolution neural network HIDCNN combination model.
Specifically, the construction of the novel convolution neural network HIDCNN combination model comprises the following steps:
s31, dividing the model framework into a mixed characteristic input layer and a model main body framework layer;
s32, the mixed characteristic input layer adopts a mode of classifying target vectors and initializing space vectors immediately, and converts data into continuous space vectors as input vectors of a model;
s33, selecting a model type as a text classification model by the model main body framework layer, and introducing an iterative cavity convolutional neural network;
s34, combining the high-speed neural network with the IDCNN to construct a novel convolution neural network HIDCNN combination model;
s35, stacking DCNN network blocks by adopting an iteration method to form an iterative cavity convolutional neural network;
s36, using the Highway network as a connecting layer of the cavity convolutional neural network and the Softmax classifying layer to form a hierarchical classifying model based on the HIDCNN, and simultaneously optimizing the characteristics extracted by the convolutional layer;
s37, connecting the Dropout layer and the Softmax classification layer by the hierarchical classification model of the HIDCNN to form a complete classification model.
The calculation formula of the HIDCNN is as follows:
Y i =H(h i-1 ,W H )*T(h i-1 ,W T )+h i-1 *C(h i-1 ,W C )
in the formula, h i Output for the i-th layer Highway layerVector;
h is a nonlinear affine transformation function;
t is a conversion gate;
c is a carrying door;
and T and C are hyperbolic tangent functions, C is 1-T.
And S4, training the novel convolution neural network HIDCNN combination model to obtain the trained novel convolution neural network HIDCNN combination model.
Specifically, the training of the novel convolutional neural network HIDCNN combination model to obtain the trained novel convolutional neural network HIDCNN combination model comprises the following steps:
s41, defining a target of data security classification.
Wherein the defining the targets of the data security hierarchical classification comprises the following steps:
s411, intensively taking m samples from the data tag to form a hierarchical classification target sample, and marking a set formed by the rest data samples as N;
s412, solving an optimization problem on the target sample by adopting a QP method to obtain a support vector, and forming a group of classification targets;
s413, using the samples in the classification target test set N, ending if the samples in the N are empty sets, and continuing otherwise;
s414, placing samples which do not meet the optimization condition in the set N into target samples, and simultaneously taking out the same number of samples from the target samples and placing the samples into the set N;
s415, repeating the step S412, and defining a plurality of groups of classification targets.
In particular, the primary goal of data security hierarchical classification is to ensure confidentiality, integrity, and availability of data, and to simplify the data management process.
The data can be separately managed by dividing the data into different levels, and higher security protection measures are provided for the data, so that the confidentiality of the data is protected, the integrity of important data can be ensured by classifying the data in a grading manner when the data are classified, the data is prevented from being tampered or damaged, the data management is convenient, and the data can be managed more easily by classifying the data, so that the data management flow is simplified.
S42, training the novel convolution neural network HIDCNN combination model by using the target.
The training of the novel convolution neural network HIDCNN combination model by using the target comprises the following steps:
s421, taking a classification target composition data set as training data to obtain an initial training model;
s422, evaluating the initial training model to obtain an abnormal data set generated in the evaluation;
s423, grouping the obtained abnormal data sets to obtain a plurality of abnormal data set groups;
s424, determining model training information according to the obtained abnormal data set group;
and S425, continuously adjusting the parameters of the detection model according to the training result until the training accuracy and the loss rate of the detection model are optimal, namely, the detection model is trained.
S43, evaluating and adjusting the trained model.
The performance of the model is estimated by using the test data, the model is adjusted according to the requirement, the model is continuously monitored finally, the accuracy of classification and the performance of the model are ensured, and the model can be repaired and updated when the model is required.
S5, classifying the data according to the trained novel convolution neural network HIDCNN combination model.
Specifically, the model is monitored by combining the classified structure during use, and the model monitoring means that the classified model is periodically monitored, analyzed and evaluated to ensure good performance and correctness of the classified model in a production environment, and the problem of the model can be timely found and processed by monitoring the model, so that the performance and reliability of the model are improved, and the effectiveness of the classified model in the production environment is ensured.
Monitoring may cover the following aspects:
and (3) data quality monitoring: the quality of the input data is checked, for example, whether the data has a miss, an outlier, a repeated value, etc.
Model performance monitoring: the performance of the model in the production environment, such as the accuracy, precision, recall, and other indicators of the model, is monitored.
And (3) real-time prediction monitoring: the real-time prediction results of the model are monitored to detect whether abnormal behavior or deviation of the model occurs.
Interpretability monitoring: the accuracy and reliability of the prediction result are ensured by monitoring the interpretation capability of the model.
And (3) safety monitoring: the monitoring model is subject to attack or abuse, such as a resistance attack or data leakage, etc.
Self-adaptive monitoring: the model is adaptively monitored and fed back to update the model in time as new data distributions or conceptual drifts occur.
In summary, by means of the above technical scheme of the invention, the invention combines the original service information system data source as the initial data with the deep learning method, dynamically adjusts and classifies the data according to the need, realizes the automatic classification of the service information system data, can perform classification marking on the data of different service information systems in real time, and improves the work efficiency of the data manager. The invention adjusts the discriminant standard of different levels of data by using the novel convolution neural network HIDCNN combination model, can output different service classification data marked according to different data security requirements of different service information systems, can make the data easier to understand and analyze by clustering the data, reduces the complexity of data processing, and can improve the accuracy of the classification model. The invention converts the collected original data into the required target information, and after the collection is completed, the data is cleaned and converted, so that valuable information can be conveniently extracted, the data is more accurate, complete and consistent, the data can be better utilized and analyzed, the quality and the efficiency of data analysis are improved, the processed data is backed up, the safety of the data is improved, and the data loss is avoided. According to the invention, a novel convolution neural network HIDCNN combined model is used, after model construction is completed, the model is trained by using data, data are verified to adjust model parameters and prevent over fitting, meanwhile, test data are used to evaluate the performance of the model, the model is continuously monitored according to the requirement, the accuracy of classification and the performance of the model are ensured, and repair and update can be carried out when required.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (9)
1. The data security classification method based on deep learning is characterized by comprising the following steps of:
s1, acquiring corresponding service data in a service information system through a big data configuration center;
s2, preprocessing service data and storing the processed service data;
s3, constructing a novel convolution neural network HIDCNN combined model;
s4, training the novel convolution neural network HIDCNN combination model to obtain a trained novel convolution neural network HIDCNN combination model;
s5, classifying the data according to the trained novel convolution neural network HIDCNN combination model.
2. The deep learning-based data security classification method as claimed in claim 1, wherein the preprocessing of the service data and the storage of the processed service data comprises the steps of:
s21, integrating the service data and forming a unified data set;
s22, data cleaning is carried out on the data in the data set, and the data in different formats are subjected to unified conversion;
s23, clustering the data in the data set;
s24, verifying whether the processed data is accurate or not, and storing the processed data to a corresponding position to obtain required target data;
s25, backing up the obtained target data.
3. The data security classification method based on deep learning as claimed in claim 2, wherein the clustering process of the data in the data set comprises the following steps:
s231, finding out the similarity between every two data points in the original data to obtain a similarity matrix A;
s232, calculating a matrix D, enabling diagonal elements of the matrix D to be the sum of corresponding column values of a similarity matrix A, enabling a matrix B=D-A, solving a certain eigenvalue and eigenvector of the matrix B, and projecting data points to a K-dimensional space;
s233, clustering the data in the K-dimensional space according to the K-dimensional space coordinates of each data point.
4. A data security classification method based on deep learning according to claim 3, wherein said clustering data in K-dimensional space according to K-dimensional space coordinates of each data point comprises the steps of:
s2331, randomly finding out a plurality of center positions, and classifying each data point to the center nearest to the center;
s2332, the data points are divided into groups of clusters, and the center point of each cluster is found, and the center is transferred to the average position of the data points inside the cluster using a minimization function.
5. The deep learning-based data security classification method as claimed in claim 2, wherein the backing up the obtained target data comprises the steps of:
s251, copying target data into a backup catalog, and starting a backup mode for a table space to be backed up;
s252, copying the table space and placing the table space in an end backup mode;
s253, executing S251 and S252 on each table space in the database;
s254, the current data sequence number is obtained by executing a command on the svrmgrl, and command forced data switching is executed, so that all data are conveniently archived.
6. The data security classification method based on deep learning as claimed in claim 1, wherein the construction of the novel convolutional neural network HIDCNN combination model comprises the following steps:
s31, dividing the model framework into a mixed characteristic input layer and a model main body framework layer;
s32, the mixed characteristic input layer adopts a mode of classifying target vectors and initializing space vectors immediately, and converts data into continuous space vectors as input vectors of a model;
s33, selecting a model type as a text classification model by the model main body framework layer, and introducing an iterative cavity convolutional neural network;
s34, combining the high-speed neural network with the IDCNN to construct a novel convolution neural network HIDCNN combination model;
s35, stacking DCNN network blocks by adopting an iteration method to form an iterative cavity convolutional neural network;
s36, using the Highway network as a connecting layer of the cavity convolutional neural network and the Softmax classifying layer to form a hierarchical classifying model based on the HIDCNN, and simultaneously optimizing the characteristics extracted by the convolutional layer;
s37, connecting the Dropout layer and the Softmax classification layer by the hierarchical classification model of the HIDCNN to form a complete classification model.
7. The data security classification method based on deep learning of claim 1, wherein training the novel convolutional neural network HIDCNN combination model to obtain the trained novel convolutional neural network HIDCNN combination model comprises the following steps:
s41, defining a target of data security classification;
s42, training a novel convolution neural network HIDCNN combination model by using a target;
s43, evaluating and adjusting the trained model.
8. The deep learning based data security classification method of claim 7, wherein the defining the data security classification targets comprises the steps of:
s411, intensively taking m samples from the data tag to form a hierarchical classification target sample, and marking a set formed by the rest data samples as N;
s412, solving an optimization problem on the target sample by adopting a QP method to obtain a support vector, and forming a group of classification targets;
s413, using the samples in the classification target test set N, ending if the samples in the N are empty sets, and continuing otherwise;
s414, placing samples which do not meet the optimization condition in the set N into target samples, and simultaneously taking out the same number of samples from the target samples and placing the samples into the set N;
s415, repeating the step S412, and defining a plurality of groups of classification targets.
9. The deep learning-based data security classification method of claim 8, wherein the training of the new convolutional neural network HIDCNN combination model using the target comprises the steps of:
s421, taking a classification target composition data set as training data to obtain an initial training model;
s422, evaluating the initial training model to obtain an abnormal data set generated in the evaluation;
s423, grouping the obtained abnormal data sets to obtain a plurality of abnormal data set groups;
s424, determining model training information according to the obtained abnormal data set group;
and S425, continuously adjusting the parameters of the detection model according to the training result until the training accuracy and the loss rate of the detection model are optimal, namely, the detection model is trained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875777.1A CN116894113A (en) | 2023-07-17 | 2023-07-17 | Data security classification method and data security management system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875777.1A CN116894113A (en) | 2023-07-17 | 2023-07-17 | Data security classification method and data security management system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116894113A true CN116894113A (en) | 2023-10-17 |
Family
ID=88314639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310875777.1A Pending CN116894113A (en) | 2023-07-17 | 2023-07-17 | Data security classification method and data security management system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116894113A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271679A (en) * | 2023-11-22 | 2023-12-22 | 华信咨询设计研究院有限公司 | Database table classification and classification method and system based on training model |
-
2023
- 2023-07-17 CN CN202310875777.1A patent/CN116894113A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271679A (en) * | 2023-11-22 | 2023-12-22 | 华信咨询设计研究院有限公司 | Database table classification and classification method and system based on training model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bashar et al. | TAnoGAN: Time series anomaly detection with generative adversarial networks | |
CN107133569A (en) | The many granularity mask methods of monitor video based on extensive Multi-label learning | |
CN111127385A (en) | Medical information cross-modal Hash coding learning method based on generative countermeasure network | |
CN107292097B (en) | Chinese medicine principal symptom selection method based on feature group | |
CN104636751A (en) | Crowd abnormity detection and positioning system and method based on time recurrent neural network | |
Luan et al. | Out-of-distribution detection for deep neural networks with isolation forest and local outlier factor | |
Khalid | Activity classification and anomaly detection using m-mediods based modelling of motion patterns | |
CN116894113A (en) | Data security classification method and data security management system based on deep learning | |
CN109993198B (en) | Multi-source heterogeneous outlier detection method based on feature isomorphic sharing description | |
CN116227624A (en) | Federal knowledge distillation method and system oriented to heterogeneous model | |
Wang et al. | R2-trans: Fine-grained visual categorization with redundancy reduction | |
CN115373879A (en) | Intelligent operation and maintenance disk fault prediction method for large-scale cloud data center | |
CN118094216B (en) | Multi-modal model optimization retrieval training method and storage medium | |
Zhang et al. | Zero-small sample classification method with model structure self-optimization and its application in capability evaluation | |
CN117909881A (en) | Fault diagnosis method and device for multi-source data fusion pumping unit | |
CN116306969A (en) | Federal learning method and system based on self-supervision learning | |
CN115408693A (en) | Malicious software detection method and system based on self-adaptive computing time strategy | |
Guo et al. | Zero-sample surface defect detection and classification based on semantic feedback neural network | |
CN110855467B (en) | Network comprehensive situation prediction method based on computer vision technology | |
CN116861175B (en) | Operation track correction method based on neural network | |
Li et al. | On Testing and Evaluation of Artificial Intelligence Models | |
CN118506290B (en) | AI (advanced technology attachment) -recognition-based beam field construction safety quality monitoring method and system | |
CN111881942B (en) | Target classification method and system based on compression learning | |
US20240135547A1 (en) | A data-generating procedure from raw tracking inputs | |
Hu et al. | Selection of Outline Descriptors Based on LightGBM with Application to Infrared Image Target Recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |