CN114726502B

CN114726502B - Security system based on Internet of things and big data

Info

Publication number: CN114726502B
Application number: CN202210235267.3A
Authority: CN
Inventors: 闫正; 缪鹏
Original assignee: Gaozhesai Technology Nantong Co ltd
Current assignee: Gaozhesai Technology Nantong Co ltd
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2024-06-21
Anticipated expiration: 2042-03-10
Also published as: CN114726502A

Abstract

The invention discloses a security system based on the Internet of things and big data, which comprises a data input classification module, a cloud storage module and a multi-stage identity verification module, wherein the data input classification module is used for acquiring original data of the Internet of things of electric power, inputting and preprocessing the data, the cloud storage module is used for classifying and encrypting the preprocessed data after classifying the preprocessed data through a random forest, the multi-stage identity verification module is used for carrying out security verification of different levels when a user accesses different data of the Internet of things of electric power, the data input classification module is electrically connected with the cloud storage module, the cloud storage module is electrically connected with the multi-stage identity verification module, classifying the data generated by the Internet of things of electric power into sensitive data and insensitive data through a random forest network, encrypting the data by using a lightweight symmetric encryption algorithm, and verifying a trust mechanism through a multi-stage identity verification credential of the user.

Description

Security system based on Internet of things and big data

Technical Field

The invention relates to the technical field of data security of the Internet of things of electric power, in particular to a security system based on the Internet of things and big data.

Background

The electric power internet of things is a concrete expression of related technologies such as application collaboration networks, cloud computing and the like in the electric power system industry, users, power grids and power generation enterprises, suppliers and corresponding devices, people and things of the electric power system are connected together, generated data are shared, the users, the power grids, the power generation, the suppliers and government society are served, modern information technologies such as 'cloud object intelligent shift chains' and advanced communication technologies are fully utilized, interconnection and man-machine interaction of all parts of the electric power system are realized, the capabilities of automatic data acquisition, automatic data acquisition and flexible application are greatly improved, as the electric power internet of things is continuously built, the perception layer terminals are connected more and more, a great amount of data are always generated by each link of terminal equipment operated, and the data security problem of the internet of things in a cloud computing environment is synchronously and massively generated.

In the existing solution, the security of the internet of things is effectively improved by corresponding modeling or encrypting data by utilizing a secret key and a public key together, but the encryption of the data is a single algorithm or has the problem of less identity authentication, while the classical encryption algorithm of the advanced encryption standard focuses on providing high-level encryption performance, and the problem of hardware resource overhead is not considered too much, but the hardware resources in the electric internet of things equipment are limited, and the encryption algorithm with high performance and high energy consumption is not suitable for being adopted, so that the design of the lightweight symmetric encryption algorithm and the multi-level identity authentication based internet of things and large data security system is necessary.

Disclosure of Invention

The invention aims to provide a security system based on the Internet of things and big data, so as to solve the problems in the background technology.

In order to solve the technical problems, the invention provides the following technical scheme: the security system based on the Internet of things and big data comprises a data input classification module, a cloud storage module and a multi-stage identity verification module, wherein the data input classification module is used for acquiring electric power Internet of things original data to carry out data input and preprocessing, the cloud storage module is used for classifying and encrypting the preprocessed data after classifying the preprocessed data through a random forest, the multi-stage identity verification module is used for carrying out security verification of different levels when a user accesses different data of the electric power Internet of things, the data input classification module is electrically connected with the cloud storage module, and the cloud storage module is electrically connected with the multi-stage identity verification module.

According to the technical scheme, the data input classification module comprises an electric power original data acquisition module, a random forest classification module and a data preprocessing module, wherein the electric power original data acquisition module is used for acquiring original data from recorded electric power internet of things equipment generated data, the random forest classification module is used for inputting the data into a random network for classification, the data preprocessing module is used for preprocessing the data so as to accelerate training speed, the electric power original data acquisition module is electrically connected with the random forest classification module, and the electric power original data acquisition module is electrically connected with the random forest classification module and the data preprocessing module.

According to the technical scheme, the cloud storage module comprises a predictor building module, a grid searching and cross verifying module, a characteristic importance sorting module and a cloud storage encryption module, wherein the predictor building module is used for predicting the accuracy of output results of a training set and a testing set, the grid searching and cross verifying module is used for setting optimization classification on parameter data, the characteristic importance sorting module is used for outputting importance sorting results and data type classification results of all parameters, the cloud storage encryption module is used for encrypting and decrypting different types of data by using different grouping algorithms, the predictor building module is electrically connected with the grid searching and cross verifying module, and the characteristic importance sorting module is electrically connected with the cloud storage encryption module;

The cloud storage encryption module comprises a private cloud encryption module and a public cloud encryption module, wherein the private cloud encryption module is used for carrying out grouping encryption on sensitive data classified by random forests, the public cloud encryption module is used for carrying out grouping encryption on non-sensitive data classified by random forests, and the cloud encryption module is electrically connected with the public cloud encryption module.

According to the technical scheme, the multi-stage identity verification module comprises a first-stage identity verification module, a second-stage identity verification module and a third-stage identity verification module, wherein the first-stage identity verification module is used for first-stage credential verification performed when a user requests access to data in public cloud, the second-stage identity verification module is used for second-stage credential verification performed when the user requests access to and downloads data in public cloud, the third-stage identity verification module is used for third-time user credential verification performed when the user accesses to and downloads the internet of things power data in private cloud, and the first-stage identity verification module, the second-stage identity verification module and the third-stage identity verification module are electrically connected.

According to the technical scheme, the electric power data security verification method of the security system based on the Internet of things and big data comprises the following steps:

step S1: reading various original data sets generated by electric power Internet of things equipment, and taking the data sets as input of a random forest network, wherein the random forest is a classifier for training and predicting samples by utilizing a plurality of decision trees, and the output class of the classifier is determined by the mode of the output class of an individual decision tree;

Step S2: preprocessing data by adopting standardization to obtain a standardized value, establishing a random forest predictor to normalize the data, setting the number value and the optional depth value of a decision tree, adding cross verification, and replacing, classifying and outputting the obtained training set;

Step S3: calculating information gain of each feature by using a random forest, outputting a feature importance sorting result, classifying to obtain output data of sensitive and non-sensitive equipment types, and encrypting the two types of data by using different algorithms;

Step S4: the user provides three levels of authentication by the trust authority for secure access to the stored data by providing owned credential information.

According to the above technical solution, the step S1 further includes the following steps:

step S11: the input dataset was written with 3:1 is divided into a training set and a testing set, and corresponding labels are arranged, wherein one part is used for training a model, and the other part is used for testing the model;

step S12: respectively inputting part of the data set with the labels into each decision tree for training, and outputting a training result of the decision tree according to each parameter of the input data;

Step S13: after training, the test data for the test model is input as a test data set without labels, and the decision tree outputs the classification result by utilizing the importance of each parameter of the input data.

According to the above technical solution, the step S2 further includes the following steps:

Step S21: the standardized data are preprocessed, and the standardized value Z _ij is obtained by calculation by using the mathematical expectation E _Xi and the standard deviation S _i of each parameter, wherein the calculation formula is as follows:

In the formula, X _ij is the j value of the i parameter, and the data sets are different in magnitude, so that the accuracy of the model is improved, the training speed is increased, and the data needs to be preprocessed;

step S22: establishing a random forest predictor, inputting a training set and a testing set into the predictor, and outputting a prediction result and the accuracy of the prediction result;

Step S23: the number of decision trees in the random forest is respectively set as N ₁,N₂,N₃,N₄,N₅, the optional depth of the decision trees is respectively set as H ₁,H₂,H₃,H₄,H₅, the existing training set is divided into a training set and a verification set, and 10-fold cross verification is added;

Step S24: the data were divided into 10 parts, 1 part of which was used as a validation set, and then 10 times of testing was performed, with each time a different validation set was replaced, resulting in the results of 10 sets of models, and the average was taken as the final result.

According to the above technical solution, the step S3 further includes the following steps:

Step S31: according to the information gain of each parameter on each decision tree of the random forest, outputting importance sequencing results of each parameter, calculating the ticket number of each prediction result through the prediction result, and taking the prediction result with the highest ticket number as the final prediction output of the random forest;

step S32: the decision tree obtains output data classification results of the sensitive data and the non-sensitive data of the electric power Internet of things according to the modes in the classification results of the parameters in the input data;

Step S33: three lightweight symmetrical encryption algorithms with low requirements on hardware resources are used for encrypting data, sensitive data is encrypted by using RC6 and Fiestel encryption algorithms, and non-sensitive data is encrypted by using SM4 algorithm.

According to the above technical solution, the step S33 further includes the following steps:

Step S331: the RC6 algorithm is used for encrypting a part of sensitive data, the sensitive data is stored in four w-bit RC6 registers of the ABCD, calculated values are stored by using different variables, wherein registers B and D undergo pre-whitening execution inner loop, the four registers perform left rotation, right rotation and addition operation, and ciphertext converted from plaintext is output and stored in private cloud;

step S332: the Fiestel algorithm is used for encrypting another part of the sensitive data, dividing the part of the original data input in Fiestel into two equal parts K ₀ and K ₁ by utilizing a multi-round subkey x ₀,x₁,x₂,x₃ according to an encrypted round function F, and outputting the part of ciphertext through function calculation;

step S333: and encrypting the non-sensitive data by using an SM4 algorithm, equally dividing a plaintext packet and a ciphertext packet into 128 bits, dividing each packet into four equal parts, wherein the length of an encryption key is 128 bits, and the encryption key is responsible for generating a round key, controlling the key sequence in encryption to be opposite to the key sequence in decryption, performing multi-round nonlinear iteration control, and outputting and storing in public cloud.

According to the above technical solution, the step S4 further includes the following steps:

Step S41: providing credentials to a trust authority in each level by way of a stepwise authentication to securely access data stored in the hybrid cloud, the required level of authentication being based on the type of file access the user wishes to perform;

Step S42: reading a data file from a public cloud requires first-level authentication, namely a user sends a request to a trust mechanism, the request is that the data file is read from the public cloud and a user ID and a password of the user are sent, the trust mechanism judges whether registered credentials are matched with credentials provided by the user, if so, the authority of reading the file in the public cloud is granted, and a secret key for decrypting the data is given to the user;

step S43: the user sends a request to the trust authority to download the data file from the public cloud and send the own biological characteristic certificate, after obtaining the certificate from the user, the trust authority verifies the received certificate against the registered certificate, when the registered and the received certificate are matched, the trust authority allows the user to download the requested file from the public cloud and send the key required for decrypting the file;

Step S43: after the first-level and second-level authentication is successfully completed, the user can enter a third-level authentication, the user needs to send a private cloud request and credentials thereof, the trust mechanism receives the user ID, the password and the biometric credentials from the user, and after the trust mechanism is matched with the registered credentials, the trust mechanism provides the authority to read and download files from the private cloud, otherwise, the user request is refused.

Compared with the prior art, the invention has the following beneficial effects: according to the invention, the data generated by the electric power Internet of things equipment is classified into the sensitive data and the non-sensitive data through the random forest network by arranging the data input classification module, the cloud storage module and the multi-level identity verification module, three lightweight symmetrical encryption algorithms with low requirements on hardware resources are used for encrypting the data, the RC6 and Fiestel encryption algorithms are used for encrypting the sensitive data, the SM4 algorithm is used for encrypting the non-sensitive data, and meanwhile, in order to protect the cloud stored data from being damaged by malicious users, the trust mechanism verifies through the multi-level identity verification credentials of the users.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a schematic diagram of the system module composition of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, the present invention provides the following technical solutions: the security system based on the Internet of things and big data comprises a data input classification module, a cloud storage module and a multi-level identity verification module, wherein the data input classification module is used for acquiring electric power Internet of things original data to carry out data input and preprocessing, the cloud storage module is used for classifying and encrypting the preprocessed data after classifying the preprocessed data through a random forest, the multi-level identity verification module is used for carrying out security verification of different levels when a user accesses different data of the electric power Internet of things, the data input classification module is electrically connected with the cloud storage module, and the cloud storage module is electrically connected with the multi-level identity verification module.

The data input classification module comprises an electric power original data acquisition module, a random forest classification module and a data preprocessing module, wherein the electric power original data acquisition module is used for acquiring original data from recorded electric power internet of things equipment generated data, the random forest classification module is used for inputting the data into a random network for classification, the data preprocessing module is used for preprocessing the data so as to accelerate training speed, and the electric power original data acquisition module is electrically connected with the random forest classification module and is electrically connected with the data preprocessing module.

The cloud storage module comprises a predictor establishing module, a grid searching and cross verifying module, a characteristic importance sorting module and a cloud storage encryption module, wherein the predictor establishing module is used for predicting the output result accuracy of a training set and a testing set, the grid searching and cross verifying module is used for setting, optimizing and classifying parameter data, the characteristic importance sorting module is used for outputting the importance sorting result and the data type sorting result of each parameter, the cloud storage encryption module is used for encrypting and decrypting different types of data by using different grouping algorithms, and the predictor establishing module is electrically connected with the grid searching and cross verifying module;

The multi-level identity verification module comprises a first-level identity verification module, a second-level identity verification module and a third-level identity verification module, wherein the first-level identity verification module is used for first-level credential verification performed when a user requests to access data in public cloud, the second-level identity verification module is used for second-level credential verification performed when the user requests to access and downloads data in public cloud, the third-level identity verification module is used for third-time user credential verification performed when the user accesses and downloads internet of things power data in private cloud, and the first-level identity verification module, the second-level identity verification module and the third-level identity verification module are electrically connected.

The electric power data security verification method of the security system based on the Internet of things and big data comprises the following steps:

Step S1: various original data sets generated by the electric power internet of things equipment are read, the data sets are used as input of a random forest network, the random forest is a classifier which trains and predicts samples by utilizing a plurality of decision trees, the output types of the classifier are determined by the mode of the output types of individual decision trees, the data generated by the electric power internet of things equipment can be effectively classified by using the random forest, and then different types of data are encrypted by using different encryption algorithms, so that the efficiency of encrypting the electric power internet of things data is effectively improved;

Step S3: calculating information gain of each feature by random forest, outputting a feature importance sorting result, classifying to obtain output data of sensitive and non-sensitive equipment types, and respectively encrypting the two types of data by using different algorithms, wherein the sensitive data refer to data which are not suitable for an electric power Internet of things company and relate to economic benefits and network safety, and comprise network structures, IP address lists of the company, temperature, voltage and the like when a power grid runs;

Step S1 further comprises the steps of:

Step S2 further comprises the steps of:

step S23: the number of decision trees in the random forest is respectively set as N ₁,N₂,N₃,N₄,

N ₅, the optional depth of the decision tree is set as H ₁,H₂,H₃,H₄,H₅ respectively, the existing training set is divided into a training set and a verification set, 10-fold cross verification is added, the 10-fold cross verification refers to dividing the data set into ten parts, 9 parts of the data set are trained for 1 part of verification in turn, the average value of 10 results is used as the estimation of algorithm precision, and generally, the average value is required to be obtained through 10-fold cross verification for many times, namely 10-fold cross verification is carried out for 10 times, so that the accuracy is ensured;

Step S24: dividing data into 10 parts, wherein 1 part is used as a verification set, then, through 10 times of testing, changing different verification sets each time to obtain the results of 10 groups of models, taking an average value as a final result, taking out most of the samples in a given modeling sample to build the models, leaving a small part of the samples to forecast by the newly built models, solving the forecast errors of the small part of the samples, recording the sum of squares of the small part of the samples, and carrying out the process until all the samples are forecasted once and only once, and summing the squares of the forecast errors of each sample.

Step S3 further comprises the steps of:

Step S33: three lightweight symmetrical encryption algorithms with low requirements on hardware resources are used for encrypting data, sensitive data is encrypted by using RC6 and Fiestel encryption algorithms, non-sensitive data is encrypted by using SM4 algorithm, and RC6, fiestel encryption algorithms and SM4 algorithm are all block ciphers.

Step S33 further includes the steps of:

Step S333: the encryption method comprises the steps of encrypting non-sensitive data by using an SM4 algorithm, equally dividing a plaintext packet and a ciphertext packet into 128 bits, dividing each packet into four equal parts, enabling the length of an encryption key to be 128 bits, generating a round key, controlling a key sequence in encryption to be opposite to a key sequence in a decryption process, performing multi-round nonlinear iteration control, outputting and storing the key sequence in public cloud, encrypting the non-sensitive data by using an SM4 symmetric encryption algorithm for balancing data transmission efficiency, and enabling the key to be consistent in the encryption and decryption processes, wherein the encryption algorithm and the key expansion algorithm both adopt multi-round nonlinear iterative structures, and because the SM4 uses symmetric keys, namely information safety depends on the protection degree of the key, a dynamic update encryption strategy is adopted, under the dynamic key update encryption strategy, the key is only effective once, if an attacker does not obtain complete information of the key, the key has to analyze original data through 2n times of attacks, and under the safety condition of the current SM4, the safety is further improved through a dynamic update mechanism, so that the attacker cannot obtain next encrypted content after cracking the single key.

Step S4 further comprises the steps of:

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. Safety coefficient based on thing networking and big data, including data input classification module, cloud storage module and multistage identity verification module, its characterized in that: the data input classification module is used for acquiring the original data of the electric power Internet of things, inputting and preprocessing the data, the cloud storage module is used for classifying and encrypting the preprocessed data through a random forest, the multi-level identity verification module is used for carrying out different-level security verification when a user accesses different data of the electric power Internet of things, the data input classification module is electrically connected with the cloud storage module, and the cloud storage module is electrically connected with the multi-level identity verification module;

The cloud storage module comprises a cloud storage encryption module, the cloud storage encryption module comprises a private cloud encryption module and a public cloud encryption module, the private cloud encryption module is used for carrying out grouping encryption on sensitive data classified by a random forest, the public cloud encryption module is used for carrying out grouping encryption on non-sensitive data classified by the random forest, and the cloud encryption module is electrically connected with the public cloud encryption module;

the method for encrypting the sensitive data and the non-sensitive data comprises the following steps:

step a: the RC6 algorithm is used for encrypting a part of sensitive data, the sensitive data is stored in four w-bit RC6 registers of the ABCD, calculated values are stored by using different variables, wherein registers B and D undergo pre-whitening execution inner loop, the four registers perform left rotation, right rotation and addition operation, and ciphertext converted from plaintext is output and stored in private cloud;

step b: fiestel algorithm is used to encrypt another part of sensitive data, and to utilize multiple rounds of subkeys for the original data entered in Fiestel according to the encrypted round function F Dividing the partial data into two equal partsAnd/>Outputting the partial ciphertext through function calculation;

step c: and encrypting the non-sensitive data by using an SM4 algorithm, equally dividing a plaintext packet and a ciphertext packet into 128 bits, dividing each packet into four equal parts, wherein the length of an encryption key is 128 bits, and the encryption key is responsible for generating a round key, controlling the key sequence in encryption to be opposite to the key sequence in decryption, performing multi-round nonlinear iteration control, and outputting and storing in public cloud.

2. The internet of things and big data based security system of claim 1, wherein: the data input classification module comprises an electric power original data acquisition module, a random forest classification module and a data preprocessing module, wherein the electric power original data acquisition module is used for acquiring original data from recorded electric power internet of things equipment generated data, the random forest classification module is used for inputting the data into a random network for classification, the data preprocessing module is used for preprocessing the data so as to accelerate training speed, the electric power original data acquisition module is electrically connected with the random forest classification module, and the electric power original data acquisition module is electrically connected with the random forest classification module and the data preprocessing module.

3. The internet of things and big data based security system of claim 2, wherein: the cloud storage module comprises a predictor building module, a grid searching and cross verifying module and a characteristic importance sorting module, wherein the predictor building module is used for predicting the accuracy of output results of a training set and a testing set, the grid searching and cross verifying module is used for setting, optimizing and sorting parameter data, the characteristic importance sorting module is used for outputting importance sorting results and data type sorting results of all parameters, the cloud storage encryption module is used for encrypting and decrypting different types of data by using different grouping algorithms, the predictor building module is electrically connected with the grid searching and cross verifying module, and the characteristic importance sorting module is electrically connected with the cloud storage encryption module.

4. A security system based on internet of things and big data according to claim 3, characterized in that: the multistage identity verification module comprises a first-stage identity verification module, a second-stage identity verification module and a third-stage identity verification module, wherein the first-stage identity verification module is used for first-stage credential verification performed when a user requests access to data in public cloud, the second-stage identity verification module is used for second-stage credential verification performed when the user requests access to and downloads data in public cloud, the third-stage identity verification module is used for third-time user credential verification performed when the user accesses to and downloads internet of things power data in private cloud, and the first-stage identity verification module, the second-stage identity verification module and the third-stage identity verification module are electrically connected.

5. A power data security verification method based on the internet of things and big data based security system of any one of claims 1-4, the method comprising the steps of:

step S3: calculating information gain of each feature by using a random forest, outputting a feature importance sorting result, classifying to obtain sensitive and non-sensitive output data, and encrypting the two types of data by using different algorithms;

6. The power data security verification method according to claim 5, wherein: the step S1 further comprises the steps of:

7. The power data security verification method according to claim 6, wherein: the step S2 further comprises the steps of:

Step S21: preprocessing data using normalization, exploiting mathematical expectations of individual parameters And standard deviation/>Calculating to obtain a normalized value/>The calculation formula is as follows:

，

In the method, in the process of the invention, As for the j value of the i parameter, because the magnitudes of the parameters of the data set are different, in order to improve the accuracy of the model and speed up training, preprocessing operation is needed to be carried out on the data;

step S23: the number of decision trees in the random forest is respectively set as The selectable depths of the decision tree are set to/>, respectivelyDividing the existing training set into a training set and a verification set, and adding 10-fold cross verification;

8. The power data security verification method according to claim 7, wherein: the step S3 further includes the steps of:

9. The power data security verification method according to claim 8, wherein: the step S4 further includes the steps of: