CN111291046B - Computer big data storage control system and method - Google Patents

Computer big data storage control system and method Download PDF

Info

Publication number
CN111291046B
CN111291046B CN202010046920.2A CN202010046920A CN111291046B CN 111291046 B CN111291046 B CN 111291046B CN 202010046920 A CN202010046920 A CN 202010046920A CN 111291046 B CN111291046 B CN 111291046B
Authority
CN
China
Prior art keywords
data
module
information
file
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010046920.2A
Other languages
Chinese (zh)
Other versions
CN111291046A (en
Inventor
付媛媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan City University
Original Assignee
Hunan City University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan City University filed Critical Hunan City University
Priority to CN202010046920.2A priority Critical patent/CN111291046B/en
Publication of CN111291046A publication Critical patent/CN111291046A/en
Application granted granted Critical
Publication of CN111291046B publication Critical patent/CN111291046B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of computer application, and discloses a computer big data storage control system and a method, wherein a key parameter generation module is used for generating an encryption key through a key algorithm according to input security parameters, and generating tag information for a storage file; selecting a corresponding storage mode by adopting a data storage module according to the reading frequency of the data file or the file size; and sending a detection request to the server by adopting an integrity verification module, calculating by the server through the tag information and the request information to obtain detection information, and detecting the storage result by the server through the detection information and the tag information. The data storage module in the invention improves the processing capacity of data based on a storage strategy of redundancy before encoding, and can ensure that the production quantity of the blowout type data in the current stage is met. The integrity verification module can be used for verifying whether the stored data is complete or not, and the situations that the stored data is deleted, lost, tampered and the like are avoided.

Description

Computer big data storage control system and method
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to a computer big data storage control system and method.
Background
Currently, the closest prior art: today, where information technology is rapidly evolving, people's daily activities produce a large amount of data information. Under the accumulation of daily and monthly conditions, the data volume is larger and the data types are more and more, so that the large data is formed. The traditional computer information processing technology has limited data which can be processed, and the data generation quantity of blowout type at the present stage can not be met at all.
Innovating the computer information processing technology, improving the effectiveness of the computer information processing technology, and collecting, arranging, processing and applying the data better. In order to ensure the application efficiency of the computer information processing technology, the data needs to be classified in advance. Meanwhile, in the process of applying the computer information processing technology, the security of information processing needs to be improved. And only the advancement of computer information processing technology is ensured, the safety of data processing can be ensured. And certain relations exist among the massive data and different types of data information. How to ensure the storage safety of information in the process of data processing is a difficult point. Only if the information security technology is continuously analyzed and upgraded, the function of data information can be fully exerted, and the integrity and stability of the data are ensured.
In summary, the problems of the prior art are:
(1) The existing computer information processing technology has limited data which can be processed, and cannot meet the data generation quantity of blowout at present.
(2) The existing computer information processing technology cannot guarantee the safety of data processing.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides a computer big data storage control system and a computer big data storage control method.
The invention is realized in that a computer big data storage control system comprises:
the data acquisition module is connected with the central main control module, acquires data uploaded by a user through the computer network terminal, and transmits the data to the data encoding module; establishing a corresponding sample of the acquired data, and preprocessing the data in the sample; after the preprocessing is completed, compressing and transmitting the corresponding data; in the data preprocessing process, the processing process of missing data is as follows: deleting the missing samples by using a corresponding deletion function; after the deletion is completed, the missing value is replaced by assigning a value to the missing value; the data set which is randomly simulated is stored in the imp, and linear regression is carried out on the imp;
the data coding module is connected with the central main control module, and the obtained data information calculates a coded array according to the parameters of the configuration center and stores the coded array according to the array; selecting a proper neural network model, and extracting corresponding data characteristics from the input data information; according to the extracted data characteristic information, a corresponding multi-layer neural network is established; each layer of neural network trains the whole deep neural network by using a corresponding training method; calculating parameters of the neural network of the first layer, and taking the output of the hidden layer of the neural network of the first layer as the input of the next layer; the process is repeated continuously, and the network parameters of each layer are trained in sequence, so that the coding of data is realized;
The key parameter generation module is connected with the central main control module, generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file; the user inputs the security parameters, generates a key pair and an encryption key, the generated public key is public, is used for generating tag information for the file storing the result, and the private key is saved by the user; when an encryption key is generated through a Keyall algorithm, two strong prime numbers p and q are generated at first, and then the two strong prime numbers p and q are calculated: m=pq, F (M) = (p-1) (q-1), then generating an odd number a: a belonging to a positive integer such that G (a, F (M))=1, and then regarding p and q as private keys and a and M as public keys;
the data storage module is connected with the central main control module and selects a corresponding storage mode according to the reading frequency or the file size of the data file; the data storage module adopts a storage strategy of redundancy first and encoding later to store data files, and adopts a redundancy backup mode for storing small files and frequently used large files, and adopts an RS encoding mode for storing large files which are not used for a long time;
the integrity verification module is connected with the central main control module, sends out a detection request to the server, and calculates the detection information through the label information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; the user sends the key information to the server to make a detection request; the server calculates the stored data according to the detection request of the user; after receiving the information returned by the server, the user decrypts the information; the user verifies the returned information, whether the returned information is complete or not is verified, if the returned information is complete, the verification is successful, and the fact that the big data storage result is correct is indicated; otherwise, the error of the big data storage result is indicated, and the storage result needs to be verified one by one to find out the error big data storage result;
The data recovery module is connected with the central main control module and is used for carrying out data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array; monitoring the time for transmitting data information between a master node and a slave node, and judging that the node fails if no return information of the node is received within the set time; reading parameters related to data recovery in the configuration center, analyzing the corresponding positions of lost nodes or blocks in the coding array according to node failure information or block failure information and the parameters related to data recovery, and sending the positions to a decoding unit; reading relevant parameters related to load balancing in the configuration center, and selecting a new node list according to the parameters and the load states of all nodes; selecting a decoding scheme according to the corresponding position of the lost node or the block in the coding array, and reading the residual block data required by the server; according to the rest of the block data, decoding calculation is carried out to obtain the data of the lost block, and the restored block is stored in a new node in a server according to a selected new node list;
The configuration module is connected with the central main control module and is used for carrying out preset configuration on various parameters in the system and extracting corresponding configuration information according to the control instruction;
the data management module is connected with the central main control module and used for adding, deleting, modifying and backing up the stored data content;
the data classification module is connected with the central main control module and classifies the stored content by utilizing a data classification method;
the inquiry module is connected with the central main control module and searches corresponding content through voice input or keyboard input;
the central main control module is respectively connected with the data acquisition module, the data encoding module, the key parameter generation module, the data storage module, the integrity verification module, the data recovery module, the configuration module, the data management module, the data classification module, the query module, the wireless signal receiving and transmitting module and the cloud server and used for coordinating the normal operation of each module;
the wireless signal transceiver module is connected with the central main control module and is connected with the cloud server through the wireless signal transceiver to realize data transmission;
the cloud server is connected with the central main control module, and host service configuration and service scale can be configured according to the needs of users and are used for realizing data sharing.
Further, the data management module includes:
the adding and deleting module inputs a corresponding deleting or adding instruction according to the user requirement, and the data management system deletes or adds corresponding content;
the modification module inputs a corresponding modification command according to the user demand, and the data management system modifies corresponding content;
and the backup module is used for monitoring and tracking the data uploaded by the user and the update of the important target file to be tracked, transmitting the update log to the backup system in real time through a network, and updating the disk according to the log by the backup system.
Another object of the present invention is to provide a computer big data storage control method of the computer big data storage control system, the computer big data storage control method comprising:
step one, a data acquisition module acquires data uploaded by a user through a computer network terminal; classifying the stored content by a data classification module by using a data classification method;
step two, after the data classification is completed, the data is transmitted to a data coding module; the data information obtained by the data coding module is used for calculating a coded array according to the parameters of the configuration center and coding according to the array; after the coding is finished, the data storage module selects a corresponding storage mode according to the reading frequency or the file size of the data file to store;
Step three, the integrity verification module sends out a detection request to the server, and the server calculates through the label information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; the data recovery module performs data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array;
step four, after the data recovery is completed, the key parameter generation module generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file;
fifthly, in the storage process, the configuration module performs preset configuration on various parameters in the system, and extracts corresponding configuration information according to the control instruction; the data management module performs addition and deletion, modification and backup on the stored data content; meanwhile, through a query module, searching corresponding contents by utilizing voice input or keyboard input;
step six, the wireless signal transceiver module is connected with the cloud server through the wireless signal transceiver to realize data transmission; the host service configuration and the service scale in the cloud server can be configured according to the needs of users and are used for realizing data sharing.
Further, in the first step, the data acquisition module processes acquired data as follows:
establishing a corresponding sample of the acquired data, and preprocessing the data in the sample; after the preprocessing is completed, compressing and transmitting the corresponding data;
in the data preprocessing process, the processing process of missing data is as follows:
deleting the missing samples by using a corresponding deletion function; after the deletion is completed, the missing value is replaced by assigning a value to the missing value; and (5) randomly simulating the completed data set, storing the data set into the imp, and carrying out linear regression on the imp.
Further, in the second step, the process of encoding the data by the data encoding module is as follows:
selecting a proper neural network model, and extracting corresponding data characteristics from the input data information;
according to the extracted data characteristic information, a corresponding multi-layer neural network is established; each layer of neural network trains the whole deep neural network by using a corresponding training method;
calculating parameters of the neural network of the first layer, and taking the output of the hidden layer of the neural network of the first layer as the input of the next layer;
the above process is repeated continuously, and the network parameters of each layer are trained in sequence, so that the coding of the data is realized.
In the second step, the data storage module stores the data files by adopting a storage strategy of redundancy first and encoding later, and stores small files and frequently used large files in a redundancy backup mode, and stores large files which are not used for a long time in an RS encoding mode.
Further, the redundancy-first-and-encoding-later storage strategy specifically includes:
when a certain file is uploaded to a server, storing the file in a redundant backup mode, and adding the latest reading time into the file meta-information and setting the latest reading time as a current time stamp;
the server checks the 'file size' and 'last read time' in each file meta-information, skips files stored by RS codes and files with sizes less than 100MB, and for files with file sizes exceeding 100MB, considers the file as hot data if the last time the file was read is within 3 days from the moment, skips; otherwise, judging that the File is not used for a long time, performing RS coding storage on the File, and deleting the redundant backup of the previous File;
when the read file is stored in a redundant backup mode, updating the latest reading time;
when the read file is stored in an RS coding mode, if the file is intact, no operation is performed; if the file is damaged, RS decoding is carried out on the residual data blocks of the file to obtain source data, and the restored source data is stored again in a redundancy backup mode.
Further, in the third step, the specific detection steps adopted by the integrity verification module include:
firstly, a user sends key information to a server to make a detection request;
secondly, the server calculates the stored data according to the detection request of the user;
thirdly, after receiving the information returned by the server, the user decrypts the information;
fourthly, the user verifies the returned information, whether the returned information is complete or not is verified, if the returned information is complete, the verification is successful, and the fact that the big data storage result is correct is indicated; otherwise, the error of the big data storage result is indicated, and the storage result needs to be verified one by one to find out the error big data storage result;
the algorithm is as follows: inputting a data file H to be detected, and selecting a file block m in the data file H i (1<i<n), detecting the random number r; thereafter, calculate a r =a r nod M, calculate 1 for the detection data:
Figure BDA0002369750450000061
second, for file block m i (1<i<n) corresponding tag information T i Selecting and calculating
Figure BDA0002369750450000071
Finally, the detection data are calculated 2: r' =s r nodM, and verifying whether R and R' are equal; and returning to 'T' when the two are equal, otherwise, returning to 'F'.
Further, in the third step, the data recovery process adopted by the data recovery module specifically includes:
monitoring the time for transmitting data information between a master node and a slave node, and judging that the node fails if no return information of the node is received within the set time;
reading parameters related to data recovery in the configuration center, analyzing the corresponding positions of lost nodes or blocks in the coding array according to node failure information or block failure information and the parameters related to data recovery, and sending the positions to a decoding unit;
reading relevant parameters related to load balancing in the configuration center, and selecting a new node list according to the parameters and the load states of all nodes;
selecting a decoding scheme according to the corresponding position of the lost node or the block in the coding array, and reading the residual block data required by the server;
and carrying out decoding calculation according to the rest of the block data to obtain the data of the lost block, and storing the restored block in a new node in the server according to the selected new node list.
Further, in the fourth step, the key parameter generating module specifically includes:
the user inputs the security parameters, generates a key pair and an encryption key, the generated public key is public, is used for generating tag information for the file storing the result, and the private key is saved by the user;
When an encryption key is generated through a Keyall algorithm, two strong prime numbers p and q are generated at first, and then the two strong prime numbers p and q are calculated: m=pq, F (M) = (p-1) (q-1), then an odd number a: a is generated to belong to a positive integer such that G (a, F (M))=1, and p and q are regarded as private keys, and a and M are regarded as public keys.
In summary, the invention has the advantages and positive effects that: the data storage module selects the corresponding storage mode according to the reading frequency or the file size of the data file, and based on the storage strategy of redundancy before encoding, the data processing capacity is improved, and the generation quantity of the data meeting the current blowout type can be ensured. The integrity verification module can be used for verifying whether the stored data is complete or not, so that the conditions of deleting, losing, tampering and the like of the stored data are avoided, and the safety of the stored data is ensured.
The data information obtained by the invention calculates the coded array according to the parameters of the configuration center, and the data coding module which codes and stores the data according to the array carries out neural network coding, so that the data information has good fault tolerance, self-organization and self-adaptability; in the data compression process, the neural network can autonomously complete image coding and compression according to the characteristics of information without resorting to a certain predetermined data coding algorithm. Meanwhile, in the data preprocessing process, the method for missing data can ensure the authenticity and reliability of the data.
Drawings
FIG. 1 is a schematic diagram of a big data storage control system of a computer according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a data management module according to an embodiment of the present invention.
Fig. 3 is a flowchart of a method for controlling storage of big data in a computer according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of comparison results of detection execution time of an integrity verification module according to an embodiment of the present invention.
In the figure: 1. a data acquisition module; 2. a data encoding module; 3. a key parameter generation module; 4. a data storage module; 5. an integrity verification module; 6. a data recovery module; 7. a configuration module; 8. a data management module; 9. a data classification module; 10. a query module; 11. a central main control module; 12. a wireless signal receiving and transmitting module; 13. and the cloud server.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, a computer big data storage control system provided by an embodiment of the present invention includes:
The data acquisition module 1 is connected with the central main control module 11, acquires data uploaded by a user through a computer network terminal, and transmits the data to the data encoding module.
The data coding module 2 is connected with the central main control module 11, and the obtained data information calculates a coded array according to the parameters of the configuration center and stores the coded array according to the array.
The key parameter generating module 3 is connected with the central main control module 11, generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file.
The data storage module 4 is connected with the central main control module 11 and selects a corresponding storage mode according to the reading frequency or the file size of the data file.
The integrity verification module 5 is connected with the central main control module 11, sends out a detection request to the server, and calculates the detection information through the label information and the request information to obtain detection information; the server detects the stored result by the detection information and the tag information.
The data recovery module 6 is connected with the central main control module 11, and performs data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array.
The configuration module 7 is connected with the central main control module 11 and is used for carrying out preset configuration on various parameters in the system and extracting corresponding configuration information according to the control instruction.
The data management module 8 is connected with the central main control module 11 and used for adding, deleting, modifying and backing up the stored data content.
The data classification module 9 is connected with the central main control module 11 and classifies the stored content by utilizing a data classification method.
The inquiry module 10 is connected with the central main control module 11, and searches corresponding contents through voice input or keyboard input.
The central main control module 11 is respectively connected with the data acquisition module 1, the data encoding module 2, the key parameter generation module 3, the data storage module 4, the integrity verification module 5, the data recovery module 6, the configuration module 7, the data management module 8, the data classification module 9, the query module 10, the wireless signal receiving and transmitting module 12 and the cloud server 13, and is used for coordinating the normal operation of each module.
The wireless signal transceiver module 12 is connected with the central main control module 11, and is connected with the cloud server through a wireless signal transceiver to realize data transmission.
The cloud server 13 is connected with the central main control module 11, and the host service configuration and the service scale can be configured according to the needs of users and is used for realizing data sharing.
As shown in fig. 2, a data management module provided in an embodiment of the present invention includes:
and the adding and deleting module inputs a corresponding deleting or adding instruction according to the user requirement, and the data management system deletes or adds the corresponding content.
And the modification module is used for inputting a corresponding modification command according to the user requirement, and the data management system modifies the corresponding content.
And the backup module is used for monitoring and tracking the data uploaded by the user and the update of the important target file to be tracked, transmitting the update log to the backup system in real time through a network, and updating the disk according to the log by the backup system.
The data coding module 2, which is connected with the central main control module 11 and is used for coding and storing data according to the data information obtained by the embodiment of the invention, calculates a coded array according to the parameters of the configuration center, and codes the data according to the array, comprises the following steps:
selecting a proper neural network model, and extracting corresponding data characteristics from the input data information;
according to the extracted data characteristic information, a corresponding multi-layer neural network is established; each layer of neural network trains the whole deep neural network by using a corresponding training method;
Calculating parameters of the neural network of the first layer, and taking the output of the hidden layer of the neural network of the first layer as the input of the next layer;
the above process is repeated continuously, and the network parameters of each layer are trained in sequence, so that the coding of the data is realized.
The process of processing acquired data by the data acquisition module 1, which is connected with the central main control module 11 and acquires data uploaded by a user through a computer network terminal, is as follows:
establishing a corresponding sample of the acquired data, and preprocessing the data in the sample; and after the preprocessing is completed, compressing and transmitting the corresponding data.
In the data preprocessing process, the processing process of missing data is as follows:
deleting the missing samples by using a corresponding deletion function; after the deletion is completed, the missing value is replaced by assigning a value to the missing value; and (5) randomly simulating the completed data set, storing the data set into the imp, and carrying out linear regression on the imp.
The key parameter generating module 3 provided in the embodiment of the present invention specifically includes:
the user inputs the security parameters, generates a key pair and an encryption key, the generated public key is public, is used for generating tag information for the file storing the result, and the private key is saved by the user;
When an encryption key is generated through a Keyall algorithm, two strong prime numbers p and q are generated at first, and then the two strong prime numbers p and q are calculated: m=pq, F (M) = (p-1) (q-1), then an odd number a: a is generated to belong to a positive integer such that G (a, F (M))=1, and p and q are regarded as private keys, and a and M are regarded as public keys.
The data storage module 4 provided by the embodiment of the invention adopts a storage strategy of redundancy first and encoding later to store data files, adopts a redundancy backup mode to store small files and frequently used large files, and adopts an RS encoding mode to store large files which are not used for a long time.
The storage strategy of redundancy-first and encoding-later specifically comprises:
when a certain file is uploaded to a server, storing the file in a redundant backup mode, and adding the latest reading time into the file meta-information and setting the latest reading time as a current time stamp;
the server checks the 'file size' and 'last read time' in each file meta-information, skips files stored by RS codes and files with sizes less than 100MB, and for files with file sizes exceeding 100MB, considers the file as hot data if the last time the file was read is within 3 days from the moment, skips; otherwise, judging that the File is not used for a long time, performing RS coding storage on the File, and deleting the redundant backup of the previous File;
When the read file is stored in a redundant backup mode, updating the latest reading time;
when the read file is stored in an RS coding mode, if the file is intact, no operation is performed. If the file is damaged, RS decoding is carried out on the residual data blocks of the file to obtain source data, and the restored source data is stored again in a redundancy backup mode.
The specific detection steps adopted by the integrity verification module 5 provided by the embodiment of the invention include:
firstly, a user sends key information to a server to make a detection request;
secondly, the server calculates the stored data according to the detection request of the user;
thirdly, after receiving the information returned by the server, the user decrypts the information;
fourthly, the user verifies the returned information, whether the returned information is complete or not is verified, if the returned information is complete, the verification is successful, and the fact that the big data storage result is correct is indicated; otherwise, the error of the big data storage result is indicated, and the storage result needs to be verified one by one to find out the error big data storage result.
The algorithm is as follows: inputting a data file H to be detected (the data file is complete), and selecting a file block m therein i (1<i<n), detecting the random number r; thereafter, calculate a r =a r nod M, calculate 1 for the detection data:
Figure BDA0002369750450000121
second, for file block m i (1<i<n) corresponding tag information T i Selecting and calculating->
Figure BDA0002369750450000122
Finally, the detection data are calculated 2: r' =s r nodM, and verifies whether R and R' are equal. When the two are equal, the data returns to 'T' (namely complete data), otherwise, the data returns to 'F' (namely incomplete data). Through the implementation of the algorithm, the big data storage result can be detected based on the integrity verification.
The data recovery process adopted by the data recovery module 6 provided by the embodiment of the invention specifically comprises the following steps:
monitoring the time for transmitting data information between a master node and a slave node, and judging that the node fails if no return information of the node is received within the set time;
reading parameters related to data recovery in the configuration center, analyzing the corresponding positions of lost nodes or blocks in the coding array according to node failure information or block failure information and the parameters related to data recovery, and sending the positions to a decoding unit;
reading relevant parameters related to load balancing in the configuration center, and selecting a new node list according to the parameters and the load states of all nodes;
Selecting a decoding scheme according to the corresponding position of the lost node or the block in the coding array, and reading the residual block data required by the server;
and carrying out decoding calculation according to the rest of the block data to obtain the data of the lost block, and storing the restored block in a new node in the server according to the selected new node list.
In order to verify whether the performance of the integrity verification module 5 is good or not, based on a PBC database, a udubm4.16 system is adopted, a CPU is i7-4600m,2.50ghz, a memory is 8GB, a programming language is adopted as a C language, the scale of a big data storage result is 1-5G, a comparison experiment of detection efficiency is performed by respectively adopting a detection system proposed by the integrity verification module 5 and a traditional detection system, and fig. 4 is a schematic diagram of a comparison result of detection execution time of the integrity verification module provided by the embodiment of the present invention.
As shown in fig. 3, a method for controlling a big data storage device of a computer according to an embodiment of the present invention includes:
s101: the data acquisition module acquires data uploaded by a user through a computer network terminal; and classifying the stored content by a data classification module by utilizing a data classification method.
S102: after the data classification is completed, the data is transmitted to a data coding module; the data information obtained by the data coding module is used for calculating a coded array according to the parameters of the configuration center and coding according to the array; after the coding is finished, the data storage module selects a corresponding storage mode according to the reading frequency or the file size of the data file to store.
S103: the integrity verification module sends a detection request to the server, and the server calculates through the tag information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; and the data recovery module performs data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array.
S104: after the data recovery is completed, the key parameter generating module generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file.
S105: in the storage process, the configuration module performs preset configuration on various parameters in the system, and extracts corresponding configuration information according to the control instruction; the data management module performs addition and deletion, modification and backup on the stored data content; and simultaneously, searching corresponding contents by utilizing voice input or keyboard input through a query module.
S106: the wireless signal transceiver module is connected with the cloud server through the wireless signal transceiver to realize data transmission; the host service configuration and the service scale in the cloud server can be configured according to the needs of users and are used for realizing data sharing.
The working principle of the invention is as follows: the data acquisition module 1 acquires data uploaded by a user through a computer network terminal; the stored content is classified by the data classification module 9 using a data classification method. After the data classification is completed, the data is transmitted to a data coding module; the data information obtained by the data coding module 2 is used for calculating a coded array according to the parameters of the configuration center and coding according to the array; after the encoding is completed, the data storage module 4 selects a corresponding storage mode according to the reading frequency or the file size of the data file, and stores the data file. The integrity verification module 5 sends a detection request to a server, and the server calculates through the tag information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; the data recovery module 6 performs data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array. After the data recovery is completed, the key parameter generating module 3 generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file.
In the storage process, the configuration module 7 performs preset configuration on various parameters in the system, and extracts corresponding configuration information according to the control instruction; the data management module 8 performs addition and deletion, modification and backup on the stored data content; and at the same time, through the query module 10, corresponding contents are searched for using voice input or keyboard input. The wireless signal transceiver module 12 is connected with the cloud server through a wireless signal transceiver to realize data transmission; the host service configuration and the service scale in the cloud server 13 can be configured according to the needs of the user, and are used for realizing the sharing of data.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. A computer big data storage control system, the computer big data storage control system comprising:
the data acquisition module is connected with the central main control module, acquires data uploaded by a user through the computer network terminal, and transmits the data to the data encoding module; establishing a corresponding sample of the acquired data, and preprocessing the data in the sample; after the preprocessing is completed, compressing and transmitting the corresponding data; in the data preprocessing process, the processing process of missing data is as follows: deleting the missing samples by using a corresponding deletion function; after the deletion is completed, the missing value is replaced by assigning a value to the missing value; the data set which is randomly simulated is stored in the imp, and linear regression is carried out on the imp;
The data coding module is connected with the central main control module, and the obtained data information calculates a coded array according to the parameters of the configuration center and stores the coded array according to the array; selecting a proper neural network model, and extracting corresponding data characteristics from the input data information; according to the extracted data characteristic information, a corresponding multi-layer neural network is established; each layer of neural network trains the whole deep neural network by using a corresponding training method; calculating parameters of the neural network of the first layer, and taking the output of the hidden layer of the neural network of the first layer as the input of the next layer; the process is repeated continuously, and the network parameters of each layer are trained in sequence, so that the coding of data is realized;
the key parameter generation module is connected with the central main control module, generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file; the user inputs the security parameters, generates a key pair and an encryption key, the generated public key is public, is used for generating tag information for the file storing the result, and the private key is saved by the user; when the encryption key is generated through the Keyall algorithm, two strong prime numbers p and q are generated at first, and then the two strong prime numbers p and q are calculated: m=pq, F (M) = (p-1) (q-1), then generating an odd number a: a belonging to a positive integer such that G (a, F (M))=1, and then regarding p and q as private keys and a and M as public keys;
The data storage module is connected with the central main control module and selects a corresponding storage mode according to the reading frequency or the file size of the data file; the data storage module adopts a storage strategy of redundancy first and encoding later to store data files, and adopts a redundancy backup mode for storing small files and frequently used large files, and adopts an RS encoding mode for storing large files which are not used for a long time;
the integrity verification module is connected with the central main control module, sends out a detection request to the server, and calculates the detection information through the label information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; the user sends the key information to the server to make a detection request; the server calculates the stored data according to the detection request of the user; after receiving the information returned by the server, the user decrypts the information; the user verifies the returned information, whether the returned information is complete or not is verified, if the returned information is complete, the verification is successful, and the fact that the big data storage result is correct is indicated; otherwise, the error of the big data storage result is indicated, and the storage result needs to be verified one by one to find out the error big data storage result;
The data recovery module is connected with the central main control module and is used for carrying out data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array; monitoring the time for transmitting data information between a master node and a slave node, and judging that the node fails if no return information of the node is received within the set time; reading parameters related to data recovery in the configuration center, analyzing the corresponding positions of lost nodes or blocks in the coding array according to node failure information or block failure information and the parameters related to data recovery, and sending the positions to a decoding unit; reading relevant parameters related to load balancing in the configuration center, and selecting a new node list according to the parameters and the load states of all nodes; selecting a decoding scheme according to the corresponding position of the lost node or the block in the coding array, and reading the residual block data required by the server; according to the rest of the block data, decoding calculation is carried out to obtain the data of the lost block, and the restored block is stored in a new node in a server according to a selected new node list;
The configuration module is connected with the central main control module and is used for carrying out preset configuration on various parameters in the system and extracting corresponding configuration information according to the control instruction;
the data management module is connected with the central main control module and used for adding, deleting, modifying and backing up the stored data content;
the data classification module is connected with the central main control module and classifies the stored content by utilizing a data classification method;
the inquiry module is connected with the central main control module and searches corresponding content through voice input or keyboard input;
the central main control module is respectively connected with the data acquisition module, the data encoding module, the key parameter generation module, the data storage module, the integrity verification module, the data recovery module, the configuration module, the data management module, the data classification module, the query module, the wireless signal receiving and transmitting module and the cloud server and used for coordinating the normal operation of each module;
the wireless signal transceiver module is connected with the central main control module and is connected with the cloud server through the wireless signal transceiver to realize data transmission;
the cloud server is connected with the central main control module, and host service configuration and service scale can be configured according to the needs of users and are used for realizing data sharing.
2. The computer big data storage control system of claim 1, wherein the data management module comprises:
the adding and deleting module inputs a corresponding deleting or adding instruction according to the user requirement, and the data management system deletes or adds corresponding content;
the modification module inputs a corresponding modification command according to the user demand, and the data management system modifies corresponding content;
and the backup module is used for monitoring and tracking the data uploaded by the user and the update of the important target file to be tracked, transmitting the update log to the backup system in real time through a network, and updating the disk according to the log by the backup system.
3. A computer big data storage control method of the computer big data storage control system according to any one of claims 1 to 2, characterized by comprising:
step one, a data acquisition module acquires data uploaded by a user through a computer network terminal; classifying the stored content by a data classification module by using a data classification method;
step two, after the data classification is completed, the data is transmitted to a data coding module; the data information obtained by the data coding module is used for calculating a coded array according to the parameters of the configuration center and coding according to the array; after the coding is finished, the data storage module selects a corresponding storage mode according to the reading frequency or the file size of the data file to store;
Step three, the integrity verification module sends out a detection request to the server, and the server calculates through the label information and the request information to obtain detection information; the server detects the storage result through the detection information and the label information; the data recovery module performs data recovery according to the feedback information of the integrity verification module, the failure node information, the data recovery related parameters and the corresponding position information of the analysis failure node in the array;
step four, after the data recovery is completed, the key parameter generation module generates an encryption key through a key algorithm according to the input security parameters, and generates tag information for the storage file;
fifthly, in the storage process, the configuration module performs preset configuration on various parameters in the system, and extracts corresponding configuration information according to the control instruction; the data management module performs addition and deletion, modification and backup on the stored data content; meanwhile, through a query module, searching corresponding contents by utilizing voice input or keyboard input;
step six, the wireless signal transceiver module is connected with the cloud server through the wireless signal transceiver to realize data transmission; the host service configuration and the service scale in the cloud server can be configured according to the needs of users and are used for realizing data sharing.
4. The method of claim 3, wherein in the first step, the process of processing the acquired data by the data acquisition module is as follows:
establishing a corresponding sample of the acquired data, and preprocessing the data in the sample; after the preprocessing is completed, compressing and transmitting the corresponding data;
in the data preprocessing process, the processing process of missing data is as follows:
deleting the missing samples by using a corresponding deletion function; after the deletion is completed, the missing value is replaced by assigning a value to the missing value; and (5) randomly simulating the completed data set, storing the data set into the imp, and carrying out linear regression on the imp.
5. The method for controlling storage of big data in computer according to claim 3, wherein in the second step, the process of encoding the data by the data encoding module is:
selecting a proper neural network model, and extracting corresponding data characteristics from the input data information;
according to the extracted data characteristic information, a corresponding multi-layer neural network is established; each layer of neural network trains the whole deep neural network by using a corresponding training method;
calculating parameters of the neural network of the first layer, and taking the output of the hidden layer of the neural network of the first layer as the input of the next layer;
The above process is repeated continuously, and the network parameters of each layer are trained in sequence, so that the coding of the data is realized.
6. The method of claim 3, wherein in the second step, the data storage module uses a redundancy-first-and-encoding storage strategy to store the data files, and uses a redundancy backup mode for storing small files and frequently used large files, and uses an RS encoding mode for storing large files which are not used for a long time.
7. The method for controlling storage of big data in computer according to claim 6, wherein the redundancy-first-and-encoding-later storage strategy specifically comprises:
when a certain file is uploaded to a server, storing the file in a redundant backup mode, and adding the latest reading time into the file meta-information and setting the latest reading time as a current time stamp;
the server checks the 'file size' and 'last read time' in each file meta-information, skips files stored by RS codes and files with sizes less than 100MB, and for files with file sizes exceeding 100MB, considers the file as hot data if the last time the file was read is within 3 days from the moment, skips; otherwise, judging that the File is not used for a long time, performing RS coding storage on the File, and deleting the redundant backup of the previous File;
When the read file is stored in a redundant backup mode, updating the latest reading time;
when the read file is stored in an RS coding mode, if the file is intact, no operation is performed; if the file is damaged, RS decoding is carried out on the residual data blocks of the file to obtain source data, and the restored source data is stored again in a redundancy backup mode.
8. The method for controlling storage of big data in computer according to claim 3, wherein in the third step, the specific detecting step adopted by the integrity verification module includes:
firstly, a user sends key information to a server to make a detection request;
secondly, the server calculates the stored data according to the detection request of the user;
thirdly, after receiving the information returned by the server, the user decrypts the information;
fourthly, the user verifies the returned information, whether the returned information is complete or not is verified, if the returned information is complete, the verification is successful, and the fact that the big data storage result is correct is indicated; otherwise, the error of the big data storage result is indicated, and the storage result needs to be verified one by one to find out the error big data storage result;
the algorithm is as follows: inputting a data file H to be detected, selecting a file block mi (1 < i < n) in the data file H, detecting the data file H, and generating a random number r; after that, ar=arodm is calculated, and detection data is calculated as 1: secondly, selecting tag information Ti corresponding to a file block mi (1 < i < n), and calculating, and finally, calculating 2 for detection data: r '=srnodm, and verify whether R and R' are equal; and returning to 'T' when the two are equal, otherwise, returning to 'F'.
9. The method for controlling storage of big data in computer according to claim 3, wherein in the third step, the data recovery process adopted by the data recovery module specifically includes:
monitoring the time for transmitting data information between a master node and a slave node, and judging that the node fails if no return information of the node is received within the set time;
reading parameters related to data recovery in the configuration center, analyzing the corresponding positions of lost nodes or blocks in the coding array according to node failure information or block failure information and the parameters related to data recovery, and sending the positions to a decoding unit;
reading relevant parameters related to load balancing in the configuration center, and selecting a new node list according to the parameters and the load states of all nodes;
selecting a decoding scheme according to the corresponding position of the lost node or the block in the coding array, and reading the residual block data required by the server;
and carrying out decoding calculation according to the rest of the block data to obtain the data of the lost block, and storing the restored block in a new node in the server according to the selected new node list.
10. The method for controlling storage of big data in computer according to claim 3, wherein in the fourth step, the key parameter generating module specifically comprises:
The user inputs the security parameters, generates a key pair and an encryption key, the generated public key is public, is used for generating tag information for the file storing the result, and the private key is saved by the user;
when an encryption key is generated through a Keyall algorithm, two strong prime numbers p and q are generated at first, and then the two strong prime numbers p and q are calculated: m=pq, F (M) = (p-1) (q-1), then an odd number a: a is generated to belong to a positive integer such that G (a, F (M))=1, and p and q are regarded as private keys, and a and M are regarded as public keys.
CN202010046920.2A 2020-01-16 2020-01-16 Computer big data storage control system and method Active CN111291046B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010046920.2A CN111291046B (en) 2020-01-16 2020-01-16 Computer big data storage control system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010046920.2A CN111291046B (en) 2020-01-16 2020-01-16 Computer big data storage control system and method

Publications (2)

Publication Number Publication Date
CN111291046A CN111291046A (en) 2020-06-16
CN111291046B true CN111291046B (en) 2023-07-14

Family

ID=71023092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010046920.2A Active CN111291046B (en) 2020-01-16 2020-01-16 Computer big data storage control system and method

Country Status (1)

Country Link
CN (1) CN111291046B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416728A (en) * 2021-12-27 2022-04-29 炫彩互动网络科技有限公司 Server archiving and file reading method
CN114629709B (en) * 2022-03-18 2022-11-25 乖乖数字科技(苏州)有限公司 Computer network safety system based on distributed big data information interaction
CN116418580B (en) * 2023-04-10 2023-11-24 广东粤密技术服务有限公司 Data integrity protection detection method and device for local area network and electronic equipment
CN117193886B (en) * 2023-11-06 2024-01-05 成都科江科技有限公司 Dynamic loading method for configuration file of industrial control system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425941A (en) * 2013-07-31 2013-12-04 广东数字证书认证中心有限公司 Cloud storage data integrity verification method, equipment and server
CN105320899A (en) * 2014-07-22 2016-02-10 北京大学 User-oriented cloud storage data integrity protection method
CN106611135A (en) * 2016-06-21 2017-05-03 四川用联信息技术有限公司 Storage data integrity verification and recovery method
RU2017115539A3 (en) * 2017-05-02 2018-11-07

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9996413B2 (en) * 2007-10-09 2018-06-12 International Business Machines Corporation Ensuring data integrity on a dispersed storage grid

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425941A (en) * 2013-07-31 2013-12-04 广东数字证书认证中心有限公司 Cloud storage data integrity verification method, equipment and server
CN105320899A (en) * 2014-07-22 2016-02-10 北京大学 User-oriented cloud storage data integrity protection method
CN106611135A (en) * 2016-06-21 2017-05-03 四川用联信息技术有限公司 Storage data integrity verification and recovery method
RU2017115539A3 (en) * 2017-05-02 2018-11-07

Also Published As

Publication number Publication date
CN111291046A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN111291046B (en) Computer big data storage control system and method
CN112084383B (en) Knowledge graph-based information recommendation method, device, equipment and storage medium
US11481622B2 (en) Continuous learning neural network system using rolling window
CN110162414B (en) Method and device for realizing artificial intelligent service based on micro-service architecture
CN113961759B (en) Abnormality detection method based on attribute map representation learning
CN113127633B (en) Intelligent conference management method and device, computer equipment and storage medium
CN110110005B (en) Block chain-based key information basic data asset management and control method
CN112380067B (en) Metadata-based big data backup system and method in Hadoop environment
CN116664292B (en) Training method of transaction anomaly prediction model and transaction anomaly prediction method
CN115278737B (en) Data acquisition method of 5G network
CN115827816A (en) BIM component data storage and authentication method and device based on block chain
CN110060157B (en) Reputation evaluation method and system
CN112269829B (en) Block chain data management method based on resource recovery system platform
CN101286903B (en) Method for enhancing integrity of sessions in network audit field
CN111431872A (en) Two-stage Internet of things equipment identification method based on TCP/IP protocol characteristics
CN114760328A (en) Data storage method, system, electronic equipment and storage medium
CN114723454A (en) Identity recognition method and device, electronic equipment and readable storage medium
CN112487065A (en) Data retrieval method and device
CN116579337B (en) False news detection method integrating evidence credibility
CN116708708B (en) Method and system for constructing paperless conference based on distribution
CN116738009B (en) Method for archiving and backtracking data
US20230342488A1 (en) Generating and processing personal information chains using machine learning techniques
CN114764701A (en) Data processing method, device, medium and electronic equipment
Li Redundancy evaluation method of massive heterogeneous data in Internet of things based on attributes and relations
CN117592095A (en) Automatic data extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant