CN116841460A

CN116841460A - Distributed storage method based on block chain

Info

Publication number: CN116841460A
Application number: CN202310590657.7A
Authority: CN
Inventors: 郭晨俐
Original assignee: Inner Mongolia Jinmeng Technology Co ltd
Current assignee: Inner Mongolia Jinmeng Technology Co ltd
Priority date: 2023-05-24
Filing date: 2023-05-24
Publication date: 2023-10-03

Abstract

The invention discloses a distributed storage method based on a blockchain, which relates to the technical field of storage. And performing feature model analysis, serialization analysis, unified processing and dimension analysis. The characteristic model analysis and the serialization analysis are carried out on the data information in the to-be-detected set, so that the response rate of the system is improved, and the storage capacity is enlarged. Meanwhile, the data in the set to be detected is processed in a unified way, so that people can judge the data transmission condition of the storage area in time, and further the storage performance is improved; in addition, the quality of the storage area is obtained through dimension analysis to judge the capacity of the storage area and backup the storage area, so that the storage space is further expanded, and the computing capacity of the server is improved to improve the reliability of data storage.

Description

Distributed storage method based on block chain

Technical Field

The invention relates to the technical field of storage, in particular to a distributed storage method based on a blockchain.

Background

Distributed storage is to store data in a distributed manner on a plurality of storage servers, and integrate the distributed storage resources into virtual storage devices, and in fact, store data in a distributed manner around the servers. Traditional network storage systems use a single storage server to store all data, but this can present a performance bottleneck. And the distributed storage shares the storage load by using a plurality of storage servers, so that the storage and reading efficiency of the storage system is improved well.

Blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like, and is a bottom layer application technology of bit coin. Conventional information storage and communication is via a server, i.e., a database, in which all information can be found or modified.

The existing blockchain distribution storage technology is not perfect enough to be applied to campus information, when i pieces of school real-time data information are collected, characteristic information of two types of information cannot be judged, so that file name formats cannot be unified during preprocessing, subsequent data analysis is affected, and distribution storage efficiency is reduced; after unifying the manuscript format, the characteristic model analysis cannot be carried out on the manuscript due to incapability of calculating the coincidence degree and the deviation degree of the format and the content of the manuscript, so that the corresponding coincidence degree cannot be obtained, the whole distribution storage process is stopped, and the working progress is influenced;

in view of the above technical drawbacks, a solution is now proposed.

Disclosure of Invention

The invention aims at: the invention provides a distributed storage method based on a blockchain to solve the technical defects, and the method is beneficial to collecting a large amount of data information by marking the temporary storage of the data real-time information of i campuses and then sending the data real-time information to a server in a to-be-detected collection form, so that the collection time is shortened, and the data collection cost is reduced. And performing feature model analysis, serialization analysis, unified processing and dimension analysis. The characteristic model analysis and the serialization analysis are carried out on the data information in the to-be-detected set, so that the response rate of the system is improved, and the storage capacity is enlarged. Meanwhile, the data in the set to be detected is processed in a unified way, so that people can judge the data transmission condition of the storage area in time, and further the storage performance is improved; in addition, the quality of the storage area is obtained through dimension analysis to judge the capacity of the storage area and backup the storage area, so that the storage space is further expanded, and the computing capacity of the server is improved to improve the reliability of data storage.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a distributed storage method based on a blockchain is applied to a city campus, and comprises the following specific steps:

step one: marking a plurality of schools as i respectively, collecting and temporarily storing the real-time data information of the schools, compressing and packaging the temporarily stored real-time data information of the schools to generate an information packet to be detected when the data amount of the real-time data information of the schools is larger than a preset value, and acquiring the information packet to be detected in a preset period to form a set to be detected; when the number of the to-be-detected sets is equal to the preset number, the to-be-detected sets are sent to a server;

step two: after receiving the to-be-detected set, the server sorts the school real-time data information in the to-be-detected set from small to large according to the sequence number of the mark, decompresses the sorted school real-time data information in sequence and analyzes the feature model in sequence to generate corresponding superposition degree; generating a similar storage file set and an irrelevant storage file set through superposition judgment;

step three: carrying out serialization analysis on a plurality of groups of similar storage file sets and irrelevant storage file sets to dynamically generate a new storage area and an irrelevant storage area; acquiring multiple groups of newly built storage areas and irrelevant storage areas of each school, and automatically processing to generate similar storage areas and irrelevant storage areas;

step four: collecting data transmission rate fluctuation, residual quantity of data in the set to be detected and delay time of the data in the set to be detected in the process of sending the set to be detected to the server within preset time, and then carrying out unified processing to obtain data transmission quality; judging the storage condition in the data service process through the data transmission quality and carrying out corresponding processing;

step five: when the data transmission quality is in a preset quality interval, dividing a storage area into n sub-areas, respectively storing test information into the sub-areas, acquiring data transmission rate fluctuation of the sub-areas, residual quantity of the test information and delay time of the test information, uniformly processing the data transmission rate fluctuation, the residual quantity of the test information and the delay time of the test information to respectively obtain sub-area transmission quality, and when the sub-area transmission quality is smaller than a preset sub-area value, not processing the data transmission quality, otherwise, isolating the sub-areas to ensure the data transmission quality;

step six: and acquiring the number of isolated subareas in the storage area, the total data transmission quality and the number of non-isolated subareas in a preset period, performing dimension analysis to obtain the quality of the storage area, generating an operation signal when the quality of the storage area is larger than a preset storage value, and replacing hardware corresponding to the storage area.

Further, the specific steps of the characteristic model analysis process are as follows:

the method comprises the steps that i, real-time data information sent by schools comprises document information and picture information, the ordered school real-time data information is decompressed in sequence, the document information and the picture information are processed respectively, and file names of the document information are unified according to a format; then, carrying out text scanning on the picture information, placing the scanned text content into a document, inserting an original picture, comparing the information, and unifying the file names again according to the format to generate a determined manuscript so as to unify the formats of two types of data information;

extracting any two groups of determined manuscripts, and carrying out feature model analysis on the formats and contents of the two groups of determined manuscripts to generate corresponding overlapping degrees; and generating a similar storage file set and an irrelevant storage file set through superposition judgment, wherein the format of the file name is sequence number-text type-time.

Further, the specific process of generating the superposition degree through the feature model analysis is as follows:

comparing the formats and contents of the two groups of determined manuscripts respectively, generating corresponding overlap ratio and deviation ratio, and marking the overlap ratio and the deviation ratio as S1, S2, P1 and P2 respectively; obtaining the superposition degree D through a normalization formula, wherein the normalization formula is thatWherein e1 and e2 are weight factors, the weight factors make the simulation calculation of the overlapping degree D more realistic, e1+e2=1, and e1 is smaller than e2.

Further, the specific steps of the process for judging and generating the similar storage file set and the irrelevant storage file set are as follows:

acquiring a preset superposition interval [ Qa, qb ] in data storage;

(1) extracting corresponding files with D > Qbmax and constructing and generating a plurality of similar storage file sets A, wherein Qbmax is the maximum value of a preset overlapping interval; the same-class storage file set A is a same-class relation file;

(2) extracting corresponding files with Qamin being less than D and less than or equal to Qbmax and constructing and generating an irrelevant storage file set B; i.e. without a homogeneous relationship, are temporarily stored together, wherein Qamin is the minimum value of the predetermined overlap interval.

Further, the process of automatic processing and generating of the same type of storage area and irrelevant storage areas is as follows:

sequentially marking multiple groups of similar storage file sets of each school as A0, A1 and A2 … …, and sequentially marking multiple groups of irrelevant storage file sets of each school as B0, B1 and B2 … …;

the method comprises the steps of extracting a similar storage file set of A0 to construct a similar storage area, comparing the similar storage file set of A1 and A2 … … with the similar storage file set of A0, extracting standard similar storage files from the similar storage file set of A0, randomly extracting standard similar storage files from the similar storage file sets of A1 and A2 … …, acquiring the overlapping degree of two groups of similar storage files, and combining and automatically building the similar storage area when the overlapping degree is larger than a preset value;

summarizing the files in the multiple groups of irrelevant storage file sets and repeating the process of the step 3 to generate a new similar storage set and an irrelevant storage file set; and automatically building the similar storage area and the irrelevant storage area respectively through the new similar storage set and the irrelevant storage file set.

Further, the specific process of obtaining the data transmission quality through unified processing is as follows:

acquiring data transmission rate fluctuation C, residual quantity L of data in a to-be-detected set and delay time T of the data in the to-be-detected set, and obtaining data transmission quality Z in a unified manner by establishing a transmission quality model;

the data transmission rate fluctuation C, the residual quantity L of the data in the to-be-detected set and the delay time T of the data in the to-be-detected set are in positive correlation, and the larger the data transmission rate fluctuation is, the more the residual quantity of the data in the to-be-detected set is, and the longer the delay time of the data in the to-be-detected set is, the worse the data transmission quality is indicated; on the contrary, the better the data quality is, the storage condition in the data transmission process is judged according to the quality of the data transmission quality, and corresponding processing is carried out.

Further, the specific process of unifying the processing and respectively obtaining the sub-area transmission quality is as follows:

when the data transmission quality is in a preset quality interval, dividing a storage area into n sub-areas, respectively storing test information into the sub-areas and acquiring data transmission rate fluctuation c of the sub-areas, residual quantity l of the test information and delay time t of the test information, uniformly processing by a transmission quality model and respectively obtaining sub-area transmission quality z, and when the sub-area transmission quality is smaller than a preset sub-area value, not processing, otherwise, isolating the sub-areas to ensure the data transmission quality.

Further, the specific steps for obtaining the quality of the storage area by dimension analysis are as follows:

acquiring the number m of isolated subareas in a storage area, the quality Z of total data transmission and the number n of non-isolated subareas in a preset period, and analyzing by establishing a dimension model to obtain the quality Q of the storage area, wherein the number of non-isolated subareas, the number of subareas and the quality of data transmission are in a negative correlation;

the number of non-isolated subareas is in negative correlation with the number of isolated subareas and the quality of data transmission; when the number of the isolated subareas is larger and the quality of data transmission is higher, the quality of the storage area is larger, and when the quality Q of the storage area is larger than a preset storage value, an operation signal is generated, namely hardware corresponding to the storage area is replaced.

The beneficial effects of the invention are as follows:

(1) The data real-time information of i campuses is temporarily stored and then sent to the server in the form of a to-be-detected set, so that a large amount of data information can be collected, the collection time is shortened, and the data collection cost is reduced. And the accuracy and the precision of distribution storage are improved by means of feature model analysis, serialization analysis, unified processing and dimension analysis, so that the storage efficiency is improved.

(2) The real-time data information of schools in the to-be-detected set is subjected to feature model analysis and serialization analysis to divide and generate the similar storage file set and the irrelevant storage file set, so that file data are effectively divided and stored in the newly-built storage area and the irrelevant storage area respectively, and the response rate of the system is improved and the storage capacity is enlarged. The data transmission quality and the sub-region transmission quality are obtained through unified processing in the collection to be detected, so that the data transmission condition of the storage region can be judged timely, and the storage performance can be improved; in addition, the quality of the storage area is obtained through dimension analysis to judge the capacity of the storage area and backup the storage area, so that the storage space is further expanded, the computing capacity of the server is improved, and very reliable and safe data security is brought to a research institution.

Drawings

The invention is further described below with reference to the accompanying drawings;

fig. 1 shows a block diagram of a flow method of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1:

referring to fig. 1, the present invention is a distributed storage method based on blockchain, comprising the following steps:

the specific characteristic model analysis process is as follows:

the method comprises the steps that i, real-time data information sent by schools comprises document information and picture information, the ordered school real-time data information is decompressed in sequence, the document information and the picture information are processed respectively, and file names of the document information are unified according to a format; then, carrying out text scanning on the picture information, placing the scanned text content into a document, inserting an original picture, comparing the information, and unifying the file names again according to the format to generate a determined manuscript so as to unify the formats of two types of data information; if the comparison information is wrong, an early warning signal appears to prompt that the file content is inconsistent, and then a program is needed to be used for automatically scanning again and carrying out fixed-point matching with the picture so as to generate correct information.

The specific generation process of the superposition degree D is as follows:

extracting any two groups of determined manuscripts, comparing the formats and contents of the two groups of determined manuscripts respectively to generate corresponding coincidence degree and deviation degree, and marking the coincidence degree and the deviation degree as S respectively ₁ 、S ₂ 、P ₁ And P ₂ The method comprises the steps of carrying out a first treatment on the surface of the Obtaining the superposition degree D through a normalization formula, wherein the normalization formula is thatWherein e1 and e2 are weight factors, and the weight factors make the simulation calculation of the superposition degree D more trueIn fact, e1+e2=1, and e1 is smaller than e2, wherein the larger the overlap ratio is, the higher the similarity degree of the formats and the contents of the two groups of determined documents is, the easier the two files generate the same type of stored file set; on the contrary, if the similarity degree of the formats and the contents of the two groups of determined manuscripts is low, the more easily the two files generate irrelevant storage file sets; the larger the deviation degree is, the lower the similarity degree of the formats and the contents of the two groups of determined manuscripts is, the easier the two files are to generate irrelevant storage file sets; the smaller the deviation degree is, the higher the similarity degree of the formats and the contents of the two groups of determined manuscripts is, the easier the two files generate the similar storage file set;

the specific process for judging and generating the similar storage file set and the irrelevant storage file set is as follows:

acquiring a preset superposition interval [ Qa, qb ] in data storage;

(2) extracting corresponding files with Qamin being less than D and less than or equal to Qbmax and constructing and generating an irrelevant storage file set B; i.e. have no similar relationship, and are temporarily stored together, wherein Qamin is the minimum value of a preset overlapping interval;

the process of the automatic processing and generation of the specific similar storage areas and the irrelevant storage areas is as follows:

summarizing the files in the multiple groups of irrelevant storage file sets and repeating the process of the step 3 to generate a new similar storage set and an irrelevant storage file set; automatically building a similar storage area and an irrelevant storage area respectively through a new similar storage set and an irrelevant storage file set;

step four: collecting data transmission rate fluctuation, residual quantity of data in the set to be detected and delay time of the data in the set to be detected in the process of sending the set to be detected to the server within preset time, and then carrying out unified processing to obtain data transmission quality; judging the storage condition in the data service process through the data transmission quality and carrying out corresponding processing; the specific process of obtaining the data transmission quality through unified processing is as follows:

transmission quality model:

the data transmission rate fluctuation C, the residual quantity L of the data in the to-be-detected set and the delay time T of the data in the to-be-detected set are in positive correlation, the larger the data transmission rate fluctuation is, the more the residual quantity of the data in the to-be-detected set is, the longer the delay time of the data in the to-be-detected set is, the worse the data transmission quality is, the larger the data quantity is, the server performance is required to be improved, and the transmission efficiency is further improved; conversely, the better the data transmission quality; and judging the storage condition in the data transmission process according to the quality of the data transmission, and performing corresponding processing, wherein k1, k2 and k3 are all dimensionality removing formulas, the dimensionality removing formulas are used for guaranteeing the unity of units, and k1+ k2+ k3 = 2.57, and k1 is larger than k2 and larger than k3.

Step five: when the data transmission quality is in a preset quality interval, dividing a storage area into n sub-areas, respectively storing test information into the sub-areas, acquiring data transmission rate fluctuation of the sub-areas, residual quantity of the test information and delay time of the test information, uniformly processing the data transmission rate fluctuation, the residual quantity of the test information and the delay time of the test information to respectively obtain sub-area transmission quality, and when the sub-area transmission quality is smaller than a preset sub-area value, not processing the data transmission quality, otherwise, isolating the sub-areas to ensure the data transmission quality; the specific process of unifying the sub-area transmission quality obtained by the processing is as follows:

Step six: and acquiring the number of the isolated subareas in the storage area, the total data transmission quality and the number of the non-isolated subareas in a preset period, performing dimension analysis to obtain the quality of the storage area, generating an operation signal when the quality of the storage area is larger than a preset storage value, and replacing hardware corresponding to the storage area. The specific steps for obtaining the quality of the storage area by dimension analysis are as follows:

dimension model:

k4, k5 and k6 are all dimensionality removing formulas, the dimensionality removing formulas are used for guaranteeing the unity of units, and k3+k4+k5=2.57, and k4 is larger than k6 and larger than k5;

In the process, the data real-time information of i campuses is temporarily stored and then sent to a server in a form of a to-be-detected set, so that a large amount of data information is collected, the collection time is shortened, the data collection cost is reduced, the accuracy and the accuracy of distributed storage are improved through feature model analysis, serialization analysis, unified processing and dimension analysis, the storage efficiency is improved, the data real-time information of schools in the to-be-detected set is subjected to feature model analysis and serialization analysis, a similar storage file set and an irrelevant storage file set are generated through dividing, so that the file data are effectively divided, and the file data are respectively stored in a newly built storage area and an irrelevant storage area, so that the response rate of a system is improved, the storage capacity is enlarged, the data transmission quality and the subarea transmission quality are obtained through unified processing in the to-be-detected set through collection, and the data transmission condition of a storage area is timely judged, and the storage performance is improved; in addition, the quality of the storage area is obtained through dimension analysis to judge the capacity of the storage area and backup the storage area, so that the storage space is further expanded, the computing capacity of the server is improved, and very reliable and safe data security is brought to a research institution.

The above formulas are all formulas obtained by collecting a large amount of data for software simulation and selecting a formula close to the true value, and coefficients in the formulas are set by a person skilled in the art according to practical situations, and the above is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art is within the technical scope of the present invention, and the technical scheme and the inventive concept according to the present invention are equivalent to or changed and are all covered in the protection scope of the present invention.

Claims

1. A distributed storage method based on a blockchain is applied to a city campus, and is characterized by comprising the following specific steps:

2. The distributed storage method based on the blockchain as in claim 1, wherein the feature model analysis process specifically comprises the following steps:

3. The blockchain-based distributed storage method of claim 2, wherein the specific process of generating the overlapping degree through feature model analysis is as follows:

comparing the format and content of the two groups of determined manuscripts to generate corresponding overlap ratio and deviation degree, and marking the overlap ratio and deviation degree as S ₁ 、S ₂ 、P ₁ And P ₂ The method comprises the steps of carrying out a first treatment on the surface of the Obtaining the superposition degree D through a normalization formula, wherein the normalization formula is thatWherein e1 and e2 areA weight factor that makes the simulation calculation of the overlap D more realistic, e1+e2=1, and e1 is smaller than e2.

4. A blockchain-based distributed storage method as in claim 3, wherein the process of determining and generating the homogeneous set of storage files and the unrelated set of storage files is as follows:

acquiring a preset superposition interval [ Qa, qb ] in data storage;

(2) extracting corresponding files with Qamin less than D and less than or equal to Qbmax and constructing and generating an irrelevant storage file set B; i.e. without a homogeneous relationship, are temporarily stored together, wherein Qamin is the minimum value of the predetermined overlap interval.

5. A blockchain-based distributed storage method as in claim 3, wherein the process of automated processing generation of homogeneous storage areas and irrelevant storage areas is as follows:

6. The distributed storage method based on blockchain as in claim 3, wherein the specific process of unifying the data transmission quality is as follows:

7. The distributed storage method based on the blockchain as in claim 4, wherein the specific process of unifying the sub-area transmission quality obtained respectively is as follows:

8. The distributed storage method based on the blockchain as in claim 5, wherein the specific steps of obtaining the quality of the storage area by dimension analysis are as follows: