CN111447275B

CN111447275B - Storage system and storage device

Info

Publication number: CN111447275B
Application number: CN202010225029.5A
Authority: CN
Inventors: 高进福; 陈恭祥
Original assignee: Shenzhen Zhongsheng Ruida Technology Co ltd
Current assignee: Shenzhen Zhongsheng Ruida Technology Co ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2021-01-01
Anticipated expiration: 2040-03-26
Also published as: CN111447275A

Abstract

The present invention provides a storage system comprising: the receiving module is used for receiving the storage data sent by the client; the establishing module is used for establishing storage blocks which are in one-to-one correspondence with the client based on the distributed file system to store the storage data, and each storage block is provided with a metadata server in one-to-one correspondence; the metadata server is used for caching and synchronously distributing metadata and establishing an incidence relation between the metadata and corresponding storage data; the cluster server is used for receiving the access request sent by the client, synchronizing the access request to each storage block and establishing a qualified access cluster of each storage block; the cluster server is also used for performing domain storage on the storage data in the storage block according to the established association relation and realizing effective data distribution according to qualified access clusters. The storage data are effectively stored by the storage blocks based on the distributed file system and the established association relation and qualified access cluster, and the safety and reliability of the corresponding storage data are ensured.

Description

Storage system and storage device

Technical Field

The present invention relates to the field of data storage technologies, and in particular, to a storage system and a storage device.

Background

The distributed file system is composed of a metadata server, a data server and a client. Generally, a file operation first performs a metadata operation and then a file data operation, and Ceph is an open-source unified distributed storage system and one of the most mainstream open-source storage items at present. The Ceph has outstanding advantages, for example, three storage access modes of an object/block/file system can be provided, and various application requirements are met; the distributed file system supports PB-level and above data storage, multi-backup, no central structure, no single point failure, good expandability and the like, but in the process of using the distributed file system and performing data storage based on storage blocks, a random phenomenon, such as overlapping storage, low safety and reliability of the stored data and the like, may occur when the storage blocks store the stored data.

Disclosure of Invention

The invention provides a storage system which is used for realizing the effective storage of storage data by a storage block and ensuring the safety and reliability of the corresponding storage data through a qualified access cluster based on a distributed file system and an established incidence relation.

An embodiment of the present invention provides a storage system, including:

the receiving module is used for receiving the storage data which is sent by the client and needs to be stored;

the establishing module is used for establishing storage blocks which are in one-to-one correspondence with the client based on a distributed file system to store the storage data, and each storage block is provided with a metadata server in one-to-one correspondence;

the metadata server is used for caching and synchronously distributing metadata and establishing an incidence relation between the metadata and corresponding stored data;

the cluster server is used for receiving the access request sent by the client, synchronizing the access request to each storage block and establishing a qualified access cluster of each storage block;

the cluster server is also used for performing domain storage on the storage data in the storage block according to the established association relation and realizing effective data distribution according to the qualified access cluster.

In one possible implementation manner, the method further includes:

the first recording module is used for recording the meta time of the metadata server for caching and synchronously distributing the metadata and the size of the meta file corresponding to the meta time and obtaining the meta speed;

the second recording module is used for recording the writing time of writing the data stream corresponding to the metadata into the cluster server and the size of the written file to obtain the writing speed;

the judging module is used for sending a first alarm warning when the meta speed recorded by the first recording module is greater than the writing speed recorded by the second recording module and the absolute value of the difference between the meta speed and the writing speed is greater than a preset difference;

and when the meta speed recorded by the first recording module is less than the writing speed recorded by the second recording module and the absolute value of the difference between the meta speed and the writing speed is less than a preset difference, sending a second alarm.

In one possible way of realisation,

the cluster server is further used for performing positive and negative classification processing on the storage data before the storage data are stored in the storage block, and performing attribute classification processing on the storage data subjected to the positive and negative classification processing according to the data attributes of the storage data;

the cluster server is further configured to temporarily store the storage data after the attribute classification processing into storage nodes based on a distributed file system, and simultaneously send a target request instruction to a corresponding storage block according to the node attribute of the storage node to store the storage data into the corresponding storage block.

In one possible way of realisation,

the cluster server is further used for acquiring network data of a target network corresponding to the storage data before the storage data is subjected to the correct and wrong classification processing based on a set time axis;

the cluster server is further configured to mark the storage data at a first position of a mark axis, mark the acquired network data at a second position of the mark axis, determine whether an overlapping area exists between the first position and the second position based on a pre-established mark set, and if not, determine whether the first position is in a target area of the mark axis;

if so, keeping the first position unchanged;

otherwise, moving the first position and the storage data corresponding to the first position to an idle sub-area of the target area;

simultaneously, determining whether the second location is in a network region of the landmark axis;

if so, keeping the second position unchanged;

otherwise, moving the second position and the network data corresponding to the second position to an idle sub-area of the network area;

the cluster server is further configured to, when the first position and the second position exist in a first overlapping area and the first overlapping area is in the target area, perform data separation processing on the overlapping storage data and the overlapping network data in the first overlapping area, and determine all first network attributes of the overlapping network data after the data separation processing;

obtaining a second network attribute by performing priority sequencing on all the first network attributes, and marking the overlapped network data corresponding to the second network attribute in an idle sub-area of a network area of the marking axis;

the cluster server is further configured to, when the first location and the second location are in a second overlapping area and the second overlapping area is in the network area, perform data separation processing on the overlapping storage data and the overlapping network data in the second overlapping area, and determine all first target attributes of the overlapping storage data after the data separation processing;

and obtaining a second target attribute by performing priority sequencing on all the first target attributes, and marking the overlapped network data corresponding to the second target attribute in an idle sub-area of the network area of the marking axis to obtain a final marking axis.

In one possible implementation manner, the method further includes:

the first diagnosis module is used for carrying out pre-diagnosis on the temporary storage node in the distributed file system;

the cluster server is used for judging whether the temporary storage node is qualified or not according to the diagnosis processing result of the first diagnosis module;

if so, reserving the temporary storage node and continuing to store and use;

otherwise, carrying out preset cutting processing on the storage mapping area corresponding to the temporary storage node, and verifying the storage performance of each sub-cutting area after the cutting processing;

if the storage performance of the sub-cutting area meets a preset standard, reserving the sub-cutting area, and carrying out first marking;

if the storage performance of the sub-cutting area does not meet the preset standard, carrying out second marking on the sub-cutting area;

the cluster server is further configured to determine a total storage capacity of all sub-areas in the temporary storage node, where the first labeling is performed, and determine whether the total storage capacity is smaller than a target capacity of corresponding storage data to be stored;

if yes, searching a first storage node, of the remaining storage nodes in the distributed file system, of which the value related to the temporary storage node is greater than a preset value and the capacity value is greater than the target capacity of the corresponding to-be-stored storage data;

if the first storage node exists, establishing a storage relation between the first storage node and corresponding storage data to be stored, and storing;

otherwise, searching a second storage node with the correlation value of the temporary storage node larger than a preset value, and storing the corresponding storage data to be stored into the temporary storage node and the second storage node.

In one possible implementation manner, the method further includes:

a second diagnosis module for diagnosing virus data in the stored data received by the receiving module;

the cluster server is further configured to determine a virus type of the virus data when the second diagnostic module diagnoses that the virus data exists, clear the corresponding virus data according to the virus type based on a pre-stored virus cleaning database, and perform subsequent operations on the stored data after virus clearing;

and when the second diagnosis module does not diagnose the virus data, performing subsequent operation on the stored data.

In one possible implementation manner, the method further includes:

the positioning module is used for positioning the storage position of the virus data when the cluster server clears the corresponding virus data according to the virus types;

the processing module is used for classifying virus types of the virus data of the storage position positioned by the positioning module and marking the virus data of the same type with significance;

and selecting the same type of virus data marked with significance to perform virus cleaning according to the pre-stored virus attribute values, and selecting the same type of virus data marked with next significance to perform virus cleaning after the virus cleaning is finished until the cleaning is finished.

In one possible way of realisation,

the cluster server receives an access request sent by the client, synchronizes the access request to each storage block, and meanwhile, in the process of establishing a qualified access cluster of each storage block, in order to ensure the security of the established qualified access cluster, the security of the storage block for synchronously receiving the access request needs to be verified, wherein the verification step comprises:

step 1: determining the access level of the access request, simultaneously extracting an encryption algorithm set randomly from a mixed algorithm database according to the access level to encrypt the access request, and simultaneously setting a corresponding decryption algorithm set in each storage block based on a random extraction result;

step 2: establishing transmission links between the client sending the access request and the storage block receiving the access request, and determining a security value F1 of each transmission link;

wherein P represents the ith barThe attack probability of the transmission link; q represents the quality factor of the ith transmission link; l_iIndicating the link length of the ith transmission link, wherein i is 1,2,3.. n; max { Δ l } represents the maximum link length difference between the ith transmission link and the other transmission links; chi shape_iA link length correction factor representing the ith transmission link;

determining the link level of the transmission link corresponding to the security value F1 based on the link security level table;

and step 3: determining the decryption efficiency of each storage block to the received access request after the encryption processing based on the decryption algorithm set in the step 1;

and 4, step 4: determining the security level of each storage block for synchronously receiving the access request according to the link level and the corresponding decryption efficiency;

wherein, D is_iIndicating a link level of an ith transmission link; v. of_iRepresenting the decryption efficiency of the ith memory block; beta is a_iA threatened factor representing an ith memory block; q. q.s_iRepresenting the quality factor of the ith storage block;

and counting the access requests corresponding to the storage blocks with the security level F2 being greater than or equal to the preset level to obtain the qualified access cluster corresponding to each storage block.

An embodiment of the present invention provides a storage apparatus, including:

a storage medium and a storage system as claimed in any preceding claim.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a block diagram of a storage system in an embodiment of the invention;

FIG. 2 is a block diagram of a token set in an embodiment of the invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

An embodiment of the present invention provides a storage system, as shown in fig. 1, including:

The client can be a smart phone end, a computer end and the like, and the corresponding storage data, such as commercial confidential information and the like;

the incidence relation between the metadata and the corresponding stored data is established because the metadata server is a control flow, the storage block and the like are data flows, and the relation between the control flow and the data flows needs to be established, for example, a section of data flow data can be controlled through a control flow a;

synchronizing the access request to each storage block, wherein the step one is to determine the secure connection relationship between each storage block and the access request, and the step two is to determine whether the access request can be used as one of the corresponding qualified access clusters in the storage block;

for example, each client has a storage block corresponding to each client, first, corresponding storage data is stored in the storage block, according to the association relationship, the storage data is stored in the sub-domain of the corresponding storage block, and according to an eligible access cluster, such as an access comprehensive security level, effective data distribution is performed on the data in the sub-domain, for example, the higher the comprehensive security level is, the higher the access data security level is, the more efficient random distribution is performed on the effective data in the sub-domain, and the eligible access cluster is used for accessing some data with high security, and is used as another access request for removing the eligible access cluster, which can be regarded as a general access request, and the accessed data is public data.

The beneficial effects of the above technical scheme are: based on the distributed file system and the established association relation and qualified access cluster, the storage blocks can effectively store the stored data, and the safety and reliability of the corresponding stored data are ensured.

An embodiment of the present invention provides a storage system, which is characterized by further including:

The meta-speed may be obtained by caching and synchronizing the meta-file size/meta-time.

Similarly, the writing speed may be the size of the written file/the writing time.

The preset difference absolute value is obtained by subtracting the speeds of the two according to a certain standard, and the preset difference is correspondingly determined according to the standard of the distributed file system.

The first alarm may be: the cluster server has abnormal writing speed;

the second alarm may be: the metadata server is abnormal in metadata speed.

The beneficial effects of the above technical scheme are: by comparing the speed of the metadata server with that of the cluster server, timely alarming is facilitated, the efficiency of establishing an incidence relation and accessing the cluster in a qualified mode is facilitated, the storage safety is improved, and the storage efficiency is improved.

The embodiment of the invention provides a storage system, wherein the cluster server is further used for performing correct and incorrect classification processing on the storage data before the storage data is stored in a storage block, and performing attribute classification processing on the storage data subjected to the correct and incorrect classification processing according to the data attribute of the storage data;

The above-mentioned correct and wrong classification processing is to judge the correct and wrong of the stored data and classify the stored data into correct data and wrong data;

meanwhile, attribute classification processing is respectively carried out on the correct data and the error data, wherein the attribute classification processing is carried out according to the access fields of users, such as the access financial field and the access education field;

after the correct data is subjected to attribute classification, the attribute data 1, the attribute data 2, and the attribute data 3 are obtained, and the attribute data are stored in the corresponding storage node 1, the storage node 2, and the storage node 3 according to the node attribute (for example, according to the access domain).

The above-mentioned sending of the target request instruction to the corresponding storage block according to the corresponding node attribute, for example, only transmitting correct data, is to transmit the correct data to the storage block for storage.

Because one client corresponds to one storage block, after attribute classification processing is carried out on storage data transmitted by the same client, the obtained correct data is subjected to field labeling and is transmitted to the storage block corresponding to the domain labeling in a centralized manner.

The beneficial effects of the above technical scheme are: the attribute classification processing is carried out on the storage data, so as to judge the field to which the storage data belongs, and meanwhile, the correct-error judgment is carried out, so as to ensure the accuracy of the data to be stored by the storage block, further realize the effective storage of the storage data by the storage block, and improve the foundation for subsequently improving the access to the effectively stored data.

Embodiments of the present invention provide a storage system,

if so, keeping the first position unchanged;

if so, keeping the second position unchanged;

The above-mentioned obtaining the corresponding network data based on the time axis is to ensure the reliability of obtaining the stored data, and also to avoid the error of the correct and incorrect judgment of the stored data caused by the problem of the network data;

the mark shaft is similar to a storage axis, the storage data and the network data are temporarily stored on the upper edge of the mark shaft, and the first position of the storage data and the second position of the network data are determined, so that the transmitted storage data can be conveniently determined according to the network data;

as shown in fig. 2, the flag set is an axis a, where a region a1 and a region a2 of the flag are included, a region a1 represents a region where network data should be temporarily stored, and a region a2 represents a region where stored data should be temporarily stored;

the corresponding first and second positions are part of the axis a thereof;

when the first position and the second position do not have the overlapping area, the first position and the second position are in the independent area, and at this time, the areas where the first position and the second position are located are quickly located based on the independent area, and whether the first position and the second position are in the correct area is determined, so that the data processing efficiency is facilitated;

and the first location and the second location are not in their correct areas, move their locations and their data to the corresponding free sub-areas, e.g., network data in area a2, to a free sub-area in area a 1;

when the first position and the second position exist in the overlapping area, the first position and the second position are covered by data, the covered data are distinguished in the data area and marked in the corresponding area, the analysis capability of the data can be further improved, and convenience is brought to subsequent data storage;

the first overlapping area a3 is a target area, that is, an area a2, and is used for performing data separation processing on overlapping storage data and overlapping network data in the first overlapping area, that is, separating network data by keeping the storage data unchanged at the position, and acquiring a second network attribute by determining all first network attributes of the separated network data, such as network throughput, storage speed, storage data for transmission, and by prioritizing all first network attributes, so as to acquire representative network data and correspondingly move the representative network data to the area a 1; when the second overlapping area a4 is in a network area, the principle is similar to that described above, and thus the description thereof is omitted.

The final flag axis is used to perform the correct/incorrect classification processing and the attribute classification processing on the stored data existing in the final flag axis.

The beneficial effects of the above technical scheme are: the first position and the second position of the storage data and the network data are marked, and the first position and the second position are adjusted, so that the storage data are managed conveniently, convenience is provided for subsequent data storage, and the accuracy of the storage data is ensured.

The embodiment of the invention provides a storage system, which further comprises:

if so, reserving the temporary storage node and continuing to store and use;

The pre-diagnosing of the temporary storage node generally determines whether the temporary storage node has the capability of identifying data, if so, the temporary storage node is qualified, otherwise, the temporary storage node is unqualified;

performing preset cutting processing on the storage mapping region of the temporary storage node, for example, after performing preset cutting processing on the storage node b, obtaining a region b1, a region b2, a region b3, and the like;

the preset cutting processing is performed according to the set region bytes of the memory mapping region, so that the damage of the storage nodes caused by random cutting is avoided;

the storage performance may be whether the area storage capacity of the sub-cutting area is damaged, that is, whether the area storage capacity has a capacity of storing data, where the corresponding preset standard refers to a capacity index of storing data, for example, data such as 10 ten thousand bytes is stored in 1 s;

the first storage node can replace a temporary storage node, and is convenient for storing the stored data at one time, so that the integrity of the data is ensured, and the data calling efficiency is improved;

when the first storage node does not exist, the second storage node is searched to store the storage data into the temporary storage node and the second storage node, so that the calling efficiency of the same attribute is ensured.

The correlation value is determined according to the data attribute, and the storage relationship means that the target data can be stored in the first storage node, for example.

The beneficial effects of the above technical scheme are: through carrying out first mark and second mark, be for carrying out the accurate division with the sub-cutting region of node of keeping in, through predetermineeing the cutting processing, be convenient for to the cutting of memory mapping region, avoid cutting at will, lead to the node of keeping in to be damaged, avoid follow-up in-process that will save the data storage to the storage piece to bring the error, through the total storage capacity of the subregion of confirming first mark, and through looking for first storage node and second storage node, be for the convenience effectively to save data.

The virus data is generally Trojan horse virus, network virus, computer virus and the like.

The beneficial effects of the above technical scheme are: by determining the virus data and cleaning the virus data, convenience is provided for subsequent storage of the stored data, and the risk of data storage failure is reduced.

in one possible implementation manner, the method further includes:

The virus data is generally stored in some disks;

the significance labeling is performed to facilitate the determination of the same type of virus;

the virus attribute value is used for determining the cleaning sequence of different types, and the higher the virus toxicity is, the higher the corresponding cleaning sequence is.

The beneficial effects of the above technical scheme are: by determining the position of the virus, firstly cleaning the same type of virus and then post-processing other types of virus, the virus processing conversion mechanism is reduced, the cleaning efficiency is improved, and the loss of the stored data caused by virus data can be effectively reduced.

The embodiment of the present invention provides a storage system, where the cluster server receives an access request sent by the client, synchronizes the access request to each storage block, and meanwhile, in a process of establishing a qualified access cluster for each storage block, in order to ensure the security of the established qualified access cluster, it is necessary to verify the security of the storage block that synchronously receives the access request, and the verification step includes:

wherein, P represents the attacked probability of the ith transmission link; q represents the quality factor of the ith transmission link; l_iIndicating the link length of the ith transmission link, wherein i is 1,2,3.. n; max { Δ l } represents the maximum link length difference between the ith transmission link and the other transmission links; chi shape_iA link length correction factor representing the ith transmission link;

The beneficial effects of the above technical scheme are: the random encryption and decryption algorithm is set for the access requests and the corresponding storage blocks, so that the encryption and decryption flexibility and the security of the access requests are improved, the security level of each access request is determined by determining the link security levels of different transmission links and the decryption efficiency of different storage blocks, qualified access requests are determined by the security level of each access request, qualified access clusters are constructed, and the security and the reliability of the storage blocks corresponding to the safe access are improved.

An embodiment of the present invention provides a storage device, including:

a storage medium and a storage system as claimed in any preceding claim.

The storage medium refers to a carrier for storing data. Such as a floppy disk, an optical disk, a DVD, a hard disk, a flash Memory, a U disk, a CF card, an SD card, an MMC card, an SM card, a Memory Stick (Memory Stick), an xD card, etc.

The beneficial effects of the above technical scheme are: the position accuracy of the stored data is improved, and the efficiency of calling the stored data is improved.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A storage system, comprising:

the cluster server is also used for performing domain storage on the storage data in the storage block according to the established association relation and realizing effective data distribution according to the qualified access cluster;

wherein, D is_iTo representLink level of the ith transmission link; v. of_iRepresenting the decryption efficiency of the ith memory block; beta is a_iA threatened factor representing an ith memory block; q. q.s_iRepresenting the quality factor of the ith storage block;

2. The storage system of claim 1, further comprising:

3. The storage system of claim 1,

4. The storage system of claim 3,

if so, keeping the first position unchanged;

if so, keeping the second position unchanged;

5. The storage system of claim 3, further comprising:

if so, reserving the temporary storage node and continuing to store and use;

6. The storage system of claim 1, further comprising:

7. The storage system of claim 6, further comprising:

8. A storage device, comprising:

a storage medium and a storage system as claimed in any one of the claims 1-7.