CN110636122A - Distributed storage method, server, system, electronic device, and storage medium - Google Patents
Distributed storage method, server, system, electronic device, and storage medium Download PDFInfo
- Publication number
- CN110636122A CN110636122A CN201910857800.8A CN201910857800A CN110636122A CN 110636122 A CN110636122 A CN 110636122A CN 201910857800 A CN201910857800 A CN 201910857800A CN 110636122 A CN110636122 A CN 110636122A
- Authority
- CN
- China
- Prior art keywords
- cluster
- storage
- file
- weight coefficient
- writing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1044—Group management mechanisms
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention relates to the technical field of computers, and discloses a distributed storage method, a server, a system, electronic equipment and a storage medium. In the invention, a writing request of a file is received; acquiring a weight coefficient of each cluster in a storage cluster array; generating a weight coefficient according to the current storage resource of the cluster; determining a target cluster for writing in a file by utilizing a Hash algorithm according to the weight coefficient of each cluster; and writing the file into the target storage sub-cluster. Therefore, when the storage capacity needs to be enlarged, the reduction of the read-write performance of the storage cluster caused by data redistribution is avoided under the condition that the file uploading and downloading efficiency is not influenced, the performance experience of a user on the reading and writing of the cloud file is improved, and meanwhile, the operation and maintenance efficiency of later-stage operation and maintenance personnel on the cloud storage system can be improved by taking the cluster as a unit for capacity expansion through the whole framework.
Description
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a distributed storage method, a server, a system, electronic equipment and a storage medium.
Background
The cloud storage is a new concept extended and developed from a cloud computing concept, and mainly provides a system for data storage and service access functions together through the cooperative work of a large number of different types of storage devices in networks such as cluster application, network technology or distributed file systems through application software, so that the safety of data is ensured, and the storage space is saved. Common distributed storage systems at present comprise a GFS Google file system, a Lustre parallel distributed file system, a Ceph distributed file system, a GlusterFS network file system and the like. The Open-source Ceph is used as a reliable, extensible, uniform and distributed storage system solution, and is particularly driven by an Open Stack Open-source cloud computing management platform, so that the Ceph is pursued by various internet companies once entering the industry.
The resource scheduling method based on the Ceph distributed cluster provided by the prior art is mainly based on a single Ceph distributed cluster, and resource scheduling is performed in a space time-changing mode through combining writing of a HDD hard disk drive, increasing of SSD solid state drive cache and the like. For example, the read-write performance of a single Ceph cluster is optimized by adjusting a strong consistency scheme of read-write of the single Ceph distributed cluster, and adjusting the number of copies and a selection algorithm of distributed nodes.
However, in the course of implementing the present invention, the inventors found that: the prior art can only aim at performance experience of a single Ceph distributed cluster user during reading and writing, and does not consider how to expand resources on the premise of ensuring the user reading and writing performance experience when the single Ceph distributed cluster storage resources are low.
Disclosure of Invention
The embodiment of the invention aims to provide a distributed storage method, a server, a system, electronic equipment and a storage medium, so that when the storage capacity needs to be enlarged, the reduction of the read-write performance of a storage cluster caused by the expansion of cluster resources is avoided under the condition that the file uploading and downloading efficiency is not influenced, the performance experience of a user on the read-write of a cloud file is improved, meanwhile, the whole architecture takes the cluster as a unit for capacity expansion, and the operation and maintenance efficiency of later-stage operation and maintenance personnel on the cloud storage system can also be improved.
In order to solve the above technical problem, an embodiment of the present invention provides a distributed storage method, including: receiving a write request of a file; acquiring a weight coefficient of each cluster in a storage cluster array; generating a weight coefficient according to the current storage resource of the cluster; determining a target cluster for writing in a file according to the weight coefficient of each cluster; and writing the file into the target cluster.
An embodiment of the present invention further provides a server, including: the request receiving module is used for receiving a reading or writing request of a file; the computing module is used for acquiring a weight coefficient of each cluster in the storage cluster array and determining a target cluster for writing in a file by utilizing a Hash algorithm according to the weight coefficient; and the writing module is used for writing the file into the target cluster.
Compared with the prior art, the embodiment of the invention has the advantages that a plurality of distributed storage clusters form a storage cluster array, and the corresponding weight coefficients are set for the clusters according to the current storage resources of the clusters, so that the files are distributed to the clusters in a random probability manner by taking the weight coefficients as the random coefficients, the expansion or scheduling of the storage resources can be carried out on the basis of not influencing the read-write performance experience of users, and the operation and maintenance efficiency of later-stage operation and maintenance personnel on the whole distributed storage system can be improved.
In addition, the value of the weight coefficient comprises a default weight coefficient; the default weight coefficient is generated by a database management system in real time according to the current residual storage capacity of the cluster or the network speed of a machine room where the cluster is located; the default weight factor is positively correlated with the remaining storage capacity or the network speed. By means of the method, the weight coefficient can be produced according to the current residual storage capacity of the cluster, and the write-in strategy of the file can be adjusted by considering the influence of the network speed of the computer room on the read-write speed of the cloud storage file, so that the use experience of a user is better.
In addition, the weight coefficient further comprises a user weight coefficient; the user weight coefficient is set by a user according to actual conditions and is stored in the database management system; and when the weight coefficient of each cluster in the storage cluster array is obtained, the user weight coefficient is preferentially obtained. By means of the method, operation and maintenance personnel can perform user-defined setting on the distribution of the files needing to be stored currently according to actual conditions, so that the utilization of the storage space is more flexible, and more scenes can be dealt with.
In addition, the storage cluster array comprises a main cluster array and a standby cluster array; each cluster in the main cluster array and the standby cluster array respectively corresponds to each other; the target cluster is a cluster in the main cluster array. And when the target cluster fails, writing the file into a cluster in the standby cluster array corresponding to the target cluster. By such means, the reliability of the whole storage system can be enhanced, and the system can stably and reliably operate when accidents such as power failure of a machine room occur.
In addition, generating index information consisting of the identification of the file and the identification of the target cluster; storing the index information into an index cluster; wherein the index cluster is built from high-speed storage media. An index cluster is built through a high-speed medium, so that the response speed of a server to a user when the user reads a cloud storage file can be greatly improved, and the user experience is remarkably improved.
Drawings
One or more embodiments are illustrated by the corresponding figures in the drawings, which are not meant to be limiting.
Fig. 1 is a flowchart of a distributed storage method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a distributed storage method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a distributed storage method according to a third embodiment of the present invention;
FIG. 4 is a block diagram of a storage cluster array according to a third embodiment of the present invention;
fig. 5 is a block diagram of a server according to a fourth embodiment of the present invention;
fig. 6 is a structural diagram of a distributed storage system according to a fifth embodiment of the present invention;
fig. 7 is a block diagram of an electronic device according to a sixth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The first embodiment of the invention relates to a distributed storage method which is applied to a server. In the present embodiment, a write request of a file is received; acquiring a weight coefficient of each cluster in a storage cluster array; generating a weight coefficient according to the current storage resource of the cluster; determining a target cluster for writing in a file according to the weight coefficient of each cluster; and writing the file into the target cluster. The distributed storage clusters form a storage cluster array, and corresponding weight coefficients are set for the clusters according to the current storage resources of the clusters, so that files are distributed to the clusters in a random probability mode by taking the weight coefficients as the clusters, expansion or scheduling of the storage resources can be performed on the basis of not influencing the read-write performance experience of users, and meanwhile, the operation and maintenance efficiency of operation and maintenance personnel on the whole distributed storage system in the later period can be improved.
The following describes implementation details of the distributed storage method of the present embodiment in detail, and the following is provided only for easy understanding and is not necessary for implementing the present embodiment.
As shown in fig. 1, the distributed storage method in this embodiment specifically includes:
Specifically, a user first needs to send a file uploading request to a proxy server based on a hypertext transfer protocol HTTP of a cluster through a client, and establish communication connection with the proxy server. The request received by the proxy server includes the ID of the file. The client is a native client of Ceph under various programming languages.
In this embodiment, the cluster is a cluster built based on a Ceph distributed file system, and the storage cluster array is a cluster array composed of a plurality of Ceph distributed file system clusters. When the storage space of the cluster array needs to be expanded, the expansion is performed in units of one cluster.
Specifically, for the purpose of reasonably writing files into the distributed storage clusters and fully and uniformly utilizing cluster storage resources, each cluster has a weight coefficient generated according to the specific situation of the current storage resource. The storage resources include: the current remaining storage space of the cluster and the current network speed of the server room in which the cluster is located.
In specific application, the current storage resource information of each cluster is stored in a small relational database management system MySQL, the system can dynamically update data in real time, and the system generates the current weight coefficient of each cluster according to the current storage resource information of each cluster. After receiving the file request, the proxy server immediately requests the weight coefficient of each current cluster from MySQL and caches the weight coefficient in the server.
An example is illustrated:
cluster identification | Current remaining storage capacity beta | |
1 | 20% | 0.4 |
2 | 20% | 0.4 |
3 | 10% | 0.2 |
As shown in the above table, a storage cluster array includes three Ceph clusters with equal total storage space, but the current remaining storage capacity of each cluster is not equal, and when the operation and maintenance staff expect that the file written into the storage cluster array next time can make the space utilization rate of each cluster reach uniform distribution, we can set a simple formula for the process of generating the current weight coefficient of each cluster by MySQL:
based on the above example, when the storage space of the cluster array needs to be expanded, a new cluster "cluster 4" is added, the remaining storage capacity of the cluster is 100%, and therefore the weighting factor is given
In another example, when the total storage space of each cluster in a storage cluster array is not equal, the operation and maintenance personnel still expect that the written files can be uniformly distributed in each cluster, and the calculation can be performed according to the current remaining specific capacity of each cluster, instead of the ratio of the remaining storage capacity to the total storage capacity in the above example.
In another example, the plurality of Ceph clusters in this embodiment are deployed in different computer rooms, so that the reliability of the entire storage system is enhanced, and sudden reduction in the file read-write performance due to physical factors is avoided. For example: if network blocking occurs in a machine room where a certain cluster is located, that is, the data read-write performance of the cluster is reduced, MySQL can also reduce the weight coefficient of the cluster according to other rules or formulas set by the user, so that the user can have good performance experience on file read-write, and the maintenance burden of operation and maintenance personnel is reduced.
And 103, determining a target cluster for writing in the file by using a random distribution algorithm according to the weight coefficient of each cluster.
And 104, writing the file into the target cluster.
In this embodiment, the random distribution algorithm is a hash algorithm. Firstly, a hash value table is preset, wherein the hash value table comprises two types of data of hash values and nodes, and each hash value corresponds to one node. Based on the foregoing example, each cluster corresponds to a different number of nodes, where the number of nodes is proportional to the magnitude of the cluster weight coefficient. Assuming a scaling factor of 100, cluster 1 corresponds to 40 nodes, cluster 2 corresponds to 40 nodes, and cluster 3 corresponds to 20 nodes. If there is a new cluster, then 100 nodes corresponding to the new cluster are directly added. The algorithm takes a file ID as an input parameter to obtain a hash value, then obtains a cluster number in a table look-up mode, and takes the cluster number as an output result. By combining the random distribution algorithm with the weighting algorithm, the distribution of the written files on the storage space can be ensured to be in accordance with the weighting coefficients of each cluster on the whole.
Specifically, a file ID in the file write request is input into the hash algorithm, the algorithm returns an address, which is the number of the cluster used for writing the file, and then a data channel between the client and the Ceph cluster is established, and the file is transmitted and written into the corresponding cluster. Since the hash algorithm is a random algorithm, MySQL must monitor the storage capacity of each cluster in real time and dynamically generate a weight coefficient for each cluster to ensure the performance of the file when writing to the storage medium, and at the same time ensure that the capacity of each cluster is sufficient to accommodate the file.
The above examples in the present embodiment are for convenience of understanding, and do not limit the technical aspects of the present invention.
Compared with the prior art, the embodiment integrates a plurality of Ceph distributed storage clusters into a Ceph distributed storage cluster array, so that the problem that the storage resource expansion can not be performed when the storage resource of a single Ceph distributed storage cluster is insufficient can be solved, and meanwhile, because the expansion of the single Ceph distributed storage cluster is not needed, the data rebalancing operation does not need to be performed inside the single Ceph distributed storage cluster, the effect that the storage resource expansion user cannot sense is achieved, and the use experience of the user is improved; moreover, since the weighting weight is set for each Ceph distributed storage cluster in advance, when a file is written, only hash calculation needs to be performed according to the file ID and the weighting weight of each Ceph distributed storage cluster to obtain a specific write address of the file, and the response speed of the distributed storage system and the balance of the distribution of the file in the cluster are further improved.
A second embodiment of the present invention relates to a distributed storage method, and a flow is shown in fig. 2, where the method includes:
In this embodiment, the operation and maintenance staff can set the weight coefficient of each cluster according to the actual needs and store the weight coefficient in MySQL. When the weight coefficient generated by the MySQL and the user weight coefficient set by the user exist in the MySQL at the same time, the proxy server obtains the user weight coefficient as the weight coefficient used by the algorithm.
In practical application, the file volumes uploaded to the cloud by users have large differences, and the volume of a large file may be hundreds of times that of a small file. Because each cluster is built by a storage medium with a small capacity, the residual storage capacity of one cluster may not be enough to store a file with a large volume, so that a proper strategy needs to be set to ensure that the storage space can be reasonably used, and at the moment, a user can set weight coefficients of a plurality of schemes to adapt to files with various sizes. For example, when the remaining space capacity of cluster 1 in clusters 1, 2, and 3 is low, when a write request for a large file is received, a weighting factor scheme for the large file is adopted.
Specifically, a cluster weight coefficient scheme for the large-volume file is set, that is: the weight coefficient of the cluster which can not store the large file is set to be 0, and then the weight coefficient is reasonably distributed to each cluster with the capacity enough to store the file according to the conventional rule. The weight coefficient scheme for small volume files, namely: the weighting factors are normally assigned to all clusters according to the storage resources of each cluster.
In practical applications, the files can be divided into N classes according to the file volumes, where N is a natural number greater than 2, so as to make the use of the storage capacity more reasonable. When a file writing request is received by the proxy server, the volume size of the file is firstly obtained, and a proper weight coefficient scheme is obtained according to the volume size.
In another example, the read-write rate of the storage medium is decreased due to the increase of the space utilization rate, so that the operation and maintenance personnel can set a space utilization rate threshold value for each storage cluster according to the actual situation, and when the space utilization rate of a certain cluster reaches the threshold value, the weight coefficient of the cluster is automatically set to 0, so as to ensure that the read-write rate of each cluster is kept above the read-write rate level acceptable to a user.
In another example, when the storage cluster array adds a new cluster due to capacity expansion, the operation and maintenance personnel may set all the weighting coefficients of the old cluster to 0, and then equally distribute the weighting coefficients to the new clusters, so as to quickly achieve the purpose of storing data in the clusters in a balanced manner, thereby making the storage space in the storage cluster array more reasonable in utilization.
And step 204, determining a target cluster for writing in the file by using a random distribution algorithm according to the weight coefficient of each cluster. This step is similar to step 103 in the first embodiment of the present invention, and is not described herein again.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
Compared with the prior art, the operation and maintenance personnel can set the weight coefficients of different schemes according to actual conditions and perform custom setting on the distribution of the files needing to be stored currently, so that the storage space is more flexibly utilized, more scenes can be dealt with, and the user performance experience is improved.
A third embodiment of the present invention relates to a distributed storage method, and a flow is shown in fig. 3, where the method includes:
Specifically, as shown in FIG. 4, the entire storage cluster array includes a primary cluster array and a backup cluster array. Wherein each cluster in the main cluster array has a corresponding standby cluster in the standby cluster array. When the agent server determines the cluster identifier of the target cluster through the ID of the file to be written by using an algorithm, firstly, a write request is sent to the cluster in the main cluster array corresponding to the cluster identifier, then, the cluster state information returned by the cluster array management server is waited, and if the cluster is in normal operation currently, the file uploaded by the user is written into the cluster. And if the cluster is in an abnormal state currently, sending a file writing request to a standby cluster corresponding to the cluster, and then writing the file uploaded by the user into the standby cluster corresponding to the cluster.
In practical application, the backup cluster array can be used as an emergency backup when a file is written in, and can also be used as a mirror image of the main cluster array when a user needs to read a cloud file.
Specifically, after the file is written into the target cluster, when each cluster is in an idle state, that is, when the data throughput of each cluster is at a low level, the cluster array management server copies the file into one mirror image and stores the mirror image in the standby cluster corresponding to the target cluster. Therefore, the reliability of the cloud file is improved under the condition that the reading and writing experience of a user is not influenced. When a certain cluster fails and cannot be read, a user can read the file images stored in the standby cluster.
Specifically, the storage system is provided with an index cluster in which all storage media are high-speed storage media, and the high-speed storage media comprise: solid state drive SSD, dynamic random access memory. When the proxy server receives a request of a user for reading a cloud file, a corresponding cluster identifier is firstly inquired in an index cluster according to a file ID, and then file data are transmitted to a client of the user. Since the storage media in the index cluster are all high-speed storage media, the process of querying can be controlled in a very short time, and a better reading experience is provided for a user. Meanwhile, when the operation and maintenance personnel maintain the distributed storage system, the operation and maintenance personnel can more quickly acquire the specific storage address of each file, so that the operation and maintenance efficiency is improved.
Compared with the prior art, in the embodiment, the backup cluster array and the index cluster are arranged in the distributed storage system, so that the reliability of long-term storage of the files can be improved, the performance experience of users in reading the files can be improved, and the operation and maintenance efficiency of operation and maintenance personnel can be improved.
A fourth embodiment of the present invention relates to a server, which is configured as shown in fig. 5, and includes:
a request receiving module 501, configured to receive a read or write request of a file;
a calculating module 502, configured to obtain a weight coefficient of each cluster in the storage cluster array, cache the weight coefficient, and determine, according to the weight coefficient and a hash algorithm, a target cluster for writing a file;
in one embodiment, after determining the target cluster, the computing module first sends a write request to the target cluster, and then determines whether the target cluster is operating normally according to feedback from a management server of the storage cluster array. And when the target cluster is abnormal, sending a write-in request to a standby cluster corresponding to the target cluster in the standby cluster array.
A writing module 503, configured to write the file uploaded by the user into the target cluster determined by the algorithm.
In one embodiment, after the writing module writes the user file into the target cluster, the writing module generates index information of the file according to the file ID and the cluster number of the target cluster, and stores the index information into the index cluster.
Compared with the prior art, the server in the embodiment determines the specific storage address of the file by combining the hash algorithm with the weight coefficient, so that the overall distribution of the file conforms to the distribution of the weight coefficient, the space capacity of the distributed storage system is fully and reasonably utilized, the reduction of the read-write performance of the storage cluster caused by the redistribution of the data is avoided, and the performance experience of the user on the read-write of the cloud file is improved.
It should be noted that each module referred to in this embodiment is a logical module, and in practical applications, one logical unit may be one physical unit, may be a part of one physical unit, and may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, elements that are not so closely related to solving the technical problems proposed by the present invention are not introduced in the present embodiment, but this does not indicate that other elements are not present in the present embodiment.
A fifth embodiment of the present invention relates to a distributed storage system, and a flow is shown in fig. 6, including:
the server 601 according to the fourth embodiment of the present invention;
the storage cluster array 602 built by N storage clusters is used for storing files uploaded by users, where N is a natural number greater than 1.
In a specific example, the storage cluster array may include a main cluster array and a standby cluster array, where the number of clusters in the main cluster array and the number of clusters in the standby cluster array are equal and respectively correspond to each other. After the server determines the cluster identifier of the target cluster according to the ID of the file to be written by using an algorithm, firstly, a write request is sent to the cluster in the main cluster array corresponding to the cluster identifier, then, the cluster state information returned by the cluster array management server is waited, and if the cluster is in normal operation currently, the file uploaded by the user is written into the cluster. And if the cluster is in an abnormal state currently, sending a file writing request to a standby cluster corresponding to the cluster, and then writing the file uploaded by the user into the standby cluster corresponding to the cluster.
In a specific example, the backup cluster array may be used as an emergency backup when a file is written in, and may also be used as a mirror image of the primary cluster array when a user needs to read a cloud file. After the file is written into the target cluster, when each cluster is in an idle state, that is, when the data throughput of each cluster is at a lower level, the cluster array management server copies the file into a mirror image and stores the mirror image into the standby cluster corresponding to the target cluster. Therefore, the reliability of the cloud file is improved under the condition that the reading and writing experience of a user is not influenced. When a certain cluster fails and cannot be read, a user can read the file images stored in the standby cluster.
And the index cluster 603 is used for storing index information of the files.
Specifically, the storage media of the index cluster are all high-speed storage media, and the high-speed storage media include: solid state drive SSD, dynamic random access memory. When the proxy server receives a request of a user for reading a cloud file, a corresponding cluster identifier is firstly inquired in an index cluster according to a file ID, and then file data are transmitted to a client of the user. Since the storage media in the index cluster are all high-speed storage media, the process of querying can be controlled in a very short time, and a better reading experience is provided for a user. Meanwhile, when the operation and maintenance personnel maintain the distributed storage system, the operation and maintenance personnel can more quickly acquire the specific storage address of each file, so that the operation and maintenance efficiency is improved.
And a database management system 604 for storing and managing the capacity utilization information of each cluster in the storage cluster array and dynamically calculating and storing default weight coefficients according to the capacity utilization information.
In one embodiment, the database management system is MySQL, and the operation and maintenance staff can set the weight coefficient of each cluster according to the needs of actual situations and store the weight coefficient in MySQL. When the MySQL simultaneously has the weight coefficient generated by the MySQL and the user weight coefficient set by the user, the server preferentially obtains the user weight coefficient as the weight coefficient used by the algorithm.
Compared with the prior art, the embodiment integrates the plurality of Ceph distributed storage clusters into a Ceph distributed storage cluster array, so that the problem that the storage resource expansion can not be performed when the storage resource of a single Ceph distributed storage cluster is insufficient can be solved, and meanwhile, because the expansion of the single Ceph distributed storage cluster is not needed, the data rebalancing operation of the inside of the single Ceph distributed storage cluster is not needed, and the effect that the storage resource expansion user can not sense is achieved. Meanwhile, when the current storage space use conditions of the clusters are not balanced, different weight coefficients are set for the clusters, so that file data of the clusters can be distributed in a balanced manner after the subsequent files are written.
It should be noted that this embodiment is a system example corresponding to the first, second, and third embodiments, and may be implemented in cooperation with the first, second, and third embodiments. The related technical details mentioned in the first embodiment, the second embodiment and the third embodiment are still valid in the present embodiment, and are not described herein again in order to reduce the repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment, the second embodiment, and the third embodiment.
It should be noted that each module referred to in this embodiment is a logical module, and in practical applications, one logical unit may be one physical unit, may be a part of one physical unit, and may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, elements that are not so closely related to solving the technical problems proposed by the present invention are not introduced in the present embodiment, but this does not indicate that other elements are not present in the present embodiment.
A sixth embodiment of the present invention relates to an electronic device, as shown in fig. 7, including at least one processor 701; and, a memory 702 communicatively coupled to the at least one processor 701; the memory 702 stores instructions executable by the at least one processor 701, and the instructions are executed by the at least one processor 701 to enable the at least one processor 701 to execute the distributed storage method according to the first, second, or third embodiment.
The memory 702 and the processor 701 are coupled by a bus, which may comprise any number of interconnecting buses and bridges that couple one or more of the various circuits of the processor 701 and the memory 702. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 701 is transmitted over a wireless medium through an antenna, which receives the data and transmits the data to the processor 701. The processor 701 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory 702 may be used for storing data used by the processor 701 in performing operations.
A seventh embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.
Claims (10)
1. A distributed storage method, characterized in that, an application server includes:
receiving a write request of a file;
acquiring a weight coefficient of each cluster in a storage cluster array; the weight coefficient is generated according to the current storage resource of the cluster;
determining a target cluster for writing the file according to the weight coefficient of each cluster;
and writing the file into the target cluster.
2. The distributed storage method of claim 1,
the weight coefficients comprise default weight coefficients;
the default weight coefficient is generated by a database management system in real time according to the current residual storage capacity of the cluster and the network speed of a machine room where the cluster is located;
the default weight factor is positively correlated with the remaining storage capacity or the network speed.
3. The distributed storage method of claim 2,
the weight coefficients further comprise user weight coefficients; the user weight coefficient is set by a user according to actual conditions and is stored in the database management system;
when the weight coefficient of each cluster in the storage cluster array is obtained, judging whether the user weight coefficient exists or not;
and if so, acquiring the user weight coefficient.
4. The distributed storage method of claim 1,
the storage cluster array comprises a main cluster array and a standby cluster array;
each cluster in the main cluster array corresponds to each cluster in the standby cluster array;
the target cluster is a cluster in the main cluster array;
and when the target cluster fails, writing the file into a cluster in the standby cluster array corresponding to the target cluster.
5. The distributed storage method according to any one of claims 1 to 4, comprising, after said writing the file in the target cluster:
generating index information consisting of the identification of the file and the identification of the target cluster;
storing the index information into an index cluster; wherein the index cluster is built by a high-speed storage medium.
6. A server, comprising:
the request receiving module is used for receiving a writing request of a file;
the computing module is used for acquiring a weight coefficient of each cluster in the storage cluster array and determining the target cluster for writing the file according to the weight coefficient;
and the writing module is used for writing the file into the target cluster.
7. A distributed storage system, comprising:
the storage cluster array comprises N clusters, wherein N is a natural number greater than 1;
the server of claim 6, configured to store the user's uploaded file in a cluster of the storage cluster array.
8. The distributed storage system according to claim 7, further comprising:
the index cluster is used for storing index information of the files in the storage cluster array;
and the database management system is used for storing and managing the current storage resource information of each cluster in the storage cluster array and dynamically calculating and storing the default weight coefficient according to the storage resource information.
9. An electronic device, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to perform the distributed storage method of any of claims 1 to 5.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the distributed storage method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910857800.8A CN110636122A (en) | 2019-09-11 | 2019-09-11 | Distributed storage method, server, system, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910857800.8A CN110636122A (en) | 2019-09-11 | 2019-09-11 | Distributed storage method, server, system, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110636122A true CN110636122A (en) | 2019-12-31 |
Family
ID=68971036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910857800.8A Pending CN110636122A (en) | 2019-09-11 | 2019-09-11 | Distributed storage method, server, system, electronic device, and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110636122A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111131457A (en) * | 2019-12-25 | 2020-05-08 | 上海交通大学 | Capacity and bandwidth compromise method and system for heterogeneous distributed storage |
CN111562884A (en) * | 2020-04-28 | 2020-08-21 | 北京奇艺世纪科技有限公司 | Data storage method and device and electronic equipment |
CN111736762A (en) * | 2020-05-21 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Synchronous updating method, device, equipment and storage medium of data storage network |
CN111767250A (en) * | 2020-06-10 | 2020-10-13 | 钛星投资(深圳)有限公司 | Decentralized storage method, downloading method and storage system |
CN112637327A (en) * | 2020-12-21 | 2021-04-09 | 北京奇艺世纪科技有限公司 | Data processing method, device and system |
CN113110796A (en) * | 2020-01-13 | 2021-07-13 | 顺丰科技有限公司 | Data management method, device, server and storage medium |
CN113721855A (en) * | 2021-09-01 | 2021-11-30 | 中国建设银行股份有限公司 | Storage method and device of storage resources, electronic equipment and computer storage medium |
CN114089917A (en) * | 2021-11-19 | 2022-02-25 | 中国电信集团系统集成有限责任公司 | Distributed object storage cluster, capacity expansion method and device thereof, and electronic equipment |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100023621A1 (en) * | 2008-07-24 | 2010-01-28 | Netapp, Inc. | Load-derived probability-based domain name service in a network storage cluster |
CN101997884A (en) * | 2009-08-18 | 2011-03-30 | 升东网络科技发展(上海)有限公司 | Distributed storage system and method |
CN106527981A (en) * | 2016-10-31 | 2017-03-22 | 华中科技大学 | Configuration-based data fragmentation method for adaptive distributed storage system |
CN108011929A (en) * | 2017-11-14 | 2018-05-08 | 平安科技(深圳)有限公司 | Data request processing method, apparatus, computer equipment and storage medium |
CN108600316A (en) * | 2018-03-23 | 2018-09-28 | 深圳市网心科技有限公司 | Data managing method, system and the equipment of cloud storage service |
CN108614837A (en) * | 2016-12-13 | 2018-10-02 | 杭州海康威视数字技术股份有限公司 | File stores and the method and device of retrieval |
CN108763436A (en) * | 2018-05-25 | 2018-11-06 | 福州大学 | A kind of distributed data-storage system based on ElasticSearch and HBase |
CN108875035A (en) * | 2018-06-25 | 2018-11-23 | 郑州云海信息技术有限公司 | The date storage method and relevant device of distributed file system |
CN109343801A (en) * | 2018-10-23 | 2019-02-15 | 深圳前海微众银行股份有限公司 | Date storage method, equipment and computer readable storage medium |
CN109597567A (en) * | 2017-09-30 | 2019-04-09 | 网宿科技股份有限公司 | A kind of data processing method and device |
CN110109886A (en) * | 2018-02-01 | 2019-08-09 | 中兴通讯股份有限公司 | The file memory method and distributed file system of distributed file system |
-
2019
- 2019-09-11 CN CN201910857800.8A patent/CN110636122A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100023621A1 (en) * | 2008-07-24 | 2010-01-28 | Netapp, Inc. | Load-derived probability-based domain name service in a network storage cluster |
CN101997884A (en) * | 2009-08-18 | 2011-03-30 | 升东网络科技发展(上海)有限公司 | Distributed storage system and method |
CN106527981A (en) * | 2016-10-31 | 2017-03-22 | 华中科技大学 | Configuration-based data fragmentation method for adaptive distributed storage system |
CN108614837A (en) * | 2016-12-13 | 2018-10-02 | 杭州海康威视数字技术股份有限公司 | File stores and the method and device of retrieval |
CN109597567A (en) * | 2017-09-30 | 2019-04-09 | 网宿科技股份有限公司 | A kind of data processing method and device |
CN108011929A (en) * | 2017-11-14 | 2018-05-08 | 平安科技(深圳)有限公司 | Data request processing method, apparatus, computer equipment and storage medium |
CN110109886A (en) * | 2018-02-01 | 2019-08-09 | 中兴通讯股份有限公司 | The file memory method and distributed file system of distributed file system |
CN108600316A (en) * | 2018-03-23 | 2018-09-28 | 深圳市网心科技有限公司 | Data managing method, system and the equipment of cloud storage service |
CN108763436A (en) * | 2018-05-25 | 2018-11-06 | 福州大学 | A kind of distributed data-storage system based on ElasticSearch and HBase |
CN108875035A (en) * | 2018-06-25 | 2018-11-23 | 郑州云海信息技术有限公司 | The date storage method and relevant device of distributed file system |
CN109343801A (en) * | 2018-10-23 | 2019-02-15 | 深圳前海微众银行股份有限公司 | Date storage method, equipment and computer readable storage medium |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111131457A (en) * | 2019-12-25 | 2020-05-08 | 上海交通大学 | Capacity and bandwidth compromise method and system for heterogeneous distributed storage |
CN113110796A (en) * | 2020-01-13 | 2021-07-13 | 顺丰科技有限公司 | Data management method, device, server and storage medium |
CN111562884A (en) * | 2020-04-28 | 2020-08-21 | 北京奇艺世纪科技有限公司 | Data storage method and device and electronic equipment |
CN111562884B (en) * | 2020-04-28 | 2023-10-27 | 北京奇艺世纪科技有限公司 | Data storage method and device and electronic equipment |
CN111736762A (en) * | 2020-05-21 | 2020-10-02 | 平安国际智慧城市科技股份有限公司 | Synchronous updating method, device, equipment and storage medium of data storage network |
CN111736762B (en) * | 2020-05-21 | 2023-04-07 | 平安国际智慧城市科技股份有限公司 | Synchronous updating method, device, equipment and storage medium of data storage network |
CN111767250A (en) * | 2020-06-10 | 2020-10-13 | 钛星投资(深圳)有限公司 | Decentralized storage method, downloading method and storage system |
CN112637327A (en) * | 2020-12-21 | 2021-04-09 | 北京奇艺世纪科技有限公司 | Data processing method, device and system |
CN113721855A (en) * | 2021-09-01 | 2021-11-30 | 中国建设银行股份有限公司 | Storage method and device of storage resources, electronic equipment and computer storage medium |
CN114089917A (en) * | 2021-11-19 | 2022-02-25 | 中国电信集团系统集成有限责任公司 | Distributed object storage cluster, capacity expansion method and device thereof, and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110636122A (en) | Distributed storage method, server, system, electronic device, and storage medium | |
US20210255791A1 (en) | Distributed storage system and data management method for distributed storage system | |
CN109299190B (en) | Method and device for processing metadata of object in distributed storage system | |
US9250682B2 (en) | Distributed power management for multi-core processors | |
JP2005275829A (en) | Storage system | |
WO2015196686A1 (en) | Data storage method and data storage management server | |
WO2011088767A1 (en) | Content delivery method, system and schedule server | |
US20140297728A1 (en) | Load distribution system | |
EP3739440A1 (en) | Distributed storage system, data processing method and storage node | |
CN111600957A (en) | File transmission method, device and system and electronic equipment | |
US20220075757A1 (en) | Data read method, data write method, and server | |
CA3030504A1 (en) | Blockchain network and task scheduling method therefor | |
CN112817728A (en) | Task scheduling method, network device and storage medium | |
CN115396377B (en) | Method, device, equipment and storage medium for optimizing service quality of object storage | |
CN113268329B (en) | Request scheduling method, device and storage medium | |
CN108459926B (en) | Data remote backup method and device and computer readable medium | |
CN113923216A (en) | Distributed cluster current limiting system and method and distributed cluster nodes | |
CN105760391A (en) | Data dynamic redistribution method and system, data node and name node | |
CN112540966A (en) | File synchronization method and device | |
US20140025630A1 (en) | Data-store management apparatus, data providing system, and data providing method | |
US11989455B2 (en) | Storage system, path management method, and recording medium | |
CN115794396A (en) | Resource allocation method, system and electronic equipment | |
US20220283875A1 (en) | Storage system, resource control method, and recording medium | |
CN112291326B (en) | Load balancing method, load balancing device, storage medium and electronic equipment | |
CN103685359A (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191231 |