WO2020063763A1 - 数据存储方法、装置、系统、服务器、控制节点及介质 - Google Patents

数据存储方法、装置、系统、服务器、控制节点及介质 Download PDF

Info

Publication number
WO2020063763A1
WO2020063763A1 PCT/CN2019/108186 CN2019108186W WO2020063763A1 WO 2020063763 A1 WO2020063763 A1 WO 2020063763A1 CN 2019108186 W CN2019108186 W CN 2019108186W WO 2020063763 A1 WO2020063763 A1 WO 2020063763A1
Authority
WO
WIPO (PCT)
Prior art keywords
split
data
shard
storage
server
Prior art date
Application number
PCT/CN2019/108186
Other languages
English (en)
French (fr)
Inventor
李坚
王文姝
Original Assignee
北京金山云网络技术有限公司
北京金山云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山云网络技术有限公司, 北京金山云科技有限公司 filed Critical 北京金山云网络技术有限公司
Priority to US17/281,466 priority Critical patent/US11385830B2/en
Publication of WO2020063763A1 publication Critical patent/WO2020063763A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0617Improving the reliability of storage systems in relation to availability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present application relates to the field of data storage technologies, and in particular, to a data storage method, device, system, server, control node, and medium.
  • a slice can be used to directly determine which slice of the data to be stored is within the data storage range of the slice, and a consistent hash algorithm can also be used to determine which slice of the data to be stored is within the data storage range of the slice.
  • the server can include a master server and one or more slave servers.
  • the server can be a shard master server, and it can also be another shard. Tablet from the server.
  • the increase in the amount of data and / or the pressure of requesting shards in a distributed storage system will cause the original shards to fail to provide sufficient service capabilities.
  • the data in the shards needs to be stored. Into multiple sub-shards created for that shard.
  • the specific data storage method is: the control node in the distributed storage system determines the target shards that need to be split and stored; the status of the target shards is changed from a writable state to a non-writable state; Split to get multiple split data; determine the server that stores each split data; send split instructions to the determined server, the determined server creates sub-shards of the target shard, and determines the data for each sub-shard Storage range; according to the data storage range of each sub-shard, multiple split data is stored in the corresponding sub-shard.
  • the above method will be performed once for each target shard.
  • the state of the target shards is set to a non-writable state, which will cause the target shards to not be written during the data storage process Operation, rejecting write requests for the target shard, thereby reducing the availability of the distributed storage system.
  • This application provides a data storage method, device, system, server, control node, and medium to improve the availability of the storage system. Specific technical solutions are as follows:
  • an embodiment of the present application provides a data storage method, which is applied to a master server to which a slice to be split belongs, and the method includes:
  • each of the slave servers After receiving the split storage instruction, send the split storage instruction to each of the slave servers, so that each of the slave servers divides the fragment to be split according to the data split point.
  • Performing split storage to obtain a target fragment and after obtaining the target fragment, sending a first message to the master server, where the first message is a message that split storage is completed;
  • the to-be-splitting fragment is split and stored according to the data splitting point to obtain the target fragment.
  • the fragment to be split is split and stored according to the data split point to obtain the target score.
  • the method further includes:
  • the method further includes:
  • each shard stored by itself stores data that is not in the data storage range of the shard; if so, delete data that is not in the data storage range of the shard.
  • the method further includes:
  • the split storage instruction When receiving the split storage instruction, determine whether the first message has been sent to the control node; if so, send the data storage request to the slave server, so that each The slave determines, from the target shard and the shard to be split after the storage is split, the shard to store the data to be stored, and stores the data to be stored to the determined shard, and The second message is sent to the main server; when the number of the received second messages is greater than a second number threshold, determining to store the target fragment and the fragment to be split after the split storage A fragment of the data to be stored, and storing the data to be stored to the determined fragment; if not, after sending the first message to the control node, executing the sending of the data storage request to the Steps from the server.
  • an embodiment of the present application further provides a data storage method, which is applied to a control node.
  • the method includes:
  • the slave server After receiving the data splitting point, send a split storage instruction to the master server, so that the main server sends the split storage instruction to each of the hosts after receiving the split storage instruction.
  • the slave server when the number of the received first messages is greater than the second number threshold, splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the method further includes:
  • the target data storage range is a data storage range of the fragment to be split after the split storage.
  • the method further includes:
  • each server sends a collection instruction for the data storage range of each shard to each server, so that after each server receives the collection instruction, it obtains the data storage range of each shard stored by itself, and sends it to the control node. Sending the obtained data storage range; wherein the server includes the master server and the slave server;
  • an embodiment of the present application provides a data storage device, which is applied to a master server to which a slice to be split belongs, and the device includes:
  • a receiving module configured to receive a preparation split instruction for the fragment to be split sent by a control node
  • the first sending module is configured to send the preparation for splitting instruction to a slave server to which the fragment to be split belongs, so that each of the slave servers obtains the standby after receiving the preparation for splitting instruction.
  • Split the data split point of the shard and send the data split point to the master server;
  • a second sending module configured to send the data splitting point to the control node when the number of data splitting points received is greater than a first number threshold, so that the control node sends a split to the master server Sub-store instruction
  • a third sending module configured to send the split storage instruction to each of the slave servers after receiving the split storage instruction, so that each of the slave servers splits a point according to the data, Split and store the fragment to be split to obtain a target fragment, and after obtaining the target fragment, send a first message to the master server, where the first message is a message that the split storage is completed ;
  • a splitting module configured to: when the number of received first messages is greater than a second number threshold, split and store the fragment to be split according to the data split point to obtain the target fragment .
  • the device further includes:
  • a fourth sending module is configured to send the first message to the control node, so that the control node, after receiving the first message, splits the data according to the data split point and before splitting the storage.
  • Data storage range of the shard to be split determine the data storage range and target data storage range of the target shard; update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record all The association relationship between the target fragment and a data storage range determined for the target fragment, wherein the target data storage range is a data storage range of the fragment to be split after the split storage.
  • the device further includes:
  • a first obtaining module configured to obtain an association relationship between each fragment and a data storage range when the data stored in the target fragment is the same as the data stored in the fragment to be split;
  • a first determining module configured to determine, according to the obtained association relationship, whether each slice stored by itself stores data that is not within a data storage range of the slice;
  • the deleting module is configured to delete data that is not within a data storage range of the slice when the judgment result of the first judgment module is yes.
  • the device further includes:
  • a second obtaining module configured to obtain a data storage request for the shard to be split, wherein the data storage request includes data to be stored;
  • a fifth sending module is configured to send the data storage request to the slave server if the split storage instruction is not received, so that each of the slave servers stores the data to be stored to The fragment to be split, and sending a second message to the master server; wherein the second message is a message that data storage is completed;
  • a first storage module configured to store the data to be stored in the fragment to be split when the number of received second messages is greater than a second number threshold
  • a second determining module configured to determine whether the first message is currently sent to the control node when the split storage instruction is received
  • the sixth sending module is configured to send the data storage request to the slave server when the judgment result of the second judgment module is yes, so that each slave server shards from the target.
  • determine the shards to store the data to be stored store the data to be stored in the determined shards, and send the second message to the master server;
  • a second storage module configured to determine, when the number of received second messages is greater than a second number threshold, from the target shard and the shard to be split after the split storage, the storage of the data to be stored is determined Fragmentation, and storing the data to be stored to the determined fragmentation;
  • the sixth sending module is further configured to send the data storage request to the control node after the first message is sent to the control node if the determination result of the second determination module is no. From the server.
  • an embodiment of the present application provides a data storage device, which should be configured as a control node.
  • the party device includes:
  • a seventh sending module is configured to send a preparation and splitting instruction for the shard to be split to the master server to which the shard to be split belongs, so that after receiving the preparation and splitting instruction, the master server splits the preparation and splitting Sending a split instruction to the slave server to which the fragment to be split belongs, and sending the data split point to the control node when the number of data split points received is greater than a first number threshold;
  • An eighth sending module is configured to send a split storage instruction to the main server after receiving the data splitting point, so that the main server will split the split after receiving the split storage instruction.
  • a split storage instruction is sent to each of the slave servers, and when the number of received first messages is greater than a second number threshold, the data to be split is split and stored according to the data split point. To obtain the target segment.
  • the device further includes:
  • the determining module is configured to determine the target score after receiving the first message sent by the master server according to the data split point and the data storage range of the to-be-split shard before split storage. Slice corresponding data storage range and target data storage range;
  • the first modification module is configured to update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record the target shard and the data storage range determined for the target shard. An association relationship among them, wherein the target data storage range is a data storage range of the fragment to be split after the split storage.
  • the device further includes:
  • the ninth sending module is configured to send a collection instruction for the data storage range of each fragment to each server, so that each server obtains the data storage of each fragment stored by itself after receiving the collection instruction Range, sending the obtained data storage range to the control node; wherein the server includes the master server and the slave server;
  • the third judgment module is configured to, after receiving the data storage range sent by each server, determine, for each slice, whether the state recorded in advance for the slice is a split storage state;
  • a fourth judgment module configured to determine, if the judgment result of the third judgment module is yes, whether the data storage range of the segment recorded in advance is the same as the data storage range of the obtained segment;
  • the tenth sending module is configured to send a split storage instruction to the master server to which the segment belongs if the determination result of the fourth determination module is the same;
  • the second modification module is configured to modify the state of the fragment record in advance to a split storage completion state when the judgment result of the fifth judgment module is different.
  • an embodiment of the present application provides a data storage system.
  • the system includes a control node, a master server to which a shard to be split belongs, and a slave server to which a shard to be split belongs.
  • the control node is configured to send a split preparation instruction for the fragment to be split to the main server;
  • the master server is configured to send the preparation for splitting instruction to the slave server to which the fragment to be split belongs after receiving the preparation for splitting instruction;
  • the slave server is configured to, after receiving the preparation for splitting instruction, obtain a data splitting point of the fragment to be split, and send the data splitting point to the master server;
  • the main server is further configured to send the data splitting points to the control node when the number of received data splitting points is greater than a first number threshold;
  • the control node is further configured to send a split storage instruction to the main server after receiving the data split point sent by the main server;
  • the master server is further configured to send the split storage instruction to each of the slave servers after receiving the split storage instruction;
  • the slave server is further configured to split and store the fragment to be split according to the data split point to obtain a target fragment, and after obtaining the target fragment, send a first to the master server.
  • the main server is further configured to, when the number of the received first messages is greater than a second number threshold, perform the split storage on the to-be-split fragments according to the data split point to obtain the target Fragmentation.
  • the main server is further configured to send the first message to the control node;
  • the control node is further configured to, after receiving the first message, determine the data of the target fragment according to the data storage point and the data storage range of the fragment to be split before the split storage.
  • Storage range and target data storage range update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record the target shard and the data storage range determined for the target shard An association relationship therebetween, wherein the target data storage range is a data storage range of the shard to be split after split storage.
  • the main server is further configured to obtain between each shard and the data storage range when the data stored in the target shard is the same as the data stored in the shard to be split. According to the obtained relationship, determine whether each shard stored by itself stores data that is not in the data storage range of the shard; if so, delete data that is not in the data storage range of the shard .
  • the main server is further configured to obtain a data storage request for the shard to be split, wherein the data storage request includes data to be stored; in a case where the split storage instruction is not received Sending the data storage request to the slave service;
  • the slave server is further configured to store the data to be stored to the shard to be split; and send a second message to the master server; wherein the second message is a message that data storage is completed;
  • the main server is further configured to store the data to be stored in the to-be-split shard when the number of received second messages meets a second number threshold;
  • the main server is further configured to determine whether the first message is currently sent to the control node when the split storage instruction is received; and if so, send the data storage request to the control node.
  • the slave server is further configured to determine, from the target shard and the shard to be split after the split storage, the shard to store the data to be stored, and store the data to be stored in the determined shard. Slice, and send the second message to the master server;
  • the main server is further configured to determine to store the data to be stored from the target fragment and the fragment to be split after the split storage when the number of received second messages is greater than a second number threshold. And store the data to be stored to the determined fragment; if not, after sending the first message to the control node, execute the sending of the data storage request to the slave server step.
  • the main server is further configured to, after receiving the first message sent by the main server, according to the data split point and the data corresponding to the shard to be split before the split storage.
  • the storage range determines the data storage range and target data storage range corresponding to the target shard; updates the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and records the target shard and An association relationship between data storage ranges determined for the target shard, wherein the target data storage range is a data storage range of the shard to be split after the split storage.
  • control node is further configured to send a collection instruction for the data storage range of the fragment to all servers; wherein the server includes the master server and the slave server;
  • the server is configured to, after receiving the collection instruction, obtain the data storage range of the fragment stored by itself, and send the obtained data storage range to the control node;
  • the control node is further configured to, after receiving the data storage range sent by all servers, determine, for each shard, whether the state of the record for the shard in advance is the state of split storage; if it is the state of split storage, Determine whether the data storage range of the pre-recorded shard is the same as the data storage range of the obtained shard; if they are the same, send a split storage instruction to the master server to which the shard belongs; if they are not the same, they will The status of the shard record is changed to the split storage completion status.
  • an embodiment of the present application further provides a main server, including a processor and a memory, where:
  • Memory set to store computer programs
  • the processor is configured to implement the steps of the data storage method applied to the main server provided in the embodiment of the present application when the program stored in the memory is executed.
  • an embodiment of the present application further provides a control node, including a processor and a memory, where:
  • Memory set to store computer programs
  • the processor is configured to implement the steps of the data storage method applied to the control node provided in the embodiment of the present application when the program stored in the memory is executed.
  • an embodiment of the present application further provides a computer-readable storage medium.
  • a computer program is stored in the computer-readable storage medium.
  • the computer program is executed by a processor, the computer program provided by the embodiment of the present application is applied to a main server. Steps of the data storage method.
  • an embodiment of the present application further provides a computer-readable storage medium.
  • a computer program is stored in the computer-readable storage medium.
  • the computer program is executed by a processor, the computer program provided by the embodiment of the present application is applied to a control node Steps of the data storage method.
  • an embodiment of the present application provides a computer program product containing instructions, which when executed on a computer, causes the computer to execute a data storage method applied to a main server.
  • an embodiment of the present application provides a computer program product containing instructions, which when executed on a computer, causes the computer to execute a data storage method applied to a control node.
  • an embodiment of the present application provides a computer program that, when run on a computer, causes the computer to execute a data storage method applied to a main server.
  • an embodiment of the present application provides a computer program that, when run on a computer, causes the computer to execute a data storage method applied to a control node.
  • the data splitting method, method, system, server, control node, and computer-readable storage medium provided in the embodiments of the present application.
  • the control node sends a preparation splitting instruction to the main server to which the shard to be split belongs, and the main server sends the preparation splitting instruction.
  • Give the slave server to which the shard to be split belongs obtain the data split point of the shard to be split from the server, and send the data split point to the master server; the number of data split points received by the master server is greater than the first
  • the control node After receiving the data split point, the control node sends a split storage instruction, and the master server and the slave server perform split storage on the split shard.
  • the slave server splits the fragment to be split according to the data split point, and after receiving the target fragment, sends a first message to the master server.
  • the split point is based on the data.
  • FIG. 1 is a schematic flowchart of a data storage method according to an embodiment of the present application
  • FIG. 2 is another schematic flowchart of a data storage method according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a data storage device according to an embodiment of the present application.
  • FIG. 4 is another schematic structural diagram of a data storage device according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a data storage system according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a master server according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a control node according to an embodiment of the present application.
  • the embodiments of the present application provide a data storage method, device, system, server, control node, and computer-readable storage medium. The following firstly performs the data storage method provided by the embodiment of the present application and applied to the main server to which the shard to be split belongs. Instructions.
  • the data storage method provided in the embodiment of the present application is preferably applied to the master server to which the shard to be split belongs.
  • the shard to be split referred to here is the shard to be split and the master to which the shard to be split belongs.
  • the server is relative to the shard to be split. This server may be the slave of another shard.
  • the main server to which the target shard belongs may be a server in KTS (Kingsoft Table Service).
  • KTS is a fully managed NoSQL (Not Relational SQL) database service that provides massive structured and semi-structured data storage and real-time access.
  • the master server to which the shard to be split belongs may be selected from the server storing the target shard by using the Raft algorithm.
  • Raft algorithm is a consensus algorithm that is easier to understand. Specifically, it is a consistency algorithm.
  • FIG. 1 is a first schematic flowchart of a data storage method according to an embodiment of the present application; the method includes:
  • S101 Receive a preparation and split instruction for a fragment to be split sent by a control node.
  • Each shard corresponds to a data storage range, and different shards have different data storage ranges, that is, different shards store different data. Users can access the data stored in each shard. For example, the user may access the data stored in each shard at the same frequency or differently, that is, each shard corresponds to an access frequency. .
  • a fragment with an access frequency greater than a preset access frequency may be determined as a fragment to be split.
  • the size of the preset access frequency may be determined according to the actual situation, and the size of the preset access frequency is not specifically limited in the embodiment of the present application.
  • the control node After receiving the split storage request sent by the master server to which the shard to be split belongs, the control node can generate a split preparation instruction. After receiving the split storage request sent manually, the control node can also generate a split preparation instruction. This will be explained in detail below.
  • the control node may generate a split preparation instruction.
  • the master server to which a shard belongs can detect whether the amount of data stored by the shard is greater than a preset storage threshold, and when the amount of data stored by the shard is greater than a preset storage threshold, the shard The affiliated server can send a split storage request to the control node.
  • the control node After receiving the split storage request sent by the server to which the shard belongs, the control node generates a split preparation instruction.
  • control node may generate a preparation for splitting instruction after receiving a split storage request sent manually. Specifically, when a storage system worker finds that a shard with an access frequency greater than a preset access frequency exists, it can send a split storage request to the control node. After receiving the split storage request, the control node generates a ready to split instruction.
  • the split storage request may include identification information of the fragment to be split, and the identification information may be a name of the fragment to be split, or an ID (Identification, identification number) of the fragment to be split.
  • control node After the control node generates the preparation split instruction, it can send the preparation split instruction to the master server to which the fragment to be split belongs, so that the master server can perform corresponding operations according to the preparation split instruction.
  • the main server may be elected by using the Raft algorithm.
  • Servers other than the master server that store the shards to be split serve as slave servers.
  • the master server can learn the identification information of other servers that store the shards to be split.
  • the identification information can be the name of the server or the IP (Internet Protocol) address.
  • the master server determines the slave server according to the previously obtained identification information of other servers storing the shards to be split, and sends the preparation for splitting instruction to the slave server.
  • S102 Send the preparation for splitting instruction to the slave server to which the shard to be split belongs, so that each slave server obtains the data splitting point of the shard to be split after receiving the splitting instruction and splits the data. Send to the master server.
  • the master server After the master server receives the preparation for splitting instruction, it can send the preparation for splitting instruction to the slave server of the fragment to be split. After the slave server receives the ready to split instruction, it can obtain the data split point of the split to be split.
  • the data splitting point may be carried in the preparation splitting instruction, and after receiving the preparation splitting instruction from the server, the data splitting point may be obtained from the preparation splitting instruction.
  • the slave server may determine a data splitting point according to a preset data splitting rule. For example, the slave can obtain the data storage range of the shard to be split and the data storage amount corresponding to each primary key in the data storage range, and then according to the data storage range of the shard to be split and the data storage amount corresponding to each primary key, according to the prior Set the data splitting rule to determine the data splitting point.
  • the data storage range of the shard to be split may be obtained directly from the preparation split instruction; In the case where the data storage range is not included in the preparation split instruction, the shards to be split can be scanned to obtain the data storage range.
  • the data storage range of the shards to be split can also be obtained by other methods.
  • the manner in which the server obtains the data storage range of the shard to be split is not specifically limited.
  • each data in the fragment to be split corresponds to a primary key, and each data has a fixed amount of data. Therefore, the data storage amount corresponding to each primary key can be determined by scanning the data stored in the fragment to be split.
  • the data splitting rule may be:
  • the amount of data corresponding to the data storage range of the shards obtained by the splitting is the same as the amount of data corresponding to the data storage range of the shards to be split.
  • the shard to be split is shard 1.
  • the amount of data stored in shard 1 is 10G.
  • the primary key range of shard 1 is AF.
  • the primary key range of shard 1 is The data storage range of slice 1, if the data amount corresponding to the primary key interval AC is 4.95G, and the data amount corresponding to the primary key interval DF is 5.05G, if the data split point is set in advance as the right endpoint of the primary key interval of one of the fragments, The data split point is the primary key C. If the data split point is set in advance as the left endpoint of the primary key range of one of the shards, the data split point is the primary key D. Regardless of which of the above data split points is used, the data storage range of the shards obtained by splitting is D-F, and the data storage range of the shards to be split after splitting is A-C.
  • the shard to be split is shard 2
  • the amount of data stored in shard 2 is 15G
  • the primary key interval of shard 2 is HN
  • the primary key interval HJ corresponds to 4.95 G
  • the amount of data corresponding to the primary key interval KL is 5.05G
  • the amount of data corresponding to the primary key interval MN is 5G, it can be considered that the data splitting point can be the primary key J and the primary key L.
  • a KeyValue type of storage is usually used. When data is stored, it is naturally sorted according to the lexicographic order of the Key.
  • Key is the primary key mentioned above.
  • the primary key is a candidate key that is selected and uniquely identified as the row of the table.
  • the primary key can consist of a single field or multiple fields, which become a single-field primary key or a multi-field primary key.
  • a shard stores all data for the corresponding primary key range.
  • the data splitting point contained in the prepared splitting instruction may be the data splitting point in the instruction received by the control node from the worker, and the worker may determine the data splitting point according to the access pressure of the data corresponding to each primary key in the shard.
  • the obtained data split point needs to be sent to the master server, so that the master server performs the next operation after receiving the data split point.
  • all data split points obtained from the server are the same.
  • the master server can determine that most of the current slaves have received the number of data splitting points greater than the first number threshold.
  • the server has sent the data split point to the main server. In this case, the subsequent steps will be performed. This can ensure the consistency of the data after each slice is split and stored in different servers.
  • the embodiment of the present application does not specifically limit the size of the first quantity threshold.
  • the first quantity threshold may be the slave server of the master server. Half the amount. Exemplarily, the number of slave servers is five, and the first number threshold may be three.
  • the master server receives four data split points, it indicates that there are currently four slave servers sending data split points to the master server. , The subsequent operations are performed.
  • the master server receives a data split point sent by the slave server, it will continue to wait until it times out.
  • the main server When the main server receives that the number of data splitting points is greater than the first number threshold, it sends the data splitting points to the control node.
  • One purpose of the main server sending data split points to the control node is to notify the control node that the preparation for splitting has been completed, and trigger the control node to send a split storage instruction to the main server.
  • the control node receives the data splitting point, it means that the main server has responded to the prepare to split instruction. At this time, the control node needs to send the split storage instruction to the main server to trigger the main server to perform subsequent steps.
  • S104 After receiving the split storage instruction, send the split storage instruction to each server, so that each slave server splits and stores the fragment to be split according to the data split point, and obtains the target fragment. After the target fragment is obtained, a first message is sent to the main server, where the first message is a message that the split storage is completed.
  • the master server After receiving the split storage instruction, the master server sends the split storage instruction to each slave server. After receiving the split storage instruction, the slave server splits the shards to be split according to the data split point. Get the target shard.
  • the target shard mentioned here is the shard obtained after the split shard is split and stored. For example, you can use hard-chain copying to perform sharded split storage, which can reduce the time required to split the stored process.
  • a hard link also called a hard link, is one or more file names of a file. In this way, multiple file names can be linked to a file through hard links. These file names can be in the same directory or in different directories. Hard links can implement multiple file names that are not in the same or the same directory. After a file is modified, all files that have a hard link to the modified file are also modified. Split the shards by hard-chain copying. The data stored in the obtained shards is the same as the data stored in the shards to be split before the split storage. The data is stored in the shard by data migration.
  • the data stored in the obtained shard can be the same as the data stored before the shard to be split and stored, or it can be the data stored before the shard to be split. Part of the data may be determined by a split storage method, which is not limited in the embodiment of the present application.
  • the relationship between the data storage range and the data storage location of the shards stored by the server is recorded in the slave, that is, the relationship between the primary key interval and the storage location of the data corresponding to the primary key interval is recorded.
  • the relationship between the recorded data storage range and the data storage location it is determined from which location to read, or from which location to store.
  • one data storage range is also split into multiple, and one data storage location can only correspond to one data storage range.
  • the previous data storage for the shard record to be split needs to be modified
  • the data storage range in the relationship for the shard records to be split can be modified into a split data storage range, and for the remaining split data storage range, determine Data storage location, and records the relationship between the determined data storage location and data storage range.
  • the relationship between the pre-recorded primary key interval AC stored in the C1 position, and the primary key interval after the split storage is the primary key A and the primary key interval BC.
  • the server After modifying the relationship between the primary key interval AC and the C1 position, we obtain The relationship between the positions of the primary key A and C1. If the data storage position of the primary key interval BC is determined as C2, then the relationship between the positions of the primary key interval BC and C2 is recorded. This can enable subsequent reading of data from the correct location and storage of data to the correct location.
  • the server obtains the target fragment, completes the modification of the data storage range and data storage location, and records the relationship between the data storage range and data storage location obtained after the split, it sends a first message to the main server to notify the main server. Its own shard split storage has been completed.
  • each slave server After each slave server completes the split storage, it sends the completed storage split message to the master server.
  • the master server When the number of first messages received by the master server is greater than the second number threshold, the master server will perform The split storage operation of the target shard stored by the master server is the same as that of the slave server for the split shard.
  • the second quantity threshold may be the same as or different from the first quantity threshold, and the size of the second quantity threshold is not specifically limited in the embodiment of the present application.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be split based on the same data Point the split shard to split the storage to get the same target shard.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node.
  • the downtime caused by the performance bottleneck of the control node can improve the reliability of the distributed storage system.
  • the method when the number of the received first messages is greater than the second number threshold, the fragment to be split is split and stored according to the data split point, and after the target fragment is obtained, the method further Can include:
  • the target data storage range is the data storage range of the shard to be split after the split storage.
  • the master server After the master server completes the split storage of the shards to be split, it can send the first message to the control node. After the control node receives the first message, it can determine that the master server and the slave server have completed the wait. Split storage for split shards. Then, the control node determines the data storage range of the target shard according to the data split point and the data storage range of the shard to be split before the split storage. For example, the control node first divides the data storage range of the shard to be split before the split according to the data split point to obtain the data storage range after the split; then, it can select one from the data storage range after the split.
  • the selected data storage range is the data storage range of the shard to be split after the split storage, that is, the target data storage range.
  • the control node may update the data storage range in the previous association record for the shard to be split, that is, update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and from the remaining Among the split data storage ranges, the data storage range of the target shard is selected, and the association relationship between the selected data storage range and the target shard is recorded.
  • the method for selecting a data storage range from the split data storage range may be a random selection or a selection according to a preset selection rule.
  • the shard to be split is shard 3
  • the data storage range of shard 3 is the primary key PT
  • the obtained data splitting point is the primary key R
  • one data storage range is the primary key PR
  • the other data storage range is Primary key ST
  • one of the two data storage ranges is the data storage range of the shard to be split after the split storage.
  • the specific data storage range is the data storage range of the shard to be split after the split storage, which needs to be determined according to the preset rules.
  • the preset data storage range with a smaller right end value is the split storage to be split
  • the data storage range of the shard, the data storage range composed of the primary key PR is the data storage range of shard 3 after the split storage; if the preset data storage range with a larger right end value is the data storage range after the split storage,
  • the data storage range of the split shard, then the data storage range composed of the primary key ST is the data storage range of the shard 3 after the split storage.
  • the data storage range of the shard to be split after the split storage is included in the data storage range of the shard to be split before the split storage, and the data storage of the shard to be split after the split storage The range is smaller than the data storage range of the shard to be split before the split storage.
  • Update the data storage range of the shard to be split after the split storage establish the association between the target shard and the data storage range determined for the target shard, and ensure that subsequent data can be correctly stored in the corresponding shard It can also ensure that the correct data can be read later according to the data read request.
  • the method may further include:
  • each shard stored by itself stores data that is not in the data storage range of the shard; if so, delete data that is not in the data storage range of the shard.
  • the data stored in the target shard is the same as the data stored in the shard to be split, indicating that there is redundant data in the target shard.
  • the shard to be split is The data stored after the split storage is the same, indicating that redundant data is stored in the shards to be split after the split.
  • the deletion method can be: first obtain The association relationship between the shards and the data storage range, wherein the control node may store the association relationship between the shards and the data storage range; and then, according to the obtained association relationship, determine each shard stored by itself Whether to store data not in the data storage range of the shard, that is, to determine whether redundant data is stored. For example, you can determine whether the primary key of the shard stored in the shard is within the data storage range of the shard.
  • the association relationship between the obtained shard and the data storage range may be timing, or may be obtained after the split storage is completed. Deleting redundant data in a shard can save storage resources, reduce the possibility of splitting and storing the shard again due to the excessive amount of redundant data, and reduce the number of split storage operations performed by the server to which the shard belongs. frequency.
  • the method may further include:
  • a data storage request is sent to the slave server, so that each slave server stores the data to be stored to the shard to be split; and sends a second message to the master server; wherein, The second message is a message that data storage is completed; when the number of received second messages meets the second number threshold, the data to be stored is stored in the shard to be split;
  • the step of sending a data storage request to the slave server determines whether the first message has been currently sent to the control node; if so, send a data storage request to the slave server, so that each slave server shards from the target and splits the storage.
  • the shards to be split determine the shards to store the data to be stored, store the data to be stored in the determined shards, and send the second message to the main server; the number of received second messages is greater than the second
  • the step of sending a data storage request to the slave server is performed.
  • the storage of data and the split storage of the shards are parallel, and the split storage of the shards does not affect the reception of data storage requests.
  • a data storage request may be sent to the slave server at this time.
  • the data to be stored is stored in the shard to be split, and after the storage is completed, a second message, that is, the message that the data storage is completed is sent to the main server.
  • the main server stores the data to be stored in its own fragment to be split.
  • the main server if the main server receives a split storage instruction because the data is being stored at this time, that is, a write operation is being performed, the main server records the split storage instruction.
  • the split storage instruction can be written to the meta database.
  • a split storage instruction is sent to each slave server.
  • the master server Before receiving the data split storage instruction, if the master server receives the data split storage instruction, the shard to be split is either in the split storage state or the split storage completion state when it is in the split storage state. At this time, data storage for the shard to be split cannot be performed. If the split storage is completed, data storage for the shards to be split can be performed at this time. Therefore, it is necessary to determine whether the master server sends the first message to the control node. If the first message is sent, it means that the master server and the slave server have completed the split storage of the shards to be split.
  • the data storage request may be sent to the slave server, and the slave server determines the shard to store the data to be stored from the target shard and the shard after the split storage.
  • the specific determination method may be: obtaining the primary key of the data to be stored, and matching the obtained primary key with the data storage range of the shard to be split after the split storage; if the matching is successful, the data to be stored may be stored in the split Stored to-be-splitting shards; if the match fails, if there is only one target shard, the data to be stored can be stored in the target shard; if there are more than one target shards, the data to be stored can be stored in the data store In the target shard whose range matches the obtained primary key; after the shard is determined, the data to be stored is stored in the determined shard.
  • the slave server After the slave server stores the data to be stored in the determined shard, the slave server sends a data storage completion message, that is, a second message to the master server.
  • a data storage completion message that is, a second message to the master server.
  • the main server performs a data storage operation.
  • the data storage operation performed by the master server is the same as the data storage operation performed by the slave server. The execution of the data storage request will not be affected from the time when the master server receives the prepare to split instruction to the time when the master server receives the split storage instruction.
  • the data to be stored needs to be stored in the target shard or the split-to-split shard, with a small amplitude Sacrifice performance in the process of sharding without setting the state of the split shard to a non-writable state.
  • read and write operations are still supported, which can improve the availability of the distributed storage system.
  • FIG. 2 is another schematic flowchart of a data storage method according to an embodiment of the present application.
  • the data storage method applied to a control node includes:
  • S201 Send a preparation and splitting instruction for the shard to be split to the master server to which the shard to be split belongs, so that after receiving the shard split instruction, the master server sends the preparation and splitting instruction to the slave to which the shard to be split belongs.
  • the server sends a data splitting point to the control node when the number of data splitting points received is greater than the first number threshold.
  • the control node may generate a ready-to-split instruction after receiving a split storage request sent by the master server to which any shard belongs, and the shard is the shard to be split; the control node may also send a After the stored instructions are split into pieces, a ready-to-split instruction is generated.
  • the control node sends the preparation split instruction to the master server to which the shard to be split belongs.
  • the master server sends the preparation split instruction to the slave server to which the shard is to be split.
  • the slave server receives the preparation split instruction and obtains the data split. Point to send the obtained data split point to the master server.
  • the main server sends the data splitting points to the control node.
  • S202 After receiving the data splitting point, send a split storage instruction to the master server, so that after receiving the split storage instruction, the master server sends the split storage instruction to each slave server, and upon receiving the first When the number of one message is greater than the second number threshold, the fragment to be split is split and stored according to the data split point to obtain the target fragment.
  • the control node can determine that the master server and the slave server can start the split storage of the slice, and then send the split storage instruction to the master server.
  • the master server sends the split storage instruction to the slave server.
  • the slave server splits the shard to be split to obtain the target shard, and after receiving the target shard, sends a first message to the master server.
  • the main server performs a split storage operation on the shards, and splits and stores the shards to be split to obtain target shards.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, and the master server and the slave server split and store the fragment to be split according to the data split point, and each slave server splits and stores the fragment to be split.
  • a first message is sent to the main server.
  • the fragment to be split is split and stored to obtain the target fragment.
  • the slave server obtains the data split point of the shard to be split, and sends the data split point to the master server.
  • the master server and the slave The server can split and store the split shards according to the same data split point, and can get the same target shards.
  • the split storage process it can continue to read and write the split shards, unlike the existing technology That way, the state of the shard to be split is set to a non-writable state, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • the method may further include:
  • the control node may determine that the master server and the slave server have completed the split storage of the shards to be split. At this time, the control node determines the data storage range of the target shard according to the data split point and the data storage range of the shard to be split before the split storage. For example, one of the data storage ranges after the split may be selected as the data storage range of the shards to be split after the split storage. The selected data storage range is the data storage range of the shards to be split after the split storage. That is, the target data storage range.
  • control node can update the data storage range in the previous association record for the shard to be split, that is, the data storage range corresponding to the pre-recorded shard to be split is updated to the target data storage range, and from the remaining split, In the subsequent data storage range, the data storage range of the target shard is selected, and the association relationship between the selected data storage range and the target shard is recorded.
  • the data storage range of the shard to be split after the split storage is updated, and an association relationship between the target shard and the data storage range determined for the target shard can be established to ensure that subsequent data can be correct.
  • the data is stored to the corresponding shard, which can also ensure that the correct data can be read later according to the data read request.
  • the method may further include:
  • the control node when the control node sends a split instruction to the main server, the state of the shard to be split is changed from the split completion state to the split storage state. After receiving the first information sent by the main server, the control node changes the state of the shard to be split after the split storage from the split storage state to the split completion state, and records the status of the target shard obtained by the split. Completed as split status.
  • control node After the control node fails and is repaired, it sends a collection instruction for the data storage range of the shard to all servers.
  • the collection instruction is used to collect the data storage range of the shards in each server.
  • the control node judges for each shard in advance to record records for that shard. Whether the status of the is in the split storage state; if it is in the split storage state, further judgment is required before the next operation can be performed, that is, whether the data storage range of the pre-recorded shard and the obtained data of the shard are determined.
  • the storage range is the same.
  • the recorded state is the split completion state, it indicates that the split is in the split completion state at this time, and further indicates that the split currently does not require split storage, or the split storage of the split has been completed. Regardless of which of the above two states, there is no need to modify the data storage range of the slice recorded by the control node. If the data storage scope is the same, and the state of the shard record is in the split storage state, it means that the master server and the slave server have completed the split storage of the shard. In order to ensure the state recorded by the control node and the shard's The actual status is the same, and a split storage instruction needs to be sent to the master server to which the shard belongs to perform the split storage of the shard.
  • the master server and / or slave server have completed the In the split storage of the shards, in order to make the status of the shards recorded be the same as the actual status of the shards, the state recorded by the control node needs to be modified to the split completion status.
  • the control node can record the status of the shards, collect the data storage range of the shards stored in the server when the fault repair is completed, and according to the recorded status, the data storage range of the shards, and the collected shard Data storage scope, determine the instructions that need to be sent, send instructions to the master server to which the shard belongs, and continue to complete the data storage operation, so that the distributed storage system recovers non-destructively when it exits abnormally.
  • An embodiment of the present application further provides a data storage device, which is applied to a master server to which a shard to be split belongs.
  • the device includes:
  • the receiving module 301 is configured to receive a preparation split instruction for a fragment to be split sent by a control node
  • the first sending module 302 is configured to send a preparation split instruction to a slave server to which the fragment to be split belongs, so that each slave server obtains a data split point of the fragment to be split after receiving the preparation split instruction, And send the data split point to the master server;
  • the second sending module 303 is configured to send the data splitting point to the control node when the number of received data splitting points is greater than the first number threshold, so that the control node sends a split storage instruction to the main server;
  • the third sending module 304 is configured to send the split storage instruction to each slave server after receiving the split storage instruction, so that each slave server splits and stores the fragment to be split according to the data split point. To obtain a target fragment, and after obtaining the target fragment, send a first message to the main server, where the first message is a message that the split storage is completed;
  • the splitting module 305 is configured to, when the number of received first messages is greater than a second number threshold, split and store the split to be split according to the data split point to obtain a target split.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be split based on the same data.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. Downtime caused by performance bottlenecks on the control nodes, improving the reliability of the distributed storage system.
  • the device may further include:
  • the fourth sending module is configured to send the first message to the control node, so that after receiving the first message, the control node determines the target score according to the data storage point and the data storage range of the shard to be split before the split storage.
  • the data storage range and target data storage range of the slice update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record the difference between the target shard and the data storage range determined for the target shard An association relationship among them, wherein the target data storage range is the data storage range of the shard to be split after the split storage.
  • the device may further include:
  • the first obtaining module is configured to obtain an association relationship between each slice and a data storage range when the data stored in the target slice is the same as the data stored in the slice to be split;
  • a first determining module configured to determine, according to the obtained association relationship, whether each slice stored by itself stores data that is not within a data storage range of the slice;
  • the deleting module is configured to delete data that is not within a data storage range of the slice when the judgment result of the first judgment module is yes.
  • the device may further include:
  • the second obtaining module is configured to obtain a data storage request for the shard to be split, where the data storage request includes data to be stored;
  • the fifth sending module is configured to send a data storage request to the slave server without receiving the split storage instruction, so that each slave server stores the data to be stored to the shard to be split, and sends a second message Send to the main server, wherein the second message is a message that the data storage is completed;
  • a first storage module configured to store data to be stored in a shard to be split when the number of received second messages is greater than a second number threshold
  • a second judgment module configured to determine whether a first message is currently sent to the control node when a split storage instruction is received
  • the sixth sending module is configured to send a data storage request to the slave server if the judgment result of the second judgment module is yes, so that each slave server shards from the target and the shards to be split after the storage is split. Determining a shard to store data to be stored, storing the shard to be stored in the determined shard, and sending a second message to the main server;
  • a second storage module configured to determine, when the number of the received second messages is greater than a second number threshold, the target fragment and the fragment to be split after the split storage, to determine a fragment to store the data to be stored, and Storing the data to be stored to the determined shard;
  • the sixth sending module is further configured to send a data storage request to the slave server after sending the first message to the control node if the determination result of the second determination module is no.
  • An embodiment of the present application further provides a data storage device, which is applied to a control node. As shown in FIG. 4, the device includes:
  • the seventh sending module 401 is configured to send a preparation and splitting instruction for the shard to be split to the master server to which the shard to be split belongs, so that after receiving the preparation and splitting instruction, the master server sends the preparation and splitting instruction to the standby Split the slave server to which the shard belongs, and send the data split point to the control node when the number of data split points received is greater than the first number threshold;
  • An eighth sending module 402 is configured to send a split storage instruction to the master server after receiving the data splitting point, so that the master server sends the split storage instruction to each slave server after receiving the split storage instruction. And when the number of the first messages received is greater than the second number threshold, the fragment to be split is split and stored according to the data split point to obtain the target fragment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be split based on the same data Point the split shard to split the storage to get the same target shard.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • the device may further include:
  • the determining module is configured to determine the data storage range and the target data storage of the target shard according to the data split point and the data storage range corresponding to the shard to be split before the split storage after receiving the first message sent by the main server. range;
  • the first modification module is configured to update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record the association relationship between the target shard and the data storage range determined for the target shard. , Where the target data storage range is the data storage range of the shard to be split after the split storage.
  • the device may further include:
  • the ninth sending module is configured to send a collection instruction for the data storage range of each shard to each server, so that after receiving the collection instruction, each server obtains the data storage range of each shard stored by itself, Sending the obtained data storage range to the control node; wherein the server includes a master server and a slave server;
  • the third judgment module is configured to, after receiving the data storage range sent by each server, determine, for each slice, whether the state recorded in advance for the slice is a split storage state;
  • a fourth judgment module configured to determine, if the judgment result of the third judgment module is yes, whether the data storage range of the segment recorded in advance is the same as the data storage range of the obtained segment;
  • the tenth sending module is configured to send a split storage instruction to the master server to which the segment belongs if the determination result of the fourth determination module is the same;
  • the second modification module is configured to modify the state of the fragment record in advance to a split storage completion state when the judgment result of the fifth judgment module is different.
  • An embodiment of the present application further provides a data storage system.
  • the system includes a control node 501, a master server 502 to which a shard to be split belongs, and a slave server 503 to the shard to be split.
  • the control node 501 is configured to send a preparation instruction for splitting the fragment to be split to the main server 502;
  • the master server 502 is configured to send the humiliation splitting instruction to the slave server 503 to which the segment to be split belongs after receiving the split splitting instruction;
  • the slave server 503 After receiving the split preparation instruction, the slave server 503 obtains the data split point of the split to be split, and sends the data split point to the master server 502;
  • the main server 502 is further configured to send the data splitting points to the control node 501 when the number of received data splitting points is greater than the first number threshold;
  • the control node 501 is configured to send a split storage instruction to the main server 502 after receiving a data split point sent by the main server 502;
  • the master server 502 is configured to send a split storage instruction to each slave server 501 after receiving the split storage instruction;
  • the server 503 From the server 503, it is set to split and store the shard to be split according to the data split point to obtain the target shard. After the target shard is obtained, send a first message to the main server 502, where the first message is a split Stored message completed
  • the main server 502 is further configured to, when the number of the first messages received is greater than the second number threshold, divide and store the fragment to be split according to the data split point to obtain the target fragment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the fragment to be split is split and stored according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be split based on the same data Point the split shard to split the storage to get the same target shard.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • the main server is further configured to send a first message to the control node
  • the control node is further configured to determine, after receiving the first message, the data storage range and target data storage range of the target shard according to the data split point and the data storage range of the shard to be split before the split storage;
  • the data storage range corresponding to the recorded shard to be split is updated to the target data storage range, and the association relationship between the target shard and the data storage range determined for the target shard is recorded, where the target data storage range is split
  • the main server is further configured to obtain the data between each shard and the data storage range when the data stored in the target shard is the same as the data stored in the shard to be split. Association relationship; according to the obtained association relationship, determine whether each shard stored by itself stores data not in the data storage range of the shard; if so, delete data not in the data storage range of the shard.
  • the main server is further configured to obtain a data storage request for the shard to be split, where the data storage request includes the data to be stored; if no split storage instruction is received, the main server Data storage request is sent to the slave service;
  • the slave server is further configured to store the data to be stored to the shard to be split; and send a second message to the master server; wherein the second message is a message that the data storage is completed;
  • the main server is further configured to store the data to be stored in the shard to be split when the number of the received second messages meets the second number threshold;
  • the master server is further configured to determine whether the first message is currently sent to the control node when a split storage instruction is received; if so, send a data storage request to the slave server;
  • the server is further configured to determine, from the target shard and the shard to be split after the storage is split, the shard to store data to be stored, store the data to be stored in the determined shard, and send a second message To the master server;
  • the main server is further configured to, when the number of received second messages is greater than the second number threshold, determine, from the target shard and the shard to be split after the split storage, the shard that stores the data to be stored, and The data to be stored is stored in the determined shard; if not, after sending the first message to the control node, the step of sending a data storage request to the slave server is performed.
  • the master server is further configured to determine the target after receiving the first message sent by the master server according to the data storage range corresponding to the data split point and the shard to be split before the split storage.
  • Data storage range and target data storage range of the shard update the data storage range corresponding to the pre-recorded shard to be split to the target data storage range, and record the target shard and the data storage range determined for the target shard An association relationship therebetween, wherein the target data storage range is the data storage range of the shard to be split after the split storage.
  • control node is further configured to send a collection instruction for the data storage range of each shard to each server; wherein the server includes a master server and a slave server;
  • the server is configured to obtain the data storage range of each slice stored by itself after receiving the collection instruction, and send the obtained data storage range to the control node;
  • the control node is also set to determine, for each shard, whether the state of the record for the shard is in the split storage state after receiving the data storage range sent by each server; if it is the split storage state, determine the Whether the recorded data storage range of the shard is the same as the obtained data storage range of the shard; if they are the same, a split storage instruction is sent to the master server to which the shard belongs; if they are not the same, the shard will be targeted for the shard in advance The status of the record is changed to the split storage completion status.
  • An embodiment of the present application further provides a main server, as shown in FIG. 6, including a processor 601 and a memory 602.
  • a memory 602 configured to store a computer program
  • each of the slave servers After receiving the split storage instruction, send the split storage instruction to each of the slave servers, so that each of the slave servers splits the shard to be split according to a data split point. Divide the storage to obtain the target fragment, and after receiving the target fragment, send a first message to the main server, where the first message is a message that the split storage is completed;
  • the fragment to be split is split and stored according to a data split point to obtain the target fragment.
  • the memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory.
  • NVM non-Volatile Memory
  • the memory may also be at least one storage device located far from the foregoing processor.
  • the aforementioned processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc .; it may also be a digital signal processor (Digital Signal Processing, DSP), special integration Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU central processing unit
  • NP network processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be split based on the same data Point the split shard to split the storage to get the same target shard.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • An embodiment of the present application further provides a control node.
  • the control node includes a processor 701 and a memory 702.
  • the memory 702 is configured to store a computer program
  • the slave server After receiving the data splitting point, send a split storage instruction to the master server, so that the main server sends the split storage instruction to each of the hosts after receiving the split storage instruction.
  • the slave server when the number of the received first messages is greater than the second number threshold, splits and stores the fragment to be split according to a data split point to obtain the target fragment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read. Write instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer-readable storage medium stores a computer program.
  • the computer program When the computer program is executed by a processor, the computer program implements any task applied to the main server in the foregoing embodiment.
  • a data storage method A data storage method.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read and written. , Instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, any task applied to the control node in the foregoing embodiment is implemented.
  • a data storage method is also provided.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read and written. , Instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer program product containing instructions is also provided.
  • the computer program product is run on a computer, the computer is caused to execute any data storage method applied to the main server in the foregoing embodiment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read. Write instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer program product containing instructions is also provided.
  • the computer program product is run on a computer, the computer is caused to execute any data storage method applied to a control node in the foregoing embodiment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read and written. , Instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer program is also provided, which, when run on a computer, causes the computer to execute any one of the data storage methods applied to the main server in the above embodiments.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read and written. , Instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • a computer program is also provided, which, when run on a computer, causes the computer to execute any data storage method applied to the control node in the foregoing embodiment.
  • the control node sends a split preparation instruction to the master server to which the shard to be split belongs, and the master server sends the split preparation instruction to the slave server to which the shard to be split belongs.
  • the data splitting point and send the data splitting point to the main server; when the number of received data splitting points is greater than the first number threshold, the data splitting point is sent to the control node, and the control node
  • a split storage instruction is sent, the master server and the slave server split and store the split shards, and each slave server splits and stores the split shards according to the data split point.
  • the first message is sent to the main server.
  • the main server splits and stores the fragment to be split according to the data split point to obtain the target fragment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the main server.
  • the master server and the slave server can split and store the split shards according to the same data split point, and can obtain the same target shards.
  • the split shards can continue to be read. Write instead of setting the state of the shard to be split to a non-writable state as in the prior art, thereby improving the availability of the storage system.
  • the master server and the slave server perform the split storage by themselves after obtaining the split storage instruction, which does not require the synchronous intervention of the control node, reduces the dependence on the control node, and avoids as much as possible the dependence on the control node. It causes performance bottlenecks in control nodes and causes downtime, improving the reliability of distributed storage systems.
  • the description is relatively simple, and the relevant part may refer to the description of the method embodiment.
  • the data split point of the fragment to be split is obtained from the server, and the data split point is sent to the master server, so that the master server and the slave server can be treated according to the same data split point
  • the split shard can be split and stored to obtain the same target shard.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据存储方法、装置、系统、服务器、控制节点及介质,可以应用于主服务器,该方法包括:接收控制节点发送的针对待分裂分片的准备拆分指令(S101);将准备拆分指令发送给待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器(S102);在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点,以使控制节点向所述主服务器发送拆分存储指令(S103);在接收到所述拆分存储指令后,将拆分存储指令发送给每一从服务器,以使每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息(S104);在接收到第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片(S105)。该数据存储方法提高存储系统的可用性。

Description

数据存储方法、装置、系统、服务器、控制节点及介质
本申请要求于2018年9月30日提交中国专利局、申请号为201811159536.2,发明名称为“数据存储方法、装置、系统、服务器、控制节点及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储技术领域,特别是涉及一种数据存储方法、装置、系统、服务器、控制节点及介质。
背景技术
随着信息技术的快速发展,数据存储系统中的数据量越来越大。为了满足大数据量的存储需求,在多台服务器上运行的分布式存储系统得到了广泛的应用。为了提高分布式存储系统的存储总量,在分布式存储系统中采用分片存储的方式,将数据存储至各个分片中,通过增加分片数量,可以对分布式存储系统进行扩展,从而增加分布式存储系统的存储总量。
当数据要进行存储时,由于每一分片对应的数据存储范围不同,因此,需要先确定该待存储数据应该在哪个分片对应的数据存储范围内,从而确定该待存储数据应该属于哪个分片,其中,可以直接判断该待存储数据在哪个分片的数据存储范围之内,还可以通过一致性哈希算法确定该待存储数据存储在哪个分片的数据存储范围之内。
为了保证数据和服务的高可用性,往往需要为分布式存储系统提供必要的容错机制,对各个分片进行冗余备份。在不同的服务器上配置相同的分片,可以避免由于单个服务器不可用时造成的数据丢失、存储服务不可用等情况的发生。对于同一分片而言,其对应的不同服务器可以包括一个主服务器和一个或多个从服务器,而对于一个服务器而言,该服务器可以是一个分片的主服务器,同时还可以是另一个分片的从服务器。
在实际应用中,分布式存储系统的分片的数据量和/或者请求分片的压力 的增大,会导致原有的分片无法提供足够的服务能力,需要将该分片中的数据存储至针对该分片创建的多个子分片中。具体的数据存储方法为:分布式存储系统中的控制节点确定需要进行拆分存储的目标分片;将目标分片的状态由可写状态修改为不可写状态;对目标分片中的数据进行拆分,得到多个拆分数据;确定存储各个拆分数据的服务器;向所确定的服务器发送分裂指令,所确定的服务器创建目标分片的子分片,并确定每一子分片的数据存储范围;根据每一子分片的数据存储范围,将多个拆分数据存储至对应的子分片中。
对于多个服务器存储同一个目标分片的情况,针对每一目标分片均会执行一次上述方法。为了保证不同服务器中的目标分片的子分片中存储的数据的一致性,将目标分片的状态设置为不可写状态,这将导致在上述数据存储过程中,无法对目标分片进行写操作,拒绝针对目标分片的写请求,从而降低了分布式存储系统的可用性。
发明内容
本申请提供了一种数据存储方法、装置、系统、服务器、控制节点及介质,以提高存储系统的可用性。具体技术方案如下:
第一方面,本申请实施例提供了一种数据存储方法,应用于待分裂分片所属的主服务器,所述方法包括:
接收控制节点发送的针对所述待分裂分片的准备拆分指令;
将所述准备拆分指令发送给所述待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点,以使所述控制节点向所述主服务器发送拆分存储指令;
在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,以使每一所述从服务器根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
可选的,在所述在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片之后,所述方法还包括:
向所述控制节点发送所述第一消息,以使所述控制节点在接收到所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,在所述目标分片中所存储的数据与所述待分裂分片所存储的数据相同的情况下,所述方法还包括:
获得每一分片与数据存储范围之间的关联关系;
根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;如果是,删除未在该分片的数据存储范围内的数据。
可选的,所述方法还包括:
获得针对所述待分裂分片的数据存储请求,其中,所述数据存储请求中包含待存储数据;
在未接收到所述拆分存储指令的情况下,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器将所述待存储数据存储至所述待分裂分片,并将第二消息发送给所述主服务器;其中,所述第二消息为数据存储完成的消息;在接收到所述第二消息的数量大于第二数量阈值时,将所述待存储数据存储至所述待分裂分片;
在接收到所述拆分存储指令的情况下,判断当前是否已经向所述控制节点发送所述第一消息;如果是,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器从所述目标分片和拆分存储之后的待分裂分片中, 确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片,并将所述第二消息发送给所述主服务器;在接收到所述第二消息的数量大于第二数量阈值时,从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片;如果否,在向所述控制节点发送所述第一消息后,执行将所述数据存储请求发送给所述从服务器的步骤。
第二方面,本申请实施例还提供了一种数据存储方法,应用于控制节点,所述方法包括:
向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令以使所述主服务器在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
在接收到所述数据拆分点后,向所述主服务器发送拆分存储指令,以使所述主服务器在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,并在接收到所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
可选的,在所述接收到所述数据拆分点后,向所述主服务器发送拆分存储指令之后,所述方法还包括:
在接收到所述主服务器发送的所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片对应的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;
将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,所述方法还包括:
向各个服务器发送针对每一分片的数据存储范围的收集指令,以使每一服务器在接收到所述收集指令后,获得自身所存储的每一分片的数据存储范围,向所述控制节点发送所获得的数据存储范围;其中,所述服务器包括所 述主服务器和所述从服务器;
接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;
如果为拆分存储中状态,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;
如果相同,则向该分片所属的主服务器发送拆分存储指令;
如果不相同,将预先针对该分片记录的状态修改为拆分存储完成状态。
第三方面,本申请实施例提供了一种数据存储装置,应用于待分裂分片所属的主服务器,所述装置包括:
接收模块,设置为接收控制节点发送的针对所述待分裂分片的准备拆分指令;
第一发送模块,设置为将所述准备拆分指令发送给所述待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
第二发送模块,设置为在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点,以使所述控制节点向所述主服务器发送拆分存储指令;
第三发送模块,设置为在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,以使每一所述从服务器根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
拆分模块,设置为在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
可选的,所述装置还包括:
第四发送模块,设置为向所述控制节点发送所述第一消息,以使所述控制节点在接收到所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储 范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,所述装置还包括:
第一获得模块,设置为在所述目标分片中所存储的数据与所述待分裂分片所存储的数据相同的情况下,获得每一分片与数据存储范围之间的关联关系;
第一判断模块,设置为根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;
删除模块,设置为在所述第一判断模块的判断结果为是的情况下,删除未在该分片的数据存储范围内的数据。
可选的,所述装置还包括:
第二获得模块,设置为获得针对所述待分裂分片的数据存储请求,其中,所述数据存储请求中包含待存储数据;
第五发送模块,设置为在未接收到所述拆分存储指令的情况下,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器将所述待存储数据存储至所述待分裂分片,并将第二消息发送给所述主服务器;其中,所述第二消息为数据存储完成的消息;
第一存储模块,设置为在接收到所述第二消息的数量大于第二数量阈值时,将所述待存储数据存储至所述待分裂分片;
第二判断模块,设置为在接收到所述拆分存储指令的情况下,判断当前是否已经向所述控制节点发送所述第一消息;
第六发送模块,设置为在所述第二判断模块的判断结果为是的情况下,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片,并将所述第二消息发送给所述主服务器;
第二存储模块,设置为在接收到所述第二消息的数量大于第二数量阈值时时,从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片;
所述第六发送模块,还设置为在所述第二判断模块的判断结果为否的情况下,在向所述控制节点发送所述第一消息后,将所述数据存储请求发送给所述从服务器。
第四方面,本申请实施例提供了一种数据存储装置,应设置为控制节点,所述方装置包括:
第七发送模块,设置为向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使所述主服务器在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
第八发送模块,设置为在接收到所述数据拆分点后,向所述主服务器发送拆分存储指令,以使所述主服务器在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,并在将接收到所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
可选的,所述装置还包括:
确定模块,设置为在接收到所述主服务器发送的所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片的数据存储范围,确定所述目标分片对应的数据存储范围和目标数据存储范围;
第一修改模块,设置为将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,所述装置还包括:
第九发送模块,设置为向各个服务器发送针对每一分片的数据存储范围的收集指令,以使每一服务器在接收到所述收集指令后,获得自身所存储的每一分片的数据存储范围,向所述控制节点发送所获得的数据存储范围;其中,所述服务器包括所述主服务器和所述从服务器;
第三判断模块,设置为接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;
第四判断模块,设置为在所述第三判断模块的判断结果为是的情况下,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;
第十发送模块,设置为在所述第四判断模块的判断结果为相同的情况下,则向该分片所属的主服务器发送拆分存储指令;
第二修改模块,设置为在所述第五判断模块的判断结果为不相同的情况下,将预先针对该分片记录的状态修改为拆分存储完成状态。
第五方面,本申请实施例提供了一种数据存储系统,所述系统包括控制节点、待分裂分片所属的主服务器和待分裂分片所属的从服务器,其中,
所述控制节点,设置为向所述主服务器发送针对所述待分裂分片的准备拆分指令;
所述主服务器,设置为在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器;
所述从服务器,设置为在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
所述主服务器,还设置为在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
所述控制节点,还设置为在接收到所述主服务器发送的所述数据拆分点后,向所述主服务器发送拆分存储指令;
所述主服务器,还设置为在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器;
所述从服务器,还设置为根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
所述主服务器,还设置为在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
可选的,所述主服务器,还设置为向所述控制节点发送所述第一消息;
所述控制节点,还设置为在接收到所述第一消息后,根据所述数据拆分 点和拆分存储之前的所述待分裂分片的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,所述主服务器,还设置为在所述目标分片中所存储的数据与所述待分裂分片所存储的数据相同的情况下,获得每一分片与数据存储范围之间的关联关系;根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;如果是,删除未在该分片的数据存储范围内的数据。
可选的,所述主服务器,还设置为获得针对所述待分裂分片的数据存储请求,其中,所述数据存储请求中包含待存储数据;在未接收到所述拆分存储指令的情况下,将所述数据存储请求发送给所述从服务;
所述从服务器,还设置为将所述待存储数据存储至所述待分裂分片;并将第二消息发送给所述主服务器;其中,所述第二消息为数据存储完成的消息;
所述主服务器,还设置为在接收到所述第二消息的数量满足第二数量阈值时,将所述待存储数据存储至所述待分裂分片;
所述主服务器,还设置为在接收到所述拆分存储指令的情况下,判断当前是否已经向所述控制节点发送所述第一消息;如果是,将所述数据存储请求发送给所述从服务器;
所述从服务器,还设置为从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片,并将所述第二消息发送给所述主服务器;
所述主服务器,还设置为在接收到所述第二消息的数量大于第二数量阈值时,从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片;如果否,在向所述控制节点发送所述第一消息后,执行将所述数据存储请求发送给所述从服务器的步骤。
可选的,所述主服务器,还设置为在接收到所述主服务器发送的所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片对应的数据存储范围,确定所述目标分片对应的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
可选的,所述控制节点,还设置为向所有服务器发送针对分片的数据存储范围的收集指令;其中,所述服务器包括所述主服务器和所述从服务器;
所述服务器,设置为在接收到所述收集指令后,获得自身所存储的分片的数据存储范围,向所述控制节点发送所获得的数据存储范围;
所述控制节点,还设置为接收到所有服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;如果为拆分存储中状态,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;如果相同,则向该分片所属的主服务器发送拆分存储指令;如果不相同,将预先针对该分片记录的状态修改为拆分存储完成状态。
第六方面,本申请实施例还提供了一种主服务器,包括处理器和存储器,其中,
存储器,设置为存放计算机程序;
处理器,设置为执行存储器上所存放的程序时,实现本申请实施例所提供的应用于主服务器的数据存储方法的步骤。
第七方面,本申请实施例还提供了一种控制节点,包括处理器和存储器,其中,
存储器,设置为存放计算机程序;
处理器,设置为执行存储器上所存放的程序时,实现本申请实施例所提供的应用于控制节点的数据存储方法的步骤。
第八方面,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时实现本申请实 施例所提供的应用于主服务器的数据存储方法的步骤。
第九方面,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时实现本申请实施例所提供的应用于控制节点的数据存储方法的步骤。
第十方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行应用于主服务器的数据存储方法。
第十一方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行应用于控制节点的数据存储方法。
第十二方面,本申请实施例提供了一种计算机程序,当其在计算机上运行时,使得计算机执行应用于主服务器的数据存储方法。
第十三方面,本申请实施例提供了一种计算机程序,当其在计算机上运行时,使得计算机执行应用于控制节点的数据存储方法。
本申请实施例提供的数据分裂方法、方法、系统、服务器、控制节点及计算机可读存储介质,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的数据存储方法的一种流程示意图;
图2为本申请实施例提供的数据存储方法的另一种流程示意图;
图3为本申请实施例提供的数据存储装置的结构示意图;
图4为本申请实施例提供的数据存储装置的另一种结构示意图;
图5为本申请实施例提供的数据存储系统的结构示意图;
图6为本申请实施例提供的主服务器的结构示意图;
图7为本申请实施例提供的控制节点的结构示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供了一种数据存储方法、装置、系统、服务器、控制节点及计算机可读存储介质,下面首先对本申请实施例提供的应用于待分裂分片所属的主服务器的数据存储方法进行说明。
需要说明的是,本申请实施例提供的数据存储方法优选应用于待分裂分片所属的主服务器,这里所说的待分裂分片为待拆分存储的分片,待分裂分片所属的主服务器是相对待分裂分片来说的,该服务器可能是另一个分片的从服务器。在本申请实施例中,目标分片所属的主服务器可以是KTS(Kingsoft Table Service,金山表格存储系统)中的服务器。KTS是完全托管的NoSQL(Not Only SQL,非关系型数据库)数据库服务,提供海量结构化和半结构化数据的存储和实时访问。待分裂分片所属的主服务器可以是利用Raft算法, 从存储有目标分片的服务器中选举产生的。Raft算法是一种更容易理解的一致性算法,具体的,是一种一致性算法。
图1为本申请实施例提供的数据存储方法的第一种流程示意图;方法包括:
S101:接收控制节点发送的针对待分裂分片的准备拆分指令。
其中,每个分片均对应一个数据存储范围,不同的分片的数据存储范围不同,即不同的分片存储不同的数据。用户可以对各个分片中存储的数据进行访问,举例而言,用户对各个分片中存储的数据的访问频度可能相同,也可能不同,也就是说,每个分片对应一个访问频度。可以将访问频度大于预设访问频度的分片确定为待分裂分片。在本申请实施例中,可以根据实际情况来确定预设访问频度的大小,本申请实施例对预设访问频度的大小不做具体限定。
控制节点可以在接收到待分裂分片所属的主服务器发送的拆分存储请求后,生成准备拆分指令;控制节点还可以在接收到人工发送的拆分存储请求后,生成准备拆分指令,下面将进行详细阐述。
在一种实施方式中,控制节点可以在接收到待分裂分片所属的主服务器发送的拆分存储请求后,生成准备拆分指令。具体的,一个分片所属的主服务器可以检测该分片所存储的数据量是否大于预设的存储阈值,并且,在该分片所存储的数据量大于预设的存储阈值时,该分片所属的服务器可以向控制节点发送拆分存储请求,控制节点接收到该分片所属的服务器发送的拆分存储请求后,生成准备拆分指令。
在另一种实施方式中,控制节点可以在接收到人工发送的拆分存储请求后,生成准备拆分指令。具体的,存储系统的工作人员在发现访问频度大于预设访问频度的分片存在时,可以向控制节点发送拆分存储请求,控制节点在接收到拆分存储请求后,生成准备拆分指令。
拆分存储请求中可以包含待分裂分片的标识信息,该标识信息可以为待分裂分片的名称,也可以为该待分裂分片的ID(Identification,身份标识号码)。
控制节点在生成准备拆分指令后,可以将准备拆分指令发送给待分裂分 片所属的主服务器,进而使得主服务器可以根据该准备拆分指令,执行相应的操作。
在本申请实施例中,主服务器可以是通过Raft算法选举出来的。存储有待分裂分片的除主服务器之外的服务器作为从服务器。主服务器可以获知其他存储有待分裂分片的服务器的标识信息,该标识信息可以为服务器的名称或IP(Internet Protocol,网络之间的互联协议)地址等等。主服务器在接收到准备拆分指令后,根据之前获得的其他存储有待分裂分片的服务器的标识信息,确定从服务器,并向从服务器发送该准备拆分指令。
S102:将准备拆分指令发送给待分裂分片所属的从服务器,以使每一从服务器在接收到准备拆分指令后,获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器。
主服务器在接收到准备拆分指令后,可以将该准备拆分指令发送给待分裂分片的从服务器。从服务器在将接收到准备拆分指令后,可以获得待分裂分片的数据拆分点。
在一种实现方式中,准备拆分指令中可以携带有数据拆分点,从服务器在接收到准备拆分指令后,可以从准备拆分指令中获取数据拆分点。
在另一种实施方式中,从服务器可以根据预先设置的数据拆分规则,确定数据拆分点。比如,从服务器可以获得待分裂分片的数据存储范围以及数据存储范围内每一主键对应的数据存储量,然后根据待分裂分片的数据存储范围和每一主键对应的数据存储量,按照预先设置的数据拆分规则,确定数据拆分点。
其中,从服务器获得待分裂分片的数据存储范围的方式可以有多种,比如,在准备拆分指令中包含有数据存储范围的情况下,可以直接从准备拆分指令中获取数据存储范围;在准备拆分指令中不包含数据存储范围的情况下,可以扫描待分裂分片,获得数据存储范围,当然,还可以通过其他方式获得待分裂分片的数据存储范围,本申请实施例对从服务器获得待分裂分片的数据存储范围的方式不做具体限定。并且,待分裂分片中的每一数据对应一个主键,每一数据都有固定大小的数据量,因此每一主键对应的数据存储量可以通过扫描待分裂分片存储的数据确定。
举例而言,数据拆分规则可以为:拆分得到的分片的数据存储范围对应的数据量与待分裂分片的数据存储范围对应的数据量相同。示例性的,拆分得到的分片只有2个,待分裂分片是分片1,分片1中存储的数据量为10G,分片1的主键区间为A-F,分片1的主键区间就是分片1的数据存储范围,如果主键区间A-C对应的数据量为4.95G,主键区间D-F对应的数据量为5.05G,如果预先设置数据拆分点作为其中一个分片的主键区间的右端点,则数据拆分点是主键C,如果预先设置数据拆分点作为其中一个分片的主键区间的左端点,则数据拆分点是主键D。不管采用上述哪个数据拆分点,拆分所得到的分片的数据存储范围为D-F,拆分之后的待分裂分片的数据存储范围是A-C。如果拆分得到的分片有3个,待分裂分片是分片2,分片2中存储的数据量为15G,分片2的主键区间为H-N,如果主键区间H-J对应的数据量为4.95G,主键区间K-L对应的数据量为5.05G,主键区间M-N对应的数据量为5G,则可以认为数据拆分点可以是主键J和主键L。
在大多数分布式存储系统中,通常采用KeyValue类型的存储方式,在数据存储时,是按照Key的字典顺序自然排序的。Key就是上面所说的主键,主键是被挑选出来,作表的行的惟一标识的候选关键字。主键可以由一个字段,也可以由多个字段组成,分别成为单字段主键或多字段主键。一个分片存储有对应的主键区间的所有数据。
准备拆分指令中所包含的数据拆分点可以是控制节点接收工作人员发送的指令中数据拆分点,工作人员可以根据分片中每一主键对应的数据的访问压力确定数据拆分点。
从服务器获得数据拆分点之后,需要将所获得的数据拆分点发送给主服务器,以使得主服务器在接收到数据拆分点后进行接下来的操作。在本申请实施例中,针对同一个待分裂分片,所有从服务器所获得的数据拆分点是相同的。
S103:在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点,以使控制节点向主服务器发送拆分存储指令。
为了保证分片的主服务器和从服务器拆分存储后得到的分片数据的一致性,因此,主服务器在接收到数据拆分点的数量大于第一数量阈值时,可以 判定当前大部分的从服务器均已向主服务器发送数据拆分点,在这种情况下,才会执行后续的步骤,这样可以尽可能保证每一分片在不同的服务器中进行拆分存储之后数据的一致性。
其中,本申请实施例对第一数量阈值的大小不做具体限定,例如,若假设每个从服务器仅获取到一个数据拆分点,那么,第一数量阈值可以是该主服务器的从服务器的数量的一半。示例性的,从服务器的数量为5个,第一数量阈值可以为3个,当主服务器接收到4个数据拆分点,表明当前已有4个从服务器向主服务器发送数据拆分点,那么,则执行后续的操作,当主服务器接收到1个从服务器发送的数据拆分点后,会继续等待,直至超时。
主服务器在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点。主服务器向控制节点发送数据拆分点的一个目的是通知控制节点拆分准备工作已经完成,触发控制节点向主服务器发送拆分存储指令。控制节点接收到数据拆分点,就意味着主服务器已经响应准备拆分指令,此时,控制节点需要将拆分存储指令发送给主服务器,以触发主服务器执行后续的步骤。
S104:在接收到拆分存储指令后,将拆分存储指令发送给每一服务器,以使每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,在得到目标分片后,向主服务器发送第一消息,其中,第一消息为拆分存储完成的消息。
主服务器在接收到拆分存储指令后,将拆分存储指令发送给每一从服务器,从服务器在接收到拆分存储指令后,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。这里所说的目标分片是对待分裂分片进行拆分存储之后得到的分片。比如,可以使用硬链拷贝的方式进行分片的拆分存储,可以减少在拆分存储过程所需要消耗的时间。
硬链,也可以称之为硬链接,就是一个文件的一个或多个文件名。这样可以通过硬链接实现多个文件名与一个文件链接,这些文件名可以在同一个目录下,也可以在不同的目录下。硬链接能够实现多个不在或者同在一个目录下的文件名,在修改一个文件后,所有与该修改的文件之间存在硬链接的文件也被修改。通过硬链拷贝的方式进行分片的分裂,所得到的分片中存储 的数据与拆分存储之前的待分裂分片中存储的数据相同。通过数据迁移的方式进行分片的拆分存储,所得到的分片中存储的数据可以与待分裂分片拆分存储之前存储的数据相同,也可以是待分裂分片拆分存储之前存储的部分数据,具体可以由拆分存储的方式决定,本申请实施例对此不做限定。
从服务器中记录了自身所存储的分片的数据存储范围与数据存储位置之间的关系,即记录了主键区间与该主键区间对应的数据的存储位置之间的关系。当有数据读取请求或者数据存储请求时,根据所记录的数据存储范围与数据存储位置之间的关系,确定从哪个位置读取,或者从哪个位置存储。当完成待分裂分片的拆分存储后,一个数据存储范围也拆分成了多个,一个数据存储位置只能对应一个数据存储范围,因此,需要修改之前针对待分裂分片记录的数据存储范围和数据存储位置之间的关系,可以将针对待分裂分片记录的关系中的数据存储范围修改为一个拆分后的数据存储范围,并针对剩下的拆分后的数据存储范围,确定数据存储位置,并记录所确定的数据存储位置与数据存储范围之间的关系。
示例性的,预先记录的主键区间A-C存储在C1位置之间的关系,拆分存储之后的主键区间为主键A,主键区间B-C,对主键区间A-C与C1位置之间的关系进行修改后,得到主键A与C1位置之间的关系,确定主键区间B-C的数据存储位置为C2,则记录主键区间B-C与C2位置之间的关系。这样可以使得后续能够从正确的位置读取数据,能够将数据存储至正确的位置。从服务器在分得到目标分片、完成数据存储范围与数据存储位置的修改以及记录拆分后得到的数据存储范围与数据存储位置之间的关系后,向主服务器发送第一消息,通知主服务器自身的分片拆分存储已经完成。
S105:在接收到的第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。
每一从服务器在完成拆分存储后,会将拆分存储的完成的消息发送给主服务器,主服务器在接收到第一消息的数量大于第二数量阈值时,根据数据拆分点,将进行主服务器所存储的目标分片的拆分存储操作,具体的拆分存储方式与从服务器对待分裂分片的拆分存储方式相同。
其中,第二数量阈值与第一数量阈值可以相同,也可以不同,本申请实 施例对第二数量阈值的大小不做具体限定。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生的宕机,从而可以提高分布式存储系统的可靠性。
在本申请的一个实施方式中,在接收到的第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片之后,该方法还可以包括:
向控制节点发送第一消息,以使控制节点在接收到第一消息后,根据数据拆分点和拆分存储之前的待分裂分片的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
也就是说,主服务器在完成待分裂分片的拆分存储之后,可以将第一消息发送给控制节点,控制节点在接收到该第一消息后,则可以确定主服务器和从服务器已经完成待分裂分片的拆分存储。然后,控制节点根据数据拆分点和拆分存储之前的待分裂分片的数据存储范围,确定目标分片的数据存储范围。比如,控制节点先将拆分之前的待分裂分片的数据存储范围根据数据拆分点进行拆分,得到拆分后的数据存储范围;然后,可以从拆分后的数据存储范围中选择一个作为拆分存储后的待分裂分片的数据存储范围,所选择的数据存储范围就是拆分存储之后的待分裂分片的数据存储范围,即目标数据存储范围。进而,控制节点可以更新之前针对待分裂分片记录的关联关系中的数据存储范围,即将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,同时,从剩下的拆分后的数据存储范围中,选择目标分片的数据存储范围,并记录所选择的数据存储范围与目标分片之间的关联关系。
其中,从拆分后的数据存储范围中选择一个数据存储范围的方式可以是随机选择,也可以是按照预设的选择规则进行选择。示例性的,待分裂分片为分片3,分片3的数据存储范围为主键P-T,所获得的数据拆分点为主键R,则一个数据存储范围是主键P-R,另一个数据存储范围是主键S-T,这两个数据存储范围中有一个数据存储范围是拆分存储之后的待分裂分片的数据存储范围。具体哪一个数据存储范围是拆分存储之后的待分裂分片的数据存储范围,需要根据预先设置的规则确定,如果预先设置的是右端值较小的数据存储范围为拆分存储后的待分裂分片的数据存储范围,则主键P-R所组成的数据存储范围是拆分存储后的分片3的数据存储范围;如果预先设置的是右端值较大的数据存储范围为拆分存储后的待分裂分片的数据存储范围,则主键S-T所组成的数据存储范围是拆分存储后的分片3的数据存储范围。
在本申请实施例中,拆分存储之后的待分裂分片的数据存储范围包含在拆分存储之前的待分裂分片的数据存储范围内,并且拆分存储之后的待分裂分片的数据存储范围小于拆分存储之前的待分裂分片的数据存储范围内。
更新拆分存储后的待分裂分片的数据存储范围、建立目标分片与针对目标分片所确定的数据存储范围之间的关联关系,可以保证后续的数据能够正 确地存储至对应的分片,还可以保证后续能根据数据读取请求读取到正确的数据。
在本申请的一个实施方式中,目标分片中所存储的数据与待分裂分片所存储的数据相同的情况下,该方法还可以包括:
获得每一分片与数据存储范围之间的关联关系;
根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;如果是,删除未在该分片的数据存储范围内的数据。
一种情况下,目标分片所存储的数据与待分裂分片所存储的数据相同,说明目标分片中存在冗余的数据,另一种情况下,待分裂分片在拆分存储之前和拆分存储之后所存储的数据相同,说明拆分之后的待分裂分片中存储了冗余的数据。
那么,为了节省存储资源,还为了减少因冗余数据量过大导致再次对分片进行拆分存储的可能性,需要删除每一分片中的冗余数据,删除的方式可以为:先获得分片与数据存储范围之间的关联关系,其中,控制节点中可以存储有分片与数据存储范围之间的关联关系;然后,根据所获得的关联关系,判断自身所存储的每一分片是否存储了未在该分片数据存储范围内的数据,即判断自身是否存储了冗余数据,比如,可以逐一判断该分片中存储的分片的主键是否在该分片的数据存储范围内;如果有,则说明该分片中存在冗余数据,其中,未在该分片的数据存储范围的主键对应的数据为冗余数据,那么,可以删除冗余数据,即删除未在该分片的数据存储范围内的数据;如果没有,则说明该分片中没有冗余数据。
在本申请实施例中,获得分片与数据存储范围之间的关联关系可以是定时的,也可以是在拆分存储完成之后获得的。删除分片中的冗余数据,可以节省存储资源,减少因冗余数据量过大导致再次对分片进行拆分存储的可能性,也可以减少该分片所属的服务器进行拆分存储操作的次数。
在本申请的一个实施方式中,该方法还可以包括:
获得针对待分裂分片的数据存储请求,其中,数据存储请求中包含待存储数据;
在未接收到拆分存储指令的情况下,将数据存储请求发送给从服务器,以使得每一从服务器将待存储数据存储至待分裂分片;并将第二消息发送给主服务器;其中,第二消息为数据存储完成的消息;在接收到第二消息的数量满足第二数量阈值时,将待存储数据存储至待分裂分片;
在接收到拆分存储指令的情况下,判断当前是否已经向控制节点发送第一消息;如果是,将数据存储请求发送给从服务器,以使得每一从服务器从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片,并将第二消息发送给主服务器;在接收到第二消息的数量大于第二数量阈值时,从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片;如果否,在向控制节点发送第一消息后,执行将数据存储请求发送给从服务器的步骤。
在本申请实施例中,数据的存储与分片的拆分存储是并行的,分片的拆分存储并不影响数据存储请求的接收。如果主服务器在接收到数据存储请求时,当前没有接收到拆分存储指令,即当前还未进行分片的拆分存储,此时可以向从服务器发送数据存储请求。从服务器接收到数据存储请求后,将待存储数据存储至待分裂分片中,在存储完成之后,将第二消息,即数据存储完成的消息发送给主服务器。主服务器在接收到的第二消息的数量大于第二数量阈值时,将待存储数据存储至自身的待分裂分片中。在上述存储过程中,如果主服务器接收到拆分存储指令,因为此时正在进行数据的存储,即正在执行写操作,主服务器记录拆分存储指令,例如,可以将拆分存储指令写入元数据库。在写操作执行完毕后,再向每一从服务器发送拆分存储指令。
在接收到数据拆分存储指令之前,如果主服务器接收到数据拆分存储指令,此时待分裂分片要么处于拆分存储中状态,要么处于拆分存储完成状态,当处于拆分存储中状态时,不能进行针对待分裂分片的数据存储。如果处于拆分存储完成状态,此时可以进行针对待分裂分片的数据存储。因此,需要判断主服务器是否向控制节点发送第一消息,如果发送了第一消息,则说明主服务器和从服务器已经完成了待分裂分片的拆分存储。此时,可以将数据存储请求发送给从服务器,从服务器从目标分片和拆分存储之后的分片中, 确定存储待存储数据的分片。具体的确定方式可以为:获得待存储数据的主键,将所获得的主键与拆分存储后的待分裂分片的数据存储范围进行匹配;如果匹配成功,则可以将待存储数据存储至拆分存储后的待分裂分片;如果匹配失败,在目标分片只有一个的情况下,可以将待存储数据存储至目标分片,如果目标分片不止一个,则可以将待存储数据存储至数据存储范围与所获得的主键相匹配的目标分片中;在确定分片后,将待存储数据存储至所确定的分片。
从服务器在将待存储数据存储至所确定的分片后,将数据存储完成的消息,即第二消息发送给主服务器。主服务器在接收的第二消息的数量大于第二数量阈值时,执行数据存储操作。主服务器所执行的数据存储操作与从服务器所执行的数据存储操作原理相同。在主服务器接收到准备拆分指令至主服务器接收到拆分存储指令期间,不会影响数据存储请求的执行。
应用本申请实施例,在目标分片和拆分存储后的待分裂分片均存在的情况下,需要将待存储数据存储至目标分片或拆分存储后的待分裂分片中,小幅度牺牲性能以换取在进行分片过程中不用将分裂分片的状态设置为不可写状态,在数据拆分存储的过程,仍支持读写操作,可以提高分布式存储系统的可用性。
图2为本申请实施例提供的数据存储方法的另一种流程示意图,该数据存储方法应用于控制节点,包括:
S201:向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使主服务器在接收到准备拆分指令后,将准备拆分指令发送给待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点。
控制节点可以是在接收到任一个分片所属的主服务器发送的拆分存储请求后,生成准备拆分指令,该分片就是待分裂分片;控制节点还可以在接收到工作人员发送针对一个分片的拆分存储的指令后,生成准备拆分指令。
控制节点将准备拆分指令发送给待分裂分片所属的主服务器,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器在接收到准备 拆分指令,获得数据拆分点,将所获得的数据拆分点发送给主服务器。主服务器在接收到的数据拆分点的数量大于第一数量阈值时,向控制节点发送数据拆分点。上述过程已经在S101-S103详细说明,在此不进行赘述。
S202:在接收到数据拆分点后,向主服务器发送拆分存储指令,以使主服务器在接收到拆分存储指令后,将拆分存储指令发送给每一从服务器,并在接收到第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。
控制节点在接收到数据拆分点后,可以确定主服务器和从服务器已经可以开始进行分片的拆分存储,然后向主服务器发送拆分存储指令。主服务器在接收到拆分存储指令后,向从服务器发送拆分存储指令。从服务器在接收到拆分存储指令后,将待分裂分片进行拆分存储,得到目标分片,在得到目标分片后,向主服务器发送第一消息。主服务器在接收到第一消息的数量大于第二数量阈值时,执行分片的拆分存储操作,将待分裂分片进行拆分存储,得到目标分片。上述过程已经在S103-S104详细说明,在此不进行赘述。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器根据数据拆分点,对待分裂分片进行拆分存储,并且,每一从服务器将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节 点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的一个实施方式中,在接收到数据拆分点后,向主服务器发送拆分存储指令之后,该方法还可以包括:
在接收到主服务器发送的第一消息后,根据数据拆分点和拆分存储之前的待分裂分片对应的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;
将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
控制节点在接收到该第一消息后,则可以确定主服务器和从服务器已经完成待分裂分片的拆分存储。此时,控制节点根据数据拆分点和拆分存储之前的待分裂分片的数据存储范围,确定目标分片的数据存储范围。比如,可以从拆分后的数据存储范围中选择一个作为拆分存储后的待分裂分片的数据存储范围,所选择的数据存储范围就是拆分存储之后的待分裂分片的数据存储范围,即目标数据存储范围。进而,控制节点可以更新之前针对待分裂分片记录的关联关系中的数据存储范围,即将预先记录的待分裂分片对应的数据存储范围更新为目标数据存储范围,同时,从剩下的拆分后的数据存储范围中,选择目标分片的数据存储范围,并记录所选择的数据存储范围与目标分片之间的关联关系。
在本申请实施例中,更新拆分存储后的待分裂分片的数据存储范围、建立目标分片与针对目标分片所确定的数据存储范围之间的关联关系,可以保证后续的数据能够正确地存储至对应的分片,还可以保证后续能根据数据读取请求读取到正确的数据。
在本申请的一个实施方式中,该方法还可以包括:
向各个服务器发送针对每一分片的数据存储范围的收集指令,以使每一服务器在接收到收集指令后,获得自身所存储的每一分片的数据存储范围,向控制节点发送所获得的数据存储范围;
接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针 对该分片记录的状态是否为拆分存储中状态;
如果为拆分存储中状态,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;
如果相同,则向该分片所属的主服务器发送拆分存储指令;
如果不相同,将预先针对该分片记录的状态修改为拆分存储完成状态。
在本申请实施例中,控制节点在向主服务器发送拆分指令时,将待分裂分片的状态从分裂完成状态修改为拆分存储中状态。控制节点在接收到主服务器发送的第一信息后,将拆分存储后的待分裂分片的状态从拆分存储中状态修改为分裂完成状态,并且将拆分得到的目标分片的状态记为分裂完成状态。
控制节点在出现故障并修复之后,向所有的服务器发送针对分片的数据存储范围的收集指令。收集指令是用于收集每一服务器中的分片的数据存储范围控制节点在接收到每一服务器反馈的每一分片的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;如果为拆分存储中状态,则需要进行进一步判断才能进行下一步操作,即判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同。但如果所记录的状态为分裂完成状态,则说明此时该分片处分裂完成状态,进而说明该分片目前不需要拆分存储,或者该分片的拆分存储已经完成。不管上述两个状态中的哪一个状态,均不需要修改控制节点所记录的该分片的数据存储范围。如果数据存储范围相同,且针对该分片记录的状态为拆分存储中状态,则说明主服务器和从服务器已经完成该分片的拆分存储,为了保证控制节点所记录的状态与分片的实际状态相同,需要向该分片所属的主服务器发送拆分存储指令,以进行该分片的拆分存储。
如果控制节点所记录的该分片的状态为拆分存储中状态,但该分片的数据存储范围与该分片实际的数据存储范围不相同,则说明主服务器和/或从服务器已经完成该分片的拆分存储,为了使得所记录的分片的状态与分片的实际状态相同,需要将控制节点所记录的状态修改为分裂完成状态。
应用本申请实施例,控制节点可以记录分片的状态,在出现故障修复完成时收集服务器中存储的分片的数据存储范围,根据记录的状态、分片的数 据存储范围和收集的分片的数据存储范围,确定需要发送的指令,向分片所属的主服务器发送指令,继续完成数据存储操作,这样使得分布式存储系统在异常退出时无损恢复。
本申请实施例还提供了一种数据存储装置,该数据存储装置应用于待分裂分片所属的主服务器,参见图3所示,装置包括:
接收模块301,设置为接收控制节点发送的针对待分裂分片的准备拆分指令;
第一发送模块302,设置为将准备拆分指令发送给待分裂分片所属的从服务器,以使每一从服务器在接收到准备拆分指令后,获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器;
第二发送模块303,设置为用于在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点,以使控制节点向主服务器发送拆分存储指令;
第三发送模块304,设置为在接收到拆分存储指令后,将拆分存储指令发送给每一从服务器,以使每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,在得到目标分片后,向主服务器发送第一消息,其中,第一消息为拆分存储完成的消息;
拆分模块305,设置为在接收到的第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提 供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生的宕机,提高分布式存储系统的可靠性。
在本申请的一个实施方式中,该装置还可以包括:
第四发送模块,设置为向控制节点发送第一消息,以使控制节点在接收到第一消息后,根据数据拆分点和拆分存储之前的待分裂分片的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
在本申请的一个实施方式中,该装置还可以包括:
第一获得模块,设置为在目标分片中所存储的数据与待分裂分片所存储的数据相同的情况下,获得每一分片与数据存储范围之间的关联关系;
第一判断模块,设置为根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;
删除模块,设置为在第一判断模块的判断结果为是的情况下,删除未在该分片的数据存储范围内的数据。
在本申请的一个实施方式中,该装置还可以包括:
第二获得模块,设置为获得针对待分裂分片的数据存储请求,其中,数据存储请求中包含待存储数据;
第五发送模块,设置为在未接收到拆分存储指令的情况下,将数据存储请求发送给从服务器,以使得每一从服务器将待存储数据存储至待分裂分片,并将第二消息发送给主服务器,其中,第二消息为数据存储完成的消息;
第一存储模块,设置为在接收到所述第二消息的数量大于第二数量阈值 时,将待存储数据存储至待分裂分片;
第二判断模块,设置为在接收到拆分存储指令的情况下,判断当前是否已经向控制节点发送第一消息;
第六发送模块,设置为在第二判断模块的判断结果为是的情况下,将数据存储请求发送给从服务器,以使得每一从服务器从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片,并将第二消息发送给主服务器;
第二存储模块,设置为在接收到所述第二消息的数量大于第二数量阈值时,从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片;
第六发送模块,还设置为在第二判断模块的判断结果为否的情况下,在向控制节点发送第一消息后,将数据存储请求发送给从服务器。
本申请实施例还提供了一种数据存储装置,该数据存储装置应用于控制节点,参见图4所示,装置包括:
第七发送模块401,设置为向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使主服务器在接收到准备拆分指令后,将准备拆分指令发送给待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点;
第八发送模块402,设置为在接收到数据拆分点后,向主服务器发送拆分存储指令,以使主服务器在接收到拆分存储指令后,将拆分存储指令发送给每一从服务器,并在将接收到所述第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发 送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的一个实施方式中,该装置还可以包括:
确定模块,设置为在接收到主服务器发送的第一消息后,根据数据拆分点和拆分存储之前的待分裂分片对应的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;
第一修改模块,设置为将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
在本申请的一个实施方式中,该装置还可以包括:
第九发送模块,设置为向各个服务器发送针对每一分片的数据存储范围的收集指令,以使每一服务器在接收到收集指令后,获得自身所存储的每一分片的数据存储范围,向控制节点发送所获得的数据存储范围;其中,服务器包括主服务器和从服务器;
第三判断模块,设置为接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;
第四判断模块,设置为在第三判断模块的判断结果为是的情况下,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;
第十发送模块,设置为在第四判断模块的判断结果为相同的情况下,则向该分片所属的主服务器发送拆分存储指令;
第二修改模块,设置为在第五判断模块的判断结果为不相同的情况下,将预先针对该分片记录的状态修改为拆分存储完成状态。
本申请实施例还提供了一种数据存储系统,参见图5所示,系统包括控制节点501、待分裂分片所属的主服务器502和待分裂分片的从服务器503,其中,
控制节点501,设置为向主服务器502发送针对待分裂分片的准备拆分指令;
主服务器502,设置为在接收到准备拆分指令后,将所述尊卑拆分指令发送给所述待分裂分片所属的从服务器503;
从服务器503在接收到准备拆分指令后,获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器502;
主服务器502,还设置为在接收到数据拆分点的数量大于第一数量阈值时,将数据拆分点发送给控制节点501;
控制节点501,设置为在接收到主服务器502发送的数据拆分点后,向主服务器502发送拆分存储指令;
主服务器502,设置为在接收到拆分存储指令后,将拆分存储指令发送给每一从服务器501;
从服务器503,设置为根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,在得到目标分片后,向主服务器502发送第一消息,其中,第一消息为拆分存储完成的消息
主服务器502,还设置为在接收到的所述第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发 送第一消息,在第一消息的数量大于第二数量阈值时,根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的一个实施方式中,主服务器,还设置为向控制节点发送第一消息;
控制节点,还设置为在接收到第一消息后,根据数据拆分点和拆分存储之前的待分裂分片的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
在本申请的一个实施方式中,主服务器,还设置为在目标分片中所存储的数据与待分裂分片所存储的数据相同的情况下,获得每一分片与数据存储范围之间的关联关系;根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该分片数据存储范围内的数据;如果是,删除未在该分片的数据存储范围内的数据。
在本申请的一个实施方式中,主服务器,还设置为获得针对待分裂分片的数据存储请求,其中,数据存储请求中包含待存储数据;在未接收到拆分存储指令的情况下,将数据存储请求发送给从服务;
从服务器,还设置为将待存储数据存储至待分裂分片;并将第二消息发送给主服务器;其中,第二消息为数据存储完成的消息;
主服务器,还设置为在接收到的所述第二消息的数量满足第二数量阈值 时,将待存储数据存储至待分裂分片;
主服务器,还设置为在接收到拆分存储指令的情况下,判断当前是否已经向控制节点发送第一消息;如果是,将数据存储请求发送给从服务器;
从服务器,还设置为从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片,并将第二消息发送给主服务器;
主服务器,还设置为在接收到所述第二消息的数量大于第二数量阈值时,从目标分片和拆分存储之后的待分裂分片中,确定存储待存储数据的分片,并将待存储数据存储至所确定的分片;如果否,在向控制节点发送第一消息后,执行将数据存储请求发送给从服务器的步骤。
在本申请的一个实施方式中,主服务器,还设置为在接收到主服务器发送的第一消息后,根据数据拆分点和拆分存储之前的待分裂分片对应的数据存储范围,确定目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录目标分片与针对目标分片所确定的数据存储范围之间的关联关系,其中,目标数据存储范围为拆分存储之后的待分裂分片的数据存储范围。
在本申请的一个实施方式中,控制节点,还设置为向各个服务器发送针对每一分片的数据存储范围的收集指令;其中,服务器包括主服务器和从服务器;
服务器,设置为在接收到收集指令后,获得自身所存储的每一分片的数据存储范围,向控制节点发送所获得的数据存储范围;
控制节点,还设置为接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;如果为拆分存储中状态,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;如果相同,则向该分片所属的主服务器发送拆分存储指令;如果不相同,将预先针对该分片记录的状态修改为拆分存储完成状态。
本申请实施例还提供了一种主服务器,如图6所示,包括处理器601和存储器602,其中,
存储器602,设置为存放计算机程序;
处理器601,设置为执行存储器602上所存放的程序时,实现如下步骤:
接收控制节点发送的针对待分裂分片的准备拆分指令;
将所述准备拆分指令发送给所述待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点,以使所述控制节点向所述主服务器发送拆分存储指令;
在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,以使每一所述从服务器根据数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
在接收到的所述第一消息的数量大于第二数量阈值时,根据数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发 送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片。可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
本申请实施例还提供了一种控制节点,如图7所示,包括处理器701和存储器702,其中,
存储器702,设置为存放计算机程序;
处理器701,设置为执行存储器702上所存放的程序时,实现如下步骤:
向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使所述主服务器在接收到准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
在接收到所述数据拆分点后,向所述主服务器发送拆分存储指令,以使所述主服务器在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,并在接收到所述第一消息的数量大于第二数量阈值时,根据数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发 送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时实现上述实施例中应用于主服务器的任一的数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而 造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时实现上述实施例中应用于控制节点的任一的数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中应用于主服务器的任一数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令, 主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中应用于控制节点的任一数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服 务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述实施例中应用于主服务器的任一数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令,主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
在本申请的又一实施例中,还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述实施例中应用于控制节点的任一数据存储方法。
在本申请实施例中,控制节点向待分裂分片所属的主服务器发送准备拆分指令,主服务器将准备拆分指令发送给待分裂分片所属的从服务器,从服务器获得待分裂分片的数据拆分点,并将该数据拆分点发送给主服务器;主服务器在接收到的数据拆分点的数量大于第一数量阈值时,将该数据拆分点发送给控制节点,控制节点在接收到数据拆分点之后,发送拆分存储指令, 主服务器和从服务器对待分裂分片进行拆分存储,并且,每一从服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片后,向主服务器发送第一消息,在第一消息的数量大于第二数量阈值时,主服务器根据数据拆分点,将待分裂分片进行拆分存储,得到目标分片,可见,本申请实施例提供的技术方案中,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。另外,主服务器和从服务器在获得拆分存储指令之后自行进行分片的拆分存储,不需要控制节点的同步介入,减少了对控制节点的依赖,尽可能避免因对控制节点依赖过大而造成控制节点出现性能瓶颈而产生宕机,提高分布式存储系统的可靠性。
对于装置/系统/主服务器/控制节点/计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含 在本申请的保护范围内。
工业实用性
基于本申请实施例提供的上述技术,从服务器获得待分裂分片的数据拆分点,并将数据拆分点发送给主服务器,这样,主服务器和从服务器可以根据相同的数据拆分点对待分裂分片进行拆分存储,可以得到相同的目标分片,在拆分存储过程中,可以继续对待分裂分片进行读写,而不像现有技术那样将待分裂分片的状态设置为不可写状态,从而提高了存储系统的可用性。

Claims (16)

  1. 一种数据存储方法,应用于待分裂分片所属的主服务器,所述方法包括:
    接收控制节点发送的针对所述待分裂分片的准备拆分指令;
    将所述准备拆分指令发送给所述待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
    在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点,以使所述控制节点向所述主服务器发送拆分存储指令;
    在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,以使每一所述从服务器根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
    在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
  2. 根据权利要求1所述的方法,其中,在所述在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片之后,所述方法还包括:
    向所述控制节点发送所述第一消息,以使所述控制节点在接收到所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
  3. 根据权利要求2所述的方法,其中,在所述目标分片中所存储的数据与所述待分裂分片所存储的数据相同的情况下,所述方法还包括:
    获得每一分片与数据存储范围之间的关联关系;
    根据所获得的关联关系,判断自身所存储的每一分片是否存储了不在该 分片数据存储范围内的数据;如果是,删除未在该分片的数据存储范围内的数据。
  4. 根据权利要求1所述的方法,其中,所述方法还包括:
    获得针对所述待分裂分片的数据存储请求,其中,所述数据存储请求中包含待存储数据;
    在未接收到所述拆分存储指令的情况下,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器将所述待存储数据存储至所述待分裂分片,并将第二消息发送给所述主服务器;其中,所述第二消息为数据存储完成的消息;在接收到所述第二消息的数量大于第二数量阈值时,将所述待存储数据存储至所述待分裂分片;
    在接收到所述拆分存储指令的情况下,判断当前是否已经向所述控制节点发送所述第一消息;如果是,将所述数据存储请求发送给所述从服务器,以使得每一所述从服务器从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片,并将所述第二消息发送给所述主服务器;在接收到所述第二消息的数量大于第二数量阈值时,从所述目标分片和拆分存储之后的待分裂分片中,确定存储所述待存储数据的分片,并将所述待存储数据存储至所确定的分片;如果否,在向所述控制节点发送所述第一消息后,执行将所述数据存储请求发送给所述从服务器的步骤。
  5. 一种数据存储方法,应用于控制节点,所述方法包括:
    向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使所述主服务器在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
    在接收到所述数据拆分点后,向所述主服务器发送拆分存储指令,以使所述主服务器在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,并在接收到所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
  6. 根据权利要求5所述的方法,其中,在所述接收到所述数据拆分点后, 向所述主服务器发送拆分存储指令之后,所述方法还包括:
    在接收到所述主服务器发送的所述第一消息后,根据所述数据拆分点和拆分存储之前的所述待分裂分片对应的数据存储范围,确定所述目标分片的数据存储范围和目标数据存储范围;
    将预先记录的待分裂分片对应的数据存储范围更新为所述目标数据存储范围,并记录所述目标分片与针对所述目标分片所确定的数据存储范围之间的关联关系,其中,所述目标数据存储范围为拆分存储之后的所述待分裂分片的数据存储范围。
  7. 根据权利要求5或6所述的方法,其中,所述方法还包括:
    向各个服务器发送针对每一分片的数据存储范围的收集指令,以使每一服务器在接收到所述收集指令后,获得自身所存储的每一分片的数据存储范围,向所述控制节点发送所获得的数据存储范围;其中,所述服务器包括所述主服务器和所述从服务器;
    接收到各个服务器发送的数据存储范围后,针对每一分片,判断预先针对该分片记录的状态是否为拆分存储中状态;
    如果为拆分存储中状态,判断预先记录的该分片的数据存储范围是否与获得的该分片的数据存储范围相同;
    如果相同,则向该分片所属的主服务器发送拆分存储指令;
    如果不相同,将预先针对该分片记录的状态修改为拆分存储完成状态。
  8. 一种数据存储装置,应用于待分裂分片所属的主服务器,所述装置包括:
    接收模块,设置为接收控制节点发送的针对所述待分裂分片的准备拆分指令;
    第一发送模块,设置为将所述准备拆分指令发送给所述待分裂分片所属的从服务器,以使每一所述从服务器在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
    第二发送模块,设置为在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点,以使所述控制节点向所述主服务器发送拆分存储指令;
    第三发送模块,设置为在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,以使每一所述从服务器根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
    拆分模块,设置为在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
  9. 一种数据存储装置,应用于控制节点,所述装置包括:
    第七发送模块,设置为向待分裂分片所属的主服务器发送针对待分裂分片的准备拆分指令,以使所述主服务器在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器,并在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
    第八发送模块,设置为在接收到所述数据拆分点后,向所述主服务器发送拆分存储指令,以使所述主服务器在接收到所述拆分存储指令后,将所述拆分存储指令发送给每一所述从服务器,并在将接收到所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
  10. 一种数据存储系统,其中,所述系统包括控制节点、待分裂分片所属的主服务器和待分裂分片所属的从服务器,其中,
    所述控制节点,设置为向所述主服务器发送针对所述待分裂分片的准备拆分指令;
    所述主服务器,设置为在接收到所述准备拆分指令后,将所述准备拆分指令发送给所述待分裂分片所属的从服务器;
    所述从服务器,设置为在接收到所述准备拆分指令后,获得所述待分裂分片的数据拆分点,并将所述数据拆分点发送给所述主服务器;
    所述主服务器,还设置为在接收到数据拆分点的数量大于第一数量阈值时,将所述数据拆分点发送给所述控制节点;
    所述控制节点,还设置为在接收到所述主服务器发送的所述数据拆分点后,向所述主服务器发送拆分存储指令;
    所述主服务器,还设置为在接收到所述拆分存储指令后,将所述拆分存 储指令发送给每一所述从服务器;
    所述从服务器,还设置为根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到目标分片,在得到所述目标分片后,向所述主服务器发送第一消息,其中,所述第一消息为拆分存储完成的消息;
    所述主服务器,还设置为在接收到的所述第一消息的数量大于第二数量阈值时,根据所述数据拆分点,将所述待分裂分片进行拆分存储,得到所述目标分片。
  11. 一种主服务器,包括处理器和存储器,其中,
    所述存储器,设置为存放计算机程序;
    所述处理器,设置为执行存储器上所存放的程序时,实现权利要求1-4任一所述的方法步骤。
  12. 一种控制节点,包括处理器和存储器,其中,
    所述存储器,设置为存放计算机程序;
    所述处理器,设置为执行存储器上所存放的程序时,实现权利要求5-7任一所述的方法步骤。
  13. 一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-4任一所述的方法步骤。
  14. 一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求5-7任一所述的方法步骤。
  15. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行权利要求1-4任一所述的方法步骤。
  16. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行权利要求5-7任一所述的方法步骤。
PCT/CN2019/108186 2018-09-30 2019-09-26 数据存储方法、装置、系统、服务器、控制节点及介质 WO2020063763A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/281,466 US11385830B2 (en) 2018-09-30 2019-09-26 Data storage method, apparatus and system, and server, control node and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811159536.2 2018-09-30
CN201811159536.2A CN109284073B (zh) 2018-09-30 2018-09-30 数据存储方法、装置、系统、服务器、控制节点及介质

Publications (1)

Publication Number Publication Date
WO2020063763A1 true WO2020063763A1 (zh) 2020-04-02

Family

ID=65182247

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108186 WO2020063763A1 (zh) 2018-09-30 2019-09-26 数据存储方法、装置、系统、服务器、控制节点及介质

Country Status (3)

Country Link
US (1) US11385830B2 (zh)
CN (1) CN109284073B (zh)
WO (1) WO2020063763A1 (zh)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284073B (zh) 2018-09-30 2020-03-06 北京金山云网络技术有限公司 数据存储方法、装置、系统、服务器、控制节点及介质
CN109960469B (zh) * 2019-03-25 2022-05-31 新华三技术有限公司 数据处理方法和装置
CN109992209B (zh) * 2019-03-29 2023-02-03 新华三技术有限公司成都分公司 数据处理方法、装置及分布式存储系统
CN112084141A (zh) * 2019-06-14 2020-12-15 北京京东尚科信息技术有限公司 一种全文检索系统扩容方法、装置、设备及介质
CN110737663B (zh) * 2019-10-15 2024-06-11 腾讯科技(深圳)有限公司 一种数据存储方法、装置、设备及存储介质
CN110750484B (zh) * 2019-10-22 2022-11-25 西安因联信息科技有限公司 一种转速和多路振动通道数据同步采集系统及采集方法
CN112835885B (zh) * 2019-11-22 2023-09-01 北京金山云网络技术有限公司 一种分布式表格存储的处理方法、装置及系统
CN111124291B (zh) * 2019-12-09 2023-05-30 北京金山云网络技术有限公司 分布式存储系统的数据存储处理方法、装置、电子设备
CN113934682A (zh) * 2020-06-29 2022-01-14 北京金山云网络技术有限公司 分布式表格系统的分片分裂方法、装置、服务器及介质
CN111897626A (zh) * 2020-07-07 2020-11-06 烽火通信科技股份有限公司 一种面向云计算场景的虚拟机高可靠系统和实现方法
CN112486876B (zh) * 2020-11-16 2024-08-06 中国人寿保险股份有限公司 一种分布式总线架构方法、装置和电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102467570A (zh) * 2010-11-17 2012-05-23 日电(中国)有限公司 用于分布式数据仓库的连接查询系统和方法
CN105930498A (zh) * 2016-05-06 2016-09-07 中国银联股份有限公司 一种分布式数据库的管理方法及系统
US9442671B1 (en) * 2010-12-23 2016-09-13 Emc Corporation Distributed consumer cloud storage system
CN106843745A (zh) * 2015-12-03 2017-06-13 南京中兴新软件有限责任公司 容量扩展方法及装置
CN107169009A (zh) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 一种分布式存储系统的数据分裂方法及装置
CN109284073A (zh) * 2018-09-30 2019-01-29 北京金山云网络技术有限公司 数据存储方法、装置、系统、服务器、控制节点及介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013147785A1 (en) * 2012-03-29 2013-10-03 Hitachi Data Systems Corporation Highly available search index with storage node addition and removal
CN102841931A (zh) * 2012-08-03 2012-12-26 中兴通讯股份有限公司 分布式文件系统的存储方法及装置
CN103078927B (zh) * 2012-12-28 2015-07-22 合一网络技术(北京)有限公司 一种key-value数据分布式缓存系统及其方法
WO2014172654A1 (en) 2013-04-19 2014-10-23 Huawei Technologies Co., Ltd. Media quality information signaling in dynamic adaptive video streaming over hypertext transfer protocol
CN106453665B (zh) * 2016-12-16 2019-06-07 东软集团股份有限公司 基于分布式缓存系统的数据缓存方法、服务器和系统
CN108572793B (zh) * 2017-10-18 2021-09-10 北京金山云网络技术有限公司 数据写入和数据恢复方法、装置、电子设备及存储介质
CN108052622A (zh) * 2017-12-15 2018-05-18 郑州云海信息技术有限公司 一种基于非关系型数据库的存储方法、装置以及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102467570A (zh) * 2010-11-17 2012-05-23 日电(中国)有限公司 用于分布式数据仓库的连接查询系统和方法
US9442671B1 (en) * 2010-12-23 2016-09-13 Emc Corporation Distributed consumer cloud storage system
CN106843745A (zh) * 2015-12-03 2017-06-13 南京中兴新软件有限责任公司 容量扩展方法及装置
CN105930498A (zh) * 2016-05-06 2016-09-07 中国银联股份有限公司 一种分布式数据库的管理方法及系统
CN107169009A (zh) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 一种分布式存储系统的数据分裂方法及装置
CN109284073A (zh) * 2018-09-30 2019-01-29 北京金山云网络技术有限公司 数据存储方法、装置、系统、服务器、控制节点及介质

Also Published As

Publication number Publication date
CN109284073A (zh) 2019-01-29
US11385830B2 (en) 2022-07-12
CN109284073B (zh) 2020-03-06
US20220004334A1 (en) 2022-01-06

Similar Documents

Publication Publication Date Title
WO2020063763A1 (zh) 数据存储方法、装置、系统、服务器、控制节点及介质
US20220335034A1 (en) Multi-master architectures for distributed databases
US10185497B2 (en) Cluster federation and trust in a cloud environment
US10642694B2 (en) Monitoring containers in a distributed computing system
US11360867B1 (en) Re-aligning data replication configuration of primary and secondary data serving entities of a cross-site storage solution after a failover event
US8712982B2 (en) Virtual multi-cluster clouds
CN106815218B (zh) 数据库访问方法、装置和数据库系统
US10949401B2 (en) Data replication in site recovery environment
JP4806729B2 (ja) グローバルナレッジを有するサーバによる書込リクエスト処理
JP2016053951A (ja) 情報中心ネットワークにわたって分布及びフォールトトレラント状態を維持するシステム及び方法
CN111049928B (zh) 数据同步方法、系统、电子设备及计算机可读存储介质
WO2018233630A1 (zh) 故障发现
US10664349B2 (en) Method and device for file storage
WO2022174735A1 (zh) 基于分布式存储的数据处理方法、装置、设备以及介质
US20200293407A1 (en) System and method for replicating data in distributed database systems
JP2018049635A (ja) トランザクション処理方法および装置
WO2020019779A1 (zh) 区块链数据的压缩处理方法和装置
US11030220B2 (en) Global table management operations for multi-region replicated tables
US20130006920A1 (en) Record operation mode setting
US11108730B2 (en) Group heartbeat information in a domain name system server text record
CN112749172A (zh) 一种缓存与数据库之间的数据同步方法及系统
CN110955460A (zh) 一种服务进程启动方法、装置、电子设备和存储介质
CN113760519B (zh) 分布式事务处理方法、装置、系统和电子设备
Alapati et al. Cassandra Architecture
CN116186165A (zh) 数据复制方法、装置、系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19867448

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19867448

Country of ref document: EP

Kind code of ref document: A1