CN110798492A - Data storage method and device and data processing system - Google Patents
Data storage method and device and data processing system Download PDFInfo
- Publication number
- CN110798492A CN110798492A CN201810872737.0A CN201810872737A CN110798492A CN 110798492 A CN110798492 A CN 110798492A CN 201810872737 A CN201810872737 A CN 201810872737A CN 110798492 A CN110798492 A CN 110798492A
- Authority
- CN
- China
- Prior art keywords
- storage
- data
- fragment
- memory
- available
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000012634 fragment Substances 0.000 claims abstract description 126
- 238000005192 partition Methods 0.000 description 11
- 238000004590 computer program Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000009825 accumulation Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1029—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a data storage method and device and a data processing system, and belongs to the technical field of storage. The method comprises the following steps: determining a target memory fragment for storing data to be stored; acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available; and when the availability state information indicates that the target memory fragment is unavailable, storing the data to be stored in other available memory fragments. The invention improves the stability of the stored data. The invention is used for storing data.
Description
Technical Field
The present application relates to the field of storage technologies, and in particular, to a data storage method and apparatus, and a data processing system.
Background
The Kafka server cluster (Kafka) is a distributed publish-subscribe message system with high throughput, and the elastic search server cluster (Elasticisarch) is a distributed full-text search engine system based on Lucene. Typically, after Kafka acquires data acquired by the data acquisition system, the data may be sent to the Elasticsearch. The Elasticsearch may store the data and, upon receiving an index request, retrieve the data stored in the Elasticsearch to provide information for retrieval.
In the related art, after receiving data sent by Kafka, the Elasticsearch may determine a memory fragment for storing the data according to a preset rule, and directly store the data in the memory fragment. However, the stability of this data storage method is poor.
Disclosure of Invention
The application provides a data storage method and device and a data processing system, which can solve the problem of poor stability of a data storage method in the related technology. The technical scheme is as follows:
in a first aspect, a data storage method is provided, where the method is applied to a first storage node in a data storage system, and the data storage system includes: a plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the method comprising:
determining a target memory fragment for storing data to be stored;
acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available;
and when the availability state information indicates that the target memory fragment is unavailable, storing the data to be stored in other available memory fragments.
Optionally, when the availability status information indicates that the target memory slice is unavailable, storing the data to be stored in other available memory slices includes:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
Optionally, the acquiring the available state information of the target memory slice includes:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the method further comprising:
acquiring available state information of each memory fragment in the data storage system;
and sending the available state information of each storage fragment to each slave storage node, so that each storage node establishes and stores a fragment information set, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
Optionally, after the obtaining of the available state information of the target memory slice, the method further includes:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining a target storage slice for storing data to be stored includes:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
In a second aspect, there is provided a data storage device comprising:
the determining module is used for determining a target memory fragment for storing data to be stored;
a first obtaining module, configured to obtain available state information of the target memory slice, where the available state information is used to indicate whether a memory slice is available;
and the storage module is used for storing the data to be stored in other available memory fragments when the availability state information indicates that the target memory fragment is unavailable.
Optionally, the storage module is configured to:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
Optionally, the first obtaining module is configured to:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the apparatus further comprising:
the second acquisition module is used for acquiring available state information of each memory fragment in the data storage system;
and the sending module is used for sending the available state information of each storage fragment to each secondary storage node, so that each storage node establishes and stores a fragment information set, and the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
Optionally, the storage module is further configured to:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining module is configured to:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
In a third aspect, a data processing system is provided, the data processing system comprising: a data storage system;
the data storage system includes: a plurality of storage nodes, each storage node comprising a data storage device according to any of the second aspects.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node;
the master storage node is further to: acquiring available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node, wherein the available state information is used for indicating whether the memory fragment is available;
each of the storage nodes is further configured to: and establishing and storing a fragment information set based on the available state information of each storage fragment, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the storage node.
Optionally, the data storage system comprises: elastic search server cluster.
Optionally, the data processing system further comprises: a data distribution system;
the data distribution system is used for receiving data of the data acquisition system and distributing the data to each storage node in the data storage system.
Optionally, the data distribution subsystem includes: a cluster of kaffka servers.
In a fourth aspect, there is provided a server comprising a processor and a memory,
wherein,
the memory is used for storing a computer program;
the processor is configured to execute the program stored in the memory to implement the data storage method according to any one of the first aspect.
In a fifth aspect, a storage medium is provided, in which a computer program is stored, and the computer program realizes the data storage method of any one of the first aspect when executed by a processor.
The beneficial effect that technical scheme that this application provided brought is:
according to the data storage method and device and the data processing system provided by the embodiment of the invention, after the target storage fragment for storing the data to be stored is determined, the data to be stored is stored into other available storage fragments by judging whether the target storage fragment is available or not and when the target storage fragment is unavailable, compared with the related technology, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a block diagram of a data processing system according to an embodiment of the present invention;
FIG. 2 is a block diagram of another data processing system according to an embodiment of the present invention;
FIG. 3 is a flowchart of a data storage method according to an embodiment of the present invention;
FIG. 4 is a flow chart of another data storage method according to an embodiment of the present invention;
FIG. 5 is a flowchart of a method for determining a target memory slice for storing data to be stored according to an embodiment of the present invention;
fig. 6 is a flowchart of a method for storing data to be stored in other available memory slices according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of another data storage device according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In the related art, the Elasticsearch includes a plurality of storage nodes, and the information interaction between the Elasticsearch and Kafka is generally realized through the storage nodes. During data storage, after the storage node receives data sent by Kafka to the storage node, a target storage fragment for storing the data may be determined according to a preset rule in a plurality of storage fragments managed by the storage node, and the data is directly stored in the target storage fragment.
However, the target memory slice may be in an unavailable state and the data may not be stored to the memory slice while the memory slice is in the unavailable state. At this time, the storage node corresponding to the storage slice may return the data to kafka, and the data returned to kafka may occupy the resource of kafka for processing the data, so that the efficiency of receiving and distributing the data by kafka is reduced, and thus data accumulation occurs in kafka. And according to the data processing strategy of kafka, when data accumulation occurs in the kafka, the kafka discards the data, thereby causing data loss. Therefore, the stability of the stored data in the related art is low.
Therefore, embodiments of the present invention provide a data storage method, where after a target storage partition for storing data to be stored is determined, whether the target storage partition is available is determined, and when the target storage partition is unavailable, the data to be stored is stored in other available storage partitions.
Fig. 1 is a schematic structural diagram of a data processing system according to a data storage method provided in an embodiment of the present invention, and as shown in fig. 1, the data processing system 10 may include: a data storage system 101. The data storage system 101 may include: a plurality of storage nodes 1011. The plurality of storage nodes 1011 may establish a connection therebetween through a wired network or a wireless network.
Wherein each storage node 1011 is configured to manage a plurality of memory slices, each memory slice configured to store data. Each storage node 1011 may receive data to be stored, determine a target storage segment for storing the data to be stored, obtain available state information of the target storage segment, and store the data to be stored in other available storage segments when the available state information indicates that the target storage segment is unavailable. Wherein the availability status information of each memory slice is used to indicate whether the memory slice is available.
Also, the plurality of storage nodes 1011 may further include: a master storage node and a slave storage node. At this time, the master storage node is further configured to: and acquiring the available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node. Each storage node 1011 is also used to: based on the available state information of each memory slice, a slice information set is established and stored, where the slice information set is recorded with available state information of a plurality of memory slices managed by the corresponding storage node 1011.
Generally, the storage node 1011 may be deployed in a server, which is a schematic structural diagram shown in fig. 1 when the storage node 1011 is deployed in a server. The server implementing each node may be one server, a server cluster composed of a plurality of servers, or a cloud computing service center, which is not specifically limited in the embodiment of the present invention.
In one implementation, the data storage system 101 may include: an Elasticsearch server cluster (hereinafter referred to as Elasticsearch) is used for storing the data and retrieving the data stored in the Elasticsearch when receiving an index request so as to provide an index result for retrieval.
Optionally, referring to fig. 2, the data processing system 10 may further include: a data distribution system 102. The data distribution system 102 may include: a plurality of distribution nodes 1021. The data distribution system 102 and the data storage system 101 may establish a connection through a wired network or a wireless network. The data distribution system 102 is configured to receive data of the data acquisition system, and distribute the data to a storage node 1011 in the data storage system 101 through a distribution node 1021.
The data distribution system 102 may also be implemented by a server, and fig. 2 is a schematic structural diagram of the data distribution system 102 when it is implemented by a server. The server for implementing the data distribution system 102 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center, which is not specifically limited in the embodiment of the present invention.
In one implementation, the data distribution system 102 includes: the Kafka server cluster (hereinafter, abbreviated as Kafka) is configured to acquire data acquired by a data acquisition system, and send the data to an Elasticsearch according to a preset distribution policy, so that the Elasticsearch stores the data.
Fig. 3 is a flowchart of a data storage method according to an embodiment of the present invention, where the method may be applied to a first storage node in the data storage system shown in fig. 1 or fig. 2, where the first storage node is any one of a plurality of storage nodes. As shown in fig. 3, the method may include:
step 301, determining a target storage partition for storing data to be stored.
Step 302, obtaining the available state information of the target memory slice.
Wherein the availability status information is used to indicate whether the memory slice is available.
And 303, when the available state information indicates that the target memory fragment is not available, storing the data to be stored in other available memory fragments.
In summary, according to the data storage method provided in the embodiments of the present invention, after determining the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, and storing the data to be stored in other available storage segments when the target storage segment is unavailable, compared with the related art, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
Fig. 4 is a flowchart of another data storage method according to an embodiment of the present invention, where the configuration management method is applicable to a first storage node in the data storage system shown in fig. 1 or fig. 2, where the first storage node is any one of a plurality of storage nodes. The data storage system may be a system comprising a plurality of storage nodes, for example: the data storage system may be an Elasticsearch server cluster. For convenience of understanding, in the embodiment of the present invention, the data storage system is an Elasticsearch server cluster, and the plurality of storage nodes include: the configuration management method provided by the embodiment of the invention is described by taking a master storage node and a plurality of slave storage nodes as examples. As shown in fig. 4, the method may include:
In each storage node included in the data storage system, the main storage node may be a storage node that manages the respective storage node. And each storage node may send node status information to the primary storage node, the node status information indicating whether the corresponding storage node is available. And each memory fragment may also send availability status information to the main storage node, where the availability status information is used to indicate whether the corresponding memory fragment is available.
In an implementation manner, the main storage node may send test information to each storage node and each storage fragment, determine whether the corresponding storage node is available according to a feedback condition of the corresponding storage node to the test information, and determine whether the storage fragment is available according to a feedback condition of the corresponding storage fragment to the test information. In another implementation, each storage node and each storage slice may send a status report to the primary storage node, and the primary storage node may determine whether the corresponding storage node or storage slice is available according to the status report, so as to obtain available status information of each storage slice and node status information of each storage node. For example: when a storage node managing a certain storage slice goes offline or is unavailable due to other reasons, the storage slice may present an unavailable state (also referred to as an unallocated state) and trigger a cluster state change event, and a primary storage node may acquire the cluster state change event and obtain available state information indicating that the storage slice is unavailable according to the cluster state change event.
Optionally, the master storage node may broadcast the availability status information of the memory slice so that each slave storage node can obtain the availability status information. After receiving the available state information, each storage node may establish and store a fragmentation information set according to the available state information, where the fragmentation information set records available state information of multiple storage fragments managed by the corresponding storage node.
For example, suppose that the storage slices managed by a certain slave storage node a are storage slice a1, storage slice a2, storage slice A3 and storage slice a4, respectively, and the slave storage node a knows the available state information sent to it by the master storage node: storage slice a1 available, storage slice a2 available, storage slice A3 available, storage slice a4 unavailable, then the slave storage node a may establish and store a slice information set, and the slice information set indicates: storage slice A1 is available, storage slice A2 is available, storage slice A3 is available, and storage slice A4 is not available.
In step 403, the first storage node determines a target storage slice for storing data to be stored.
Optionally, as shown in fig. 5, an implementation manner of this step 403 may include:
step 4031, obtain the label of the data to be stored.
The data sent by kafka to the storage node includes, in addition to the content information of the data to be stored, the related information of the data to be stored, and the related information may include the data size, data format, data Identification (ID), and the like of the data to be stored. Therefore, after the first storage node receives the data to be stored distributed to the first storage node by the kafka, the ID of the data to be stored can be acquired according to the data sent to the first storage node by the kafka.
Step 4032, based on the ID, a hash value of the data to be stored is calculated.
The hash algorithm may map a binary value of arbitrary length to a binary value of fixed length, referred to as a hash value. When the step 4032 is executed, the first storage node may use the ID of the data to be stored as a key of a hash algorithm, and obtain a hash value corresponding to the ID through the hash algorithm.
Step 4033, the memory slice indicated by the hash value is determined as the target memory slice.
In the process of initializing the data storage system, a corresponding segment number may be set for each storage segment, and the segment number may be represented by a preset hash value, that is, each hash value may indicate one storage segment. After the hash value corresponding to the data to be stored is obtained, the hash value may be compared with the segment number of the storage segment, and the segment number that is the same as the hash value is determined as the target storage segment for storing the data to be stored.
For example, assuming that, in the process of initializing the data storage system, the memory slice a1 is set to correspond to the hash value 0111110010, the memory slice a2 corresponds to the hash value 0111110001, the memory slice A3 corresponds to the hash value 0111110000, and the memory slice a4 corresponds to the hash value 0111110011, where the hash value calculated according to the ID of the data to be stored in step 4032 is 0111110011, then according to the correspondence between the memory slice and the hash value, the target memory slice for storing the data to be stored may be determined to be the memory slice a 4.
Optionally, after determining the target storage segment, the availability status information of the target storage segment may be queried in the segment information set stored in the first storage node to obtain the availability status information of the target storage segment. When the availability status information indicates that the target memory slice is available, the data to be stored may be directly stored to the target memory slice, i.e., step 406 is executed. When the availability status information indicates that the target memory slice is not available, the data to be stored may be stored in other available memory slices, i.e., step 405 is performed.
By way of example, assume that the set of shard information stored in the first storage node indicates: storage slice a1 is available, storage slice a2 is available, storage slice A3 is available, storage slice a4 is unavailable, and according to the target storage slice determined in step 403 as storage slice a4, the availability status information of storage slice a4 may be determined by querying the information set of the slices as: memory slice a4 is not available, at which point it may be determined that step 405 needs to be performed.
Optionally, as shown in fig. 6, an implementation of this step 405 may include:
step 4051, when the available state information indicates that the target storage segment is unavailable, obtaining index information corresponding to the target storage segment.
In the process of initializing the data storage system, index information corresponding to each storage node can be generated according to a preset setting instruction. The index information corresponding to each storage node may be used to indicate a plurality of storage slices managed by the corresponding storage node. For example: the index information corresponding to the target storage segment is used to indicate a plurality of storage segments managed by a storage node corresponding to the target storage segment, and the plurality of storage segments include the target storage segment. And, after the index information is generated, the corresponding relationship of the storage node, the storage segment and the index information may be stored at a preset position for subsequent use. Therefore, after the target storage segment is determined, the corresponding relationship may be queried according to the target storage segment to determine the index information corresponding to the target storage segment.
Step 4052, determining other storage segments of the plurality of storage segments except the target storage segment.
After determining the index information corresponding to the target storage slice, a plurality of storage slices managed by the first storage node may be determined according to the index information, and in the plurality of storage slices, other storage slices except the target storage slice are determined. For example: the storage node, the storage shards and the index information may be queried according to the index information to determine the plurality of storage shards managed by the first storage node, and determine storage shards, which are located outside the target storage shard, of the plurality of storage shards as other storage shards.
Step 4053, based on the available state information of each other storage segment, determines other available target storage segments, and stores the data to be stored in the other target storage segments.
After determining the other memory fragments, the available state information of each other memory fragment may be obtained, and according to the available state information, whether the corresponding memory fragment is available is determined, and one available memory fragment of the other memory fragments is determined as the other target memory fragments, and the data to be stored is stored in the other target memory fragments.
For example, assume that the target storage partition is storage partition a4, the index information corresponding to the storage partition a4 is index information B, and the storage partitions managed by the first storage node corresponding to the storage partition a4 are: storage slice a1, storage slice a2, storage slice A3, and storage slice a4, when determining that storage slice a4 is unavailable according to the availability status information, it may be determined that index information corresponding to storage slice a4 is index information B, and according to the index information B, it may be determined that storage slices other than the target storage slice in the plurality of storage slices indicated by the index information B include: storage slice a1, storage slice a2, and storage slice A3, and according to the available state information corresponding to each of storage slice a1, storage slice a2, and storage slice A3, it may be determined: storage slice a1 may be available, storage slice a2 may be available, and storage slice A3 may be available, and any of storage slice a1, storage slice a2, and storage slice A3 may be determined to be another target storage slice, and the data to be stored may be stored in the other target storage slice.
In summary, according to the data storage method provided in the embodiments of the present invention, after determining the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, and storing the data to be stored in other available storage segments when the target storage segment is unavailable, compared with the related art, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
It should be noted that, the sequence of the steps of the data storage method provided in the embodiment of the present invention may be appropriately adjusted, and the steps may also be increased or decreased according to the circumstances, and any method that can be easily conceived by a person skilled in the art within the technical scope disclosed in the present invention should be included in the protection scope of the present invention, and therefore, the details are not described again.
Fig. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present invention, and as shown in fig. 7, the data storage device 700 may include:
a determining module 701, configured to determine a target storage partition for storing data to be stored.
A first obtaining module 702, configured to obtain availability status information of a target memory slice, where the availability status information is used to indicate whether the memory slice is available.
The storage module 703 is configured to store the data to be stored in other available memory fragments when the availability status information indicates that the target memory fragment is not available.
In summary, in the data storage device provided in the embodiment of the present invention, after the determining module determines the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, when the target storage segment is unavailable, the storage module stores the data to be stored into another available storage segment.
Optionally, the storage module 703 is configured to:
when the availability status information indicates that the target storage slice is unavailable, index information is obtained, the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice.
Determining other memory slices of the plurality of memory slices other than the target memory slice.
Based on the availability status information for each other memory slice, other target memory slices that are available are determined.
And storing the data to be stored in other target memory fragments.
Optionally, the first obtaining module 702 is configured to: and inquiring the available state information of the target storage fragment in the fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, and when the first storage node is the master storage node, as shown in fig. 8, the apparatus 700 may further include:
a second obtaining module 704, configured to obtain available state information of each memory slice in the data storage system.
The sending module 705 is configured to send the available state information of each storage slice to each secondary storage node, so that each storage node establishes and stores a slice information set, where the slice information set records available state information of multiple storage slices managed by a corresponding storage node.
Optionally, the storage module 703 is further configured to: and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining module 701 is configured to:
and acquiring the identification ID of the data to be stored.
Based on the ID, a hash value of the data to be stored is calculated.
And determining the storage slice indicated by the hash value as the target storage slice.
In summary, in the data storage device provided in the embodiment of the present invention, after the determining module determines the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, when the target storage segment is unavailable, the storage module stores the data to be stored into another available storage segment.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present invention provides a server, which may be the data storage device, as shown in fig. 9, where the server 01 includes: including a processor 12 and a memory 16,
wherein,
a memory 16 for storing a computer program;
the processor 12 is configured to execute the program stored in the memory 16 to implement the data storage method according to the foregoing embodiments, and for example, the method may include:
determining a target memory slice for storing data to be stored.
And acquiring the availability state information of the target memory fragment, wherein the availability state information is used for indicating whether the memory fragment is available.
And when the available state information indicates that the target memory fragment is not available, storing the data to be stored in other available memory fragments.
In particular, processor 12 includes one or more processing cores. The processor 12 executes various functional applications and data processing by running a computer program stored in the memory 16, which includes software programs and units.
The computer programs stored by the memory 16 include software programs and units. In particular, memory 16 may store an operating system 162, an application unit 164 required for at least one function. Operating system 162 may be a Real Time eXceptive (RTX) operating system, such as LINUX, UNIX, WINDOWS, or OS X. Wherein the application unit 164 may include a first obtaining unit 164a, a first determining unit 164b and a second determining unit 164 c.
The determination unit 164a has the same or similar functions as the determination module 701.
The first acquiring unit 164b has the same or similar functions as the first acquiring module 702.
The memory unit 164c has the same or similar functions as the memory module 703.
The embodiment of the invention provides a storage medium which can be a nonvolatile computer readable storage medium, wherein a computer program is stored in the storage medium, and when the computer program is executed by a processor, the computer program realizes the data storage method provided by the embodiment of the method.
Embodiments of the present invention further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the data storage method provided by the above method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.
Claims (17)
1. A data storage method applied to a first storage node in a data storage system, the data storage system comprising: a plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the method comprising:
determining a target memory fragment for storing data to be stored;
acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available;
and when the availability state information indicates that the target memory fragment is unavailable, storing the data to be stored in other available memory fragments.
2. The method according to claim 1, wherein the storing the data to be stored in other available memory fragments when the availability status information indicates that the target memory fragment is not available comprises:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
3. The method of claim 1, wherein the obtaining the availability status information of the target memory slice comprises:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
4. The method of any of claims 1 to 3, wherein the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the method further comprising:
acquiring available state information of each memory fragment in the data storage system;
and sending the available state information of each storage fragment to each slave storage node, so that each storage node establishes and stores a fragment information set, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
5. The method according to any of claims 1 to 3, wherein after the obtaining the availability status information of the target memory slice, the method further comprises:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
6. The method according to any one of claims 1 to 3, wherein the determining a target memory slice for storing data to be stored comprises:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
7. A data storage device, characterized in that the device comprises:
the determining module is used for determining a target memory fragment for storing data to be stored;
a first obtaining module, configured to obtain available state information of the target memory slice, where the available state information is used to indicate whether a memory slice is available;
and the storage module is used for storing the data to be stored in other available memory fragments when the availability state information indicates that the target memory fragment is unavailable.
8. The apparatus of claim 7, wherein the storage module is configured to:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
9. The apparatus of claim 7, wherein the first obtaining module is configured to:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
10. The apparatus of any of claims 7 to 9, wherein the plurality of storage nodes comprise: a master storage node and a slave storage node, when the first storage node is the master storage node, the apparatus further comprising:
the second acquisition module is used for acquiring available state information of each memory fragment in the data storage system;
and the sending module is used for sending the available state information of each storage fragment to each secondary storage node, so that each storage node establishes and stores a fragment information set, and the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
11. The apparatus of any of claims 7 to 9, wherein the storage module is further configured to:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
12. The apparatus of any one of claims 7 to 9, wherein the determining module is configured to:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
13. A data processing system, characterized in that the data processing system comprises: a data storage system;
the data storage system includes: a plurality of storage nodes, each storage node comprising a data storage device as claimed in any of claims 7 to 12.
14. The system of claim 13, wherein the plurality of storage nodes comprises: a master storage node and a slave storage node;
the master storage node is further to: acquiring available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node, wherein the available state information is used for indicating whether the memory fragment is available;
each of the storage nodes is further configured to: and establishing and storing a fragment information set based on the available state information of each storage fragment, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the storage node.
15. The system of claim 13, wherein the data storage system comprises: elastic search server cluster.
16. The system of any of claims 13 to 15, wherein the data processing system further comprises: a data distribution system;
the data distribution system is used for receiving data of the data acquisition system and distributing the data to each storage node in the data storage system.
17. The system of claim 16, wherein the data distribution subsystem comprises: a cluster of kaffka servers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810872737.0A CN110798492B (en) | 2018-08-02 | 2018-08-02 | Data storage method and device and data processing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810872737.0A CN110798492B (en) | 2018-08-02 | 2018-08-02 | Data storage method and device and data processing system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110798492A true CN110798492A (en) | 2020-02-14 |
CN110798492B CN110798492B (en) | 2022-08-09 |
Family
ID=69425941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810872737.0A Active CN110798492B (en) | 2018-08-02 | 2018-08-02 | Data storage method and device and data processing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110798492B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112711382A (en) * | 2020-12-31 | 2021-04-27 | 百果园技术(新加坡)有限公司 | Data storage method and device based on distributed system and storage node |
CN115190085A (en) * | 2022-05-26 | 2022-10-14 | 中科驭数(北京)科技有限公司 | Data sharing method and device based on SMB transmission and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6421687B1 (en) * | 1997-01-20 | 2002-07-16 | Telefonaktiebolaget Lm Ericsson (Publ) | Data partitioning and duplication in a distributed data processing system |
CN105100146A (en) * | 2014-05-07 | 2015-11-25 | 腾讯科技(深圳)有限公司 | Data storage method, device and system |
CN105573680A (en) * | 2015-12-25 | 2016-05-11 | 北京奇虎科技有限公司 | Storage method and device for replicated data |
CN106302702A (en) * | 2016-08-10 | 2017-01-04 | 华为技术有限公司 | Burst storage method, the Apparatus and system of data |
US20170060687A1 (en) * | 2015-03-31 | 2017-03-02 | Amazon Technologies, Inc. | Precomputed redundancy code matrices for high-availability data storage |
CN106527981A (en) * | 2016-10-31 | 2017-03-22 | 华中科技大学 | Configuration-based data fragmentation method for adaptive distributed storage system |
CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
EP3327991A1 (en) * | 2016-11-29 | 2018-05-30 | Alcatel Lucent | Storage of coverage-related information of a telecommunication network |
-
2018
- 2018-08-02 CN CN201810872737.0A patent/CN110798492B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6421687B1 (en) * | 1997-01-20 | 2002-07-16 | Telefonaktiebolaget Lm Ericsson (Publ) | Data partitioning and duplication in a distributed data processing system |
CN105100146A (en) * | 2014-05-07 | 2015-11-25 | 腾讯科技(深圳)有限公司 | Data storage method, device and system |
US20170060687A1 (en) * | 2015-03-31 | 2017-03-02 | Amazon Technologies, Inc. | Precomputed redundancy code matrices for high-availability data storage |
CN105573680A (en) * | 2015-12-25 | 2016-05-11 | 北京奇虎科技有限公司 | Storage method and device for replicated data |
CN106302702A (en) * | 2016-08-10 | 2017-01-04 | 华为技术有限公司 | Burst storage method, the Apparatus and system of data |
CN106527981A (en) * | 2016-10-31 | 2017-03-22 | 华中科技大学 | Configuration-based data fragmentation method for adaptive distributed storage system |
EP3327991A1 (en) * | 2016-11-29 | 2018-05-30 | Alcatel Lucent | Storage of coverage-related information of a telecommunication network |
CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112711382A (en) * | 2020-12-31 | 2021-04-27 | 百果园技术(新加坡)有限公司 | Data storage method and device based on distributed system and storage node |
CN112711382B (en) * | 2020-12-31 | 2024-04-26 | 百果园技术(新加坡)有限公司 | Data storage method and device based on distributed system and storage node |
CN115190085A (en) * | 2022-05-26 | 2022-10-14 | 中科驭数(北京)科技有限公司 | Data sharing method and device based on SMB transmission and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110798492B (en) | 2022-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10936560B2 (en) | Methods and devices for data de-duplication | |
CN111818112B (en) | Kafka system-based message sending method and device | |
EP3223165B1 (en) | File processing method, system and server-clustered system for cloud storage | |
CN108121511B (en) | Data processing method, device and equipment in distributed edge storage system | |
CN109117275B (en) | Account checking method and device based on data slicing, computer equipment and storage medium | |
CN106991008B (en) | Resource lock management method, related equipment and system | |
KR20120018178A (en) | Swarm-based synchronization over a network of object stores | |
CN114490518A (en) | Method, apparatus and program product for managing indexes of a streaming data storage system | |
CN110798492B (en) | Data storage method and device and data processing system | |
CN109388651B (en) | Data processing method and device | |
CN111708763B (en) | Data migration method and device of sliced cluster and sliced cluster system | |
CN101526959B (en) | Data storing method and device | |
CN106951443B (en) | Method, equipment and system for synchronizing copies based on distributed system | |
CN107547605B (en) | message reading and writing method based on node queue and node equipment | |
US11855868B2 (en) | Reducing the impact of network latency during a restore operation | |
CN109992447B (en) | Data copying method, device and storage medium | |
CN111061557B (en) | Method and device for balancing distributed memory database load | |
CN116775712A (en) | Method, device, electronic equipment, distributed system and storage medium for inquiring linked list | |
CN110798358A (en) | Distributed service identification method and device, computer readable medium and electronic equipment | |
CN107203559B (en) | Method and device for dividing data strips | |
CN110134547B (en) | Middleware-based repeated data deleting method and related device | |
CN113965538A (en) | Equipment state message processing method, device and storage medium | |
CN113190347A (en) | Edge cloud system and task management method | |
CN113760532A (en) | Data processing method, device, electronic equipment, system and storage medium | |
CN114443267A (en) | Resource acquisition method, system, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |