CN110798492B - Data storage method and device and data processing system - Google Patents

Data storage method and device and data processing system Download PDF

Info

Publication number
CN110798492B
CN110798492B CN201810872737.0A CN201810872737A CN110798492B CN 110798492 B CN110798492 B CN 110798492B CN 201810872737 A CN201810872737 A CN 201810872737A CN 110798492 B CN110798492 B CN 110798492B
Authority
CN
China
Prior art keywords
storage
data
fragment
slice
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810872737.0A
Other languages
Chinese (zh)
Other versions
CN110798492A (en
Inventor
张敢
邓长春
何永健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201810872737.0A priority Critical patent/CN110798492B/en
Publication of CN110798492A publication Critical patent/CN110798492A/en
Application granted granted Critical
Publication of CN110798492B publication Critical patent/CN110798492B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data storage method and device and a data processing system, and belongs to the technical field of storage. The method comprises the following steps: determining a target memory fragment for storing data to be stored; acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available; and when the availability state information indicates that the target memory fragment is unavailable, storing the data to be stored in other available memory fragments. The invention improves the stability of the stored data. The invention is used for storing data.

Description

Data storage method and device and data processing system
Technical Field
The present application relates to the field of storage technologies, and in particular, to a data storage method and apparatus, and a data processing system.
Background
The Kafka server cluster (Kafka) is a distributed publish-subscribe message system with high throughput, and the elastic search server cluster (Elasticisarch) is a distributed full-text search engine system based on Lucene. Typically, after Kafka acquires data acquired by the data acquisition system, the data may be sent to the Elasticsearch. The Elasticsearch may store the data and, upon receiving an index request, retrieve the data stored in the Elasticsearch to provide information for retrieval.
In the related art, after receiving data sent by Kafka, the Elasticsearch may determine a memory fragment for storing the data according to a preset rule, and directly store the data in the memory fragment. However, the stability of this data storage method is poor.
Disclosure of Invention
The application provides a data storage method and device and a data processing system, which can solve the problem of poor stability of a data storage method in the related technology. The technical scheme is as follows:
in a first aspect, a data storage method is provided, where the method is applied to a first storage node in a data storage system, and the data storage system includes: a plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the method comprising:
determining a target memory fragment for storing data to be stored;
acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available;
and when the availability state information indicates that the target memory fragment is unavailable, storing the data to be stored in other available memory fragments.
Optionally, when the availability status information indicates that the target memory slice is unavailable, storing the data to be stored in other available memory slices includes:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
Optionally, the acquiring the available state information of the target memory slice includes:
and querying the available state information of the target memory slice in a slice information set stored by the first storage node, wherein the slice information set records the available state information of a plurality of memory slices managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the method further comprising:
acquiring available state information of each memory fragment in the data storage system;
and sending the available state information of each storage fragment to each slave storage node, so that each storage node establishes and stores a fragment information set, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
Optionally, after the obtaining the available state information of the target memory slice, the method further includes:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining a target storage slice for storing data to be stored includes:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
In a second aspect, there is provided a data storage device comprising:
the determining module is used for determining a target memory fragment for storing data to be stored;
a first obtaining module, configured to obtain available state information of the target memory slice, where the available state information is used to indicate whether a memory slice is available;
and the storage module is used for storing the data to be stored in other available memory fragments when the availability state information indicates that the target memory fragment is unavailable.
Optionally, the storage module is configured to:
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices in the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
Optionally, the first obtaining module is configured to:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the apparatus further comprising:
the second acquisition module is used for acquiring available state information of each memory fragment in the data storage system;
and the sending module is used for sending the available state information of each storage fragment to each secondary storage node, so that each storage node establishes and stores a fragment information set, and the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
Optionally, the storage module is further configured to:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining module is configured to:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
In a third aspect, a data processing system is provided, the data processing system comprising: a data storage system;
the data storage system includes: a plurality of storage nodes, each storage node comprising a data storage device according to any of the second aspects.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node;
the master storage node is further to: acquiring available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node, wherein the available state information is used for indicating whether the memory fragment is available;
each of the storage nodes is further configured to: and establishing and storing a fragment information set based on the available state information of each storage fragment, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the storage node.
Optionally, the data storage system comprises: elastic search server cluster.
Optionally, the data processing system further comprises: a data distribution system;
the data distribution system is used for receiving data of the data acquisition system and distributing the data to each storage node in the data storage system.
Optionally, the data distribution subsystem includes: a cluster of kaffka servers.
In a fourth aspect, there is provided a server comprising a processor and a memory,
wherein the content of the first and second substances,
the memory is used for storing a computer program;
the processor is configured to execute the program stored in the memory to implement the data storage method according to any one of the first aspect.
In a fifth aspect, a storage medium is provided, in which a computer program is stored, and the computer program realizes the data storage method of any one of the first aspect when executed by a processor.
The beneficial effect that technical scheme that this application provided brought is:
according to the data storage method and device and the data processing system provided by the embodiment of the invention, after the target storage fragment for storing the data to be stored is determined, the data to be stored is stored into other available storage fragments by judging whether the target storage fragment is available or not and when the target storage fragment is unavailable, compared with the related technology, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a data processing system according to an embodiment of the present invention;
FIG. 2 is a block diagram of another data processing system according to an embodiment of the present invention;
FIG. 3 is a flowchart of a data storage method according to an embodiment of the present invention;
FIG. 4 is a flow chart of another data storage method according to an embodiment of the present invention;
FIG. 5 is a flowchart of a method for determining a target memory slice for storing data to be stored according to an embodiment of the present invention;
fig. 6 is a flowchart of a method for storing data to be stored in other available memory slices according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of another data storage device according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In the related art, the Elasticsearch includes a plurality of storage nodes, and the information interaction between the Elasticsearch and Kafka is generally realized through the storage nodes. During data storage, after the storage node receives data sent by Kafka to the storage node, a target storage fragment for storing the data may be determined according to a preset rule in a plurality of storage fragments managed by the storage node, and the data is directly stored in the target storage fragment.
However, the target memory slice may be in an unavailable state and the data may not be stored to the memory slice while the memory slice is in the unavailable state. At this time, the storage node corresponding to the storage slice may return the data to kafka, and the data returned to kafka may occupy the resource of kafka for processing the data, so that the efficiency of receiving and distributing the data by kafka is reduced, and thus data accumulation occurs in kafka. And according to the data processing strategy of kafka, when data accumulation occurs in the kafka, the kafka discards the data, thereby causing data loss. Therefore, the stability of the stored data in the related art is low.
Therefore, embodiments of the present invention provide a data storage method, where after a target storage partition for storing data to be stored is determined, whether the target storage partition is available is determined, and when the target storage partition is unavailable, the data to be stored is stored in other available storage partitions.
Fig. 1 is a schematic structural diagram of a data processing system according to a data storage method provided in an embodiment of the present invention, and as shown in fig. 1, the data processing system 10 may include: a data storage system 101. The data storage system 101 may include: a plurality of storage nodes 1011. The plurality of storage nodes 1011 may establish a connection therebetween through a wired network or a wireless network.
Wherein each storage node 1011 is configured to manage a plurality of memory slices, each memory slice configured to store data. Each storage node 1011 may receive data to be stored, determine a target storage segment for storing the data to be stored, obtain available state information of the target storage segment, and store the data to be stored in other available storage segments when the available state information indicates that the target storage segment is unavailable. Wherein the availability status information of each memory slice is used to indicate whether the memory slice is available.
Also, the plurality of storage nodes 1011 may further include: a master storage node and a slave storage node. At this time, the master storage node is further configured to: and acquiring the available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node. Each storage node 1011 is also used to: based on the available state information of each memory slice, a slice information set is established and stored, where the slice information set is recorded with available state information of a plurality of memory slices managed by the corresponding storage node 1011.
Generally, the storage node 1011 may be deployed in a server, which is a schematic structural diagram shown in fig. 1 when the storage node 1011 is deployed in a server. The server implementing each node may be one server, a server cluster composed of a plurality of servers, or a cloud computing service center, which is not specifically limited in the embodiment of the present invention.
In one implementation, the data storage system 101 may include: an Elasticsearch server cluster (hereinafter referred to as Elasticsearch) is used for storing the data and retrieving the data stored in the Elasticsearch when receiving an index request so as to provide an index result for retrieval.
Optionally, referring to fig. 2, the data processing system 10 may further include: a data distribution system 102. The data distribution system 102 may include: a plurality of distribution nodes 1021. The data distribution system 102 and the data storage system 101 may establish a connection through a wired network or a wireless network. The data distribution system 102 is configured to receive data of the data acquisition system, and distribute the data to the storage nodes 1011 in the data storage system 101 through the distribution node 1021.
The data distribution system 102 may also be implemented by a server, and fig. 2 is a schematic structural diagram of the data distribution system 102 when it is implemented by a server. The server for implementing the data distribution system 102 may be a server, a server cluster composed of a plurality of servers, or a cloud computing service center, which is not specifically limited in the embodiment of the present invention.
In one implementation, the data distribution system 102 includes: the Kafka server cluster (hereinafter, abbreviated as Kafka) is configured to acquire data acquired by a data acquisition system, and send the data to an Elasticsearch according to a preset distribution policy, so that the Elasticsearch stores the data.
Fig. 3 is a flowchart of a data storage method according to an embodiment of the present invention, where the method may be applied to a first storage node in the data storage system shown in fig. 1 or fig. 2, where the first storage node is any one of a plurality of storage nodes. As shown in fig. 3, the method may include:
step 301, determining a target storage partition for storing data to be stored.
Step 302, obtaining the available state information of the target memory slice.
Wherein the availability status information is used to indicate whether the memory slice is available.
And 303, when the available state information indicates that the target memory fragment is not available, storing the data to be stored in other available memory fragments.
In summary, according to the data storage method provided in the embodiments of the present invention, after determining the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, and storing the data to be stored in other available storage segments when the target storage segment is unavailable, compared with the related art, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
Fig. 4 is a flowchart of another data storage method according to an embodiment of the present invention, where the configuration management method is applicable to a first storage node in the data storage system shown in fig. 1 or fig. 2, where the first storage node is any one of a plurality of storage nodes. The data storage system may be a system comprising a plurality of storage nodes, for example: the data storage system may be an Elasticsearch server cluster. For convenience of understanding, in the embodiment of the present invention, the data storage system is an Elasticsearch server cluster, and the plurality of storage nodes include: the configuration management method provided by the embodiment of the invention is described by taking a master storage node and a plurality of slave storage nodes as examples. As shown in fig. 4, the method may include:
step 401, a master storage node obtains available state information of each storage slice in a data storage system.
In each storage node included in the data storage system, the main storage node may be a storage node that manages the respective storage node. And each storage node may send node status information to the primary storage node, the node status information indicating whether the corresponding storage node is available. And each memory fragment may also send availability status information to the main storage node, where the availability status information is used to indicate whether the corresponding memory fragment is available.
In an implementation manner, the main storage node may send test information to each storage node and each storage fragment, determine whether the corresponding storage node is available according to a feedback condition of the corresponding storage node to the test information, and determine whether the storage fragment is available according to a feedback condition of the corresponding storage fragment to the test information. In another implementation, each storage node and each storage slice may send a status report to the primary storage node, and the primary storage node may determine whether the corresponding storage node or storage slice is available according to the status report, so as to obtain available status information of each storage slice and node status information of each storage node. For example: when a storage node managing a certain storage slice goes offline or is unavailable due to other reasons, the storage slice may present an unavailable state (also referred to as an unallocated state) and trigger a cluster state change event, and a primary storage node may acquire the cluster state change event and obtain available state information indicating that the storage slice is unavailable according to the cluster state change event.
Step 402, the master storage node sends the available state information of each storage fragment to each slave storage node, so that each storage node establishes and stores the fragment information set.
Optionally, the master storage node may broadcast the availability status information of the memory slice so that each slave storage node can obtain the availability status information. After receiving the available state information, each storage node may establish and store a fragmentation information set according to the available state information, where the fragmentation information set records available state information of multiple storage fragments managed by the corresponding storage node.
For example, suppose that the storage slices managed by a certain slave storage node a are storage slice a1, storage slice a2, storage slice A3 and storage slice a4, respectively, and the slave storage node a knows the available state information sent to it by the master storage node: storage shard a1 available, storage shard a2 available, storage shard A3 available, storage shard a4 unavailable, then the slave storage node a may establish and store a sharded information set, and the sharded information set indicates: memory slice A1 is available, memory slice A2 is available, memory slice A3 is available, and memory slice A4 is not available.
In step 403, the first storage node determines a target storage slice for storing data to be stored.
Optionally, as shown in fig. 5, an implementation manner of this step 403 may include:
step 4031, obtain the label of the data to be stored.
The data sent by kafka to the storage node includes, in addition to the content information of the data to be stored, the related information of the data to be stored, and the related information may include the data size, data format, data Identification (ID), and the like of the data to be stored. Therefore, after the first storage node receives the data to be stored distributed to the first storage node by the kafka, the ID of the data to be stored can be acquired according to the data sent to the first storage node by the kafka.
Step 4032, based on the ID, a hash value of the data to be stored is calculated.
The hash algorithm may map a binary value of arbitrary length to a binary value of fixed length, referred to as a hash value. When the step 4032 is executed, the first storage node may use the ID of the data to be stored as a key of a hash algorithm, and obtain a hash value corresponding to the ID through the hash algorithm.
Step 4033, the memory slice indicated by the hash value is determined as the target memory slice.
In the process of initializing the data storage system, a corresponding segment number may be set for each storage segment, and the segment number may be represented by a preset hash value, that is, each hash value may indicate one storage segment. After the hash value corresponding to the data to be stored is obtained, the hash value may be compared with the segment number of the storage segment, and the segment number that is the same as the hash value is determined as the target storage segment for storing the data to be stored.
For example, assuming that, in the process of initializing the data storage system, the memory slice a1 is set to correspond to the hash value 0111110010, the memory slice a2 corresponds to the hash value 0111110001, the memory slice A3 corresponds to the hash value 0111110000, and the memory slice a4 corresponds to the hash value 0111110011, and the hash value calculated according to the ID of the data to be stored in step 4032 is 0111110011, according to the correspondence between the memory slice and the hash value, it may be determined that the target memory slice for storing the data to be stored is the memory slice a 4.
Step 404, the first storage node obtains available state information of the target storage slice.
Optionally, after determining the target storage segment, the availability status information of the target storage segment may be queried in the segment information set stored in the first storage node to obtain the availability status information of the target storage segment. When the availability status information indicates that the target memory slice is available, the data to be stored may be directly stored to the target memory slice, i.e., step 406 is executed. When the availability status information indicates that the target memory slice is not available, the data to be stored may be stored in other available memory slices, i.e., step 405 is performed.
By way of example, assume that the set of shard information stored in the first storage node indicates: storage slice a1 is available, storage slice a2 is available, storage slice A3 is available, storage slice a4 is unavailable, and according to the target storage slice determined in step 403 as storage slice a4, the availability status information of storage slice a4 may be determined by querying the information set of the slices as: memory slice a4 is not available, at which point it may be determined that step 405 needs to be performed.
Step 405, when the available state information indicates that the target memory slice is not available, the first storage node stores the data to be stored in other available memory slices.
Optionally, as shown in fig. 6, an implementation of this step 405 may include:
step 4051, when the available state information indicates that the target storage segment is unavailable, obtaining index information corresponding to the target storage segment.
In the process of initializing the data storage system, index information corresponding to each storage node can be generated according to a preset setting instruction. The index information corresponding to each storage node may be used to indicate a plurality of storage slices managed by the corresponding storage node. For example: the index information corresponding to the target storage slice is used to indicate a plurality of storage slices managed by the storage node corresponding to the target storage slice, and the plurality of storage slices include the target storage slice. And, after the index information is generated, the corresponding relationship of the storage node, the storage segment and the index information may be stored at a preset position for subsequent use. Therefore, after the target memory slice is determined, the corresponding relationship may be queried according to the target memory slice to determine the index information corresponding to the target memory slice.
Step 4052, determining other storage segments of the plurality of storage segments except the target storage segment.
After determining the index information corresponding to the target storage slice, a plurality of storage slices managed by the first storage node may be determined according to the index information, and in the plurality of storage slices, other storage slices except the target storage slice are determined. For example: the storage node, the storage shards and the index information may be queried according to the index information to determine the plurality of storage shards managed by the first storage node, and determine storage shards, which are located outside the target storage shard, of the plurality of storage shards as other storage shards.
Step 4053, based on the available state information of each other storage segment, determines other available target storage segments, and stores the data to be stored in the other target storage segments.
After determining the other memory fragments, the available state information of each other memory fragment may be obtained, and according to the available state information, whether the corresponding memory fragment is available is determined, and one available memory fragment of the other memory fragments is determined as the other target memory fragments, and the data to be stored is stored in the other target memory fragments.
For example, assume that the target storage partition is storage partition a4, the index information corresponding to the storage partition a4 is index information B, and the storage partitions managed by the first storage node corresponding to the storage partition a4 are: storage slice a1, storage slice a2, storage slice A3, and storage slice a4, when determining that storage slice a4 is unavailable according to the availability status information, it may be determined that index information corresponding to storage slice a is index information B, and according to the index information B, it may be determined that storage slices other than the target storage slice in the plurality of storage slices indicated by the index information B include: storage slice a1, storage slice a2, and storage slice A3, and according to available state information corresponding to each of storage slice a1, storage slice a2, and storage slice A3, it may be determined that: storage slice a1 may be available, storage slice a2 may be available, and storage slice A3 may be available, and any one of storage slice a1, storage slice a2, and storage slice A3 may be determined as another target storage slice, and the data to be stored may be stored in the other target storage slice.
Step 406, when the available state information indicates that the target storage segment is available, the first storage node stores the data to be stored to the target storage segment.
In summary, according to the data storage method provided in the embodiments of the present invention, after determining the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, and storing the data to be stored in other available storage segments when the target storage segment is unavailable, compared with the related art, the data to be stored can be stored, data accumulation caused by the fact that the data cannot be stored can be avoided, and the stability of the stored data is effectively improved.
It should be noted that, the sequence of the steps of the data storage method provided in the embodiment of the present invention may be appropriately adjusted, and the steps may also be increased or decreased according to the circumstances, and any method that can be easily conceived by a person skilled in the art within the technical scope disclosed in the present invention should be included in the protection scope of the present invention, and therefore, the details are not described again.
Fig. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present invention, and as shown in fig. 7, the data storage device 700 may include:
a determining module 701, configured to determine a target storage partition for storing data to be stored.
A first obtaining module 702, configured to obtain availability status information of a target memory slice, where the availability status information is used to indicate whether the memory slice is available.
The storage module 703 is configured to store the data to be stored in other available memory fragments when the availability status information indicates that the target memory fragment is not available.
In summary, in the data storage device provided in the embodiment of the present invention, after the determining module determines the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, when the target storage segment is unavailable, the storage module stores the data to be stored into another available storage segment.
Optionally, the storage module 703 is configured to:
when the availability status information indicates that the target storage slice is unavailable, index information is obtained, the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice.
Determining other memory slices of the plurality of memory slices other than the target memory slice.
Based on the availability status information for each other memory slice, other target memory slices that are available are determined.
And storing the data to be stored in other target memory fragments.
Optionally, the first obtaining module 702 is configured to: and inquiring the available state information of the target storage fragment in the fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
Optionally, the plurality of storage nodes comprises: a master storage node and a slave storage node, and when the first storage node is the master storage node, as shown in fig. 8, the apparatus 700 may further include:
a second obtaining module 704, configured to obtain available state information of each memory slice in the data storage system.
The sending module 705 is configured to send the available state information of each storage slice to each secondary storage node, so that each storage node establishes and stores a slice information set, where the slice information set records available state information of multiple storage slices managed by a corresponding storage node.
Optionally, the storage module 703 is further configured to: and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
Optionally, the determining module 701 is configured to:
and acquiring the identification ID of the data to be stored.
Based on the ID, a hash value of the data to be stored is calculated.
And determining the storage slice indicated by the hash value as the target storage slice.
In summary, in the data storage device provided in the embodiment of the present invention, after the determining module determines the target storage segment for storing the data to be stored, by determining whether the target storage segment is available, when the target storage segment is unavailable, the storage module stores the data to be stored into another available storage segment.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present invention provides a server, which may be the data storage device, as shown in fig. 9, where the server 01 includes: including a processor 12 and a memory 16,
wherein the content of the first and second substances,
a memory 16 for storing a computer program;
the processor 12 is configured to execute the program stored in the memory 16 to implement the data storage method according to the foregoing embodiments, and for example, the method may include:
determining a target memory slice for storing data to be stored.
And acquiring the availability state information of the target memory fragment, wherein the availability state information is used for indicating whether the memory fragment is available.
And when the available state information indicates that the target memory fragment is not available, storing the data to be stored in other available memory fragments.
In particular, processor 12 includes one or more processing cores. The processor 12 executes various functional applications and data processing by running a computer program stored in the memory 16, which includes software programs and units.
The computer programs stored by the memory 16 include software programs and units. In particular, memory 16 may store an operating system 162, an application program unit 164 required for at least one function. Operating system 162 may be a Real Time eXceptive (RTX) operating system, such as LINUX, UNIX, WINDOWS, or OS X. Wherein the application unit 164 may include a first obtaining unit 164a, a first determining unit 164b, and a second determining unit 164 c.
The determination unit 164a has the same or similar functions as the determination module 701.
The first acquiring unit 164b has the same or similar functions as the first acquiring module 702.
The memory unit 164c has the same or similar function as the memory module 703.
The embodiment of the invention provides a storage medium which can be a nonvolatile computer readable storage medium, wherein a computer program is stored in the storage medium, and when the computer program is executed by a processor, the computer program realizes the data storage method provided by the embodiment of the method.
Embodiments of the present invention further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the data storage method provided by the above method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (15)

1. A data storage method applied to a first storage node in a data storage system, the data storage system comprising a plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the method comprising:
determining a target memory fragment for storing data to be stored;
acquiring available state information of the target memory fragment, wherein the available state information is used for indicating whether the memory fragment is available;
when the availability status information indicates that the target storage slice is unavailable, obtaining index information, wherein the index information is used for indicating a plurality of storage slices, and the plurality of storage slices comprise the target storage slice;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
2. The method of claim 1, wherein the obtaining the availability status information of the target memory slice comprises:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
3. The method of claim 1 or 2, wherein the plurality of storage nodes comprises: a master storage node and a slave storage node, when the first storage node is the master storage node, the method further comprising:
acquiring available state information of each memory fragment in the data storage system;
and sending the available state information of each storage fragment to each slave storage node, so that each storage node establishes and stores a fragment information set, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
4. The method according to claim 1 or 2, wherein after the obtaining the availability status information of the target memory slice, the method further comprises:
when the available state information indicates that the target memory slice is available, storing the data to be stored to the target memory slice.
5. The method according to claim 1 or 2, wherein the determining a target memory slice for storing data to be stored comprises:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
6. A data storage apparatus, applied to a first storage node in a data storage system, the data storage system including a plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the apparatus comprising:
the determining module is used for determining a target memory fragment for storing data to be stored;
a first obtaining module, configured to obtain available state information of the target memory slice, where the available state information is used to indicate whether a memory slice is available;
a storage module, configured to obtain index information when the availability status information indicates that the target storage partition is unavailable, where the index information is used to indicate multiple storage partitions, and the multiple storage partitions include the target storage partition;
determining other memory slices of the plurality of memory slices except the target memory slice;
determining other target memory slices that are available based on the availability status information of each other memory slice;
and storing the data to be stored in the other target memory fragments.
7. The apparatus of claim 6, wherein the first obtaining module is configured to:
and querying the available state information of the target storage fragment in a fragment information set stored by the first storage node, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the first storage node.
8. The apparatus of claim 6 or 7, wherein the plurality of storage nodes comprise: a master storage node and a slave storage node, when the first storage node is the master storage node, the apparatus further comprising:
the second acquisition module is used for acquiring available state information of each memory fragment in the data storage system;
and the sending module is used for sending the available state information of each storage fragment to each secondary storage node, so that each storage node establishes and stores a fragment information set, and the fragment information set records the available state information of a plurality of storage fragments managed by the corresponding storage node.
9. The apparatus of any of claims 6 or 7, wherein the storage module is further configured to:
and when the available state information indicates that the target memory fragment is available, storing the data to be stored to the target memory fragment.
10. The apparatus of any of claims 6 or 7, wherein the determining module is configured to:
acquiring an identification ID of the data to be stored;
calculating a hash value of the data to be stored based on the ID;
and determining the storage slice indicated by the hash value as the target storage slice.
11. A data processing system, characterized in that the data processing system comprises: a data storage system;
the data storage system includes: a plurality of storage nodes, each storage node comprising a data storage device as claimed in any one of claims 6 to 10.
12. The system of claim 11, wherein the plurality of storage nodes comprises: a master storage node and a slave storage node;
the master storage node is further to: acquiring available state information of each memory fragment in the data storage system, and sending the available state information of each memory fragment to each slave storage node, wherein the available state information is used for indicating whether the memory fragment is available;
each of the storage nodes is further configured to: and establishing and storing a fragment information set based on the available state information of each storage fragment, wherein the fragment information set records the available state information of a plurality of storage fragments managed by the storage node.
13. The system of claim 11, wherein the data storage system comprises: elastic search server cluster.
14. The system of any of claims 11 to 13, wherein the data processing system further comprises: a data distribution system;
the data distribution system is used for receiving data of the data acquisition system and distributing the data to each storage node in the data storage system.
15. The system of claim 14, wherein the data distribution subsystem comprises: a cluster of kaffka servers.
CN201810872737.0A 2018-08-02 2018-08-02 Data storage method and device and data processing system Active CN110798492B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810872737.0A CN110798492B (en) 2018-08-02 2018-08-02 Data storage method and device and data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810872737.0A CN110798492B (en) 2018-08-02 2018-08-02 Data storage method and device and data processing system

Publications (2)

Publication Number Publication Date
CN110798492A CN110798492A (en) 2020-02-14
CN110798492B true CN110798492B (en) 2022-08-09

Family

ID=69425941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810872737.0A Active CN110798492B (en) 2018-08-02 2018-08-02 Data storage method and device and data processing system

Country Status (1)

Country Link
CN (1) CN110798492B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112711382B (en) * 2020-12-31 2024-04-26 百果园技术(新加坡)有限公司 Data storage method and device based on distributed system and storage node
CN115190085A (en) * 2022-05-26 2022-10-14 中科驭数(北京)科技有限公司 Data sharing method and device based on SMB transmission and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0854423A1 (en) * 1997-01-20 1998-07-22 TELEFONAKTIEBOLAGET L M ERICSSON (publ) Data partitioning and duplication in a distributed data processing system
CN105100146B (en) * 2014-05-07 2018-07-20 腾讯科技(深圳)有限公司 Date storage method, apparatus and system
US9495249B1 (en) * 2015-03-31 2016-11-15 Amazon Technolgies, Inc. Precomputed redundancy code matrices for high-availability data storage
CN105573680A (en) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 Storage method and device for replicated data
CN106302702B (en) * 2016-08-10 2020-03-20 华为技术有限公司 Data fragment storage method, device and system
CN106527981B (en) * 2016-10-31 2020-04-28 华中科技大学 Data fragmentation method of self-adaptive distributed storage system based on configuration
EP3327991A1 (en) * 2016-11-29 2018-05-30 Alcatel Lucent Storage of coverage-related information of a telecommunication network
CN107528724B (en) * 2017-07-20 2020-09-29 奇安信科技集团股份有限公司 Optimization processing method and device for node cluster

Also Published As

Publication number Publication date
CN110798492A (en) 2020-02-14

Similar Documents

Publication Publication Date Title
US10936560B2 (en) Methods and devices for data de-duplication
CN111818112B (en) Kafka system-based message sending method and device
CN109117275B (en) Account checking method and device based on data slicing, computer equipment and storage medium
US9367261B2 (en) Computer system, data management method and data management program
KR20120018178A (en) Swarm-based synchronization over a network of object stores
CN106991008B (en) Resource lock management method, related equipment and system
US20170031908A1 (en) Efficient parallel insertion into an open hash table
WO2019242359A1 (en) File processing method and device
CN114490518A (en) Method, apparatus and program product for managing indexes of a streaming data storage system
CN110798492B (en) Data storage method and device and data processing system
CN109388651B (en) Data processing method and device
CN111708763B (en) Data migration method and device of sliced cluster and sliced cluster system
CN110798358B (en) Distributed service identification method and device, computer readable medium and electronic equipment
CN107547605B (en) message reading and writing method based on node queue and node equipment
US11855868B2 (en) Reducing the impact of network latency during a restore operation
CN109992447B (en) Data copying method, device and storage medium
CN116684416A (en) Mirror image distribution method, device and system in network element cluster
CN111061557B (en) Method and device for balancing distributed memory database load
CN111092956A (en) Resource synchronization method, device, storage medium and equipment
CN110677497B (en) Network medium distribution method and device
CN113190347A (en) Edge cloud system and task management method
US11163462B1 (en) Automated resource selection for software-defined storage deployment
US11379147B2 (en) Method, device, and computer program product for managing storage system
CN113743564B (en) Counting method, counting device, electronic equipment and storage medium
CN111797062B (en) Data processing method, device and distributed database system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant