WO2016054818A1 - 数据处理方法和装置 - Google Patents

数据处理方法和装置 Download PDF

Info

Publication number
WO2016054818A1
WO2016054818A1 PCT/CN2014/088351 CN2014088351W WO2016054818A1 WO 2016054818 A1 WO2016054818 A1 WO 2016054818A1 CN 2014088351 W CN2014088351 W CN 2014088351W WO 2016054818 A1 WO2016054818 A1 WO 2016054818A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
key
data
metadata
storage disk
Prior art date
Application number
PCT/CN2014/088351
Other languages
English (en)
French (fr)
Inventor
罗雄
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2014/088351 priority Critical patent/WO2016054818A1/zh
Priority to EP14903664.2A priority patent/EP3196776B1/en
Priority to CN201480075499.0A priority patent/CN106164898B/zh
Publication of WO2016054818A1 publication Critical patent/WO2016054818A1/zh
Priority to US15/484,152 priority patent/US11003719B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0635Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the present application relates to a data processing technique, and more particularly to a data processing method and apparatus.
  • a storage system consisting of multiple physical storage nodes (referred to as physical nodes in this application) is present, and each storage node can provide storage space.
  • This storage mode is called distributed storage.
  • a distributed storage method is called Key-Value storage.
  • Key-Value storage stored data (or data fragments) is called a value, and each data has a A unique identifier in the entire storage system. This identifier is a key, and Key and Value correspond one-to-one.
  • the Key can be "Volume Name + Logical Block Address (LBA)", and Value is the data recorded in this sector, the data size is, for example, 512 words. Section.
  • K-V The Key and Value of the same data are collectively referred to as Key-Value, abbreviated as K-V.
  • K-V The Key and Value of the same data are collectively referred to as Key-Value, abbreviated as K-V.
  • Each Key-Value is stored in one physical node of the storage system.
  • the physical node that stores it can be determined by a mapping rule, which is based on the hash value generated by hashing the Key, and then the hash value and the physical node. Map. Based on this method, if the hash values calculated by two different keys are the same, the Key-Value corresponding to the two keys must be stored on the same physical node.
  • the number of physical nodes is often fixed, and the number of Key-Values is variable, the number of physical nodes is often exceeded, thus allowing the same physical node to store multiple Key-Values with different Keys. In other words, if the hash values calculated by two different keys are different, then it is also possible that the Key-Values corresponding to the two Keys are stored on the same physical node.
  • the hash value calculated according to Key will fall within the integer interval of [0, 2 ⁇ 32-1].
  • the large-scale integer interval is segmented and divided into multiple intervals of equal size or Approximate partitions, the number of hash values in each partition is basically the same.
  • the entire Hash is divided into two partitions of partition 1 and partition 2, then the interval represented by partition 1 is: [0, 2 ⁇ 32/2], and the interval represented by partition 2 is: [2 ⁇ 32/2+1, 2 ⁇ 32-1].
  • the physical node includes a storage medium, and may further include components such as a memory and a CPU.
  • the virtual storage node is a logical division of the storage space of the physical node relative to the physical storage node. Each physical node can virtualize one or more virtual nodes. In a few cases, multiple physical entities can also be used.
  • the node is virtualized into a virtual node; or each partition corresponds to a virtual node (Virtual Node). After receiving a new Key-Value, the storage system selects the virtual node corresponding to the partition to store the Key-Value according to the partition in which the Hash value of the Key-Value key falls.
  • a service host In a network topology, a service host is connected to a server, and data is written to the storage medium through the server. Therefore, the server needs to record a mapping table (also called a partitioned view) that records the mapping relationship between the partition and the physical node.
  • the server is heavy and the operation is complicated.
  • the present invention provides a data processing technique that can reduce server load and operational complexity.
  • the first aspect of the present invention provides a data processing method, including: a switching device receives a key-value Key-Value packet sent by a server, and a destination address of the Key-Value packet is a partition number; Obtaining the partition number in the Key-Value packet, and querying the partition view to obtain the storage disk address corresponding to the partition number, wherein the partition view records the correspondence between the partition number and the storage disk address; Converting the Key-Value packet into a storage disk packet by changing the destination address of the Key-Value packet to the storage disk address; Sending the storage disk message to a storage disk corresponding to the storage disk address.
  • the present invention provides a data exchange apparatus, including: a receiving module, configured to receive a key-value Key-Value message, a destination address of the Key-Value message is a partition number; and a query module, configured to Obtaining the partition number in the Key-Value packet, and querying the partition view to obtain a storage disk address corresponding to the partition number, wherein the partition view records the correspondence between the partition number and the storage disk address a message conversion module, configured to convert the Key-Value message into a storage disk message by changing a destination address of the Key-Value message to the storage disk address, and a sending module, configured to send the The storage disk message is given to the storage disk corresponding to the storage disk address.
  • a receiving module configured to receive a key-value Key-Value message, a destination address of the Key-Value message is a partition number
  • a query module configured to Obtaining the partition number in the Key-Value packet, and querying the partition view to obtain a storage disk address
  • the present invention provides a write data system, including a server and a switching device of the second aspect of the present invention, wherein the Key-Value packet is a data Key-Value packet:
  • the server is configured to calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data Key-Value message,
  • the data key-value packet is sent to the switching device, where the destination address of the data Key-Value is the data partition number, and the data key-value packet carries the data key and the The data Value.
  • the present invention provides a write data system including a server and a switching device.
  • the server is configured to calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data key-value Key-Value report.
  • the switching device is connected to the server, and is configured to receive a Key-Value packet sent by the server; the switching device is further configured to use the data Key-Value packet Obtain the data partition number, query The partition view obtains a data storage disk address corresponding to the partition number, wherein the partition view records a correspondence between the data partition number and the data storage disk address; and the switching device is further configured to The destination address of the data Key-Value packet is changed to the data storage disk address, and the data Key-Value packet is converted into a data storage disk packet; the switching device is further configured to send the data storage disk The message is given to a data storage disk corresponding to the data storage disk address.
  • the present invention provides a switching device comprising: an interface configured to provide an external connection; a computer readable medium configured to store a computer program; a processor, and the interface, the computer readable Media connection.
  • the processor is configured to, by running the program, perform the following steps: receiving a key-value Key-Value message, the destination address of the Key-Value message is a partition number; from the Key-Value message Obtaining the partition number, and querying the partition view to obtain a storage disk address corresponding to the partition number, wherein the partition view records a correspondence between the partition number and the storage disk address; by using the Key-Value
  • the destination address of the packet is changed to the storage disk address, and the Key-Value message is converted into a storage disk message; the storage disk message is sent to the storage disk corresponding to the storage disk address.
  • the present invention provides a write data system including a switching device and a server.
  • the server is configured to calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data Key-Value packet.
  • the switching device is connected to the server, configured to receive the data Key-Value packet, obtain the data partition number from the data Key-Value packet, and query the partition view to obtain the data.
  • the partition view records a correspondence between the partition number and the data storage disk address; by using the data key-Value message The address is changed to the data storage disk address, and the data Key-Value message is converted into a data storage disk message; the data storage disk message is sent to a data storage disk corresponding to the data storage disk address.
  • the Key-Value packet uses the partition number as the destination address, and the partition view is separated from the server, and the switching device performs packet conversion. Improves the data processing efficiency of the server, reducing its load and operational complexity. It also saves network bandwidth across the storage system.
  • FIG. 1 is a topological view of an embodiment of a storage system of the present invention
  • FIG. 2 is a flow chart of an embodiment of a method for writing data and reading data according to the present invention
  • FIG. 3 is a flow chart of an embodiment of a method for writing data according to the present invention.
  • FIG. 4 is a flow chart of an embodiment of a method for reading data according to the present invention.
  • FIG. 5 is a block diagram showing an embodiment of a switching device of the present invention.
  • FIG. 6 is a block diagram of an embodiment of a switching device of the present invention.
  • Key-Value storage is a distributed storage method.
  • the storage object is Key-Value.
  • the Key-Value storage can store multiple Key-Values in multiple storage disks.
  • Embodiments of the invention relate to the following nouns.
  • Key-Value It consists of two parts: Value and Key.
  • the Value part is the data itself, Key is the index of Value, and Key and Value correspond one by one. You can find the corresponding Value through Key. .
  • Metadata Descriptive information of data, also referred to as data of data, for example, may be an index of data.
  • data refers to a description object of metadata, and does not include metadata.
  • Data Value The data itself.
  • Data Key The label of the data Value in the Key-Value store.
  • Metadata Value The metadata itself.
  • Metadata Key The label of the metadata Value in the Key-Value store. Just as the data Key and the metadata Key are both Key, the Data Value and the Meta Value are both Value.
  • Key-Value packet A new packet format. The difference between the IP packet and the IP packet is that the destination address is the partition number. The frame type field is used to mark the key-value packet. The rest can be combined with the IP packet. The text is the same.
  • Data Key-Value packet A type of Key-Value packet.
  • the payload carries the data key and the data Value, which can be used to store the Key-Value in the storage disk.
  • Metadata Key-Value packet A type of Key-Value packet.
  • the payload carries the metadata Key and the metadata Value, which can be used to store the Key-Value in the storage disk.
  • Data request Key-Value packet A type of Key-Value packet, the key carrying the data in the payload, sent by the server to the switching device, and can be used to request the value corresponding to the Key in the payload. , that is, the data Value.
  • Metadata request Key-Value packet A type of Key-Value packet.
  • the key carrying the metadata in the payload is sent by the server to the switching device, and can be used to request the key corresponding to the payload.
  • the Value which is the request metadata Value.
  • storage disk storage device, can use flash (Flash), disk or tape as Storage media
  • the external interface can use IP interface or other interfaces.
  • IP interface When using an IP interface, it is also called an IP disk.
  • the IP disk has a CPU and memory, so it has a certain processing power, so to some extent, the IP disk is equivalent to a combination of a storage controller and a storage medium.
  • Storage disk message A message that can be recognized by the storage disk.
  • the disk disk message is an IP packet.
  • the payload of the metadata storage disk packet is the same as the metadata Key-Value packet. The difference is that the destination address is different.
  • the destination address of the storage disk packet is the address that can be recognized by the storage disk, and the Key-Value packet.
  • the destination address is the partition number.
  • the label field is different. The label device can distinguish between the storage disk packet and the Key-Value packet.
  • the data storage disk message is similar to the metadata storage disk message, except that the payload is different.
  • a data response storage disk message, a type of storage disk message, the data requesting a response message of the storage disk message, and the payload carries the data value requested by the data request Key-Value message.
  • Metadata response storage disk message A type of storage disk message, the metadata requesting a response message of the storage disk message, and the payload carries the metadata requested by the key-value message. .
  • the embodiment of the invention provides a method for writing data and a method for reading data, which can be applied to a storage system composed of a server and a switching device.
  • a storage system composed of a server and a switching device.
  • one end of the server 1 is connected to a switching device, and the other end of the switching device 2 can be connected to a storage device 3, such as a memory, or a storage array composed of a memory.
  • Figure 2 is a write data method and a read data method based on this system, which are relatively independent between the two methods.
  • the switching device is, for example, a switch or a switch with routing capabilities.
  • the server 1 when reading data or writing data, the server 1 does not need to query the partition view, which reduces the resource occupation of the server.
  • the partitioned view needs to be sent to each switching device, so that the storage device can be read or write accessed.
  • the number of servers 1 is often much larger than the number of switches
  • the amount of backup 2 the amount of data required to publish the partitioned view to all servers 1 is much larger than the amount of data that publishes the partitioned view to all switching devices. After the partitioned view is no longer stored on Server 1, the amount of data in the published partitioned view is greatly reduced.
  • the server and the switch communicate using the IP protocol.
  • the server When storing data, the server must grasp the address of the storage disk to implement data reading, and the switch acts only as a transit device.
  • each server 1 and the switching device 2 must use the Key-Value packet to communicate with each other.
  • the server 1 can calculate the partition number of the Key-Value packet without having to know the address of the storage disk.
  • the embodiment of the present invention reduces the resource occupation of the server; on the other hand, since the server 1 does not have a storage disk address, the security of the entire storage system is improved, and the address of the storage disk cannot be obtained through the intrusion server 1.
  • the same copy needs to be stored in a different storage device.
  • the server when writing data, the server needs to send multiple identical copies to the switch, and then the switch forwards to the corresponding memory one by one.
  • the server 1 when writing data, the server 1 may send only one copy to the switching device 2, and the switching device 2 generates multiple copies, and then forwards them to the respective memories.
  • the technique of the embodiment of the present invention reduces the number of packets sent between the server 1 and the switching device 2.
  • the server 1 can receive data from an external host and store it in the storage device 3 through the switching device 2. Alternatively, the data in the storage device 3 is read out by the switching device 2.
  • the switching device 2 can support an OSI Layer 2 protocol, such as a switch; or support both an OSI Layer 2 protocol and a Layer 3 protocol, such as a combination of a switch and a router.
  • OSI Layer 2 protocol such as a switch
  • OSI Layer 3 protocol such as a combination of a switch and a router.
  • the storage device 3 is used to store data.
  • Each storage device 3 contains one or more partitions, which is a logical concept.
  • the method of writing data method includes the following steps.
  • the server receives the data to be written.
  • the data to be written can be, for example, a complete file, or a data stream, or it can be part of a file or part of a data stream.
  • the server splits the data to be written into at least one slice, which is also called a data value.
  • the data key (Key) algorithm the data key of each data Value is obtained, and the partition number of each data Value is obtained according to the partition number algorithm.
  • the data Key and the data Value corresponding to the data key can also be simply referred to as a K-V packet.
  • partition number can refer to Distributed Hash Table (DHT) technology.
  • DHT Distributed Hash Table
  • the hash space is evenly divided into partitions in order, and each hash space is assigned a unique number, which is the partition number.
  • the hash space may be a range of values obtained after the Key has passed a hash function.
  • the server splits the data to be written at a granularity of 1 MB, and splits a plurality of data fragments formed, and each data fragment is also referred to as a data value.
  • the algorithm for obtaining the data Key corresponding to the data Value may be arbitrarily set, as long as the one-to-one correspondence of the two is met, for example, a third party may provide a number as a Key of the Value, or may be calculated according to the parameter of the Value. Key can be regarded as a Label of Value, which can uniquely identify a Value.
  • Each partition number uniquely marks a partition, and the partition has a corresponding relationship with the storage disk, for example, multiple The partition corresponds to a storage disk.
  • the partition number can be a number or a letter or other form of marking.
  • the partition number can also be called a partition address.
  • One of the algorithms is to number the Values and correspond to each partition one by one.
  • an algorithm may be: performing a hash calculation on the Key of the Value, taking the obtained hash value as the total number of partitions owned by all the storage disks, using the remainder as the partition number, and storing the Value in the corresponding number according to the value of the remainder.
  • the partition For example, if the remainder is 3, it will be stored in the third partition.
  • the partition number of this partition can also be named partition 3.
  • Another algorithm can be: Assume that there are a total of three partitions, namely partition 1, partition 2 and partition 3. Then the first value is stored in partition 1, the second value is stored in partition 2, the third value is stored in partition 3, the fourth value is stored in partition 1, and the fifth value is stored in partition 2, the sixth value. Deposited into partition 3, the seventh value is stored in partition 1...
  • Another algorithm uses pseudo-random numbers, and randomly assigns a partition number to each Value. From a statistical point of view, Value also roughly averages to each partition. number.
  • the keys of the 9 data fragments that are split into are aaa.doc-1, aaa.doc-2, aaa.doc-3... ...aaa.doc-9
  • the naming method for these 9 data fragments is the Data Key algorithm.
  • the hash value of the 9 data keys is divided according to the number of partitions 4. The remainder is 1, corresponding to partition 1, the remainder is 2, the corresponding partition is partition 2, and so on.
  • the K-V packet sent to the switching device is a newly defined packet, which uses the partition number as the destination address.
  • the source address of the K-V packet may be the IP address of the server.
  • the label fields used to distinguish K-V packets and IP packets are different except for the destination address. K-V packets are used. The remaining fields can be consistent with IP packets.
  • K-V packets are similar to IP packets and belong to the network layer of the Open Systems Interconnection (OSI) model in network protocols.
  • the difference is that the IP packet is a 32-bit destination IP address, the KV packet is the partition number, and the partition number can be 32 bits.
  • the upper layer protocol of the protocol used by the K-V packet is the UDP protocol, and the lower layer protocol is the MAC protocol.
  • the K-V packet is encapsulated into a MAC frame.
  • the source address of the MAC frame is the MAC address of the server
  • the destination address is the MAC address of the switching device
  • the broadcast address is 0xFFFFFFFFFF. Mark this in the "Frame Type" field as a K-V message to distinguish it from the IP message.
  • the MAC frame also needs to be encapsulated by the physical layer.
  • the switching device receives the data K-V packet, and obtains a storage disk address corresponding to the partition number of the data Value by querying the partition view.
  • the storage disk address corresponding to the partition number of the data value is used as the destination address, and the data K-V packet is converted into a data storage disk packet, and the data storage disk packet is sent.
  • the partition view records the correspondence between the partition number and the storage disk address, and the partition view may be stored in the switching device and updated by a controller that communicates with the switching device.
  • the destination address of the data packet is the storage disk address.
  • the corresponding storage disk receives the data packet, and then stores the Key and Value in the packet.
  • the payload of the data storage disk message and the data K-V message are the same.
  • the partition view records the correspondence between the partition number and the storage disk address.
  • the working principle of the switching device is: after receiving the Key-Value packet sent by the server, the switching device reads the partition number carried in the packet, and queries the partition view to obtain the storage disk address.
  • the number of partitions is greater than the number of storage disks, multiple partitions may correspond to the same storage disk.
  • the storage disk here is a physical storage, and may be, for example, a magnetic disk, a solid state hard disk SSD, a rewritable optical disk, or the like.
  • the address of the storage disk can be an IP address.
  • the exchange device The device can be connected to the storage disk through the Ethernet. If the packet sent by the switching device to the storage disk is an IP packet, the TCP/IP protocol is not the underlying protocol. In the actual solution, the IP packet needs to be used again.
  • the package for example, is packaged into an Ethernet frame and then sent to the storage disk. After receiving the Ethernet frame, the storage disk can obtain the IP packet after decapsulation.
  • a bearer protocol such as Asynchronous Transfer Mode (ATM) can also be used to carry out the underlying data transmission between the switching device and the storage disk.
  • ATM Asynchronous Transfer Mode
  • the Ethernet frame needs to be further encapsulated into a physical layer format to be sent to the storage disk.
  • Step 12-13 describes the process of sending the data Key-Value. It should be noted that, for the case where there is a check fragment, for example, there are 2 check fragments for every 5 data fragments, then the check fragment is verified. It is also sent in the same operation as data fragmentation. That is to say, when there are verification fragments, the data Value mentioned in each step of the method includes two types, one type is directly obtained by data splitting, is the original data Value, and the other type is Check information of the original data Value. Since the check value is processed in the same process as the original data Value, it can be distinguished, and is collectively referred to as data Value.
  • Each data Key-Value is stored in the same manner as in step 13. After the storage disk successfully stores the data Key-Value, it sends a response message indicating that the storage is successful to the switching device, and the switching device sends a response message indicating that the storage is successful to the server. After receiving the storage success response message sent by the switching device, the server performs the processing of the metadata Key-Value of steps 14-15.
  • the metadata is relative to the data. After the data is split into the data Value, the metadata Value is generated.
  • the metadata Value can record which data the data is split into, that is, The index of the data Value; and the order relationship of these data Values in the data.
  • the metadata Value can be obtained from the data, and the server obtains the metadata Key corresponding to the metadata Value according to the metadata Key algorithm, and obtains the metadata Value according to the partition number algorithm.
  • the area code, the partition number of the metadata Value is called the metadata partition number.
  • the metadata KV packet is generated by using the metadata partition number as the destination address, and the metadata KV packet is sent to the switching device, where the payload of each of the metadata KV packets is carried.
  • the metadata Key and the metadata Value corresponding to the metadata Key is carried.
  • the metadata value records the index information of the data Value. If the metadata is smaller than the size of the data Value, the metadata does not need to be split. According to the previous example, for the data whose file name is aaa.doc, the metadata Key algorithm may be to add -detadata after the file name, and the key of the metadata may be aaa.doc-metadata.
  • the Meta Key algorithm can be different from the Data Key algorithm.
  • the switching device receives the metadata KV packet, obtains the storage disk address corresponding to the metadata partition number by querying the partition view, and uses the storage disk address corresponding to the metadata partition number as the destination address, and the element is
  • the data Key-Value message is converted into a metadata storage disk message, and the metadata storage disk message is sent to the corresponding storage disk.
  • steps 14-15 and 12-13 are similar, except that the data becomes metadata, so this embodiment will not be described in detail.
  • any one of the three methods mentioned in step 12 may be employed.
  • the switching device can distinguish between K-V packets and other packets (such as IP packets) by reading the frame type field.
  • the read data method and the write data method are relatively independent. The following describes the read data method embodiment.
  • the server receives a read data request.
  • the data to be read requested by the read data request may be, for example, a complete file, or a data stream, or may be part of the file or part of the data stream.
  • the server calculates the metadata key according to the read data request, and uses the metadata Key algorithm to obtain the partition number of the metadata according to the partition number algorithm, and uses the partition number as the destination address to generate the location.
  • the metadata read request key-value message carries the metadata key in the payload of the metadata read request Key-Value message, and sends the metadata read request to the switching device.
  • the key of the metadata can also be called the metadata of the metadata.
  • the read data request carries information of the data to be read, and the metadata Key algorithm is used to calculate the metadata key according to the information of the data to be read.
  • the metadata Key algorithm of this step and the metadata Key algorithm of step 14 may be the same, for example, the information of the data to be read is the file name of the data to be read, and the metadata Key algorithm is to add the suffix -meta data after the file name, then the file name
  • the metadata Key for the data to be read for aaa.doc is aaa.doc-metadata.
  • the algorithm for obtaining the metadata partition number of the server is the same as that of step 12, that is, for the read request and the write request of the same data, the partition numbers calculated according to the preset partition number algorithm are the same.
  • the partition number can be obtained according to the Key.
  • the metadata key is also aaa.doc-metadata.
  • the metadata partition number can be obtained using one of the methods mentioned in step 12. Of course, you can also obtain the metadata partition number by other means without using the Key.
  • the metadata partition number is placed in the metadata read request Key-Value header as the destination address, and the Key is placed in the payload of the metadata read request Key-Value packet, and the generated Metadata read request Key-Value message.
  • the content requested by the metadata read request Key-Value message is the metadata Value, and the server sends the read request to the switching device.
  • the message format of the metadata read request is the same as the K-V message in step 12, and the metadata key written by the payload. Similarly, this message needs to be encapsulated into a MAC frame and even further converted into a physical signal for transmission to the switching device.
  • the switching device receives the metadata read request Key-Value packet, obtains a metadata partition number as a destination address from the packet header, and obtains the metadata score by querying the partition view.
  • the storage disk address corresponding to the area code. Substituting the destination address of the metadata read request Key-Value message with the storage disk address corresponding to the metadata partition number, thereby converting the metadata Value read request into a metadata request storage disk message, and issuing the metadata Request a storage disk message to the corresponding storage disk.
  • the mapping between the metadata partition number and the corresponding storage disk address is recorded in the partition view.
  • the destination address of the metadata read request key-value packet is a storage disk address, and after receiving the metadata request message, the storage disk returns the metadata value to the switching device.
  • the returned packet format can be an IP packet, and the metadata value is carried in the payload of the IP packet.
  • the storage disk sends a metadata response storage disk message to the switching device, and the switching device receives the metadata response storage disk message and forwards the message to the server.
  • the packets forwarded by the switching device to the server can use the IP packet without using the K-V packet format.
  • the metadata response storage disk message is a response message of the metadata request storage disk message.
  • the server decapsulates the received metadata response storage disk message, and obtains the data key of each of the data values by using the metadata value carried in the payload of the storage disk message. And the data partition number of the partition that obtained the stored data Value.
  • the data request key-value packet is generated by using the data partition number as the destination address, and the data request key-value packet carries the data key, and the data request Key-Value packet is sent to the switching device.
  • the specific method of obtaining the data key and the data partition number can be the same as step 22.
  • the metadata value carries the order between the data Values and the index of each data Value (for example, the data Key corresponding to the data value), and obtains the partition number of the data Value through the data Key.
  • the algorithm is the same as the algorithm mentioned in step 12.
  • the metadata Value may also carry the data Key without carrying the data Key, and the data Key may be obtained by the server. Obtain the specific data partition number through the data key
  • the algorithm is the same as the algorithm mentioned in step 12.
  • the metadata Value can also directly carry the data partition number, which can eliminate the process of obtaining the data partition number by the data Key calculation.
  • the switching device receives the data request Key-Value message, obtains the data value storage disk address corresponding to the data partition number by querying the partition view, and uses the storage disk address corresponding to the data partition number as the destination address, and the data request key
  • the -Value message is converted into a data request storage disk message, and the data request storage disk message is sent to the corresponding storage disk.
  • the storage disk After receiving the data request storage disk packet, the storage disk searches for the data value corresponding to the data key, and carries the found data value in the payload of the IP packet and returns it to the switching device. This IP packet is called a data response storage disk. Message.
  • step 23 is to obtain the metadata Value
  • the object to be acquired in this step is the data Value.
  • the switching device is further configured to: receive a data response storage disk message and forward the message to the server.
  • the server is further configured to decapsulate the data response storage disk message, and combine the plurality of data values into the data to be read requested by the read data request.
  • the server After the server has assembled all the data fragments constituting the data to be read, according to the order information of the data fragments recorded in the metadata Value, the data fragments are sequentially combined to form the data to be read, and the data to be read is returned to the request.
  • the Key-Value packet is also used in the header of the packet. This is a Key-Value to distinguish it from the IP packet. For example, in the frame type field, it is marked with 0x8040, which is a Key-Value message.
  • Another embodiment of the present invention is a data processing method, which is another table of the above method.
  • the method including the following steps 33-36 performed by the switching device. See Figure 3 and Figure 4.
  • the switching device receives the key-value Key-Value packet, and the destination address of the Key-Value packet is a partition number.
  • the switching device obtains the partition number from the Key-Value packet, and queries the partition view to obtain a storage disk address corresponding to the partition number, where the partition number and the storage are recorded in the partition view. The correspondence of the disk addresses.
  • the switching device sends the storage disk message to a storage disk corresponding to the storage disk address.
  • Steps 33-36 are performed by the switching device, without distinguishing between a method of writing data or a method of reading data.
  • steps 33-36 are data writing methods, see Figure 3. Assuming that the Key-Value packet is a data Key-Value packet, the method further includes step 30 before the step of receiving the Key-Value packet by the switching device.
  • the server calculates a data key corresponding to the data value Value according to the data key Key algorithm, obtains a data partition number of the data value according to the partition number algorithm, generates the data Key-Value message, and uses the data Key-
  • the value packet is sent to the switching device, where the destination address of the data Key-Value is the data partition number, and the payload of the data Key-Value packet carries the data key and the data Value .
  • the server calculates a metadata key corresponding to the metadata value according to the metadata key Key algorithm, obtains a metadata partition number of the metadata value according to the partition number algorithm, and generates a metadata Key-Value message, Transmitting the metadata Key-Value message to the exchange
  • the device where the destination address of the metadata Key-Value packet is the metadata partition number, and the metadata of the metadata Key-Value packet carries the metadata Key and the metadata Value,
  • the metadata Value records index information of the data Value.
  • Obtaining the data partition number according to the partition number algorithm includes: the hash value of the data key is obtained according to the number of partitions, and the obtained value is used as the data partition number.
  • the switching device receives the metadata Key-Value packet, and obtains a metadata storage disk address corresponding to the partition number of the metadata Value by querying the partitioned view, by using the metadata Key-
  • the destination address of the value packet is changed to the metadata storage disk address, and the metadata Key-Value packet is converted into a metadata storage disk packet, and the metadata storage disk packet is sent to the metadata.
  • step 30 the following steps 401 and 402 may also be included.
  • the server receives data to be written.
  • the server splits the data to be written into a data Value.
  • the Key-Value packet is a metadata request Key-Value packet, and further includes steps 31 and 32 before the switching device receives the Key-Value packet.
  • the server receives the read data request and requests to obtain the data to be read.
  • the server obtains the metadata key of the data to be read according to the metadata Key algorithm, and obtains the metadata partition number according to the partition number algorithm by using the metadata key, and generates the metadata request Key-Value packet. Sending the metadata Key-Value packet to the switching device, where the destination address of the metadata request Key-Value packet is the metadata partition number, and the metadata request Key-Value report
  • the document carries the metadata Key.
  • the method may further include steps 37-41.
  • the switching device receives a metadata response message and forwards the message to the server, where the payload of the metadata response message carries a metadata value, where the metadata response message is the storage disk message. Response message.
  • the server decapsulates the metadata response message, obtains a data key from the metadata value, and uses the data key to calculate a data partition number according to the partition number algorithm, and generates a data request key-Value message.
  • the destination address of the data request Key-Value packet is a data partition number, and the data request Key-Value packet carries the data key, and the data request Key-Value packet is sent to the switching device.
  • the switching device receives the data request Key-Value packet, obtains a storage disk address corresponding to the data partition number by querying a partition view, and uses the storage disk address as a destination address to request the data request Key-
  • the value message is converted into a data request storage disk message, and the data request storage disk message is sent to the storage disk corresponding to the data value storage disk address.
  • the switching device receives the data response storage disk message and forwards the message to the server, where the data response message is a response message of the data request storage disk message, and the data response message carries the data. Value.
  • the server decapsulates the data response Key-Value to obtain a data Value, and combines data values of the plurality of data keys into the to-be-read data.
  • the Key-Value packet uses the frame type field to mark the packet type.
  • Embodiments of the method of the present invention include steps 11-28, steps 301-36, and steps 31-41.
  • part of the operations are performed by the server and another part of the operations are performed by the switching device.
  • the switching device 2 in the embodiment of the present invention can be read by the processor 51 and the computer.
  • the medium 52 is composed of an interface 52, and the processor 51 is connected to the computer readable medium 52 and the interface 53 via a bus, respectively.
  • the interface 53 provides an external connection, such as to the server 1, the computer readable medium 52 is for storing computer program code, and the processor 51 executes the operation of the switching device in steps 11-28 by running the computer readable medium 52 program code, or The device is in steps 301-36, or the operation of the switching device in steps 31-41.
  • the server provided by the embodiment of the present invention may be composed of a processor, a storage medium, and an interface, and the processor is respectively connected to the storage medium and the interface.
  • the interface provides an external connection
  • the storage medium is for storing computer program code
  • the processor executes the operations of the server in steps 11-28 by running the storage medium program code, or the server is in steps 301-36, or the server is in steps 31-41. operating.
  • the switching device 2 provided by the embodiment of the present invention includes a receiving module 61, a querying module 62, a message converting module 63, and a sending module 64.
  • the switching device 2 can perform the above mentioned methods, such as the operation of the switching device in steps 11-28, or the switching device in steps 301-36, or the operation of the switching device in steps 31-41.
  • the receiving module 61 is configured to receive a key-value Key-Value packet, where the destination address of the Key-Value packet is a partition number, and the query module 62 is configured to obtain the partition number from the Key-Value packet. Querying the partitioned view to obtain the storage disk address corresponding to the partition number, wherein the partition view records the correspondence between the partition number and the storage disk address; the message conversion module 63 is configured to The destination address of the Key-Value packet is changed to the storage disk address, and the Key-Value packet is converted into a storage disk packet.
  • the sending module 64 is configured to send the storage disk packet to the storage disk. The storage disk corresponding to the address.
  • the Key-Value packet received by the receiving module 61 may be a metadata request Key-Value packet.
  • the receiving module 61 is further configured to receive a metadata response message and forward the message to the server 1.
  • the payload of the metadata response message carries a metadata value, where the metadata response message is the storage Response packet of the disk message.
  • the receiving module 61 is further configured to read a frame type word The segment determines the packet type of the Key-Value packet. As the key-value packet is similar to the IP packet, the frame type field can be used to distinguish between the Key-Value packet and the IP packet.
  • the Key-Value packet received by the receiving module 61 may be a data request Key-Value packet.
  • the switching device 2 and the server 1 together constitute a write data system of an embodiment of the present invention.
  • the service device is configured to: calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data Key-Value message,
  • the data key-value packet is sent to the switching device 2, where the destination address of the data Key-Value is the data partition number, and the data key of the data Key-Value packet carries the data key And the data Value.
  • the server 1 is further configured to: calculate a metadata key corresponding to the metadata value according to the metadata key Key algorithm, obtain a metadata partition number of the metadata value according to the partition number algorithm, and generate a metadata Key-Value message,
  • the metadata key-value packet is sent to the switching device 2, where the destination address of the metadata Key-Value packet is the metadata partition number, and the metadata of the metadata Key-Value packet is
  • the metadata carries the metadata Key and the metadata Value, and the metadata Value records index information of the data Value.
  • the receiving module 61 is further configured to receive the metadata Key-Value packet
  • the querying module 62 is further configured to obtain a metadata storage disk corresponding to the partition number of the metadata Value by querying the partitioned view.
  • the message conversion module 63 is further configured to convert the metadata Key-Value message into a meta by changing the destination address of the metadata Key-Value message to the metadata storage disk address.
  • the data storage disk packet, the sending module 64 is further configured to send the metadata storage disk message to a storage disk corresponding to the metadata storage disk address.
  • the server is further configured to: before the data value is processed, receive the data to be written, and the data to be written is split into the data Value.
  • the obtaining the partition number of the data value according to the partition number algorithm includes: the hash value of the data key by the server 1 is obtained according to the number of partitions, and the obtained value is used as the data partition number.
  • the server 1 is used to execute the part of the above method involving the server. For example, the operation of the server in steps 11-28, or the operation of the server in steps 301-36, or the server in steps 31-41.
  • the write data system of the embodiment of the present invention will be described below from another angle.
  • the write data system consists of server 1 and switching device 2.
  • the server 1 is configured to calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data key-value Key-Value. Sending, by the packet, the data Key-Value packet to the switching device, where the destination address of the data Key-Value is the data partition number, and the data key-Value packet carries the payload The data Key and the data Value.
  • the switching device 2 is connected to the server 1 and configured to receive the Key-Value packet sent by the server 1.
  • the switching device 2 is further configured to obtain the data partition number from the data Key-Value packet, and query the partition view to obtain a data storage disk address corresponding to the partition number, where the partition view records Correspondence between the data partition number and the data storage disk address.
  • the switching device 2 is further configured to convert the data Key-Value message into a data storage disk message by changing a destination address of the data Key-Value message to the data storage disk address;
  • the switching device 2 is further configured to send the data storage disk message to the data storage disk 3 corresponding to the data storage disk address.
  • the server 1 is further configured to: calculate a metadata key corresponding to the metadata value according to the metadata key Key algorithm, obtain a metadata partition number of the metadata value according to the partition number algorithm, and generate a metadata Key-Value. Sending, by the packet, the metadata Key-Value packet to the switching device 2, where the destination address of the metadata Key-Value packet is the metadata partition number, and the metadata Key-Value Carrying the metadata key in the payload of the message and the Metadata Value, the metadata value records index information of the data Value.
  • the switching device 2 is further configured to receive the metadata Key-Value packet, and obtain a metadata storage disk address corresponding to the partition number of the metadata Value by querying the partition view, by using the The destination address of the metadata Key-Value packet is changed to the metadata storage disk address, and the metadata Key-Value packet is converted into a metadata storage disk packet, and the metadata storage disk packet is sent to A storage disk 3 corresponding to the metadata storage disk address.
  • the storage system includes the switching device 2 and the server 1.
  • the server 1 is configured to calculate a data key corresponding to the data value Value according to a data key Key algorithm, obtain a data partition number of the data value according to a partition number algorithm, and generate the data Key-Value message. Transmitting the data Key-Value packet to the switching device, where the destination address of the data Key-Value is the data partition number, and the payload of the data Key-Value packet carries the Data Key and the data Value.
  • the switching device 2 is connected to the server 1 and configured to receive the data Key-Value packet; obtain the data partition number from the data Key-Value packet, and query the partition view to obtain the location a data storage disk address corresponding to the data partition number, wherein the partition view records a correspondence between the partition number and the data storage disk address; and the destination address of the data Key-Value packet is changed by
  • the data storage disk address converts the data Key-Value message into a data storage disk message; and sends the data storage disk message to the data storage disk 3 corresponding to the data storage disk address.
  • aspects of the present invention, or possible implementations of various aspects may be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," “modules,” or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product
  • a computer program product refers to computer readable program code stored on a computer readable medium.
  • the computer readable medium can be a computer readable signal medium or a computer readable storage medium.
  • the computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).
  • the processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor is capable of performing the various functional steps specified in each step of the flowchart, or a combination of steps; A device that functions as specified in each block, or combination of blocks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据处理技术,交换设备接收服务器发送的Key-Value报文,获取分区号,查询分区视图获得所述分区号对应的存储盘地址;通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;交换设备发送所述存储盘报文给与所述存储盘地址对应的存储盘。

Description

数据处理方法和装置 技术领域
本申请涉及一种数据处理技术,特别有关于一种数据处理方法和装置。
背景技术
随着社会的发展,需要被存储和管理的数据的规模越来越多,甚至被称为海量数据。用传统的集中存储管理超大规模数据时,难以提供高效的读写操作,难以满足良好的扩展性以及高可用性。
在这种背景下,出现了由多个物理存储节点(本申请中简称为物理节点)共同组成的存储系统,每个存储节点都可以提供存储空间,这种存储方式称为分布式存储。一种分布式存储方式被称为键-值(Key-Value)存储,在Key-Value存储中,被存储的数据(或者数据分片)被称为值(Value),每个数据拥有一个在整个存储系统范围内唯一的标识,这个标识就是键(Key),Key和Value一一对应。比如要存取位于某个磁盘扇区内的数据,那么Key可以是“卷名+逻辑块地址(Logical Block Address,LBA)”,而Value是这个扇区所记录的数据,数据大小例如512字节。
同一个数据的Key和Value作为整体称为Key-Value,简称K-V。每一个Key-Value存储在存储系统的一个物理节点中。对某个具体的Key-Value而言,存储它的物理节点可以由一个映射规则确定,这个映射规则是建立在对Key进行哈希(hash)运算生成的Hash值,然后将Hash值和物理节点进行映射。基于这种方法,如果两个不同的key计算出的hash值相同,那么这两个Key对应的Key-Value一定存储在同一个物理节点上。
此外,由于物理节点的数量往往是固定的,而Key-Value数量不定,经常会超过物理节点数量,因此允许同一个物理节点存储拥有不同Key的多个Key-Value。也就是说,如果两个不同的key计算出的hash值不同,那么 这两个Key各自对应的Key-Value也有可能存储在同一个物理节点上。
根据Key计算出的Hash值,会落入在[0,2^32-1]的整数区间内,在系统初始化的时候,对这个大范围的整数区间进行分段,分成多个区间大小相等或近似的分区(Partition),每个分区内的哈希(Hash)值的数量基本一样。比如把整个Hash分成分区1和分区2共2个分区,那么分区1代表的区间为:[0,2^32/2],分区2代表的区间为:[2^32/2+1,2^32-1]。
物理节点包括存储介质,还可以进一步包括内存、CPU等组件。虚拟存储节点是相对于物理存储节点区别而言的,是对物理节点的存储空间的逻辑划分,每个物理节点可以虚拟出一个或多个虚拟节点;少部分情况下,也可以把多个物理节点虚拟成一个虚拟节点;或者,每个分区和一个虚拟节点(Virtual Node)对应。存储系统收到一个新的Key-Value后,根据这个Key-Value的Key的Hash值落入的分区,选择这个分区对应的虚拟节点存储Key-Value。
在一种网络拓扑中,业务主机和服务器连接,通过服务器把数据写入存储介质。因此服务器需要记录记录分区和物理节点之间映射关系的映射表(也叫做分区视图)。服务器负荷重,操作复杂。
发明内容
本发明提供一种数据处理技术,可以降低服务器负荷和操作复杂性。
第一方面,本发明一方面提供一种数据处理方法,包括:交换设备接收服务器发送的键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;交换设备从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;交换设备 发送所述存储盘报文给与所述存储盘地址对应的存储盘。
第二方面,本发明提供一种数据交换装置,包括:接收模块,用于接收键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;查询模块,用于从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;报文转换模块,用于通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;发送模块,用于发送所述存储盘报文给与所述存储盘地址对应的存储盘。
第三方面,本发明提供一种写数据系统,包括服务器和本发明第二方面的交换设备,其中,所述Key-Value报文是数据Key-Value报文:
所述服务器用于,按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
第四方面,本发明提供一种写数据系统,包括服务器和交换设备。其中,所述服务器,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据键-值Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value;所述交换设备和所述服务器连接,用于接收所述服务器发送的Key-Value报文;所述交换设备,还用于从所述数据Key-Value报文中获得所述数据分区号,查询 分区视图获得所述分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述数据分区号和所述数据存储盘地址的对应关系;所述交换设备,还用于通过将所述数据Key-Value报文的目的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;所述交换设备,还用于发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘。
第五方面,本发明提供一种交换设备,包括:接口,被配置为用于提供对外连接;计算机可读介质,被配置为用于存储计算机程序;处理器,和所述接口、计算机可读介质连接。处理器被配置为用于通过运行所述程序,执行以下步骤:接收键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;发送所述存储盘报文给与所述存储盘地址对应的存储盘。
第六方面,本发明提供一种写数据系统,包括交换设备和服务器。所述服务器被配置为,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。所述交换设备,和所述服务器连接,被配置为用于接收所述数据Key-Value报文;从所述数据Key-Value报文中获得所述数据分区号,查询分区视图获得所述数据分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述分区号和所述数据存储盘地址的对应关系;通过将所述数据Key-Value报文的目 的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘。
应用本发明方案,Key-Value报文使用分区号作为目的地址,并且分区视图和服务器分离,由交换设备进行报文转换。提高了服务器的数据处理效率,降低了其负载和操作复杂性。此外还可以节约整个存储系统的网络带宽。
附图说明
图1是本发明存储系统实施例拓扑图;
图2是本发明写数据方法和读数据方法实施例流程图;
图3本发明写数据方法实施例流程图;
图4本发明读数据方法实施例流程图;
图5是本发明交换设备结构实施例图;
图6是本发明交换设备实施例框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例所获得的所有其他实施例,都属于本发明保护的范围。
Key-Value存储是一种分布式存储方式,存储对象是Key-Value,Key-Value存储可以把多个Key-Value分散存储到多个存储盘中。
本发明实施例涉及以下名词。
(1)Key-Value:由值(Value)和键(Key)两部分信息组成,其中Value部分是数据本身,Key是Value的索引,Key和Value一一对应,通过Key可以查找到对应的Value。
(2)元数据:数据的描述信息,也称为数据的数据,例如可以是数据的索引。在本发明实施例中,在无特别说明的情况下,“数据”是专指元数据的描述对象,不包括元数据。数据Value:数据本身。数据Key:Key-Value存储中,数据Value的标签。元数据Value:元数据本身。元数据Key:Key-Value存储中,元数据Value的标签。就像数据Key和元数据Key都是Key一样,数据Value和元数据Value都是Value。
(3)Key-Value报文:一种新报文格式,其与IP报文的区别在于,目的地址是分区号,用帧类型字段标记这是Key-Value报文,其余部分可以与IP报文相同。
(4)数据Key-Value报文:Key-Value报文的一种,净荷中携带有数据Key及数据Value,可以用于把Key-Value存储到存储盘中。
(5)元数据Key-Value报文:Key-Value报文的一种,净荷中携带有元数据Key及元数据Value,可以用于把Key-Value存储到存储盘中。
(6)数据请求Key-Value报文:Key-Value报文的一种,净荷中携带有数据的Key,由服务器发送给交换设备,可以用于请求获得净荷中的Key所对应的Value,也就是数据Value。
(7)元数据请求Key-Value报文:Key-Value报文的一种,净荷中携带有元数据的Key,由服务器发送给交换设备,可以用于请求获得净荷中的Key所对应的Value,也就是请求元数据Value。
(8)存储盘:存储设备,可以采用闪存(Flash)、磁盘或者磁带作为 存储介质,对外接口可以使用IP接口或者其他接口。当使用IP接口时,也称为IP盘。IP盘拥有CPU和内存,因此具有一定的处理能力,因此从某种程度上来讲,IP盘相当于存储控制器和存储介质的组合。
(9)存储盘报文:能够被存储盘识别的报文。当存储盘是IP盘时,存储盘报文是IP报文。元数据存储盘报文的净荷和元数据Key-Value报文相同,不同之处在于目的地地址不同,存储盘报文的目的地址是能够被存储盘识别的地址,而Key-Value报文的目的地址是分区号;此外,标签字段也不同,通过标签设备可以区分出存储盘报文和Key-Value报文。数据存储盘报文和元数据存储盘报文相似,不同之处在于净荷不同。
(10)数据响应存储盘报文,存储盘报文的一种,数据请求存储盘报文的响应报文,净荷中携带有数据请求Key-Value报文所请求的数据Value。
(11)元数据响应存储盘报文:存储盘报文的一种,元数据请求存储盘报文的响应报文,净荷中携带有元数据请求Key-Value报文所请求的元数据Value。
本发明实施例提供一种写数据方法以及一种读数据方法,可以应用在服务器和交换设备组成的存储系统中。如图1所示,服务器1的一端和交换设备连接,交换设备2的另一端可以和存储设备3连接,存储设备3例如存储器,或者存储器组成的存储阵列。图2是基于这个系统的写数据方法和读数据方法,这两个方法之间是相对独立的。交换设备例如是交换机,或者具有路由功能的交换机。
应用本发明提供的方案,在读数据或者写数据时,不需要由服务器1查询分区视图,降低了服务器的资源占用。此外,在系统初始化时,或者系统中的存储设备有增删时,需要把分区视图发给各个交换设备,以便能够对存储设备进行读访问或者写访问。服务器1的数量往往远大于交换设 备2的数量,把分区视图发布到所有服务器1所需要的数据量远大于把分区视图发布到所有交换设备的数据量。分区视图不再存储在服务器1后,发布分区视图的数据量大大减少。
此外,现有技术中,服务器和交换机使用IP协议通信,存储数据时服务器必须掌握存储盘的地址,才能实现数据的读取,交换机仅仅作为中转设备。而本发明实施例中,各个必须知道各个服务器1和交换设备2之间使用Key-Value报文进行通信,服务器1计算出Key-Value报文的分区号即可,无需掌握存储盘的地址。本发明实施例一方面减小了服务器的资源占用;另一方面,由于服务器1没有存储盘地址,也提高了整个存储系统的安全性,无法通过入侵服务器1获得存储盘的地址。
更进一步的,在多副本的场景下,相同的副本需要存储到不同的存储设备中。现有技术中,在写数据时,服务器需要发送多个相同的副本给交换机,再由交换机逐个转发到对应的存储器。本发明实施例中,在写数据时,服务器1可以只发送一个副本给交换设备2,由交换设备2生成多个副本,然后转发给各个存储器。本发明实施例所的技术减少了服务器1和交换设备2之间发送的报文数量。
服务器1可以从外部主机中接收数据,通过交换设备2存入存储设备3。或者通过交换设备2读出存储设备3中的数据。
交换设备2可以支持OSI二层协议,例如是交换机;或者同时支持OSI二层协议与三层协议,例如交换机与路由器的结合体。
存储设备3用于存储数据。每个存储设备3包含一个或多个分区,分区是一种逻辑概念。
写数据方法实施例包括如下步骤。
11,服务器接收待写数据。
待写数据例如可以是一个完整的文件,或者一个数据流,也可以是文件的一部分或者数据流的一部分。
12,服务器将所述待写数据拆分成至少一个分片,分片又称为数据值(Value)。按照数据键(Key)算法获得每个数据Value的数据Key,按照分区号算法获得每个数据Value的分区号。以分区号作为目的地址生成所述数据Value的数据Key-Value报文,把所述数据Key-Value报文发送给交换设备,其中,每个所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Key对应的数据Value。Key-Value报文也可以简称为K-V报文。
本步骤中,由于这些Value是由服务器发送的数据拆分而产生的,所以称之为数据Value。后文中把元数据生成的Value称为元数据Value。数据Value和元数据Value统称为Value。
分区号的概念可以参考分布式哈希表(Distributed hash table,DHT)技术。将哈希空间按序均匀分割成分区(partition),并给每个hash空间分配一个唯一的编号,这个编号就是分区号。其中哈希空间可以是Key经过哈希功能(hash function)后得到的值的范围。
例如当数据大于1MB(Mega Bytes)时,服务器把待写数据以1MB为粒度进行拆分,拆分后形成的多个数据分片,每个数据分片也称作一个数据值(Value)。获得数据Value所对应的数据Key的算法可以任意设置,只要符合二者的一一对应即可,例如可以由第三方提供一个编号作为Value的Key,也可以根据Value的参数计算获得。Key可以看作Value的标签,可以唯一标识一个Value。
每个分区号唯一标记一个分区,分区和存储盘有对应关系,例如多个 分区对应一个存储盘。分区号可以是数字也可以是字母或者其他形式的标记方式。分区号也可以称为分区地址。获得每个Value的分区号的方法也有很多种,满足把Value大致均匀的分配到各个分区中即可。在特殊情况下,不均匀也是可以的。其中一种算法是:对Value进行编号,逐个对应到各个分区中。
例如:一种算法可以是:对Value的Key进行哈希计算,把获得的哈希值按所有存储盘拥有的分区总数取余,把余数作为分区号,按余数的数值把Value存入对应编号的分区中。例如余数是3就存入第3个分区中,这个分区的分区号也可以命名为分区3。
另外一种算法可以是:假设一共有3个分区,分别是分区1、分区2和分区3。那么第一个Value存入分区1,第二个Value存入分区2,第三个Value存入分区3,第四个Value存入分区1,第五个Value存入分区2,第六个Value存入分区3,第七个Value存入分区1……还有一种算法是采用伪随机数,对每个Value随机指定一个分区号,从统计上看,Value也会大致平均的对应到各个分区号。
例如,假设待写数据的文件名是aaa.doc,大小为9MB,那么拆分成的9个数据分片的Key分别是aaa.doc-1、aaa.doc-2、aaa.doc-3……aaa.doc-9,这9个数据分片的命名方式就是数据Key算法。假设一共有4个分区,分别是分区1,分区2,分区3以及分区4。对这9个数据Key的哈希值按照分区数量4取余,余数是1的,对应的就是分区1,余数是2的,对应的分区就是分区2,以此类推。
本步骤中,发给交换设备的K-V报文是一种新定义的报文,它以分区号作为目的地址K-V报文的源地址可以是是服务器的IP地址。除了目的地址不同以外,用于区分K-V报文和IP报文的标签字段也不同,K-V报文的 其余字段可以和IP报文保持一致。
K-V报文和IP报文相似,在网络协议中都属于开放系统互连(OSI)模型的网络层。不同之处在于IP报文中是32位目的IP地址,KV报文中是分区号,分区号也可以是32位的。和IP报文一样,K-V报文采用的协议的上层协议为UDP协议,下层协议为MAC协议。在把K-V报文由服务器发送给交换设备时,把K-V报文封装成MAC帧,MAC帧的源地址是服务器的MAC地址,目的地址是交换设备的MAC地址,或者是广播地址0xFFFFFFFFFFFF。在“帧类型”字段中标记这是一个K-V报文,以与IP报文进行区分。在真正传输时,MAC帧还需要进行物理层的封装。
13,所述交换设备接收所述数据K-V报文,通过查询分区视图获得所述数据Value的分区号对应的存储盘地址。以数据Value的分区号对应的存储盘地址作为目的地址,把所述数据K-V报文转换成数据存储盘报文,发送所述数据存储盘报文。其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系,分区视图可以存储在交换设备中,通过和交换设备通信的控制器进行更新。
本步骤中,由于数据报文的目的地址是存储盘地址,发送数据存储盘报文后,相应的存储盘会收到数据报文,然后对报文中的Key和Value进行存储。数据存储盘报文和数据K-V报文的净荷相同。
分区视图记录有分区号和存储盘地址的对应关系。交换设备的工作原理是:交换设备在收到服务器发送的Key-Value报文后,读取报文中携带的分区号,查询这个分区视图就可以获得存储盘地址。当分区数量大于存储盘数量时,可以多个分区对应同一个存储盘。这里的存储盘是物理存储器,例如可以是磁盘、固态硬盘SSD,可擦写光盘等。
存储盘的地址可以是IP地址。需要说明的是,和步骤12类似,交换设 备和存储盘之间可以通过以太网连接,交换设备发送给存储盘的报文如果是IP报文,由于TCP/IP协议不是底层的协议,在实际方案中,需要对IP报文再进行一次封装,例如封装成以太帧,然后发送给存储盘。存储盘收到以太帧后,解封装即可获得IP报文。除了以太网,也可以采用异步传输模式(Asynchronous Transfer Mode,ATM)等承载协议,进行交换设备和存储盘之间的底层数据传输。在实际业务中,以太帧还需要进一步封装成物理层的格式才能发送给存储盘。
步骤12-13介绍了数据Key-Value的发送过程,需要说明的是,对于有校验分片存在的情况,例如每5个数据分片对应有2个校验分片,那么校验分片的也按数据分片同样的操作发送。也就是说,当有校验分片的情况下,本方法各个步骤中提到的数据Value包括2种类型,一种类型是直接由数据拆分获得,是原始数据Value,另一种类型是原始数据Value的校验信息。由于校验Value按照和原始数据Value同样的流程处理,可以不进行区分,统称为数据Value。
每个数据Key-Value都按照步骤13的方式进行存储。存储盘成功存储完数据Key-Value后,发送存储成功的响应消息给交换设备,交换设备发送存储成功的响应消息给服务器。服务器收到交换设备发送的存储成功响应消息后,进行步骤14-15的元数据Key-Value的处理。
下面介绍元数据Key-Value的发送过程,元数据是相对于数据而言,在数据被拆分成数据Value之后生成元数据Value,元数据Value可以记录数据被拆分成哪些数据Value,也就是数据Value的索引;以及这些数据Value在数据中的顺序关系。
14,元数据Value可以由数据获得,服务器按照元数据Key算法获得和元数据Value对应的元数据Key,按照分区号算法获得元数据Value的分 区号,元数据Value的分区号称为元数据分区号。以元数据分区号作为目的地址生成所述元数据Value的元数据K-V报文,把所述元数据K-V报文发送给交换设备,其中,每个所述元数据K-V报文的净荷中携带所述元数据Key以及所述元数据Key对应的元数据Value。
元数据Value记录所述数据Value的索引信息。如果元数据小于数据Value的大小,元数据不用拆分。按照前面的举例,对于文件名是aaa.doc的数据,其元数据Key算法可以是在文件名后增加-detadata,其元数据的Key可以是aaa.doc-metadata。元数据Key算法可以和数据Key算法不同。
15,所述交换设备接收所述元数据K-V报文,通过查询分区视图获得所述元数据分区号对应的存储盘地址,以元数据分区号对应的存储盘地址作为目的地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给对应的存储盘。
由于元数据的发送过程和数据相似,也就是说步骤14-15和步骤12-13相似,不同之处在于数据变成了元数据,所以本实施例不再详述。例如步骤14中获得分区号的方式,可以采用步骤12提到的三种方法中的任意一种。交换设备可以通过读取帧类型字段,对K-V报文和其他报文(例如IP报文)进行区分。
读数据方法和写数据方法相对独立,下面对读数据方法实施例进行介绍。
21,服务器接收读数据请求。
读数据请求所请求的待读数据例如可以是一个完整的文件,或者一个数据流,也可以是文件的一部分或者数据流的一部分。
22,服务器根据所述读数据请求,使用元数据Key算法计算元数据Key,按照分区号算法获得所述元数据的分区号,以分区号作为目的地址生成所 述元数据读请求Key-Value报文,元数据读请求Key-Value报文的净荷中携带元数据Key,把元数据读请求发送给交换设备。元数据的Key也可以称为元数据的元数据。
读数据请求中携带有待读数据的信息,根据待读数据的信息使用元数据Key算法计算元数据Key。本步骤的元数据Key算法和步骤14的元数据Key算法可以相同,例如待读数据的信息是待读数据的文件名,而元数据Key算法是文件名后面增加后缀-meta data,那么文件名为aaa.doc的待读数据的元数据Key是aaa.doc-metadata。
服务器获得元数据分区号的算法和步骤12相同,也就是说对于同一个数据的读请求和写请求,按照预设的分区号算法计算出的分区号相同。按照前面的举例,可以根据Key获得分区号。对于文件名是aaa.doc的数据的读请求,在本步骤中,由于元数据Key的生成规则相同,其元数据Key也是aaa.doc-metadata。在获得元数据Key以后,可以使用步骤12提到的方法中的一种,获得元数据分区号。当然,也可以不使用Key,通过其他手段获得元数据分区号。
获得元数据分区号以后,把元数据分区号放在元数据读请求Key-Value报文头中作为目的地址,把Key放在元数据读请求Key-Value报文的净荷中,生成所述元数据读请求Key-Value报文。元数据读请求Key-Value报文所请求获得的内容是元数据Value,服务器把这个读请求发送给交换设备。元数据读请求的报文格式和步骤12中的K-V报文相同,净荷写的元数据Key。同样的,这个报文需要封装成MAC帧,甚至进一步转换成物理信号以便发送给交换设备。
23,所述交换设备接收所述元数据读请求Key-Value报文,从报文头中获得作为目的地址的元数据分区号,通过查询分区视图获得所述元数据分 区号对应的存储盘地址。以把元数据读请求Key-Value报文的目的地址替换成元数据分区号对应的存储盘地址,从而把所述元数据Value读请求转换成元数据请求存储盘报文,发出所述元数据请求存储盘报文给对应的存储盘。其中,所述分区视图中记录有所述元数据分区号和与其对应的存储盘地址的对应关系。
本步骤中,元数据读请求Key-Value报文的目的地址是存储盘地址,存储盘收到元数据请求报文后,返回元数据Value给交换设备。返回的报文格式可以是IP报文,元数据Value携带在IP报文的净荷中。
24,存储盘发送元数据响应存储盘报文给交换设备,交换设备接收元数据响应存储盘报文并转发给所述服务器。交换设备转发给服务器的报文可以不使用K-V报文格式,而是使用IP报文。元数据响应存储盘报文是元数据请求存储盘报文的响应报文。
25,服务器解封装收到的元数据响应存储盘报文,通过元数据响应存储盘报文的净荷中携带的元数据Value,获得各个所述数据Value的数据Key。以及获得存储数据Value的分区的数据分区号。以数据分区号作为目的地址生成数据请求Key-Value报文,数据请求Key-Value报文中携带数据Key,把所述数据请求Key-Value报文发送给所述交换设备。
获得数据Key和数据分区号的具体方法可以和步骤22相同。
元数据Value中携带有数据Value之间的顺序以及各个数据Value的索引(例如数据Value对应的数据Key),通过数据Key获得数据Value的分区号,算法与步骤12提到的算法相同。
元数据Value也可以不携带数据Key,而是携带可以获得数据Key的参数,由服务器计算获得数据Key。通过数据Key获得数据分区号的具体 算法与步骤12提到的算法相同。元数据Value还可以直接携带数据分区号,这样可以省去由数据Key计算获得数据分区号的过程。
26,所述交换设备接收数据请求Key-Value报文,通过查询分区视图获得数据分区号对应的数据Value存储盘地址,以数据分区号对应的存储盘地址作为目的地址,把所述数据请求Key-Value报文转换成数据请求存储盘报文,发送数据请求存储盘报文给对应的存储盘。
存储盘收到数据请求存储盘报文后,查找数据Key对应的数据Value,把查找到的数据Value携带在IP报文的净荷中返回给交换设备,这个IP报文称为数据响应存储盘报文。
本步骤技术原理和步骤23相似,不同之处在于步骤23是获取元数据Value,本步骤的获取对象是数据Value。
27,所述交换设备还用于,接收数据响应存储盘报文并转发给所述服务器。
28,所述服务器还用于,解封装数据响应存储盘报文,把多个数据Value组合成读数据请求所请求的待读数据。
当服务器凑齐组成待读数据的所有数据分片后,根据元数据Value中记录的数据分片的顺序信息,把这些数据分片依顺序组合起来形成待读数据,把待读数据返回给请求者。
上述方法中,Key-Value报文除了目的地址是分区号,还在报文头中使用专门的字段标记这是一个Key-Value,以便和IP报文进行区分。例如在该帧类型字段中用0x8040标记这是一个Key-Value报文。
本发明另外一实施例一种数据处理方法,是对上述方法的另外一种表 达方式,包括由交换设备执行的如下步骤33-36。参见图3、图4。
33、交换设备接收键-值Key-Value报文,所述Key-Value报文的目的地址是分区号。
34、交换设备从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系。
35、通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文。
36、交换设备发送所述存储盘报文给与所述存储盘地址对应的存储盘。
步骤33-步骤36由交换设备执行,没有区分是写数据的方法还是读数据的方法。
当步骤33-步骤36是写数据方法时,参见图3。假设所述Key-Value报文是数据Key-Value报文,那么在交换设备接收Key-Value报文的步骤之前,该方法还包括步骤30。
30、服务器按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
在步骤30中,所述服务器按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换 设备,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息。按照分区号算法获得数据分区号包括:对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
基于步骤30,所述交换设备接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
在步骤30之前,还可以包括以下步骤401和402。
301、所述服务器接收待写数据。
302、所述服务器把所述待写数据拆分成数据Value。
当步骤33-步骤36是读数据方法时,参见图4。所述Key-Value报文是元数据请求Key-Value报文,在所述交换设备接收Key-Value报文之前,进一步包括步骤31和32。
31、服务器接收读数据请求,请求获得待读数据;
32、服务器按照所述元数据Key算法获得待读数据的元数据Key,用所述元数据Key按照所述分区号算法获得所述元数据分区号,生成所述元数据请求Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备,其中,所述元数据请求Key-Value报文的目的地址是所述元数据分区号,所述元数据请求Key-Value报文携带所述元数据Key。
在所述交换设备发送所述存储盘报文给与所述存储盘地址对应的存储盘之后,还可以包括步骤37-步骤41。
37、所述交换设备接收元数据响应报文并转发给所述服务器,所述元数据响应报文的净荷中携带元数据Value,所述元数据响应报文是所述存储盘报文的响应报文。
38、服务器解封装所述元数据响应报文,从所述元数据Value中获得数据Key,用所述数据Key按照所述分区号算法计算数据分区号,生成数据请求Key-Value报文,所述数据请求Key-Value报文的目的地址是以数据分区号,所述数据请求Key-Value报文中携带所述数据Key,把所述数据请求Key-Value报文发送给所述交换设备。
39、所述交换设备接收所述数据请求Key-Value报文,通过查询分区视图获得所述数据分区号对应的存储盘地址,以所述存储盘地址作为目的地址,把所述数据请求Key-Value报文转换成数据请求存储盘报文,发送所述数据请求存储盘报文给所述数据Value存储盘地址对应的存储盘。
40、所述交换设备接收数据响应存储盘报文并转发给所述服务器,所述数据响应报文是所述数据请求存储盘报文的响应报文,所述数据响应报文中携带有数据Value。
41、所述服务器解封装所述数据响应Key-Value获得数据Value,将多个数据Key的数据Value组合成所述待读数据。
前述步骤中,Key-Value报文使用帧类型字段对报文类型进行标记。
本发明方法实施例包括步骤11-28,步骤301-36,以及步骤31-41。由前文描述可知,一部分操作由服务器执行,另一部分操作由交换设备执行。如图5所示,本发明实施例中的交换设备2可以由处理器51、计算机可读 介质52和接口53组成,处理器51通过总线分别和计算机可读介质52、接口53连接。接口53提供对外连接,例如和服务器1连接,计算机可读介质52用于存储计算机程序代码,处理器51通过运行计算机可读介质52程序代码执行交换设备在步骤11-28中的操作,或者交换设备在步骤301-36,或者交换设备在步骤31-41中的操作。
本发明实施例提供的服务器可以由处理器、存储介质和接口组成,处理器分别和存储介质、接口连接。接口提供对外连接,存储介质用于存储计算机程序代码,处理器通过运行存储介质程序代码执行服务器在步骤11-28中的操作,或者服务器在步骤301-36,或者服务器在步骤31-41中的操作。
参见图6,本发明实施例提供的交换设备2,包括接收模块61、查询模块62、报文转换模块63以及发送模块64。交换设备2可以执行上面提到的方法,例如交换设备在步骤11-28中的操作,或者交换设备在步骤301-36,或者交换设备在步骤31-41中的操作。
接收模块61,用于接收键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;查询模块62,用于从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;报文转换模块63,用于通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;发送模块64,用于发送所述存储盘报文给与所述存储盘地址对应的存储盘。
所述接收模块61接收的Key-Value报文可以是元数据请求Key-Value报文。那么所述接收模块61还用于接收元数据响应报文并转发给所述服务器1,所述元数据响应报文的净荷中携带元数据Value,所述元数据响应报文是所述存储盘报文的响应报文。接收模块61,还用于通过读取帧类型字 段确定所述Key-Value报文的报文类型。由于Key-Value报文和IP报文类似,可以使用帧类型字段对Key-Value报文和IP报文进行区分。
所述接收模块61接收的Key-Value报文可以是数据请求Key-Value报文。交换设备2和服务器1共同组成本发明实施例的写数据系统。所述服务1器用于:按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备2,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
所述服务器1还用于:按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备2,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息。所述接收模块61,还用于接收所述元数据Key-Value报文;所述查询模块62,还用于通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址;所述报文转换模块63,还用于通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,所述发送模块64,还用于发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
可选的,服务器还用于,在对数据Value进行处理之前,接收待写数据,以及所述待写数据拆分成数据Value。可选的,所述按照分区号算法获得数据Value的分区号,具体包括:所述服务器1对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
类似的,服务器1用于执行上述方法中涉及服务器的部分。例如在步骤11-28中服务器的操作,或者服务器在步骤301-36,或者服务器在步骤31-41中的操作。
下面从另外一个角度来介绍本发明实施例的写数据系统。写数据系统由服务器1和交换设备2。
其中,所述服务器1,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据键-值Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
所述交换设备2和所述服务器1连接,用于接收所述服务器1发送的Key-Value报文。所述交换设备2,还用于从所述数据Key-Value报文中获得所述数据分区号,查询分区视图获得所述分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述数据分区号和所述数据存储盘地址的对应关系。
所述交换设备2,还用于通过将所述数据Key-Value报文的目的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;所述交换设备2,还用于发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘3。
可选的,所述服务器1还用于,按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备2,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述 元数据Value,所述元数据Value记录所述数据Value的索引信息。可选的,所述交换设备2,还用于接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘3。
下面从再一个角度来描述存储系统:存储系统包括交换设备2和服务器1。所述服务器1被配置为,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
所述交换设备2,和所述服务器1连接,被配置为用于接收所述数据Key-Value报文;从所述数据Key-Value报文中获得所述数据分区号,查询分区视图获得所述数据分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述分区号和所述数据存储盘地址的对应关系;通过将所述数据Key-Value报文的目的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘3。
本领域普通技术人员将会理解,本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计 算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、光纤、便携式只读存储器(CD-ROM)。
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作;生成实施在框图的每一块、或各块的组合中规定的功能动作的装置。

Claims (27)

  1. 一种数据处理方法,包括:
    交换设备接收服务器发送的键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;
    交换设备从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;
    通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;
    交换设备发送所述存储盘报文给与所述存储盘地址对应的存储盘。
  2. 根据权利要求1所述的方法,所述Key-Value报文是数据Key-Value报文,在交换设备接收所述Key-Value报文之前,该方法还包括:
    所述服务器按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
  3. 根据权利要求2所述的方法,还包括:
    所述服务器按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据 Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息;
    所述交换设备接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
  4. 根据权利要求2或3所述的方法,服务器按照数据键Key算法获得数据Value的数据Key之前,还包括:
    所述服务器接收待写数据;
    所述服务器把所述待写数据拆分成数据Value。
  5. 根据权利要求2所述的方法,所述按照分区号算法获得数据Value的分区号,具体包括:
    对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
  6. 根据权利要求1所述的方法,其中,所述Key-Value报文是元数据请求Key-Value报文,在所述交换设备接收Key-Value报文之前,进一步包括:
    服务器接收读数据请求,请求获得待读数据;
    服务器按照所述元数据Key算法获得待读数据的元数据Key,用所述元数据Key按照所述分区号算法获得所述元数据分区号,生成所述元数据请求Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备,其中,所述元数据请求Key-Value报文的目的地址是所述元数据分区号,所 述元数据请求Key-Value报文携带所述元数据Key。
  7. 根据权利要求6所述的方法,所述交换设备发送所述存储盘报文给与所述存储盘地址对应的存储盘之后,还包括:
    所述交换设备接收元数据响应报文并转发给所述服务器,所述元数据响应报文的净荷中携带元数据Value,所述元数据响应报文是所述存储盘报文的响应报文;
    服务器解封装所述元数据响应报文,从所述元数据Value中获得数据Key,用所述数据Key按照所述分区号算法计算数据分区号,生成数据请求Key-Value报文,所述数据请求Key-Value报文的目的地址是以数据分区号,所述数据请求Key-Value报文中携带所述数据Key,把所述数据请求Value Key-Value报文发送给所述交换设备;
    所述交换设备接收所述数据请求Key-Value报文,通过查询分区视图获得所述数据分区号对应的存储盘地址,以所述存储盘地址作为目的地址,把所述数据请求Key-Value报文转换成数据请求存储盘报文,发送所述数据请求存储盘报文给所述数据Value存储盘地址对应的存储盘;
    所述交换设备接收数据响应存储盘报文并转发给所述服务器,所述数据响应报文是所述数据请求存储盘报文的响应报文,所述数据响应报文中携带有数据Value;
    所述服务器解封装所述数据响应Key-Value获得数据Value,将多个数据Key的数据Value组合成所述待读数据。
  8. 根据权利要求6所述的方法,其中:
    所述Key-Value报文使用帧类型字段对报文类型进行标记。
  9. 一种数据交换装置,包括:
    接收模块,用于接收服务器发送的键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;
    查询模块,用于从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;
    报文转换模块,用于通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;
    发送模块,用于发送所述存储盘报文给与所述存储盘地址对应的存储盘。
  10. 根据权利要求9所述的装置,其中,所述Key-Value报文是元数据请求Key-Value报文,所述接收模块还用于:
    接收元数据响应报文并转发给所述服务器,所述元数据响应报文的净荷中携带元数据Value,所述元数据响应报文是所述存储盘报文的响应报文。
  11. 根据权利要求9所述的装置,其中:
    所述接收模块,还用于通过读取帧类型字段确定所述Key-Value报文的报文类型。
  12. 一种写数据系统,包括服务器和权利要求9或11的交换设备,其中,所述Key-Value报文是数据Key-Value报文:
    所述服务器,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数 据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value。
  13. 根据权利要求12所述的系统,所述服务器还用于:
    按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息;
    所述接收模块,还用于接收所述元数据Key-Value报文;
    所述查询模块,还用于通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址;
    所述报文转换模块,还用于通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,
    所述发送模块,还用于发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
  14. 根据权利要求12或13所述的系统,所述服务器还用于:
    接收待写数据;
    所述待写数据拆分成数据Value。
  15. 根据权利要求12或13所述的系统,所述按照分区号算法获得数据Value 的分区号,具体包括:
    所述服务器对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
  16. 一种写数据系统,包括服务器和交换设备,其中:
    所述服务器,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据键-值Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value;
    所述交换设备和所述服务器连接,用于接收所述服务器发送的Key-Value报文;
    所述交换设备,还用于从所述数据Key-Value报文中获得所述数据分区号,查询分区视图获得所述分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述数据分区号和所述数据存储盘地址的对应关系;
    所述交换设备,还用于通过将所述数据Key-Value报文的目的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;
    所述交换设备,还用于发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘。
  17. 根据权利要求16所述的系统,其中:
    所述服务器还用于,按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备, 其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息;
    所述交换设备,还用于接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
  18. 根据权利要求16或17所述的系统,所述服务器在按照数据键Key算法获得数据Value的数据Key之前,还用于:
    接收待写数据;
    把所述待写数据拆分成数据Value。
  19. 根据权利要求16所述的系统,所述按照分区号算法获得数据Value的分区号,具体包括:
    所述服务器对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
  20. 一种交换设备,包括:
    接口,被配置为用于提供对外连接;
    计算机可读介质,被配置为用于存储计算机程序;
    处理器,和所述接口、计算机可读介质连接,被配置为用于通过运行所述程序,执行以下步骤:
    接收服务器发送的键-值Key-Value报文,所述Key-Value报文的目的地址是分区号;
    从所述Key-Value报文中获得所述分区号,查询分区视图获得所述分区号对应的存储盘地址,其中,所述分区视图中记录有所述分区号和所述存储盘地址的对应关系;
    通过将所述Key-Value报文的目的地址改为所述存储盘地址,把所述Key-Value报文转换成存储盘报文;
    发送所述存储盘报文给与所述存储盘地址对应的存储盘。
  21. 根据权利要求20所述的交换设备,所述Key-Value报文是数据Key-Value报文,所述处理器还被配置为执行:
    接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数据存储盘报文给与所述元数据存储盘地址对应的存储盘。
  22. 根据权利要求20所述的交换设备,其中,所述Key-Value报文是元数据请求Key-Value报文,所述处理器还被配置为执行:
    接收元数据响应报文并转发给所述服务器,所述元数据响应报文的净荷中携带元数据Value,所述元数据响应报文是所述存储盘报文的响应报文。
  23. 根据权利要求20所述的交换设备,其中,所述处理器还被配置为执行:
    通过读取帧类型字段确定所述Key-Value报文的报文类型。
  24. 一种写数据系统,包括交换设备和服务器:
    所述服务器被配置为,用于按照数据键Key算法计算与所述数据值Value对应的数据Key,按照分区号算法获得所述数据Value的数据分区号,生成所述数据Key-Value报文,把所述数据Key-Value报文发送给所述交换设备,其中,所述数据Key-Value的目的地址是所述数据分区号,所述数据Key-Value报文的净荷中携带所述数据Key以及所述数据Value;
    所述交换设备,和所述服务器连接,被配置为用于接收所述服务器发送的数据Key-Value报文;从所述数据Key-Value报文中获得所述数据分区号,查询分区视图获得所述数据分区号对应的数据存储盘地址,其中,所述分区视图中记录有所述分区号和所述数据存储盘地址的对应关系;通过将所述数据Key-Value报文的目的地址改为所述数据存储盘地址,把所述数据Key-Value报文转换成数据存储盘报文;发送所述数据存储盘报文给与所述数据存储盘地址对应的数据存储盘。
  25. 根据权利要求24所述的系统,其中:
    所述服务器还被配置为用于,按照元数据键Key算法计算元数据Value对应的元数据Key,按照所述分区号算法获得元数据Value的元数据分区号,生成元数据Key-Value报文,把所述元数据Key-Value报文发送给所述交换设备,其中,所述元数据Key-Value报文的目的地址是所述元数据分区号,所述元数据Key-Value报文的净荷中携带所述元数据Key以及所述元数据Value,所述元数据Value记录所述数据Value的索引信息;
    所述交换设备还被配置为用于,接收所述元数据Key-Value报文,通过查询所述分区视图获得所述元数据Value的分区号对应的元数据存储盘地址,通过将所述元数据Key-Value报文的目的地址改为所述元数据存储盘地址,把所述元数据Key-Value报文转换成元数据存储盘报文,发送所述元数 据存储盘报文给与所述元数据存储盘地址对应的存储盘。
  26. 根据权利要求24所述的系统,其中,所述服务器还被配置为用于:
    接收待写数据;
    把所述待写数据拆分成数据Value。
  27. 根据权利要求24所述的系统,其中,所述按照分区号算法获得数据Value的分区号,具体包括:
    所述服务器对所述数据Key的哈希值按照分区的数量取余,得到的数值作为数据分区号。
PCT/CN2014/088351 2014-10-11 2014-10-11 数据处理方法和装置 WO2016054818A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2014/088351 WO2016054818A1 (zh) 2014-10-11 2014-10-11 数据处理方法和装置
EP14903664.2A EP3196776B1 (en) 2014-10-11 2014-10-11 Method and device for data processing
CN201480075499.0A CN106164898B (zh) 2014-10-11 2014-10-11 数据处理方法和装置
US15/484,152 US11003719B2 (en) 2014-10-11 2017-04-11 Method and apparatus for accessing a storage disk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/088351 WO2016054818A1 (zh) 2014-10-11 2014-10-11 数据处理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/484,152 Continuation US11003719B2 (en) 2014-10-11 2017-04-11 Method and apparatus for accessing a storage disk

Publications (1)

Publication Number Publication Date
WO2016054818A1 true WO2016054818A1 (zh) 2016-04-14

Family

ID=55652506

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/088351 WO2016054818A1 (zh) 2014-10-11 2014-10-11 数据处理方法和装置

Country Status (4)

Country Link
US (1) US11003719B2 (zh)
EP (1) EP3196776B1 (zh)
CN (1) CN106164898B (zh)
WO (1) WO2016054818A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495392A (zh) * 2018-10-31 2019-03-19 泰康保险集团股份有限公司 报文转换处理方法及装置、电子设备、存储介质
EP3474146A4 (en) * 2017-04-14 2019-07-24 Huawei Technologies Co., Ltd. DATA PROCESSING METHOD, STORAGE SYSTEM, AND EXCHANGE DEVICE

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019024032A1 (zh) * 2017-08-03 2019-02-07 华为技术有限公司 数据传输方法、相关设备及通信系统
US10481834B2 (en) * 2018-01-24 2019-11-19 Samsung Electronics Co., Ltd. Erasure code data protection across multiple NVME over fabrics storage devices
CN112463214B (zh) * 2019-09-09 2023-11-03 北京京东振世信息技术有限公司 数据处理方法及装置、计算机可读存储介质以及电子设备
CN113377280A (zh) * 2020-03-09 2021-09-10 华为技术有限公司 一种存储系统和请求处理方法以及交换机
CN111597148B (zh) * 2020-05-14 2023-09-19 杭州果汁数据科技有限公司 用于分布式文件系统的分布式元数据管理方法
CN115729440A (zh) * 2021-08-31 2023-03-03 华为技术有限公司 处理i/o请求的方法、装置、系统、设备及存储介质
CN114579051B (zh) * 2022-02-25 2024-04-23 阿里巴巴(中国)有限公司 识别硬盘读模式的方法以及装置
CN115495479A (zh) * 2022-10-20 2022-12-20 深圳市雁联计算系统有限公司 字段可变的数据查询方法和装置
US11683399B1 (en) * 2023-01-10 2023-06-20 Delta Cygni Labs Oy Method for timely transmission of data over lossy communication channels

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101374087A (zh) * 2007-08-20 2009-02-25 华为技术有限公司 一种移动自组网络、节点及其区域划分方法
CN102799628A (zh) * 2012-06-21 2012-11-28 新浪网技术(中国)有限公司 在key-value数据库中进行数据分区的方法和装置
CN102968498A (zh) * 2012-12-05 2013-03-13 华为技术有限公司 数据处理方法及装置
CN102968503A (zh) * 2012-12-10 2013-03-13 曙光信息产业(北京)有限公司 数据库系统的数据处理方法以及数据库系统

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7752392B1 (en) * 2006-01-30 2010-07-06 Symantec Operating Corporation Method and apparatus for accessing a virtualized storage volume using a pre-loaded volume map
GB2460841B (en) * 2008-06-10 2012-01-11 Virtensys Ltd Methods of providing access to I/O devices
CN101639835A (zh) * 2008-07-30 2010-02-03 国际商业机器公司 多租户场景中应用数据库分区的方法和装置
US7876752B1 (en) * 2008-08-29 2011-01-25 Qlogic, Corporation Method and system for partition based network routing
US9390102B2 (en) * 2008-09-29 2016-07-12 Oracle International Corporation Client application program interface for network-attached storage system
CN101876983B (zh) * 2009-04-30 2012-11-28 国际商业机器公司 数据库分区方法与系统
US8793250B1 (en) * 2010-12-17 2014-07-29 Amazon Technologies, Inc. Flexible partitioning of data
CN103092885A (zh) * 2011-11-07 2013-05-08 中国移动通信集团公司 稀疏索引的建立方法和装置、稀疏索引及查询方法和装置
WO2013147175A1 (ja) * 2012-03-30 2013-10-03 日本電気株式会社 分散ストレージシステム、制御装置、クライアント端末、負荷分散方法及びプログラム
US9075710B2 (en) * 2012-04-17 2015-07-07 SanDisk Technologies, Inc. Non-volatile key-value store
US20160050146A1 (en) * 2014-08-15 2016-02-18 Turbostor, Inc. Accelerated storage appliance using a network switch

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101374087A (zh) * 2007-08-20 2009-02-25 华为技术有限公司 一种移动自组网络、节点及其区域划分方法
CN102799628A (zh) * 2012-06-21 2012-11-28 新浪网技术(中国)有限公司 在key-value数据库中进行数据分区的方法和装置
CN102968498A (zh) * 2012-12-05 2013-03-13 华为技术有限公司 数据处理方法及装置
CN102968503A (zh) * 2012-12-10 2013-03-13 曙光信息产业(北京)有限公司 数据库系统的数据处理方法以及数据库系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3474146A4 (en) * 2017-04-14 2019-07-24 Huawei Technologies Co., Ltd. DATA PROCESSING METHOD, STORAGE SYSTEM, AND EXCHANGE DEVICE
CN110546620A (zh) * 2017-04-14 2019-12-06 华为技术有限公司 数据处理方法、存储系统和交换设备
US10728335B2 (en) 2017-04-14 2020-07-28 Huawei Technologies Co., Ltd. Data processing method, storage system, and switching device
CN110546620B (zh) * 2017-04-14 2022-05-17 华为技术有限公司 数据处理方法、存储系统和交换设备
CN109495392A (zh) * 2018-10-31 2019-03-19 泰康保险集团股份有限公司 报文转换处理方法及装置、电子设备、存储介质
CN109495392B (zh) * 2018-10-31 2021-05-07 泰康保险集团股份有限公司 报文转换处理方法及装置、电子设备、存储介质

Also Published As

Publication number Publication date
CN106164898B (zh) 2018-06-26
US11003719B2 (en) 2021-05-11
CN106164898A (zh) 2016-11-23
EP3196776A4 (en) 2017-10-18
EP3196776A1 (en) 2017-07-26
US20170220699A1 (en) 2017-08-03
EP3196776B1 (en) 2020-01-15

Similar Documents

Publication Publication Date Title
WO2016054818A1 (zh) 数据处理方法和装置
US10459649B2 (en) Host side deduplication
JP5932024B2 (ja) 参照カウント伝播
US9405781B2 (en) Virtual multi-cluster clouds
US10701151B2 (en) Methods and systems for accessing virtual storage servers in a clustered environment
US8930693B2 (en) Cluster federation and trust
KR20190055721A (ko) KV-SSD를 사용하여 확장성있는 객체 스토리지를 구성하고 액세스하는 방법 및 KV-SSD, NVMe-SSD 및 기타 플래시 장치의 하이브리드(HUBBRID) 백엔드 스토리지 계층
WO2018059026A1 (zh) 通信方法和装置
US10523753B2 (en) Broadcast data operations in distributed file systems
US10929042B2 (en) Data storage system, process, and computer program for de-duplication of distributed data in a scalable cluster system
WO2018188089A1 (zh) 数据处理方法、存储系统和交换设备
Ko et al. An information-centric architecture for data center networks
CN109309706B (zh) 在云局域网的存储系统间共享指纹和数据块的方法和系统
WO2016065611A1 (zh) 访问文件的方法、系统和主机
Wu et al. N-DISE: NDN-based data distribution for large-scale data-intensive science
US8825985B2 (en) Data transfer reduction in scale out architectures
JP6378044B2 (ja) データ処理装置、データ処理方法およびプログラム
WO2016065610A1 (zh) 访问文件的方法、分布式存储系统和存储节点
Shimano et al. An information propagation scheme for an autonomous distributed storage system in iSCSI environment
JP2017123040A (ja) サーバー装置、分散ファイルシステム、分散ファイルシステム制御方法、および、プログラム
CN117811987A (zh) 一种分布式内存资源自动寻址和管理的方法
CN115955505A (zh) 基于算力网络的sdn控制系统、控制方法及平台
JP2009253463A (ja) リソース検索システムおよびリソース検索用情報処理装置
JP2006146409A (ja) ストレージデバイス選択方法、ゲートウェイ装置およびストレージシステム
JP2011141608A (ja) P2p通信のコンテンツ制御装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14903664

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2014903664

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014903664

Country of ref document: EP