WO2019137321A1 - 数据处理方法、装置及计算设备 - Google Patents

数据处理方法、装置及计算设备 Download PDF

Info

Publication number
WO2019137321A1
WO2019137321A1 PCT/CN2019/070580 CN2019070580W WO2019137321A1 WO 2019137321 A1 WO2019137321 A1 WO 2019137321A1 CN 2019070580 W CN2019070580 W CN 2019070580W WO 2019137321 A1 WO2019137321 A1 WO 2019137321A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
metadata
stored
target
storage
Prior art date
Application number
PCT/CN2019/070580
Other languages
English (en)
French (fr)
Inventor
刘金鑫
董乘宇
刘善阳
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Priority to JP2020537757A priority Critical patent/JP7378403B2/ja
Priority to EP19738475.3A priority patent/EP3739450A4/en
Publication of WO2019137321A1 publication Critical patent/WO2019137321A1/zh
Priority to US16/924,028 priority patent/US11354050B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the embodiments of the present application relate to the field of data processing technologies, and in particular, to a data processing method, a data processing device, and a computing device.
  • metadata for describing attributes related to the data to be stored which is also called data metadata, is usually added at the end of the data to be stored, so that the data to be stored can be stored. And the data metadata is written to the storage device at one time.
  • the storage end When receiving the data to be stored sent by the requesting end, the storage end passively allocates a memory that is consistent with the data size of the data to be stored to cache the data to be stored, so that the data to be stored and the data metadata can be written to the storage at one time.
  • an implementation manner of the prior art is: re-applying a write disk memory, the size of the write disk memory is the data size of the data to be stored plus the data size of the data metadata, and then copying the data to be stored to the write
  • the data metadata is spliced in the write disk memory, so that the data to be stored and the data metadata are organized together, so that only one write operation is performed, that is, the data to be stored and the data in the memory of the write disk can be written. Metadata is written to the storage device as a whole.
  • the embodiment of the present application provides a data processing method, device, and computing device, which are used to solve the technical problem of low data storage efficiency in the prior art.
  • a data processing method including:
  • the reserved field is used to write data metadata of the data to be stored at a memory location corresponding to the storage end, so that the data to be stored and the data metadata are written to the storage device as a whole.
  • a data processing method including:
  • the target data is formed by the requesting end adding a reserved field in the data to be stored;
  • a data processing apparatus including:
  • a data construction module configured to add a reserved field to the data to be stored to obtain target data
  • a data sending module configured to send the target data to a storage end
  • the reserved field is used to write data metadata of the data to be stored at a memory location corresponding to the storage end, so that the data to be stored and the data metadata are written to the storage device as a whole.
  • a data processing apparatus including:
  • a data acquisition module configured to acquire target data;
  • the target data is configured by the requesting end adding a reserved field in the to-be-stored data;
  • a memory allocation module configured to allocate a second memory to cache the target data
  • a data generating module configured to generate data metadata of the data to be stored in the target data
  • a data writing module configured to write the data metadata into a memory location corresponding to the reserved field in the second memory.
  • a computing device including a storage component and a processing component, is provided in an embodiment of the present application.
  • the storage component is configured to store one or more computer instructions, wherein the one or more computer instructions are for execution by the processing component;
  • the processing component is used to:
  • the reserved field is used to write data metadata of the data to be stored at a memory location corresponding to the storage end, so that the data to be stored and the data metadata are written to the storage device as a whole.
  • a computing device including a storage component and a processing component, is provided in an embodiment of the present application.
  • the storage component is configured to store one or more computer instructions, wherein the one or more computer instructions are for execution by the processing component;
  • the processing component is used to:
  • the target data is formed by the requesting end adding a reserved field in the data to be stored;
  • the requesting end adds a reserved field to the data to be stored, and constructs the data to be stored as the target data; the data size of the target data is the data size of the data to be stored and the data size occupied by the reserved field. Sum.
  • the requesting end sends the target data to the storage end, and the storage end allocates the memory to cache the target data. Since the reserved data is reserved in the target data, the memory space corresponding to the reserved field may be enough to write the data metadata, so the storage end No need to allocate new memory, it will not cause memory waste, and avoid data copy, so it can improve data storage efficiency and improve system performance.
  • FIG. 1 is a flow chart showing an embodiment of a data processing method provided by the present application.
  • FIG. 2 is a flow chart showing still another embodiment of a data processing method provided by the present application.
  • FIG. 3 is a flow chart showing still another embodiment of a data processing method provided by the present application.
  • FIG. 4a is a schematic diagram of a data structure in the embodiment of the present application.
  • FIG. 4b is a schematic diagram showing another data structure in the embodiment of the present application.
  • FIG. 5 is a flow chart showing still another embodiment of a data processing method provided by the present application.
  • FIG. 6 is a schematic diagram showing interaction of data processing in an actual application according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an embodiment of a data processing apparatus provided by the present application.
  • FIG. 8 is a schematic structural diagram of an embodiment of a computing device provided by the present application.
  • FIG. 9 is a schematic structural diagram of still another embodiment of a data processing apparatus provided by the present application.
  • FIG. 10 is a schematic structural diagram of still another embodiment of a computing device provided by the present application.
  • the technical solution of the embodiment of the present application is mainly applied to a storage system, which may be a traditional storage system or a distributed storage system.
  • the storage end is responsible for data access operations; the traditional storage system uses a centralized storage server to store data, and the storage end can refer to the storage server; in the distributed storage system, data is distributed and stored in multiple data storage nodes, and the storage end can be Refers to the one or more data storage nodes;
  • Data storage node The node responsible for data storage in a distributed storage system, usually a physical server.
  • the requesting end responsible for sending read/write requests, and the upper layer business system accesses data or updates data to the storage end through the requesting end.
  • Request metadata data guidance information of the data to be stored sent by the requesting end to the storage end, which may include data length and/or data position of the data to be stored, and save indication information, and the save indication information also means that the data to be stored is written. In which storage device is entered, the request metadata is not written to the stored data.
  • Data Metadata Metadata used to describe the attributes related to the data to be stored, which may include data length, data checksum, storage location, file name, etc., and data metadata is written to the data to be stored. In the device.
  • Multi-copy technology A data redundancy technology in a distributed storage system.
  • the original data is usually copied in several copies, and each copy data is stored in a different data storage node.
  • the copy data of each raw data is called copy data.
  • Storage device A hardware device used to store data in a storage system. The data ultimately needs to be written into the storage device, and may refer to a storage medium such as a magnetic disk.
  • the storage end in order to write the data to be stored and the data metadata to the storage device, the storage end needs to allocate an additional write disk memory, and when the storage end receives the data to be stored, a memory is passively allocated to cache the standby.
  • the data is stored, so the data to be stored is re-copied to the write disk memory.
  • the data to be stored needs to be copied once, and the data copy affects the data storage efficiency and affects system performance.
  • the requesting end adds a reserved field to the data to be stored, and constructs the data to be stored as a target.
  • Data the data size of the target data is the sum of the data size of the data to be stored and the data size corresponding to the reserved field.
  • the requesting end sends the target data to the storage end, and the storage end allocates a piece of memory to cache the target data. Since the target data includes a reserved field, the memory space corresponding to the reserved field may be enough to write the data metadata, so the storage end does not need to Additional allocation of new memory will not waste memory and avoid data copying, thus improving data storage efficiency and system performance.
  • FIG. 1 is a flowchart of an embodiment of a data processing method according to an embodiment of the present application. The technical solution of this embodiment is applied to a requesting end, and the requesting end is responsible for executing.
  • the method can include the following steps:
  • a reserved field is added to the data to be stored, and the reserved field occupies a certain data length, and the data to be stored is added to the reserved field to form a new data format, which constitutes a target.
  • Data the data size of the target data is also equal to the sum of the data size of the data to be stored and the data size occupied by the reserved field.
  • the reserved field may be located at the end of the data to be stored, and the size of the data occupied by the reserved field may be preset, and may be set in combination with the data size of the data metadata, and the data size occupied by the reserved field needs to be greater than or equal to the data of the data metadata. size.
  • the size of the data occupied by the reserved field may be equal to the data size of the data metadata.
  • the data metadata refers to the data metadata of the data to be stored, and is generated based on the data to be stored. Since the data metadata includes information, the data metadata is usually the same size, and the data may be set according to the preset data. The size of the data occupied by the remaining fields.
  • the data size involved in the embodiment of the present application is expressed in units of bytes (English: Byte, abbreviated as B) or kilobytes (English: Kbytes, abbreviated as K), and the data size may also be referred to as data. length.
  • a predetermined character string of a preset size may be added as a reserved field at the end of the data to be stored to obtain target data.
  • the predetermined string may be an empty string or a contract string or the like.
  • the requesting end may send a write request to the storage end to carry the target data, and the target data is sent to the storage end as data to be written.
  • the memory is passively allocated to cache the target data.
  • the memory for buffering the target data that is passively allocated by the storage terminal is named “second memory”, and the storage terminal may The data to be stored is read from the target data, and data metadata is generated accordingly. Since the reserved field is reserved in the target data, the reserved field is sufficient to write the data metadata, so the storage end can write the data metadata to the memory location corresponding to the reserved field in the second memory.
  • the reserved field is used to write data metadata of the data to be stored at a memory location corresponding to the storage end.
  • the requesting end adds a reserved field to the data to be stored, and reconstructs the target data, and the data size of the target data is equal to the sum of the data size of the data to be stored and the data size occupied by the reserved field, so that the storage is performed.
  • the second memory that is passively allocated is enough to write data metadata, so there is no need to apply for additional disk memory, no data copying, zero copy is implemented, so system performance can be improved, data storage efficiency can be improved, and memory waste is not caused.
  • the sending the target data to the storage end may include:
  • the target data is sent from the first memory to the storage end.
  • the first memory in the embodiment of the present application refers to the memory allocated for storing the target data sent to the storage end, and is named “first memory” for convenience of description. It will be understood by those skilled in the art that "first” and “second” in “first memory” and “second memory” are merely for the purpose of description, and do not mean that there are relationships such as a delivery, inclusion, and the like.
  • the requesting end and the storage end can agree on the location of the reserved field in the target data and the location of the data to be stored. For example, the last 32 bytes in the target data are reserved fields, so the storage end can follow the agreed rule.
  • the data to be stored is read in the data, and the reserved field is determined.
  • the requesting end may further send the request metadata to the storage end, and the storage end may determine the data to be stored in the target data according to the data length and/or the data location of the data to be stored in the request metadata.
  • the request metadata may also indicate to which storage device the storage side stores the data to be stored.
  • the request metadata may also be filled in the memory, so that the data to be stored and the request metadata may be sent together.
  • the sending the target data to the storage end may include:
  • the target data is sent from the first memory to the storage end.
  • the technical solution of the present application can be applied to a distributed storage system.
  • a distributed storage system in order to avoid data loss caused by a data storage node failure, a multi-copy technology is usually used to make the original data.
  • Several copies each copy data is stored in a different data storage node, and the copy data of each original data is the copy data. Therefore, the data to be stored can be referred to as replica data.
  • the distributed storage system is formed into a cluster system by a plurality of data storage nodes, there is a need to send the data to be stored to a plurality of data storage nodes for storage.
  • the data to be stored is replica data
  • the replica data needs to be separately sent to multiple data storage nodes. .
  • the requesting end needs to send corresponding request metadata for each data storage node, and the request metadata sent to the multiple data storage nodes respectively may include the data length of the data to be stored and
  • the data location and the like also include differentized information corresponding to different data storage nodes, such as save indication information.
  • the data to be stored may have different requirements for storage devices stored in different data storage nodes.
  • the storage end in the embodiment of the present application may include multiple data storage nodes.
  • transmitting the target data to the storage end includes transmitting the target data and the corresponding request metadata to the plurality of data storage nodes.
  • the requesting end may first calculate the total data size of the target data and a request metadata, and then allocate the first memory that is consistent with the total data size. Filling the target data into the first memory first, if it is required to send the data to be stored to any one of the data storage nodes, that is, splicing the request metadata corresponding to any one of the data storage nodes in the first memory, and then the target data and The request metadata is sent to the any data storage node, and if the data to be stored needs to be sent to another data storage node, the request metadata corresponding to the other data storage node is copied to the first memory before being overwritten. The request metadata is sent afterwards.
  • FIG. 2 is a flowchart of still another embodiment of a data processing method according to an embodiment of the present disclosure, where the method may include the following steps:
  • the memory size of the first memory can be equal to the total data size.
  • the target data can be first filled in the first memory, and then multiple request metadata is sequentially filled into the first memory, and spliced at the end of the target data.
  • the requesting end may separately send a write request to the plurality of data storage nodes to carry the target data and the request metadata corresponding to each of the plurality of data storage nodes.
  • the target data in the first memory and the request data corresponding to the any one of the data storage nodes are sent to the any one of the data storage nodes.
  • the plurality of request metadata may be sequentially filled into the first memory according to the sending order of the plurality of data storage nodes.
  • the request metadata of the first data storage node in the first sending order is located at the end of the target data.
  • the request data of the second data storage node of the second sending sequence is located at the end of the request metadata of the first data storage node, and so on, that is, multiple request metadata may be sequentially filled into the first memory.
  • any data storage node After any data storage node obtains the received data sent by the requesting end, it can read the request metadata required by itself from the tail of the received data, and based on the request metadata, can determine the data to be stored and the reserved field from the target data. Wait.
  • the allocated first memory can write multiple request metadata, without specializing the request data for each data storage node, and performing an operation can send data to different data storage nodes, which is simpler in operation. , improve the transmission efficiency, so it can improve data storage efficiency and improve system performance.
  • the data processing method may include:
  • the memory size of the first memory can be equal to the total data size.
  • the plurality of request metadata may be sequentially filled into the first memory according to the sending order of the plurality of data storage nodes, for example, the request metadata of the first data storage node in the first sending order is located at the end of the target data, and the second sending order is The request data of the two data storage nodes is located at the end of the request metadata of the first data storage node, and so on, that is, multiple request metadata may be sequentially filled into the first memory.
  • the step of filling in the plurality of request metadata at the end of the target data includes:
  • the request metadata corresponding to each of the plurality of data storage nodes is sequentially filled in at the end of the target data.
  • the requesting end may send a write request to any one of the data storage nodes to carry the target data, the request metadata corresponding to the any data storage node, and the request element before the request metadata corresponding to the any data storage node. data.
  • the any data storage node is configured to read its corresponding request metadata from the tail of the received data.
  • the request metadata corresponding to each of the plurality of data storage nodes is sequentially filled in at the end of the target data.
  • step 306 may include:
  • the target data, the request metadata corresponding to the sending order of the any data storage node, and the sending order corresponding to the any data storage node are corresponding to the first memory.
  • the request metadata before the request metadata is sent to any of the storage nodes.
  • the data storage node can parse and obtain the corresponding request metadata from the tail of the received data.
  • the request metadata includes the data size and/or the data location of the data to be stored, based on the request metadata, the data of the data size indicated by the request metadata may be read from the received data header, that is, the data to be stored may be obtained. Or, according to the data location, the data to be stored in the received data can be obtained.
  • FIG. 4a is a schematic diagram of a data structure of target data. It can be seen that the target data is composed of data to be stored 401 and a reserved field 402.
  • 4b is a schematic diagram of a data structure in the first memory, which is composed of target data 403 and three request metadata m1, m2, and m3 which are sequentially filled in at the end of the target data 403.
  • the three request metadata may be sequentially filled in according to the sending order of the three data storage nodes, the request metadata m1 corresponds to the first data storage node, the request metadata m2 corresponds to the second data storage node, and the request metadata m3 corresponds to the first Three data storage nodes.
  • the target data 403 and the request metadata m1 may be sent to the first data storage node, and the second data storage node may perform a write operation.
  • the target data 303 and the request metadata m2 may be transmitted to the second data storage node, and when the third data storage node performs a write operation, the target data 403 and the request metadata m3 may be transmitted to the third data storage node.
  • the target data 403 and the request metadata m1 may be sent to the first data storage node; and the write operation is performed to the second data storage node.
  • the target data 403, the request metadata m1, and the request metadata m2 may be sent to the second data storage node, and when the third data storage node performs a write operation, the target data 403, the request metadata m1, and the request element may be The data m2 and the request metadata m3 are sent to the third data storage node.
  • FIG. 5 is a flowchart of still another embodiment of a data processing method according to an embodiment of the present disclosure.
  • the technical solution in this embodiment is applied to a storage end.
  • the storage server may perform the method.
  • any of the data storage nodes can perform the method.
  • the method can include the following steps:
  • the target data is formed by the requesting end adding a reserved field in the data to be stored.
  • the target data may be carried in a write request of the requesting end, and the storage end may obtain the target data from the write requesting end.
  • a second memory is passively allocated to temporarily store the target data, and the memory size of the second memory is consistent with the data size of the target data.
  • 503 Generate data metadata of the to-be-stored data in the target data.
  • the storage end can write the data to be stored in the second memory into the storage device as a piece of data metadata.
  • the size of the data occupied by the reserved field is greater than or equal to the data size of the data metadata. Therefore, there is enough space in the second memory to write the data metadata, and the storage end does not need to allocate the write disk memory, that is, the data to be stored can be Data metadata is written to the storage device, eliminating the need for data copying and zero copying. This improves system performance and improves data storage efficiency without wasting memory.
  • the requesting end and the storage end can agree on the location of the reserved field in the target data and the location of the data to be stored. For example, the last 32 bytes in the target data are reserved fields, and the storage end can follow the agreed rule, that is, the target data can be obtained. Read the data to be stored and determine the reserved field.
  • the reserved field may be located at the end of the data to be stored
  • the generating the data metadata of the data to be stored in the target data may include:
  • the storage end can generate data metadata based on the data to be stored.
  • the requesting end may further send the request metadata to the storage end, and the storage end may determine the data to be stored in the target data according to the data length and/or the data location of the data to be stored in the request metadata.
  • the request metadata may also indicate to which storage device the storage side stores the data to be stored.
  • Data metadata can be generated based on the request metadata and the data to be stored.
  • the requesting end may send target data and at least one request metadata to the sending data sent by the storage end.
  • the acquiring target data may include:
  • the write request includes the target data and at least one request metadata
  • the generating the data metadata of the data to be stored in the target data includes:
  • Data metadata is generated based on the target request metadata and the data to be stored.
  • the target request metadata may include a data size of the data to be stored, and the data size of the data to be stored may be used as information of the data metadata.
  • Other information included in the data metadata such as a data checksum, may be generated based on the data to be stored, and the data checksum may be implemented by using a CRC (Cyclic Redundancy Check) algorithm, which is the same as the prior art. I will not repeat them here.
  • CRC Cyclic Redundancy Check
  • the requesting end of the write request may include the target data requested by the requesting end from the first memory, the target requesting metadata, and the located Request metadata before the target request metadata
  • the determining target request metadata may include:
  • the target request metadata is read from the end of the write request data based on the data size of the request metadata.
  • the metadata is requested based on the target, that is, the data to be stored can be read from the transmission data.
  • the requesting end 60 first constructs the target data 601, and adds a reserved field 602 at the end of the copy data 602.
  • the reserved field can be formed by adding an empty string of a preset size to the end of the copy data.
  • the data storage node receiving the replica data includes three, respectively the first data storage node 61, the second data storage node 62 and the third data storage node 63, that is, the requesting end needs to send a copy to the three data storage nodes.
  • the requesting end respectively determines the request metadata corresponding to each of the three data storage nodes, assuming that the request metadata corresponding to the first data storage node 61 is m1; the request metadata corresponding to the second data storage node 62 is m2; The request metadata corresponding to the third data storage node 63 is m3.
  • the requesting end applies for a piece of first memory, writes the target data 601 into the first memory, and according to the sending order of the three data storage nodes, the sending order is assumed to be: the first data storage node 61 and the second data storage node 62.
  • the third data storage node 63 fills three request metadata in sequence at the end of the target data 601, and the order of the three request metadata is: m1, m2, and m3.
  • the requesting end is directed to any one of the data storage nodes, that is, the copy data and the request metadata corresponding to the one of the storage nodes and the request metadata before the request metadata corresponding to the any one of the storage nodes are read from the first memory.
  • Any data storage node sends a write request.
  • the requesting end 60 transmits the target data 601 and m1 to the first data storage node 61; the target data 601, m1 and m2 to the second data storage node; and the destination to the third data storage node.
  • the target request metadata differentiated from other data storage nodes can be obtained from the tail of the write request.
  • Each data storage node determines the target data from the write request based on the number of request metadata and the data size of the request metadata, and allocates a second memory to temporarily store the target data;
  • each data storage node can determine the replica data from the target data based on the target request metadata, and generate data metadata according to the data metadata, and then the data metadata can be written into the second memory.
  • Each data storage node can write the copy data and data metadata written in the second memory into the respective storage device 64.
  • the requesting end applies for a piece of first memory, and fills the request metadata of the three data storage nodes into the first memory at a time, thereby simplifying operations and ensuring data transmission efficiency.
  • the requesting end adds a reserved field in the replica data to form the target data, so that the data storage node has enough space in the second memory that is passively allocated for temporarily storing the target data to write the data metadata, thereby eliminating the need to re-apply for writing the disk memory.
  • No data copy operation is required, that is, data storage can be realized, and a zero copy scheme is adopted to improve system performance and improve data storage efficiency. And no need to apply for additional memory, reducing the problem of memory waste.
  • FIG. 7 is a schematic structural diagram of an embodiment of a data processing apparatus according to an embodiment of the present disclosure.
  • the data processing apparatus may be configured on a requesting end, and the apparatus may include:
  • the data construction module 701 is configured to add a reserved field to the data to be stored to obtain target data.
  • the data construction module may specifically add a predetermined character string of a preset size as a reserved field at the end of the data to be stored to obtain target data.
  • a data sending module 702 configured to send the target data to a storage end
  • the reserved field is used to write data metadata of the to-be-stored data in a memory location corresponding to the storage end.
  • a reserved field is added to the data to be stored, and is reconstructed into target data. Since the data size of the target data is equal to the sum of the data size of the data to be stored and the data size occupied by the reserved field, the storage end is passive.
  • the allocated second memory is enough to write data metadata, so there is no need to apply for additional disk memory, no data copying, zero copy is achieved, so system performance can be improved, data storage efficiency can be improved, and memory waste is not caused.
  • the data sending module may be specifically configured to: allocate a first memory that is consistent with a data size of the target data; fill the target data into the first memory; from the first The target data is sent to the storage end in memory.
  • the request metadata may also be sent to the storage end, and the storage end may determine the data to be stored in the target data according to the data length and/or the data location of the data to be stored in the request metadata.
  • the request metadata may also indicate to which storage device the storage side stores the data to be stored.
  • the request metadata may also be filled in the memory, so that the data to be stored and the request metadata may be sent together.
  • the data sending module may be specifically configured to:
  • the target data is sent from the first memory to the storage end.
  • the technical solution of the present application can be applied to a distributed storage system.
  • a distributed storage system in order to avoid data loss caused by a data storage node failure, a multi-copy technology is usually used to make the original data.
  • Several copies each copy data is stored in a different data storage node, and the copy data of each original data is the copy data. Therefore, the data to be stored can be referred to as replica data.
  • the distributed storage system is formed into a cluster system by a plurality of data storage nodes, there is a need to send the data to be stored to a plurality of data storage nodes for storage.
  • the data to be stored is replica data
  • the replica data needs to be separately sent to multiple data storage nodes. .
  • the storage end can include a plurality of data storage nodes.
  • the data sending module may be specifically configured to:
  • the target data and the corresponding request metadata are respectively sent to the plurality of data storage nodes.
  • the data sending module may be specifically configured to:
  • the target data in the first memory, the request metadata corresponding to the any data storage node, and the request element before the request metadata corresponding to the any data storage node Data is sent to any of the storage nodes; the any data storage node is configured to read its corresponding request metadata from the tail of the received data.
  • the data sending module fills in a plurality of request metadata in sequence at the end of the target data, which may be specifically: sequentially filling in the tail of the target data according to the sending order of the multiple data storage nodes Requesting metadata corresponding to each of the plurality of data storage nodes;
  • the data sending module for any data storage node, the target data in the first memory, the request metadata corresponding to the any data storage node, and the request element corresponding to the any data storage node Sending the request metadata before the data to the any storage node may be specifically: for any data storage node, the target data is corresponding to the sending order of the any data storage node from the first memory
  • the request metadata and the request metadata before the request metadata corresponding to the transmission order of any of the data storage nodes are sent to the any storage node.
  • the data processing apparatus of the embodiment shown in FIG. 7 can be implemented as a computing device deployed on the requesting end, which can be a requesting server.
  • the computing device can include Storage component 801 and processing component 802,
  • the storage component 801 is configured to store one or more computer instructions, wherein the one or more computer instructions are invoked by the processing component to execute;
  • the processing component 802 is configured to:
  • the reserved field is used to write data metadata of the to-be-stored data in a memory location corresponding to the storage end.
  • Processing component 802 can request the first memory in storage component 801 to cache target data and/or request metadata.
  • processing component 802 can also be used to execute the data processing method described in any of the foregoing embodiments of FIG. 1 to FIG.
  • the processing component 802 can include one or more processors to execute computer instructions to perform all or part of the steps described above.
  • the processing component can also be one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs).
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • Storage component 801 is configured to store various types of data to support operation at the computing device.
  • the memory can be implemented by any type of volatile or non-volatile memory device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), and erasable programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • the computing device must also include other components, such as input/output interfaces, communication components, and the like.
  • the embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, and when the computer program is executed by a computer, the monitoring method shown in any of the foregoing embodiments of FIG. 1 to FIG. 3 can be implemented.
  • FIG. 9 is a schematic structural diagram of another embodiment of a data processing apparatus according to an embodiment of the present disclosure.
  • the apparatus may be configured on a storage end, and the apparatus may include:
  • the data obtaining module 901 is configured to acquire target data; the target data is configured by the requesting end adding a reserved field in the data to be stored;
  • a memory allocation module 902 configured to allocate a second memory to cache the target data
  • a data generating module 903 configured to generate data metadata of the data to be stored in the target data
  • the data writing module 904 is configured to write the data metadata into a memory location corresponding to the reserved field in the second memory.
  • the size of the data occupied by the reserved field is greater than or equal to the data size of the data metadata. Therefore, there is enough space in the second memory to write the data metadata, and the storage end does not need to allocate the write disk memory, that is, the data to be stored can be Data metadata is written to the storage device, eliminating the need for data copying and zero copying. This improves system performance and improves data storage efficiency without wasting memory.
  • the location of the reserved field in the target data and the location of the data to be stored may be agreed with the requesting end, for example, the last 32 bytes in the target data are reserved fields, and the storage end follows the agreed rule. That is, the data to be stored can be read from the target data, and the reserved field is determined.
  • the reserved field may be located at the end of the data to be stored
  • the data generating module may be specifically configured to determine the to-be-stored data and the reserved field in the target data based on a preset size of the reserved field, and generate data metadata of the to-be-stored data.
  • the requesting end may also send the request metadata to the storage end. Therefore, in some embodiments, the data obtaining module may be specifically configured to:
  • the write request includes the target data and at least one request metadata
  • the data generating module may be specifically configured to: read the to-be-stored data from the sent data based on the target request metadata; generate data metadata based on the target request metadata and the to-be-stored data .
  • the write request may include the target data requested by the requesting end from the first memory, the target request metadata, and request metadata before the target request metadata;
  • the determining, by the data obtaining module, the target request metadata may be: reading target request metadata from a tail of the write request based on a data size of the request metadata.
  • the data processing device of the embodiment shown in FIG. 9 can be implemented as a computing device, which can be a storage server in a traditional storage system or a data storage node in a distributed storage system, which can be a Physical server, as shown in Figure 10, the computing device can include a storage component 1001 and a processing component 1002;
  • the storage component 1001 is configured to store one or more computer instructions, wherein the one or more computer instructions are invoked by the processing component 1002 to execute;
  • the processing component 1002 is configured to:
  • the target data is formed by the requesting end adding a reserved field in the data to be stored;
  • processing component 1002 can also be used to perform the data processing method described in any of the above embodiments.
  • the processing component 1002 can include one or more processors to execute computer instructions to perform all or part of the steps described above.
  • the processing component can also be one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs).
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • Storage component 1001 is configured to store various types of data to support operation at the computing device.
  • the memory can be implemented by any type of volatile or non-volatile memory device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), and erasable programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • the computing device must also include other components, such as input/output interfaces, communication components, and the like.
  • the embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, and when the computer program is executed by a computer, the monitoring method shown in FIG. 4 can be implemented.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供一种数据处理方法、装置及移动终端。其中,请求端在待存储数据中添加预留字段,将待存储数据构造为目标数据;请求端将该目标数据发送至存储端,存储端分配内存以缓存该目标数据,并生成目标数据中待存储数据的数据元数据;将所述数据元数据写入分配的内存中所述预留字段对应的内存位置。本申请实施例实现了数据零拷贝,保证了数据存储效率,提高了系统性能。

Description

数据处理方法、装置及计算设备
本申请要求2018年01月09日递交的申请号为201810020121.0、发明名称为“数据处理方法、装置及计算设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及数据处理技术领域,尤其涉及一种数据处理方法、一种数据处理装置及一种计算设备。
背景技术
在存储系统中进行数据存储时,为了保证数据安全性,通常会在待存储数据的尾部添加用于描述待存储数据相关属性的元数据,也被称为数据元数据,从而可以将待存储数据以及数据元数据一次性写入存储设备中。
由于接收到请求端发送的待存储数据时,存储端会被动分配一块与待存储数据的数据大小一致的内存以缓存该待存储数据,为了保证待存储数据以及数据元数据可以一次性写入存储设备中,现有技术的一种实现方式是:重新申请一块写盘内存,该写盘内存的大小为待存储数据的数据大小加上数据元数据的数据大小,之后将待存储数据拷贝至写盘内存中,再在写盘内存中拼接上该数据元数据,使得待存储数据与数据元数据组织在一起,从而只需执行一次写操作,即可以将写盘内存中的待存储数据以及数据元数据作为一个整体写入存储设备中。
但是,由上述描述可知,现有技术中需要先采用数据拷贝的方式将待存储数据重新拷贝至写盘内存中,这就会影响数据存储效率。
发明内容
本申请实施例提供一种数据处理方法、装置及计算设备,用以解决现有技术中数据存储效率低的技术问题。
第一方面,本申请实施例中提供了一种数据处理方法,包括:
在待存储数据中添加预留字段,以获得目标数据;
发送所述目标数据至存储端;
其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据 元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
第二方面,本申请实施例中提供了一种数据处理方法,包括:
获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
分配第二内存以缓存所述目标数据;
生成所述目标数据中所述待存储数据的数据元数据;
将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
第三方面,本申请实施例中提供了一种数据处理装置,包括:
数据构造模块,用于在待存储数据中添加预留字段,以获得目标数据;
数据发送模块,用于发送所述目标数据至存储端;
其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
第四方面,本申请实施例中提供了一种数据处理装置,包括:
数据获取模块,用于获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
内存分配模块,用于分配第二内存以缓存所述目标数据;
数据生成模块,用于生成所述目标数据中所述待存储数据的数据元数据;
数据写入模块,用于将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
第五方面,本申请实施例中提供了一种计算设备,包括存储组件以及处理组件,
所述存储组件用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用执行;
所述处理组件用于:
在待存储数据中添加预留字段,以获得目标数据;
发送所述目标数据至存储端;
其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
第六方面,本申请实施例中提供了一种计算设备,包括存储组件以及处理组件,
所述存储组件用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用执行;
所述处理组件用于:
获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
在所述存储组件中分配第二内存以缓存所述目标数据;
生成所述目标数据中所述待存储数据的数据元数据;
将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
本申请实施例中,请求端在待存储数据中添加预留字段,将待存储数据构造为目标数据;该目标数据的数据大小即为待存储数据的数据大小以及该预留字段占用的数据大小之和。请求端将该目标数据发送至存储端,存储端分配内存以缓存该目标数据,由于目标数据中预留了预留字段,预留字段对应的内存空间可以足够写入数据元数据,因此存储端无需额外分配新内存,既不会造成内存浪费,同时避免了数据拷贝,因此可以提高数据存储效率,提高系统性能。
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请提供的一种数据处理方法一个实施例的流程图;
图2示出了本申请提供的一种数据处理方法又一个实施例的流程图;
图3示出了本申请提供的一种数据处理方法又一个实施例的流程图;
图4a示出了本申请实施例中的一种数据结构示意图;
图4b示出了本申请实施例中的另一种数据结构示意图;
图5示出了本申请提供的一种数据处理方法又一个实施例的流程图;
图6示出了本申请实施例在一个实际应用中数据处理的交互示意图;
图7示出了本申请提供的一种数据处理装置一个实施例的结构示意图;
图8示出了本申请提供的一种计算设备一个实施例的结构示意图;
图9示出了本申请提供的一种数据处理装置又一个实施例的结构示意图;
图10示出了本申请提供的一种计算设备又一个实施例的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如101、102等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
本申请实施例的技术方案主要应用于存储系统中,该存储系统可以为传统存储系统或者分布式存储系统。
为了方便理解,下面首先对本申请实施例中可能出现的技术术语进行相应解释:
存储端:负责数据存取操作;传统存储系统采用集中的存储服务器存储数据,存储端即可以是指该存储服务器;分布式存储系统中数据分散存储在多个数据存储节点,存储端即可以是指该一个或多个数据存储节点;
数据存储节点:分布式存储系统中负责数据存储的节点,通常为一个物理服务器。
请求端:负责发送读/写请求,上层的业务系统即通过请求端向存储端存取数据或者更新数据等。
请求元数据:请求端发送至存储端的待存储数据的数据指引信息,其可以包括待存储数据的数据长度和/或数据位置、以及存盘指示信息,存盘指示信息也即是指指定待存储数据写入哪一个存储设备中,请求元数据并不会写入存储数据中。
数据元数据(DataMeta):用于描述待存储数据相关属性的元数据,其可以包括数据长度、数据校验和、存储位置、所属文件名称等等,数据元数据会与待存储数据写入存储设备中。
多副本技术:分布式存储系统中一种数据冗余技术,为了防止某个数据存储节点故障而导致数据丢失,通常会将原始数据做几份拷贝,每个拷贝数据存储到不同的数据存储节点中,每个原始数据的拷贝数据即称为副本数据。
存储设备:存储系统中用以存储数据的硬件设备,数据最终需要写入存储设备中,可以是指诸如磁盘等的存储介质。
由于现有技术中为了将待存储数据以及数据元数据一块写入存储设备,存储端需要 额外分配一块写盘内存,而存储端接收到待存储数据时,即会被动分配一块内存以缓存该待存储数据,因此需要将待存储数据重新拷贝至写盘内存,待存储数据需要进行一次拷贝,而数据拷贝会影响数据存储效率,影响系统性能。
为了提高系统性能,保证数据存储效率,发明人经过一系列研究提出了本申请的技术方案,在本申请实施例中,请求端在待存储数据中添加预留字段,将待存储数据构造为目标数据;该目标数据的数据大小即为待存储数据的数据大小以及该预留字段对应的数据大小之和。请求端将该目标数据发送至存储端,存储端分配一块内存以缓存该目标数据,由于目标数据中包括预留字段,预留字段对应的内存空间可以足够写入数据元数据,因此存储端无需额外分配新内存,既不会造成内存浪费,同时避免了数据拷贝,因此可以提高数据存储效率,提高系统性能。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1为本申请实施例提供的一种数据处理方法一个实施例的流程图,本实施例的技术方案应用于请求端,由请求端负责执行。
该方法可以包括以下几个步骤:
101:在待存储数据中添加预留字段,以获得目标数据。
本申请实施例中,请求端构造写请求时,在待存储数据中添加预留字段,该预留字段占用一定数据长度,待存储数据加上该预留字段,形成新的数据格式,构成目标数据,该目标数据的数据大小也即等于待存储数据的数据大小与该预留字段所占用的数据大小之和。
其中,预留字段可以位于待存储数据的尾部,其所占用的数据大小预先设定,可以结合数据元数据的数据大小设置,该预留字段所占用的数据大小需大于等于数据元数据的数据大小。
可选地,该预留字段所占用的数据大小可以等于数据元数据的数据大小。
其中,数据元数据也即是指待存储数据的数据元数据,基于待存储数据生成,由于数据元数据包含信息通常是固定,因此不同数据的数据元数据通常大小一致,可以据此设定预留字段所占用的数据大小。
需要说明的是,本申请实施例所涉及的数据大小以字节(英文:Byte,简称:B)或 千字节为(英文:Kbytes,简称:K)单位表示,数据大小也可以称为数据长度。
作为一种可选方式,可以是在待存储数据尾部添加预设大小的预定字符串作为预留字段,以获得目标数据。
该预定字符串可以为空字符串或者约定字符串等。
102:发送所述目标数据至存储端。
请求端可以向存储端发送写请求,以携带该目标数据,该目标数据即作为待写入数据发送至存储端。
存储端获得该目标数据之后,会被动分配一块内存以缓存该目标数据,为了方便描述上的区分,将存储端被动分配的用于缓存目标数据的内存命名为“第二内存”,存储端可以从目标数据中读取该待存储数据,据此生成数据元数据。由于目标数据中预留了预留字段,该预留字段足够写入数据元数据,因此存储端即可以将数据元数据写入该第二内存中预留字段对应的内存位置处。
也即,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据。
本实施例中,请求端在待存储数据中添加预留字段,重新构造成目标数据,由于目标数据的数据大小等于待存储数据的数据大小与预留字段所占用的数据大小之和,使得存储端被动分配的第二内存足够写入数据元数据,因此无需额外申请写盘内存,无需进行数据拷贝,实现了零拷贝,因此可以提高系统性能,提高数据存储效率,不会造成内存浪费。
由于待存储数据发送之前,需要暂时存储至内存中。因此,在某些实施例中,所述发送所述目标数据至存储端可以包括:
分配与所述目标数据的数据大小一致的第一内存;
将所述目标数据填入所述第一内存;
从第一内存中将所述目标数据发送至存储端。
其中,本申请实施例中的所述的第一内存也即是指为了存储向存储端发送的目标数据而分配的内存,为了方便描述上的区分,命名为“第一内存”,本领域技术人员可以理解的是,“第一内存”以及“第二内存”中的“第一”以及“第二”仅仅是为了描述上的区分,并不表示具有例如递件、包含等关系。
其中,请求端与存储端可以约定目标数据中预留字段所在位置以及待存储数据所在 位置,例如目标数据中最后32个字节为预留字段,因此存储端按照该约定规则,即可以从目标数据中读取待存储数据,并确定预留字段。
此外,请求端还可以向存储端发送请求元数据,存储端可以根据请求元数据中的待存储数据的数据长度和/或数据位置,确定目标数据中待存储数据。
该请求元数据还可以指示存储端将待存储数据存储至哪一个存储设备中等。
由于待存储数据发送之前,需要暂时存储至内存中,请求元数据也可以填入内存中,以使得待存储数据与请求元数据可以一起发送。
因此,所述发送所述目标数据至存储端可以包括:
计算所述目标数据以及所述请求元数据的总数据大小
分配与所述总数据大小一致的第一内存;
将所述目标数据以及所述请求元数据填入所述第一内存;
从第一内存中将所述目标数据发送至存储端。
在一个实际应用中,本申请的技术方案可以应用于分布式存储系统中,在分布式存储系统中,为了避免某个数据存储节点故障导致数据丢失,通常会采用多副本技术,将原始数据做几份拷贝,每个拷贝数据存储到不同的数据存储节点中,每个原始数据的拷贝数据即为副本数据。因此,该待存储数据可以即是指副本数据。
由于分布式存储系统由若干数据存储节点形成集群系统,存在将待存储数据发送至多个数据存储节点进行存储的需求,例如待存储数据为副本数据时,需要将副本数据分别发送至多个数据存储节点。
待存储数据如果需要发送至多个数据存储节点,请求端需要针对每个数据存储节点发送对应的请求元数据,分别发送至多个数据存储节点的请求元数据中除了可以包括待存储数据的数据长度以及数据位置等,还包括不同数据存储节点对应的差异化信息,例如存盘指示信息,待存储数据对不同数据存储节点存入的存储设备可能有不同要求。
在分布式存储系统中,本申请实施例中的存储端即可以包括多个数据存储节点。
因此,发送目标数据至存储端包括发送目标数据以及各自对应的请求元数据至多个数据存储节点。
目前,请求端发送目标数据以及各自对应的请求元数据至多个数据存储节点时,请求端可以首先计算目标数据与一个请求元数据的总数据大小,之后分配与该总数据大小一致的第一内存;将目标数据先填入该第一内存,如果需要向任一个数据存储节点发送 待存储数据,即在第一内存中拼接上该任一个数据存储节点对应的请求元数据,再将目标数据以及该请求元数据发送至该任一数据存储节点,而如果需要向另一个数据存储节点发送待存储数据,则将该另一个数据存储节点对应的请求元数据拷贝至该第一内存中覆盖掉之前的请求元数据之后再发送。
由上述描述可知,目前这种方式,向多个数据存储节点发送待存储数据时,需要拷贝多次请求元数据操作繁琐,这也会影响发送效率,从而影响数据存储效率。
因此,发明人经过进一步思考提出了本申请技术方案又一个实施例。
图2为本申请实施例提供的一种数据处理方法又一个实施例的流程图,该方法可以包括以下几个步骤:
201:在待存储数据中添加预留字段,以获得目标数据。
202:确定所述待存储数据分别对应所多个数据存储节点的请求元数据。
203:计算所述目标数据与所述多个请求元数据的总数据大小。
204:分配与所述总数据大小一致的第一内存。
也即第一内存的内存大小可以等于总数据大小。
205:将所述目标数据以及所述多个请求元数据填入所述第一内存。
可以将目标数据先填入第一内存,之后再将多个请求元数据依次填入第一内存,拼接在目标数据的尾部。
206:向所述多个数据存储节点分别发送所述目标数据及各自对应的请求元数据。
请求端可以向多个数据存储节点分别发送写请求,以携带该目标数据以及该多个数据存储节点各自对应的请求元数据。
也即针对任一个数据存储节点,将第一内存中的目标数据、以及该任一个数据存储节点对应的请求数据发送至该任一个数据存储节点。
在第一内存中,多个请求元数据可以按照多个数据存储节点的发送顺序依次填入第一内存中,例如发送顺序第一的第一个数据存储节点的请求元数据位于目标数据尾部,发送顺序第二的第二个数据存储节点的请求数据位于第一个个数据存储节点的请求元数据的尾部,等依次类推,即可以将多个请求元数据依次填入第一内存中。在需要向任一个数据存储节点发送写请求时,由于请求元数据的数据大小已知,根据该任一个数据存储节点的发送顺序即可以找到其对应的请求元数据。
任一个数据存储节点获得请求端发送的接收数据之后,即可以从接收数据尾部读取其自身需要的请求元数据,并基于该请求元数据可以从目标数据中确定出待存储数据以 及预留字段等。
本实施例中,分配的第一内存可以将多个请求元数据均写入,无需为每一个数据存储节点特例化请求数据,执行一次操作即可以向不同数据存储节点发送数据,操作更为简单,提高了发送效率,因此可以提高数据存储效率,提高系统性能。
此外,为了方便请求端以及每个数据存储节点的操作,作为又一个实施例,如图3中所示,该数据处理方法可以包括:
301:在待存储数据中添加预留字段,以获得目标数据。
302:确定所述待存储数据分别对应所多个数据存储节点的请求元数据。
303:计算所述目标数据与所述多个请求元数据的总数据大小。
304:分配与所述总数据大小一致的第一内存。
也即第一内存的内存大小可以等于总数据大小。
305:将所述目标数据填入所述第一内存,以及在所述目标数据的尾部依次填入多个请求元数据;
多个请求元数据可以按照多个数据存储节点的发送顺序依次填入第一内存中,例如发送顺序第一的第一个数据存储节点的请求元数据位于目标数据尾部,发送顺序第二的第二个数据存储节点的请求数据位于第一个个数据存储节点的请求元数据的尾部,等依次类推,即可以将多个请求元数据依次填入第一内存中。
因此,作为又一个实施例,所述在所述目标数据的尾部依次填入多个请求元数据包括:
按照所述多个数据存储节点的发送顺序,在所述目标数据的尾部依次填入所述多个数据存储节点各自对应的请求元数据。
306:针对任一数据存储节点,将所述第一内存中的所述目标数据、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请求元数据发送至所述任一存储节点。
请求端可以向任一个数据存储节点发送写请求,以携带该目标数据、、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请求元数据。
其中,所述任一数据存储节点用于从接收数据的尾部读取其对应的请求元数据。
其中,如果是按照所述多个数据存储节点的发送顺序,在所述目标数据的尾部依次 填入所述多个数据存储节点各自对应的请求元数据。
可选地,该步骤306的操作可以包括:
针对任一数据存储节点,从所述第一内存中将所述目标数据、与所述任一数据存储节点的发送顺序对应的请求元数据以及位于与所述任一数据存储节点的发送顺序对应的请求元数据之前的请求元数据发送至所述任一存储节点。
采用本实施例的方式,基于请求元数据的数据大小,数据存储节点从接收数据的尾部即可以解析获得自身对应的请求元数据。
由于该请求元数据中包括待存储数据的数据大小和/或数据位置,因此基于请求元数据,可以从接收数据头部读取该请求元数据指示的数据大小的数据,即可以获得待存储数据,或者根据数据位置即可以定位获得接收数据中的待存储数据。
为了方便理解,下面以三个数据存储节点为例,图4a为目标数据的数据结构示意图,可知该目标数据由待存储数据401以及预留字段402构成。
图4b为第一内存中的数据结构示意图,其由目标数据403以及位于目标数据403尾部依次填入的三个请求元数据m1、m2以及m3,
该三个请求元数据可以按照三个数据存储节点的发送顺序依次填入,请求元数据m1对应第一个数据存储节点、请求元数据m2对应第二个数据存储节点、请求元数据m3对应第三个数据存储节点。
作为一种可选方式,请求端向第一个数据存储节点执行写操作时,可以将目标数据403以及请求元数据m1发送至第一数据存储节点,向第二个数据存储节点执行写操作,可以将目标数据303以及请求元数据m2发送至第二个数据存储节点,向第三个数据存储节点执行写操作时,可以将目标数据403以及请求元数据m3发送至第三个数据存储节点。
作为另一种可选方式,请求端向第一个数据存储节点执行写操作时,可以将目标数据403以及请求元数据m1发送至第一数据存储节点;向第二个数据存储节点执行写操作,可以将目标数据403、请求元数据m1以及请求元数据m2发送至第二个数据存储节点,向第三个数据存储节点执行写操作时,可以将目标数据403、请求元数据m1、请求元数据m2以及请求元数据m3发送至第三个数据存储节点。
图5为本申请实施例提供的一种数据处理方法又一个实施例的流程图,本实施例的 技术方案应用于存储端,在传统存储系统中,可以是存储服务器执行所述方法,在分布式存储系统中,可以是任一个数据存储节点执行所述方法。
所述方法可以包括以下几个步骤:
501:获取目标数据。
其中,所述目标数据由请求端在待存储数据中添加预留字段构成。
其中,该目标数据可以携带在请求端的写请求中,存储端可以从该写请求端中获得该目标数据。
502:分配第二内存以缓存所述目标数据。
存储端获得目标数据之后,会被动分配一块第二内存以暂存该目标数据,该第二内存的内存大小即与目标数据的数据大小一致。
503:生成所述目标数据中所述待存储数据的数据元数据。
504:将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
之后,存储端即可以将该第二内存的待存储数据以数据元数据一块写入存储设备中。
预留字段所占用的数据大小大于或等于数据元数据的数据大小,因此第二内存中留有足够空间可以写入数据元数据,存储端无需额外分配写盘内存,即可以将待存储数据以数据元数据写入存储设备,无需进行数据拷贝,实现了零拷贝,因此可以提高系统性能,提高数据存储效率,不会造成内存浪费。
其中,请求端与存储端可以约定目标数据中预留字段所在位置以及待存储数据所在位置,例如目标数据中最后32个字节为预留字段,存储端按照该约定规则,即可以从目标数据中读取待存储数据,并确定预留字段。
因此,在某些实施例中,所述预留字段可以位于所述待存储数据的尾部;
所述生成所述目标数据中所述待存储数据的数据元数据可以包括:
基于所述预留字段的预设大小,确定所述目标数据中的所述待存储数据以及所述预留字段;
生成所述待存储数据的数据元数据。
从而存储端即可以基于该待存储数据,生成数据元数据。
此外,请求端还可以向存储端发送请求元数据,存储端可以根据请求元数据中的待存储数据的数据长度和/或数据位置,确定目标数据中待存储数据。
该请求元数据还可以指示存储端将待存储数据存储至哪一个存储设备中等。
数据元数据可以基于该请求元数据以及待存储数据生成。
而在分布式存储系统中,存在待存储数据发送至多个数据存储节点进行存储的情况,例如待存储数据为副本数据时。基于上述实施例中的描述,请求端向存储端发送的发送数据中可以目标数据以及至少一个请求元数据。
因此,在某些实施例中,所述获取目标数据可以包括:
接收请求端发送的写请求;所述写请求包括所述目标数据以及至少一个请求元数据;
确定目标请求元数据;
所述生成所述目标数据中所述待存储数据的数据元数据包括:
基于所述目标请求元数据,读取所述待存储数据;
基于所述目标请求元数据以及所述待存储数据,生成数据元数据。
例如,目标请求元数据中可以包括待存储数据的数据大小,该待存储数据的数据大小可以作为数据元数据的信息。数据元数据中包括的其他信息,例如数据校验和,可以基于待存储数据生成,该数据校验和可以利用CRC(Cyclic Redundancy Check,循环冗余校验)算法实现,与现有技术相同,在此不再赘述。
其中,如果请求端按照图3所示实施例的方式发送目标数据,而请求端的写请求中即可以包括请求端从第一内存中请求发送的所述目标数据、所述目标请求元数据以及位于所述目标请求元数据之前的请求元数据
因此所述确定目标请求元数据可以包括:
基于所述请求元数据的数据大小,从所述写请求数据尾部读取所述目标请求元数据。
从而基于该目标请求元数据,即可以从发送数据中读取所述待存储数据。
下面以分布式存储系统中,待存储数据为副本数据为例,对本申请实施例技术方案进行描述。在分布式存储系统中,通常采用多副本技术以解决原始数据由于故障丢失的问题。
如图6中所示,请求端60首先构造目标数据601,在副本数据602尾部添加预留字段602。该预留字段可以通过在副本数据尾部添加预设大小的空字符串构成。
假设接收副本数据的数据存储节点包括3个,分别第一个数据存储节点61,第二个数据存储节点62以及第三个数据存储节点63,也即请求端需要向3个数据存储节点发 送副本数据,请求端分别确定需要3个数据存储节点各自对应的请求元数据,假设第一个数据存储节点61对应的请求元数据为m1;第二个数据存储节点62对应的请求元数据为m2;第三个数据存储节点63对应的请求元数据为m3。
请求端申请一块第一内存,将目标数据601写入该第一内存中,并按照3个数据存储节点的发送顺序,假设发送顺序为:第一个数据存储节点61、第二个数据存储节点62、第三个数据存储节点63,则在目标数据601尾部依次填入3个请求元数据,3个请求元数据的排列顺序即为:m1、m2以及m3。
请求端针对任一个数据存储节点,即可以从第一内存中读取副本数据以及该任一个存储节点对应的请求元数据以及该任一个存储节点对应的请求元数据之前的请求元数据,向该任一个数据存储节点发送写请求。
如图6中所示,请求端60向第一个数据存储节点61发送目标数据601以及m1;向第二个数据存储节点发送目标数据601、m1以及m2;向第三个数据存储节点发送目标数据601、m1、m2以及m3。
从而任一个数据存储节点接收到写请求之后,即可以从写请求的尾部读取获得与其它数据存储节点差异化的目标请求元数据。
每一个数据存储节点基于请求元数据的数量以及请求元数据的数据大小,即可以从写请求中确定目标数据,并分配一块第二内存暂存该目标数据;
之后,每一个数据存储节点基于目标请求元数据,可以从目标数据中确定出副本数据,据此生成数据元数据,之后即将数据元数据可以写入第二内存中。
每一个数据存储节点即可以将第二内存中写入的副本数据以及数据元数据写入各自存储设备64中。
由上述描述可知,请求端申请了一块第一内存,一次将3个数据存储节点的请求元数据均填入第一内存中,从而可以简化操作,保证数据发送效率。且请求端在副本数据中增加预留字段,构成目标数据,使得数据存储节点为了暂存目标数据而被动分配的第二内存中留有足够空间写入数据元数据,从而无需重新申请写盘内存,无需进行数据拷贝操作,即可以实现数据存储,采用零拷贝方案提高了系统性能,提高数据存储效率。且无需额外申请内存,降低了内存浪费问题。
图7为本申请实施例提供的一种数据处理装置一个实施例的结构示意图,该数据处理装置可以配置于请求端,该装置可以包括:
数据构造模块701,用于在待存储数据中添加预留字段,以获得目标数据。
可选地,所述数据构造模块可以具体是在所述待存储数据尾部添加预设大小的预定字符串作为预留字段,以获得目标数据。
数据发送模块702,用于发送所述目标数据至存储端;
其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据。
本实施例中,在待存储数据中添加预留字段,重新构造成目标数据,由于目标数据的数据大小等于待存储数据的数据大小与预留字段所占用的数据大小之和,使得存储端被动分配的第二内存足够写入数据元数据,因此无需额外申请写盘内存,无需进行数据拷贝,实现了零拷贝,因此可以提高系统性能,提高数据存储效率,不会造成内存浪费。
由于待存储数据发送之前,需要暂时存储至内存中
因此,在某些实施例中,所述数据发送模块可以具体用于:分配与所述目标数据的数据大小一致的第一内存;将所述目标数据填入所述第一内存;从第一内存中将所述目标数据发送至存储端。
其中,还可以向存储端发送请求元数据,存储端可以根据请求元数据中的待存储数据的数据长度和/或数据位置,确定目标数据中待存储数据。
该请求元数据还可以指示存储端将待存储数据存储至哪一个存储设备中等。
由于待存储数据发送之前,需要暂时存储至内存中,请求元数据也可以填入内存中,以使得待存储数据与请求元数据可以一起发送。
因此,在某些实施例中,所述数据发送模块可以具体用于:
计算所述目标数据以及所述请求元数据的总数据大小
分配与所述总数据大小一致的第一内存;
将所述目标数据以及所述请求元数据填入所述第一内存;
从第一内存中将所述目标数据发送至存储端。
在一个实际应用中,本申请的技术方案可以应用于分布式存储系统中,在分布式存储系统中,为了避免某个数据存储节点故障导致数据丢失,通常会采用多副本技术,将原始数据做几份拷贝,每个拷贝数据存储到不同的数据存储节点中,每个原始数据的拷贝数据即为副本数据。因此,该待存储数据可以即是指副本数据。
由于分布式存储系统由若干数据存储节点形成集群系统,存在将待存储数据发送至多个数据存储节点进行存储的需求,例如待存储数据为副本数据时,需要将副本数据分别发送至多个数据存储节点。
因此,在某些实施例中,所述存储端可以包括多个数据存储节点。
作为一种可选方式,所述数据发送模块可以具体用于:
确定所述待存储数据分别对应所述多个数据存储节点的请求元数据;
计算所述目标数据与所述多个请求元数据的总数据大小;
分配与所述总数据大小一致的第一内存;
将所述目标数据以及所述多个请求元数据填入所述第一内存;
向所述多个数据存储节点分别发送所述目标数据及各自对应的请求元数据。
作为另一种可选方式,所述数据发送模块可以具体用于:
确定所述待存储数据分别对应所述多个数据存储节点的请求元数据;
计算所述目标数据与所述多个请求元数据的总数据大小;
分配与所述总数据大小一致的第一内存;
将所述目标数据填入所述第一内存,以及在所述目标数据的尾部依次填入多个请求元数据;
针对任一数据存储节点,将所述第一内存中的所述目标数据、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请求元数据发送至所述任一存储节点;所述任一数据存储节点用于从接收数据的尾部读取其对应的请求元数据。
可选地,所述数据发送模块在所述目标数据的尾部依次填入多个请求元数据可以具体是:按照所述多个数据存储节点的发送顺序,在所述目标数据的尾部依次填入所述多个数据存储节点各自对应的请求元数据;
所述数据发送模块针对任一数据存储节点,将所述第一内存中的所述目标数据、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请求元数据发送至所述任一存储节点可以具体是:针对任一数据存储节点,从所述第一内存中将所述目标数据、与所述任一数据存储节点的发送顺序对应的请求元数据以及位于与所述任一数据存储节点的发送顺序对应的请求元数据之前的请求元数据发送至所述任一存储节点。
在一个可能的设计中,图7所示实施例的数据处理装置可以实现为一计算设备,该计算设备部署在请求端,其可以为一个请求服务器,如图8所示,该计算设备可以包括存储组件801以及处理组件802,
所述存储组件801用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用执行;
所述处理组件802用于:
在待存储数据中添加预留字段,以获得目标数据;
发送所述目标数据至存储端;
其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据。
处理组件802可以在存储组件801中申请第一内存以缓存目标数据和/或请求元数据。
此外,该处理组件802还可以用于执行上述图1~图3任一实施例所述的数据处理方法。
其中,处理组件802可以包括一个或多个处理器来执行计算机指令,以完成上述的方法中的全部或部分步骤。当然处理组件也可以为一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
存储组件801被配置为存储各种类型的数据以支持在计算设备的操作。存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
当然,计算设备必然还可以包括其他部件,例如输入/输出接口、通信组件等。
此外,本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被计算机执行时可以实现上述图1~图3任一实施例所示的监控方法。
图9为本申请实施例提供的一种数据处理装置又一个实施例的结构示意图,该装置可以配置于存储端,该装置可以包括:
数据获取模块901,用于获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
内存分配模块902,用于分配第二内存以缓存所述目标数据;
数据生成模块903,用于生成所述目标数据中所述待存储数据的数据元数据;
数据写入模块904,用于将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
预留字段所占用的数据大小大于或等于数据元数据的数据大小,因此第二内存中留有足够空间可以写入数据元数据,存储端无需额外分配写盘内存,即可以将待存储数据以数据元数据写入存储设备,无需进行数据拷贝,实现了零拷贝,因此可以提高系统性能,提高数据存储效率,不会造成内存浪费。
其中,在某些实施例中,可以与请求端约定目标数据中预留字段所在位置以及待存储数据所在位置,例如目标数据中最后32个字节为预留字段,存储端按照该约定规则,即可以从目标数据中读取待存储数据,并确定预留字段。
因此,在某些实施例中,所述预留字段可以位于所述待存储数据的尾部;
所述数据生成模块可以具体用于基于所述预留字段的预设大小,确定所述目标数据中的所述待存储数据以及所述预留字段;生成所述待存储数据的数据元数据。
此外,请求端还可以向存储端发送请求元数据,因此在某些实施例中,所述数据获取模块可以具体用于:
接收请求端发送的写请求;所述写请求包括所述目标数据以及至少一个请求元数据;
确定目标请求元数据;
所述数据生成模块可以具体用于:基于所述目标请求元数据,从所述发送数据中读取所述待存储数据;基于所述目标请求元数据以及所述待存储数据,生成数据元数据。
在某些实施例中,所述写请求中可以包括请求端从第一内存中请求发送的所述目标数据、所述目标请求元数据以及位于所述目标请求元数据之前的请求元数据;
所述数据获取模块确定目标请求元数据可以具体是:基于所述请求元数据的数据大小,从所述写请求的尾部读取目标请求元数据。
在一个可能的设计中,图9所示实施例的数据处理装置可以实现为一计算设备,该计算设备可以为传统存储系统中存储服务器或者分布式存储系统中的数据存储节点,其 可以为一个物理服务器,如图10所示,该计算设备可以包括存储组件1001以及处理组件1002;
所述存储组件1001用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件1002调用执行;
所述处理组件1002用于:
获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
在所述存储组件1001中分配第二内存以缓存所述目标数据;
生成所述目标数据中所述待存储数据的数据元数据;
将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
此外,该处理组件1002还可以用于执行上述任一实施例所述的数据处理方法。
其中,处理组件1002可以包括一个或多个处理器来执行计算机指令,以完成上述的方法中的全部或部分步骤。当然处理组件也可以为一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
存储组件1001被配置为存储各种类型的数据以支持在计算设备的操作。存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
当然,计算设备必然还可以包括其他部件,例如输入/输出接口、通信组件等。
此外,本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被计算机执行时可以实现上述图4所示的监控方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (14)

  1. 一种数据处理方法,其特征在于,包括:
    在待存储数据中添加预留字段,以获得目标数据;
    发送所述目标数据至存储端;
    其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
  2. 根据权利要求1所述的方法,其特征在于,所述存储端包括多个数据存储节点;
    所述发送所述目标数据至存储端包括:
    确定所述待存储数据分别对应所述多个数据存储节点的请求元数据;
    计算所述目标数据与所述多个请求元数据的总数据大小;
    分配与所述总数据大小一致的第一内存;
    将所述目标数据以及所述多个请求元数据填入所述第一内存;
    向所述多个数据存储节点分别发送所述目标数据及各自对应的请求元数据。
  3. 根据权利要求1所述的方法,其特征在于,所述存储端包括多个数据存储节点;
    所述发送所述目标数据至存储端包括:
    确定所述待存储数据分别对应所述多个数据存储节点的请求元数据;
    计算所述目标数据与所述多个请求元数据的总数据大小;
    分配与所述总数据大小一致的第一内存;
    将所述目标数据填入所述第一内存,以及在所述目标数据的尾部依次填入多个请求元数据;
    针对任一数据存储节点,将所述第一内存中的所述目标数据、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请求元数据发送至所述任一存储节点;所述任一数据存储节点用于从接收数据的尾部读取其对应的请求元数据。
  4. 根据权利要求3所述的方法,其特征在于,所述在所述目标数据的尾部依次填入多个请求元数据包括:
    按照所述多个数据存储节点的发送顺序,在所述目标数据的尾部依次填入所述多个数据存储节点各自对应的请求元数据;
    所述针对任一数据存储节点,从所述第一内存中将所述目标数据、所述任一数据存储节点对应的请求元数据以及位于所述任一数据存储节点对应的请求元数据之前的请 求元数据发送至所述任一存储节点包括:
    针对任一数据存储节点,从所述第一内存中将所述目标数据、与所述任一数据存储节点的发送顺序对应的请求元数据以及位于与所述任一数据存储节点的发送顺序对应的请求元数据之前的请求元数据发送至所述任一存储节点。
  5. 根据权利要求1所述的方法,其特征在于,所述在待存储数据中添加预留字段,以获得目标数据包括:
    在所述待存储数据尾部添加预设大小的预定字符串作为所述预留字段,以获得目标数据。
  6. 根据权利要求1所述的方法,其特征在于,所述发送所述目标数据至存储端包括:
    分配与所述目标数据的数据大小一致的第一内存;
    将所述目标数据填入所述第一内存;
    从所述第一内存中将所述目标数据发送至存储端。
  7. 一种数据处理方法,其特征在于,包括:
    获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
    分配第二内存以缓存所述目标数据;
    生成所述目标数据中所述待存储数据的数据元数据;
    将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
  8. 根据权利要求7所述的方法,其特征在于,所述预留字段位于所述待存储数据的尾部;
    所述生成所述目标数据中所述待存储数据的数据元数据包括:
    基于所述预留字段的预设大小,确定所述目标数据中的所述待存储数据;
    生成所述待存储数据的数据元数据。
  9. 根据权利要求7所述的方法,其特征在于,所述获取目标数据包括:
    接收请求端发送的写请求;所述写请求包括所述目标数据以及至少一个请求元数据;
    确定目标请求元数据;
    所述生成所述目标数据中所述待存储数据的数据元数据包括:
    基于所述目标请求元数据,从所述发送数据中读取所述待存储数据;
    基于所述目标请求元数据以及所述待存储数据,生成数据元数据。
  10. 根据权利要求9所述的方法,其特征在于,所述写请求包括请求端从第一内存中 请求发送的所述目标数据、所述目标请求元数据以及位于所述目标请求元数据之前的请求元数据;
    所述确定目标请求元数据包括:
    基于所述请求元数据的数据大小,从所述写请求的尾部读取目标请求元数据。
  11. 一种数据处理装置,其特征在于,包括:
    数据构造模块,用于在待存储数据中添加预留字段,以获得目标数据;
    数据发送模块,用于发送所述目标数据至存储端;
    其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
  12. 一种数据处理装置,其特征在于,包括:
    数据获取模块,用于获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
    内存分配模块,用于分配第二内存以缓存所述目标数据;
    数据生成模块,用于生成所述目标数据中所述待存储数据的数据元数据;
    数据写入模块,用于将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
  13. 一种计算设备,其特征在于,包括存储组件以及处理组件,
    所述存储组件用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用执行;
    所述处理组件用于:
    在待存储数据中添加预留字段,以获得目标数据;
    发送所述目标数据至存储端;
    其中,所述预留字段在所述存储端对应的内存位置用以写入所述待存储数据的数据元数据,以使得所述待存储数据以及所述数据元数据作为整体写入存储设备。
  14. 一种计算设备,其特征在于,包括存储组件以及处理组件,
    所述存储组件用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用执行;
    所述处理组件用于:
    获取目标数据;所述目标数据由请求端在待存储数据中添加预留字段构成;
    在所述存储组件中分配第二内存以缓存所述目标数据;
    生成所述目标数据中所述待存储数据的数据元数据;
    将所述数据元数据写入所述第二内存中所述预留字段对应的内存位置。
PCT/CN2019/070580 2018-01-09 2019-01-07 数据处理方法、装置及计算设备 WO2019137321A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2020537757A JP7378403B2 (ja) 2018-01-09 2019-01-07 データ処理方法、装置、およびコンピューティングデバイス
EP19738475.3A EP3739450A4 (en) 2018-01-09 2019-01-07 DATA PROCESSING PROCESS AND APPARATUS, AND COMPUTER DEVICE
US16/924,028 US11354050B2 (en) 2018-01-09 2020-07-08 Data processing method, apparatus, and computing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810020121.0A CN110018897B (zh) 2018-01-09 2018-01-09 数据处理方法、装置及计算设备
CN201810020121.0 2018-01-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/924,028 Continuation US11354050B2 (en) 2018-01-09 2020-07-08 Data processing method, apparatus, and computing device

Publications (1)

Publication Number Publication Date
WO2019137321A1 true WO2019137321A1 (zh) 2019-07-18

Family

ID=67187831

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/070580 WO2019137321A1 (zh) 2018-01-09 2019-01-07 数据处理方法、装置及计算设备

Country Status (5)

Country Link
US (1) US11354050B2 (zh)
EP (1) EP3739450A4 (zh)
JP (1) JP7378403B2 (zh)
CN (1) CN110018897B (zh)
WO (1) WO2019137321A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3109862A1 (en) * 2020-02-20 2021-08-20 Comcast Cable Communications, Llc Systems, methods, and apparatuses for storage management
CN111800353B (zh) * 2020-06-30 2022-04-12 翱捷科技股份有限公司 一种嵌入式系统内存零拷贝的方法及装置
CN113965627A (zh) * 2020-07-02 2022-01-21 北京瀚海云星科技有限公司 一种发送数据的方法、低延时接收数据的方法及相关装置
CN113886294A (zh) * 2020-07-02 2022-01-04 北京瀚海云星科技有限公司 一种基于rdma的低延时数据传输方法及相关装置
CN112148795B (zh) * 2020-09-27 2021-06-15 上海依图网络科技有限公司 一种数据处理方法、装置、设备及介质
CN114546261B (zh) * 2022-01-07 2023-08-08 苏州浪潮智能科技有限公司 一种分布式对象存储中对象移动优化方法与系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101382948A (zh) * 2008-10-14 2009-03-11 成都市华为赛门铁克科技有限公司 一种文件存储方法、装置和系统
JP2010009698A (ja) * 2008-06-27 2010-01-14 Toshiba Corp メタデータ管理方法、データ記録装置、及び情報記憶媒体。
CN106294193A (zh) * 2015-06-03 2017-01-04 杭州海康威视系统技术有限公司 存储设备及基于该存储设备的分块存储方法
CN107122140A (zh) * 2017-05-02 2017-09-01 郑州云海信息技术有限公司 一种基于元数据信息的文件智能存储方法
CN107656939A (zh) * 2016-07-26 2018-02-02 南京中兴新软件有限责任公司 文件写入方法及装置

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826650B1 (en) 2000-08-22 2004-11-30 Qlogic Corporation Disk controller configured to perform out of order execution of write operations
JP4186509B2 (ja) * 2002-05-10 2008-11-26 株式会社日立製作所 ディスクシステムとそのキャッシュ制御方法
US7877543B2 (en) * 2004-12-03 2011-01-25 Hewlett-Packard Development Company, L.P. System and method for writing data and a time value to an addressable unit of a removable storage medium
KR101257848B1 (ko) 2005-07-13 2013-04-24 삼성전자주식회사 복합 메모리를 구비하는 데이터 저장 시스템 및 그 동작방법
JP2008027383A (ja) 2006-07-25 2008-02-07 Toshiba Corp 情報記録装置及びその制御方法
TWI326028B (en) 2006-11-20 2010-06-11 Silicon Motion Inc Method for flash memory data management
JP4282733B1 (ja) 2007-12-13 2009-06-24 株式会社東芝 ディスク記憶装置及びデータ書き込み方法
US8732388B2 (en) * 2008-09-16 2014-05-20 Micron Technology, Inc. Embedded mapping information for memory devices
US8533397B2 (en) * 2009-01-06 2013-09-10 International Business Machines Corporation Improving performance in a cache mechanism by way of destaging data in partial strides
JP4706029B2 (ja) 2009-03-19 2011-06-22 富士通株式会社 ストレージ装置、データ書き込み方法、及びデータ書き込みプログラム
US8327076B2 (en) 2009-05-13 2012-12-04 Seagate Technology Llc Systems and methods of tiered caching
JP4621794B1 (ja) 2009-07-22 2011-01-26 株式会社東芝 キャッシュメモリ制御方法およびキャッシュメモリを備えた情報記憶装置
KR101662827B1 (ko) 2010-07-02 2016-10-06 삼성전자주식회사 쓰기 패턴에 따라 데이터 블록의 쓰기 모드를 선택하는 메모리 시스템 및 그것의 데이터 쓰기 방법
US8352685B2 (en) 2010-08-20 2013-01-08 Apple Inc. Combining write buffer with dynamically adjustable flush metrics
KR101739556B1 (ko) 2010-11-15 2017-05-24 삼성전자주식회사 데이터 저장 장치, 사용자 장치 및 그것의 주소 맵핑 방법
US8484408B2 (en) 2010-12-29 2013-07-09 International Business Machines Corporation Storage system cache with flash memory in a raid configuration that commits writes as full stripes
JP5853899B2 (ja) 2012-03-23 2016-02-09 ソニー株式会社 記憶制御装置、記憶装置、情報処理システム、および、それらにおける処理方法
KR20140128824A (ko) * 2013-04-29 2014-11-06 삼성전자주식회사 속성 데이터를 이용한 데이터 관리 방법
KR102074329B1 (ko) 2013-09-06 2020-02-06 삼성전자주식회사 데이터 저장 장치 및 그것의 데이터 처리 방법
CN103559143A (zh) * 2013-11-08 2014-02-05 华为技术有限公司 数据拷贝管理装置及其数据拷贝方法
US10423339B2 (en) * 2015-02-02 2019-09-24 Western Digital Technologies, Inc. Logical block address mapping for hard disk drives
CN105159607A (zh) * 2015-08-28 2015-12-16 浪潮(北京)电子信息产业有限公司 一种基于离散存储的高速写入的方法
US20170123991A1 (en) 2015-10-28 2017-05-04 Sandisk Technologies Inc. System and method for utilization of a data buffer in a storage device
CN105389128B (zh) * 2015-11-06 2019-04-19 成都华为技术有限公司 一种固态硬盘数据存储方法及存储控制器
CN109582599B (zh) * 2017-09-29 2023-12-22 上海宝存信息科技有限公司 数据储存装置以及非挥发式存储器操作方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010009698A (ja) * 2008-06-27 2010-01-14 Toshiba Corp メタデータ管理方法、データ記録装置、及び情報記憶媒体。
CN101382948A (zh) * 2008-10-14 2009-03-11 成都市华为赛门铁克科技有限公司 一种文件存储方法、装置和系统
CN106294193A (zh) * 2015-06-03 2017-01-04 杭州海康威视系统技术有限公司 存储设备及基于该存储设备的分块存储方法
CN107656939A (zh) * 2016-07-26 2018-02-02 南京中兴新软件有限责任公司 文件写入方法及装置
CN107122140A (zh) * 2017-05-02 2017-09-01 郑州云海信息技术有限公司 一种基于元数据信息的文件智能存储方法

Also Published As

Publication number Publication date
JP7378403B2 (ja) 2023-11-13
EP3739450A4 (en) 2021-10-27
US11354050B2 (en) 2022-06-07
EP3739450A1 (en) 2020-11-18
CN110018897A (zh) 2019-07-16
JP2021510222A (ja) 2021-04-15
CN110018897B (zh) 2023-05-26
US20200341661A1 (en) 2020-10-29

Similar Documents

Publication Publication Date Title
WO2019137321A1 (zh) 数据处理方法、装置及计算设备
US10896102B2 (en) Implementing secure communication in a distributed computing system
US9928250B2 (en) System and method for managing deduplication using checkpoints in a file storage system
CN108351860B (zh) 低延迟的基于rdma的分布式存储装置
US10713210B2 (en) Distributed self-directed lock-free RDMA-based B-tree key-value manager
US8452930B2 (en) Methods and apparatus for backup and restore of thin provisioning volume
US11093387B1 (en) Garbage collection based on transmission object models
KR20120063926A (ko) 비대칭 클러스터링 파일시스템에서의 패리티 산출 방법
JP2010102738A (ja) ハードウェアベースのファイルシステムのための装置および方法
US9164856B2 (en) Persistent messaging mechanism
WO2019001521A1 (zh) 数据存储方法、存储设备、客户端及系统
WO2015035768A1 (zh) 一种独立磁盘冗余阵列raid系统扩容方法及装置
US11314639B2 (en) Protecting against data loss during garbage collection
EP3739440A1 (en) Distributed storage system, data processing method and storage node
US10963177B2 (en) Deduplication using fingerprint tries
US8473693B1 (en) Managing ownership of memory buffers (mbufs)
US11947419B2 (en) Storage device with data deduplication, operation method of storage device, and operation method of storage server
CN108572888A (zh) 磁盘快照创建方法和磁盘快照创建装置
US11144243B2 (en) Method and device for managing redundant array of independent disks and computer program product
WO2019109209A1 (zh) 一种存储器数据替换方法、服务器节点和数据存储系统
US20150135001A1 (en) Persistent messaging mechanism
US11514016B2 (en) Paging row-based data stored as objects
US10235098B1 (en) Writable clones with minimal overhead
CN117056294A (zh) 一种wal处理方法、装置、电子设备及存储介质
CN111367712A (zh) 一种数据处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19738475

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020537757

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019738475

Country of ref document: EP

Effective date: 20200810