WO2017157158A1 - 写数据的方法及装置、计算机存储介质 - Google Patents

写数据的方法及装置、计算机存储介质 Download PDF

Info

Publication number
WO2017157158A1
WO2017157158A1 PCT/CN2017/075058 CN2017075058W WO2017157158A1 WO 2017157158 A1 WO2017157158 A1 WO 2017157158A1 CN 2017075058 W CN2017075058 W CN 2017075058W WO 2017157158 A1 WO2017157158 A1 WO 2017157158A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
address
write
storage
target
Prior art date
Application number
PCT/CN2017/075058
Other languages
English (en)
French (fr)
Inventor
包俊
高洪
韩银俊
郭斌
陈正华
陈洪锋
吴宏松
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017157158A1 publication Critical patent/WO2017157158A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present invention relates to the field of data storage technologies, and in particular, to a method and an apparatus for writing data, and a computer storage medium.
  • Snapshot technology is currently commonly used to back up data.
  • SNIA Storage Networking Industry Association
  • COFW Copy OnFirst Write
  • ROW Redirect On Write
  • the storage of data is in the smallest unit of Chunk.
  • a new chunk space of chunk size is usually allocated for writing.
  • the write data in the write operation is smaller than the chunk size, the data of the old chunk in the source volume is read first, and then the data is merged and then written into the new chunk space. Since the method of writing the data needs to read the data first and merge. The data is written to the chunk, and the write operation process is cumbersome. Especially when the write operation is frequent, a large number of read and merge operations reduce the performance of the write data.
  • Embodiments of the present invention provide a method and apparatus for writing data, and a computer storage medium, which are intended to improve the performance of a write operation during a process of redirecting a snapshot during writing.
  • An embodiment of the present invention provides an apparatus for writing data, where the apparatus for writing data includes:
  • the first obtaining module is configured to obtain the size information of the write data included in the write operation command when the write operation command to the source volume is received before the snapshot generation command is obtained. ;
  • the write operation module is configured to apply, from the preset storage area, a storage space matching the size information according to the size information of the write data, and perform a write operation in the applied storage space.
  • the write operation command includes the write data, and the write data is to be written into a first target write address of the source volume;
  • the write operation module includes:
  • a storage space requesting unit configured to apply for a first storage space from the storage area according to size information of the written data, where the first storage space size is equal to a size of the write data and the first target write address The sum of the data sizes;
  • a storage unit configured to store the write data and the first target write address in the first storage space
  • an update unit configured to update a correspondence between the first target write address and the first storage address of the first storage space into a mapping table, and record an update of the correspondence in the mapping table
  • the mapping table is configured to save a correspondence between an address in the source volume and a storage address of the storage area and an update time of the corresponding relationship.
  • the device for writing data further includes:
  • the second obtaining module is configured to obtain a target read address in the read operation command when receiving a read operation command for the source volume before the snapshot generation command is obtained.
  • a first determining module configured to determine, if the target read address is included in the mapping table, determine whether the update time of the target read address in the mapping table is after the last snapshot is generated based on the target read address ;
  • a data merge module configured to include the target read address and the target in the mapping table Reading a corresponding storage time in the mapping table, after searching for the snapshot based on the target read address for the last time, searching for a second storage address corresponding to the target read address in the mapping table, and acquiring the storage
  • the data in the second storage address in the area merges the data in the second storage address with the acquired first data into second data, and applies the second storage space in the storage area to save the second data. Updating a correspondence between the target read address and a storage address of the second storage space to the mapping table, where the first data is a corresponding storage of the target read address on the source volume
  • the data in the block
  • a read operation module configured to not include the target read address in the mapping table or the target read address is included in the mapping table, but the update time corresponding to the target read address in the mapping table is not based on the last time When the target read address is generated after the snapshot, the data of the target read address in the corresponding storage block on the source volume is read.
  • the device for writing data further includes:
  • a receiving module configured to determine, according to the mapping table, whether the source volume has a write before receiving the snapshot generation command, if a snapshot generation command for the source volume is received during a redirection process during writing operating;
  • a second determining module configured to: when the source volume has a write operation before receiving the snapshot generation command, obtain an address of a write operation to the source volume as a third target write address, and search for the location according to the mapping table. Obtaining an address in the storage area corresponding to the third target write address, acquiring data in the address as third data, and acquiring a size of the third data;
  • the snapshot generation module is configured to generate a snapshot according to the third data, the size of the third data, and the third target write address.
  • the snapshot generating module includes:
  • a determining unit configured to determine whether the size of the third data is equal to a size of the third storage block corresponding to the third target write address in the source volume
  • a write operation unit configured to when a size of the third data and a size of the third storage block When equal, writing the third data back to the third storage block of the source volume
  • a snapshot generating unit configured to acquire data in the third storage block when the size of the third data is not equal to a size of the third storage block, and to use the third data and the third storage The data in the block is merged into fourth data, and the fourth data is written back to the third memory block of the source volume.
  • the read operation module, the receiving module, the second determining module, the snapshot generating module, the determining unit, the writing operating unit, and the snapshot generating unit may use a central processing unit when performing processing (CPU, Central Processing Unit), digital signal processor (DSP, Digital Singnal Processor) or Programmable Array Array (FPGA).
  • CPU Central Processing Unit
  • DSP Digital Singnal Processor
  • FPGA Programmable Array Array
  • the embodiment of the invention further provides a method for writing data, and the method for writing data includes:
  • the size information of the write data included in the write operation command is acquired when the write operation command to the source volume is received before the snapshot generation command is obtained.
  • the write operation command further includes the write data, and the write data is to be written into a first target write address of the source volume;
  • And applying, according to the size information of the write data, a storage space that matches the size information from the preset storage area, and performing a write operation in the applied storage space includes:
  • mapping The table is configured to save a correspondence between an address in the source volume and a storage address of the storage area and an update time of the correspondence.
  • the method for writing data further includes:
  • the target read address in the read operation command is acquired when a read operation command to the source volume is received before the snapshot generation command is obtained;
  • mapping table determines whether the corresponding update time of the target read address in the mapping table is after the last snapshot is generated based on the target read address
  • the mapping table is searched for and Obtaining, by the second storage address corresponding to the target read address, the data in the second storage address in the storage area, and combining the data in the second storage address with the acquired first data into the second data, Applying a second storage space in the storage area to save the second data, and updating a correspondence between the target read address and a storage address of the second storage space to the mapping table, where the first Data is data in the corresponding storage block of the target read address on the source volume;
  • the update time corresponding to the target read address in the mapping table is not based on the target read address last time After the snapshot is generated, the data in the corresponding storage block of the target read address on the source volume is read.
  • the method for writing data further includes:
  • the mapping table is determined according to the mapping table whether the source volume has a write operation before receiving the snapshot generation command.
  • obtaining an address of a write operation to the source volume is a third target write address, and searching according to the mapping table corresponds to the third target write address
  • the address in the storage area, the data in the address is obtained as the third data, and the size of the third data is acquired;
  • a snapshot is generated according to the third data, the size of the third data, and the third target write address.
  • the generating a snapshot according to the third data, the size of the third data, and the third target write address includes:
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to execute the above method for writing data.
  • the solution for writing data acquires the write included in the write operation command when receiving a write operation command to the source volume before the snapshot generation command is obtained during the redirection of the snapshot.
  • the size information of the data; the storage space matching the size information is applied from the preset storage area according to the size information of the write data, and the write operation is performed in the storage space that is requested.
  • FIG. 1 is a schematic diagram of functional modules of a first embodiment of an apparatus for writing data according to the present invention
  • FIG. 2 is a schematic diagram of a refinement function module of the write operation module 20 in the embodiment shown in FIG. 1 according to the present invention
  • FIG. 3 is a schematic diagram of functional modules of a third embodiment of an apparatus for writing data according to the present invention.
  • FIG. 4 is a schematic diagram of functional modules of a fourth embodiment of an apparatus for writing data according to the present invention.
  • FIG. 5 is a schematic diagram of functional modules of the snapshot generation module 90 in the embodiment shown in FIG. 4;
  • FIG. 6 is a schematic flow chart of a first embodiment of a method for writing data according to the present invention.
  • FIG. 7 is a schematic flowchart showing a refinement process of applying a storage space from a preset storage area according to the size of the write data according to the size of the write data in the embodiment shown in FIG. 6;
  • FIG. 8 is a schematic flowchart diagram of a third embodiment of a method for writing data according to the present invention.
  • FIG. 9 is a schematic flow chart of a fourth embodiment of a method for writing data according to the present invention.
  • FIG. 10 is a schematic diagram showing a refinement process of generating a snapshot according to the third data, the size of the third data, and the third target write address in step S100 in the embodiment shown in FIG.
  • the apparatus for writing data includes:
  • the first obtaining module 10 is configured to obtain the size of the write data included in the write operation command when receiving a write operation command to the source volume before the snapshot generation command is obtained.
  • the write operation module 20 is configured to apply a storage space matching the size information from the preset storage area according to the size information of the write data, and perform a write operation in the applied storage space.
  • the device for writing data provided by the present invention is mainly applied to a distributed storage technology, and is a device for writing data when using a snapshot technology that is redirected at the time of writing.
  • the storage of data is in the smallest unit of Chunk.
  • Each storage system can divide the storage space into a number of chunks of the same size as needed, and store the data in the chunk, for example, the storage can be The space is divided into a number of blocks, each of which is divided into 4 megabytes or 112 bytes.
  • the pair may be received.
  • the write operation command for the source volume For example, during the redirection of the snapshot, the snapshots are taken at the time of A, B, and C. After the snapshot is completed at time A, the write operation command may be received before the snapshot generation command is received before the snapshot is generated.
  • the read operation command, the write operation or the read operation that occurs during the redirection of the snapshot during writing is specifically a write redirection operation or a read redirection operation.
  • the source volume in the first obtaining module 10 refers to a data storage space to be backed up during the snapshot process, and may take snapshots of several blocks as needed.
  • the above write operation is to write data to the source volume.
  • the size information of the write data included in the write operation command is acquired, and the write data is data to be written to the source volume, usually The data is stored in the form of bytes in the memory, so the size information of the obtained write data can be obtained by obtaining the number of bytes of the write data.
  • the storage space After obtaining the size information of the written data, the storage space is requested in the preset storage area according to the size information of the written data, instead of applying for the storage space according to the size information of the chunk, and then writing in the newly applied storage space.
  • the preset storage area is a storage space other than the source volume in the storage space, and a new space may be selected for storage according to requirements.
  • the number of writes included in the write operation command is obtained when a write operation command to the source volume is received before the snapshot generation command is obtained.
  • the storage space matching the size information is applied from the preset storage area according to the size information of the written data, and the write operation is performed in the storage space that is requested.
  • the write operation command includes the write data, and the Write data to be written to the first target write address of the source volume;
  • the refinement function module of the write operation module 20 includes:
  • the storage space application unit 21 is configured to apply, from the storage area, a first storage space according to the size information of the write data, where the size of the first storage space is equal to the size of the write data and the first target write The sum of the data sizes of the addresses;
  • the storage unit 22 is configured to store the write data and the first target write address in the first storage space
  • the updating unit 23 is configured to update a correspondence between the first target write address and the first storage address of the first storage space into a mapping table, and record the correspondence in the mapping table
  • the update time is used to save a correspondence between an address in the source volume and a storage address of the storage area and an update time of the corresponding relationship.
  • the write data in the acquiring unit is the data to be written to the source volume in the write operation command
  • the first target write address is the address to be written in the write data, and may be the write data to be written.
  • the address of the chunk on the source volume can also be the address of the chunk on the source volume where the write data is to be written and the offset address of the data in the chunk.
  • the first storage space is used for storing the written data and the address of the source volume to be written.
  • the size of the first storage space is equal to the sum of the size of the write data and the data size of the first target write address.
  • the correspondence between the first target write address of the source volume and the first storage address of the first storage space is updated to the mapping table
  • the first storage address is the address of the first storage space
  • the update time is recorded in the mapping table. It can be understood that the update time may not be limited to being recorded in the mapping table, or may be recorded elsewhere, and may be determined by time.
  • the new write data stored in the preset storage area can be found through the mapping table, and the chunk address on the source volume corresponding to the newly written data can also be found.
  • the data abcd is stored, and when the command to write m in the d position of the first chunk is received before the snapshot generation command is received, the preset storage area applies for a new storage space.
  • the size of the storage space requested at this time is equal to the size of the data m and the address length of the source volume d. with.
  • the address of d in the mapping table and the address of m are then updated so that the newly written data m can be found when reading d.
  • the size of the storage space requested in the preset storage area is equal to the sum of the size of the write data and the data size of the write address in the write operation, and the data of the original chunk is not required to be read, thereby improving the efficiency of writing and saving the saving. storage.
  • a second embodiment of the apparatus for writing data according to the present invention in the third embodiment of the apparatus for writing data according to the present invention, the apparatus for writing data further includes:
  • the second obtaining module 30 is configured to obtain a target read address in the read operation command when receiving a read operation command for the source volume before the snapshot generation command is obtained.
  • the first determining module 40 is configured to: if the target read address is included in the mapping table, determine whether the update time of the target read address in the mapping table is the last time based on the target read address to generate a snapshot after that;
  • the data merge module 50 is configured to: when the mapping table includes the target read address and the target read address corresponds to the update time in the mapping table, the last time based on the target read location After the snapshot is generated, the second storage address corresponding to the target read address is searched in the mapping table, and the data in the second storage address in the storage area is obtained, and the second storage address is obtained. Combining the acquired data with the acquired first data into the second data, applying the second storage space in the storage area to save the second data, and between the target read address and the storage address of the second storage space Updating the correspondence to the mapping table, where the first data is data in the corresponding storage block of the target read address on the source volume;
  • the read operation module 60 is configured to not include the target read address in the mapping table or the target read address is included in the mapping table, but the update time corresponding to the target read address in the mapping table is not last. Reading data after the snapshot is generated based on the target read address, the target read address is in the corresponding storage block on the source volume.
  • the pair may be received.
  • the read operation command of the source volume the read operation command can read the source volume or only some of the chunks.
  • the write operation command may be received before the read operation command is received, or the write operation command may not be received, that is, the write operation may be performed before the read operation command is received, or the write operation may not be performed.
  • the target read address in the second acquisition module 30 is the address of the data to be read in the read operation command.
  • first obtain the target read address in the read operation command and look up the mapping table according to the address of the read operation command to see whether there is a record of the target read address in the mapping table, and when present, indicate that the target reads
  • the address is recorded as the target write address, and then according to the update time of the target read address record, it is determined whether the corresponding update time of the target read address in the mapping table is after the last snapshot is generated based on the target read address, and the purpose is to determine the write operation. Whether the time is at the target read address between the last snapshot generated and the current read operation.
  • the target read address to be read is not recorded in the mapping table, indicating that the target read address to be read has not been written, the data on the target read address has not changed, and the target read address can be directly read at the source volume.
  • the data in the corresponding storage block When the block or source volume to be read has been written, but the update time of the write operation is judged according to the update time of the target read address record, before the last snapshot is generated, it means that the last snapshot is generated to the current read operation. There is no write operation between them. At this time, the data on the block or source volume to be read does not change after the last snapshot is generated. In this case, the target read address can be directly read in the corresponding storage block on the source volume. data.
  • the record of the target read address is included in the mapping table and the target read address is in the mapping table corresponding to the update time after the last snapshot is generated based on the target read address, indicating that the target read address has been written after the snapshot is generated, indicating After the last snapshot is generated, the data stored in the storage block corresponding to the target read address is updated, and since only the data to be actually written is written when the write operation is performed, when the entire block or the source volume is to be written When reading a read operation, you need to combine the source volume with the data in the write space to read it.
  • the second storage address is an address for storing the write data in the newly applied space in the preset storage area when the second target write address performs the write operation, and the correspondence between the second storage address and the target read address may be obtained through the mapping table.
  • the first data is data of a target read address in a corresponding storage block on the source volume. After the data in the first data and the second storage address are acquired, the second storage address is merged with the first data, specifically, the data in the second storage address is merged into the corresponding position of the first data, and the merged data is For the second data.
  • the second storage space is a storage space newly saved for storing the second data in the preset storage area.
  • the second data is the latest data after the write operation, and the second data and the second storage space are saved.
  • the second data corresponds to the address of the source volume.
  • the second data in the second storage space at this time is read when the data is read.
  • the correspondence between the second target write address and the address of the second storage space is updated to the mapping table, and the original space is released, that is, when the original write data is released, the application is released. Space for storing write data and storage addresses.
  • the snapshot is generated in the order of A, B, and C.
  • the snapshot at the time A if the read operation command to the source volume is obtained, the read operation is obtained.
  • the read address is the third chunk, it is determined according to the record of the third chunk in the lookup mapping table whether the location has a write operation during the receipt of the read operation command and the last snapshot generation, when there is no third chunk record or the third
  • the update time of the chunk's record is not after the A time, and the data in the third chunk is directly read.
  • the record of the third chunk is obtained from the mapping table and the update time is after the time A, indicating that the write operation is performed after the time A, and the mapping corresponding to the third chunk is found by searching the mapping table.
  • the address in the storage area is found to be m in the storage area according to the address in the storage area corresponding to the third chunk, and the modified location is the fifth offset address in the third chunk, that is, the e in the abcde is changed to m.
  • merge m with abcde specifically, write m to the e of the third chunk to generate new data abcdm, apply for a new space in the preset storage area, save the address of abcdm and the corresponding third chunk, and update the third in the mapping table.
  • the chunk stores the address of the new space of abcdm in the address and storage area of the source volume, and releases the space for storing e.
  • reading data read the latest data abcdm.
  • the data is read by determining whether there is a write operation between the received read command and the most recently generated snapshot, so that the read data is the latest real-time data.
  • the apparatus for writing data further includes:
  • the receiving module 70 is configured to determine, according to the mapping table, whether the source volume has before the snapshot generation command is received, if the snapshot generation command for the source volume is received during the redirection of the snapshot.
  • the second determining module 80 is configured to: when the source volume has a write operation before receiving the snapshot generation command, obtain an address of a write operation to the source volume as a third target write address, and search and search according to the mapping table. An address in the storage area corresponding to the third target write address, acquiring the location The data in the address is the third data, and the size of the third data is acquired at the same time;
  • the snapshot generation module 90 is configured to generate a snapshot according to the third data, the size of the third data, and the third target write address.
  • the snapshot may be taken for the first time, or the old snapshot may be generated.
  • the source volume Before receiving the snapshot generation command, the source volume may have performed the operation of writing data or reading data. When there is a write redirection, it indicates that the data in the source volume has the modification of the write data, in order to generate the latest data.
  • the data snapshot first determines whether the source volume has a write operation before receiving the snapshot generation command. When there is no write operation, the content of the source volume at this moment is the latest data.
  • the third target write address in the second judging module is an address at which data is to be written when the write operation is performed on the source volume.
  • the third target write address and the corresponding address in the storage area can be queried through the mapping table, and the data in the corresponding address in the storage area is obtained as the third data.
  • the corresponding address is stored.
  • the write data stored in the storage area when the data is written.
  • the corresponding address stores the latest data that merges the original write data with the original chunk when the data is read. How to perform data merging is specifically explained in the second embodiment of the device for writing data, and details are not described herein again.
  • a method of determining whether the source volume has a write operation before receiving the snapshot generation command may be obtained by looking up a mapping table.
  • the size of the third data may be obtained by acquiring the number of bytes of the third data, and then generating a snapshot according to the third data, the size of the third data, and the write address of the third data.
  • the snapshot generation module 90 includes:
  • the determining unit 91 is configured to determine whether the size of the third data is equal to the size of the third storage block corresponding to the third target write address in the source volume;
  • the write operation unit 92 is configured to write the third data back to the third storage block of the source volume when the size of the third data is equal to the size of the third storage block;
  • the snapshot generating unit 93 is configured to acquire data in the third storage block when the size of the third data is not equal to the size of the third storage block, and the third data and the third The data in the memory block is merged into fourth data, and the fourth data is written back to the third memory block of the source volume.
  • the third storage block refers to a block corresponding to the source volume of the third target write address, that is, a block corresponding to a write operation when performing a write operation, for example, receiving a write operation to source the second volume of the chunk.
  • the third target write address is the address of the second chunk
  • the data written is c
  • the third storage block corresponding to the third target write address is the second chunk.
  • the write redirection The write data in the storage space is already the latest data, and the third data is written back to the third storage block of the source volume, that is, written back to the chunk where the write operation has been generated.
  • the size of the third data is not equal to the size of the third storage block, it indicates that the write operation has been performed at this time, but only part of the data in the chunk is rewritten and the read operation is not performed, and the latest snapshot is needed to generate the snapshot. data.
  • the data in the third storage block at the time of the write operation is acquired.
  • the data in the third storage block is the data before the write operation, and the third data is combined with the data in the third storage block to generate the fourth data.
  • the fourth data is the latest data corresponding to the third storage block of the source volume, and the fourth data is written back to the third storage block of the source volume.
  • the data stored in the third chunk on the source volume is abcde.
  • the write operation is performed to change e to m, and the read operation command is not received.
  • the search is performed.
  • the mapping table has been written to the source volume.
  • the data in the storage space of the write operation is obtained, that is, m is obtained, and the size of m is compared with the size of the chunk. It is easy to obtain that the size of m is smaller than the size of the chunk, and the data of the third chunk on the source volume is obtained, and m is Abcde in the third chunk Merge generates new data.
  • the corresponding location of the third chunk is written as the new data abcdm.
  • abcdm is the latest data of the third chunk of the source volume
  • abcdm is written to the third chunk on the source volume
  • the third chunk is The data is the content of the generated snapshot.
  • the snapshot is generated according to the size of the data of the write operation, and the time data generated by the snapshot is ensured to be the latest real-time data.
  • the efficiency of snapshot generation is improved.
  • the method of writing data includes the steps of:
  • step S10 during the redirection of the snapshot, the size information of the write data included in the write operation command is acquired when the write operation command to the source volume is received before the snapshot generation command is obtained.
  • Step S20 Apply a storage space matching the size information from the preset storage area according to the size information of the write data, and perform a write operation in the applied storage space.
  • the method for writing data provided by the present invention is mainly applied to a distributed storage technology, and is a method for writing data when using a snapshot technology that is redirected at the time of writing.
  • the storage of data is in the smallest unit of Chunk.
  • Each storage system can divide the storage space into a number of chunks of the same size as needed, and store the data in the chunk, for example, the storage can be The space is divided into a number of blocks, each of which is divided into 4 megabytes or 112 bytes.
  • the pair may be received.
  • the write operation command for the source volume For example, during the redirection of the snapshot, the snapshots are taken at the time of A, B, and C. After the snapshot is completed at time A, the write operation command may be received before the snapshot generation command is received before the snapshot is generated.
  • Read operation command redirect when writing A write or read operation that occurs during a snapshot is specifically a write redirection operation or a read redirection operation.
  • the above source volume refers to the data storage space to be backed up during the snapshot process. You can take snapshots of several blocks as needed.
  • the above write operation is to write data to the source volume.
  • the size information of the write data included in the write operation command is acquired, and the write data is data to be written to the source volume, usually
  • the data is stored in the form of bytes in the memory, so the size information of the obtained write data can be obtained by obtaining the number of bytes of the write data.
  • the storage space After obtaining the size information of the written data, the storage space is requested in the preset storage area according to the size information of the written data, instead of applying for the storage space according to the size information of the chunk, and then writing in the newly applied storage space.
  • the preset storage area is a storage space other than the source volume in the storage space, and a new space may be selected for storage according to requirements.
  • the size information of the write data included in the write operation command is acquired when the write operation command to the source volume is received before the snapshot generation command is obtained. And applying, according to the size information of the write data, a storage space matching the size information from a preset storage area, and performing a write operation in the applied storage space. By applying the matching data space according to the size information of the write operation of the write operation, writing the write data included in the write operation command and the address to be written to the source volume, without reading the original chunk data, achieving the purpose of efficiently writing data. .
  • the write operation command further includes the write data, and Writing data to be written to a first target write address of the source volume,
  • the refinement step of the foregoing step S20 includes:
  • Step S21 applying, according to the size information of the write data, the first storage space from the storage area, where the size of the first storage space is equal to the size of the write data and the data size of the first target write address.
  • Step S22 storing the write data and the first target write location in the first storage space. site
  • Step S23 updating the correspondence between the first target write address and the first storage address of the first storage space to the mapping table, and recording the update time of the corresponding relationship in the mapping table
  • the mapping table is configured to save a correspondence between an address in the source volume and a storage address of the storage area and an update time of the corresponding relationship.
  • the write data is data to be written to the source volume in the write operation command
  • the first target write address is an address to be written to the write data, and may be a source volume to which the write data is to be written.
  • the address of the chunk can also be the address of the chunk on the source volume to which the data is written and the offset address of the data in the chunk.
  • the first storage space is for storing the written data and the address of the source volume to be written, and the size of the first storage space is equal to the sum of the size of the write data and the data size of the first target write address.
  • the correspondence between the first target write address of the source volume and the first storage address of the first storage space is updated to the mapping table, where the first storage address is the address of the first storage space, and
  • the update time is recorded in the mapping table. It can be understood that the update time may not be limited to being recorded in the mapping table, or may be recorded elsewhere, and the time may be used to determine when the write operation is performed.
  • the mapping table is updated, the new write data stored in the preset storage area can be found through the mapping table, and the chunk address on the source volume corresponding to the newly written data can also be found.
  • the data abcd is stored, and when the command to write m in the d position of the first chunk is received before the snapshot generation command is received, the preset storage area applies for a new storage space.
  • the size of the storage space requested at this time is equal to the size of the data m and the address length of the source volume d. with.
  • the address of d in the mapping table and the address of m are then updated so that the newly written data m can be found when reading d.
  • the size of the storage space requested in the preset storage area is equal to the sum of the size of the write data and the data size of the write address in the write operation, and the data of the original chunk is not required to be read, thereby improving The efficiency of writing saves storage space.
  • a second embodiment of the method for writing data according to the present invention in the third embodiment of the method for writing data according to the present invention, the method for writing data further includes the following steps:
  • step S30 during the process of the redirection snapshot, the target read address in the read operation command is obtained when the read operation command to the source volume is received before the snapshot generation command is obtained.
  • Step S40 determining whether the target read address is included in the mapping table; if yes, executing step S50; otherwise performing step S70;
  • step S50 it is determined whether the corresponding update time in the mapping table is after the snapshot is generated based on the target read address last time; if yes, step S60 is performed; otherwise, step S70 is performed;
  • Step S60 searching for a second storage address corresponding to the target read address in the mapping table, acquiring data in the second storage address in the storage area, and using data in the second storage address Acquiring the acquired first data into the second data, applying the second storage space in the storage area to save the second data, and updating the correspondence between the target read address and the storage address of the second storage space Up to the mapping table, the first data is data in a corresponding storage block of the target read address on the source volume;
  • Step S70 Read data of the target read address in a corresponding storage block on the source volume.
  • the pair may be received.
  • the read operation command of the source volume the read operation command can read the source volume or only some of the chunks.
  • the write operation command may be received before the read operation command is received, or the write operation command may not be received, that is, the write operation may be performed before the read operation command is received, or the write operation may not be performed.
  • the above target read address is the address of the data to be read in the read operation command.
  • first obtain the target read address in the read operation command and look up the mapping table according to the address of the read operation command to see whether there is a record of the target read address in the mapping table, and when present, indicate that the target read address is the target
  • the write address is subjected to a write operation record, and then according to the update time of the target read address record, it is determined whether the corresponding update time of the target read address in the mapping table is after the last snapshot is generated based on the target read address, and the purpose is to determine whether the write operation time is
  • the target read address is between the last snapshot generated and the current read operation.
  • the target read address to be read is not recorded in the mapping table, indicating that the target read address to be read has not been written, the data on the target read address has not changed, and the target read address can be directly read at the source volume.
  • the data in the corresponding storage block When the block or source volume to be read has been written, but the update time of the write operation is judged according to the update time of the target read address record, before the last snapshot is generated, it means that the last snapshot is generated to the current read operation. There is no write operation between them. At this time, the data on the block or source volume to be read does not change after the last snapshot is generated. In this case, the target read address can be directly read in the corresponding storage block on the source volume. data.
  • the record of the target read address is included in the mapping table and the target read address is in the mapping table corresponding to the update time after the last snapshot is generated based on the target read address, indicating that the target read address has been written after the snapshot is generated, indicating After the last snapshot is generated, the data stored in the storage block corresponding to the target read address is updated, and since only the data to be actually written is written when the write operation is performed, when the entire block or the source volume is to be written When reading a read operation, you need to combine the source volume with the data in the write space to read it.
  • the second storage address is an address for storing the write data in the newly applied space in the preset storage area when the second target write address performs the write operation, and the correspondence between the second storage address and the target read address may be obtained through the mapping table.
  • the first data is data of a target read address in a corresponding storage block on the source volume. After the data in the first data and the second storage address are acquired, the second storage address is merged with the first data, specifically, the data in the second storage address is merged into The corresponding position of the first data, the combined data is the second data.
  • the second storage space is a storage space newly saved for storing the second data in the preset storage area.
  • the second data is the latest data after the write operation, and the second data and the second storage space are saved.
  • the second data corresponds to the address of the source volume.
  • the second data in the second storage space at this time is read when the data is read.
  • the correspondence between the second target write address and the address of the second storage space is updated to the mapping table, and the original space is released, that is, when the original write data is released, the application is released. Space for storing write data and storage addresses.
  • the snapshot is generated in the order of A, B, and C.
  • the snapshot at the time A if the read operation command to the source volume is obtained, the read operation is obtained.
  • the read address is the third chunk, it is determined according to the record of the third chunk in the lookup mapping table whether the location has a write operation during the receipt of the read operation command and the last snapshot generation, when there is no third chunk record or the third
  • the update time of the chunk's record is not after the A time, and the data in the third chunk is directly read.
  • the record of the third chunk is obtained from the mapping table and the update time is after the time A, indicating that the write operation is performed after the time A, and the mapping corresponding to the third chunk is found by searching the mapping table.
  • the address in the storage area is found to be m in the storage area according to the address in the storage area corresponding to the third chunk, and the modified location is the fifth offset address in the third chunk, that is, the e in the abcde is changed to m.
  • merge m with abcde specifically, write m to the e of the third chunk to generate new data abcdm, apply for a new space in the preset storage area, save the address of abcdm and the corresponding third chunk, and update the third in the mapping table.
  • the chunk stores the address of the new space of abcdm in the address and storage area of the source volume, and releases the space for storing e.
  • reading data read the latest data abcdm.
  • the data is read by determining whether there is a write operation between the received read command and the most recently generated snapshot, so that the read data is the latest real-time data.
  • a method for writing data based on the present invention in the fourth embodiment of the method for writing data according to the present invention, is further Including steps:
  • Step S80 If a snapshot generation command for the source volume is received during the redirection of the snapshot, the mapping table determines whether the source volume has a write operation before receiving the snapshot generation command.
  • Step S90 When the source volume has a write operation before receiving the snapshot generation command, obtain an address of a write operation to the source volume as a third target write address, and search for the third target according to the mapping table. Writing an address in the storage area corresponding to the address, acquiring data in the address as the third data, and acquiring the size of the third data;
  • Step S100 Generate a snapshot according to the third data, the size of the third data, and the third target write address.
  • the snapshot may be taken for the first time, or the old snapshot may be generated.
  • the source volume Before receiving the snapshot generation command, the source volume may have performed the operation of writing data or reading data. When there is a write redirection, it indicates that the data in the source volume has the modification of the write data, in order to generate the latest data.
  • the data snapshot first determines whether the source volume has a write operation before receiving the snapshot generation command. When there is no write operation, the content of the source volume at this moment is the latest data.
  • the above third target write address is an address at which data is to be written when a write operation is performed on the source volume.
  • the third target write address and the corresponding address in the storage area can be queried through the mapping table, and the data in the corresponding address in the storage area is obtained as the third data.
  • the corresponding address is stored.
  • the write data stored in the storage area when the data is written.
  • the corresponding address stores the latest data that merges the original write data with the original chunk when the data is read. Specifically, how to perform data merging is described in detail in the second embodiment of the method for writing data, and details are not described herein again.
  • the size of the third data may be obtained by acquiring the number of bytes of the third data, and then generating a snapshot according to the third data, the size of the third data, and the write address of the third data.
  • generating the snapshot according to the third data, the size of the third data, and the third target write address includes:
  • Step S110 it is determined whether the size of the third data and the third target write address are equal to the size of the third storage block corresponding to the source volume; if yes, step S120 is performed; otherwise, step S130 is performed;
  • Step S120 writing the third data back to the third storage block of the source volume
  • Step S130 acquiring data in the third storage block, combining the third data and the data in the third storage block into fourth data, and writing the fourth data back to the source volume In the third memory block.
  • the third storage block refers to a block corresponding to the source volume of the third target write address, that is, a block corresponding to a write operation when performing a write operation, for example, receiving a write operation to source the second volume of the chunk.
  • the third target write address is the address of the second chunk
  • the data written is c
  • the third storage block corresponding to the third target write address is the second chunk.
  • the write redirection The write data in the storage space is already the latest data, and the third data is written back to the third storage block of the source volume, that is, written back to the chunk where the write operation has been generated.
  • the size of the third data is not equal to the size of the third storage block, it indicates that the write operation has been performed at this time, but only part of the data in the chunk is rewritten and the read operation is not performed, and the latest snapshot is needed to generate the snapshot. data.
  • the data in the third storage block at the time of the write operation is acquired.
  • the data in the third storage block is the data before the write operation, and the third data is combined with the data in the third storage block to generate the fourth data.
  • the fourth data is the latest corresponding to the third storage block of the source volume. Data, the fourth data is written back to the third storage block of the source volume.
  • the data stored in the third chunk on the source volume is abcde.
  • the write operation is performed to change e to m, and the read operation command is not received.
  • the search is performed.
  • the mapping table has been written to the source volume.
  • the data in the storage space of the write operation is obtained, that is, m is obtained, and the size of m is compared with the size of the chunk. It is easy to obtain that the size of m is smaller than the size of the chunk, and the data of the third chunk on the source volume is obtained, and m is The abcde in the third chunk merges to generate new data. Specifically, the corresponding location of the third chunk is written as the new data abcdm.
  • abcdm is the latest data of the third chunk of the source volume
  • abcdm is written to the third chunk on the source volume
  • the third chunk is The data is the content of the generated snapshot.
  • the snapshot is generated according to the size of the data of the write operation, and the time data generated by the snapshot is ensured to be the latest real-time data.
  • the efficiency of snapshot generation is improved.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are configured to execute the above method for writing data.
  • the size information of the write data included in the write operation command is acquired when the write operation command to the source volume is received before the snapshot generation command is obtained. And applying, according to the size information of the write data, a storage space matching the size information from a preset storage area, and performing a write operation in the applied storage space. Pass According to the size information of the write operation data, the data space is matched, the write data included in the write operation command and the address to be written to the source volume are written, and the original chunk data is not required to be read, so that the data is efficiently written.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种写数据的装置、写数据的方法及计算机存储介质,所述写数据的装置包括:第一获取模块(10),配置为写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;写操作模块(20),配置为根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。

Description

写数据的方法及装置、计算机存储介质 技术领域
本发明涉及数据存储技术领域,尤其涉及写数据的方法及装置、计算机存储介质。
背景技术
随着信息技术的发展,数字化资源越来越多,数据的安全与备份越发重要。目前通常使用快照技术实现对数据的备份,根据存储网络工业协会(Storage Networking Industry Association,SNIA)的定义,快照是关于制定数据集合的一个完全可用拷贝,包含相应数据在拷贝初始化时刻的映像。现有的快照主流技术分为两种,一种是第一次写时复制(Copy OnFirst Write,COFW),另一种技术是写时重定向(Redirect On Write,ROW)。
在分布式存储的场景下,数据的存储是以块(Chunk)为最小单位的,然而目前在写时重定向的快照技术中进行写操作时,通常分配一个chunk大小的新chunk空间进行写操作,当写操作中写入数据小于chunk大小时,先读取源卷中旧chunk的数据,然后进行数据合并再写入新chunk空间,由于这种写数据的方法需要进行先读取数据、合并数据再写入chunk,写操作过程较繁琐,特别是当写操作频繁时,大量的读取、合并操作降低了写数据的性能。
发明内容
本发明实施例提供了一种写数据的方法及装置、计算机存储介质,旨在对于写时重定向快照过程中,提高写操作的性能。
本发明实施例提供一种写数据的装置,所述写数据的装置包括:
第一获取模块,配置为写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
写操作模块,配置为根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
在本发明实施例一实施方式中,所述写操作命令中包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址;
则所述写操作模块包括:
存储空间申请单元,配置为根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
存储单元,配置为在所述第一存储空间存储所述写入数据和所述第一目标写地址;
更新单元,配置为将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
在本发明实施例一实施方式中,所述写数据的装置还包括:
第二获取模块,配置为写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的目标读地址;
第一判断模块,配置为若所述映射表中包含所述目标读地址,则判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;
数据合并模块,配置为所述映射表中包含所述目标读地址且所述目标 读地址在所述映射表中对应的更新时间在最后一次基于所述目标读地址生成快照之后时,在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
读操作模块,配置为所述映射表中不包含所述目标读地址或所述映射表中包含所述目标读地址但所述目标读地址在所述映射表中对应的更新时间不在最后一次基于所述目标读地址生成快照之后时,读取所述目标读地址在所述源卷上对应的存储块中的数据。
在本发明实施例一实施方式中,所述写数据的装置还包括:
接收模块,配置为若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操作;
第二判断模块,配置为当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所述地址中的数据为第三数据,同时获取所述第三数据的大小;
快照生成模块,配置为根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
在本发明实施例一实施方式中,所述快照生成模块包括:
判断单元,配置为判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;
写操作单元,配置为当所述第三数据的大小与所述第三存储块的大小 相等时,将所述第三数据写回至所述源卷的第三存储块中;
快照生成单元,配置为当所述第三数据的大小与所述第三存储块的大小不相等时,获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
所述第一获取模块、所述写操作模块、所述存储空间申请单元、所述存储单元、所述更新单元、所述第二获取模块、所述第一判断模块、所述数据合并模块、所述读操作模块、所述接收模块、所述第二判断模块、所述快照生成模块、所述判断单元、所述写操作单元、所述快照生成单元在执行处理时,可以采用中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Singnal Processor)或可编程逻辑阵列(FPGA,Field-Programmable Gate Array)实现。
本发明实施例还提供一种写数据的方法,所述写数据的方法包括:
写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
在本发明实施例一实施方式中,所述写操作命令中还包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址;
则所述根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作包括:
根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
在所述第一存储空间存储所述写入数据和所述第一目标写地址;
将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
在本发明实施例一实施方式中,所述写数据的方法还包括:
写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的目标读地址;
若所述映射表中包含所述目标读地址,则判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;
若所述映射表中包含所述目标读地址且所述目标读地址在所述映射表中对应的更新时间在最后一次基于所述目标读地址生成快照之后,则在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
若所述映射表中不包含所述目标读地址或所述映射表中包含所述目标读地址但所述目标读地址在所述映射表中对应的更新时间不在最后一次基于所述目标读地址生成快照之后,则读取所述目标读地址在所述源卷上对应的存储块中的数据。
在本发明实施例一实施方式中,所述写数据的方法还包括:
若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操 作;
当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所述地址中的数据为第三数据,同时获取所述第三数据的大小;
根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
在本发明实施例一实施方式中,所述根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照包括:
判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;
若是,则将所述第三数据写回至所述源卷的第三存储块中;
若否,则获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行上述的写数据的方法。
本发明实施例提出的写数据的方案,在写时重定向快照过程中,未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。通过根据写操作写入数据的大小信息申请相匹配的数据空间,写入写操作命令包含的写入数据和要写入源卷的地址,无需读取原chunk的数据,实现高效写数据的目的。
附图说明
图1为本发明写数据的装置第一实施例的功能模块示意图;
图2为本发明图1所示实施例中写操作模块20的细化功能模块的示意图;
图3为本发明写数据的装置第三实施例的功能模块示意图;
图4为本发明写数据的装置第四实施例的功能模块示意图;
图5为本发明图4所示实施例中快照生成模块90的功能模块示意图;
图6为本发明写数据的方法第一实施例的流程示意图;
图7为本发明图6所示实施例中步骤S20根据所述写入数据的大小从预置的存储区域申请存储空间,在申请到的存储空间进行写操作的细化流程示意图;
图8为本发明写数据的方法第三实施例的流程示意图;
图9为本发明写数据的方法第四实施例的流程示意图;
图10为本发明图9所示实施例中步骤S100根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照的细化流程示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
本发明提供一种写数据的装置,参照图1,在第一实施例中,该写数据的装置包括:
第一获取模块10,配置为写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
写操作模块20,配置为根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
本发明提供的写数据的装置主要应用于分布式存储技术中,是使用写时重定向的快照技术时的一种写数据的装置。通常分布式存储场景下,数据的存储是以块(Chunk)为最小单位的,每一个存储系统可以根据需要将存储空间划分成若干个大小一致的chunk,在chunk中存放数据,例如可以将存储空间划分为若干个块,其中每个块的大小划分为4兆也可以为112个字节。
本实施例中,在写时重定向快照过程中,当已经生成旧快照但还未接收到新的快照生成命令时,或者是第一份快照未收到快照生成命令时,都可能接收到对源卷的写操作命令。例如在写时重定向快照过程中A、B、C三个时刻分别做快照,在A时刻完成快照后,在B时刻快照生成之前,即收到快照生成命令之前,可能收到写操作命令或读操作命令,在写时重定向快照过程中发生的写操作或读操作具体说是写重定向操作或读重定向操作。上述第一获取模块10中源卷是指快照过程中将要备份的数据存储空间,可以根据需要对若干块做快照。上述写操作是指向源卷写入数据,当接收到对源卷的写操作命令时,获取写操作命令包含的写入数据的大小信息,上述写入数据是要写入源卷的数据,通常数据是以字节的形式存放在内存中,因此获取写入数据的大小信息可以通过获取写入数据的字节数来得到。
当获取到写入数据的大小信息后根据写入数据的大小信息在预置的存储区域申请存储空间,而不是按照chunk的大小信息申请存储空间,然后再在新申请的存储空间中进行写操作。上述预置的存储区域是存储空间中除了源卷以外的其他存储空间,具体可以根据需要选择新空间进行存放。
在本实施例中,写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数 据的大小信息;根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。通过根据写操作写入数据的大小信息申请相匹配的数据空间,写入写操作命令包含的写入数据和要写入源卷的地址,无需读取原chunk的数据,实现高效写数据的目的。
在本发明实施例一实施方式中,基于本发明写数据的装置第一实施例,在本发明写数据的装置的第二实施例中,写操作命令中包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址;
则参照图2,所述写操作模块20的细化功能模块包括:
存储空间申请单元21,配置为根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
存储单元22,配置为在所述第一存储空间存储所述写入数据和所述第一目标写地址;
更新单元23,配置为将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
本实施例中,上述获取单元中写入数据是写操作命令中要写入源卷的数据,上述第一目标写地址是写入数据要写入的地址,可以是写入数据将要写入的源卷上的chunk的地址,也可以是写入数据要写入的源卷上的chunk的地址和数据在chunk内的偏移地址。上述第一存储空间是用于存储写入的数据和源卷上要写入数据的地址,第一存储空间的大小等于写入数据的大小与第一目标写地址的数据大小之和当写入数据后,将源卷的第一目标写地址与第一存储空间的第一存储地址之间的对应关系更新至映射表中,上 述第一存储地址是第一存储空间的地址,同时在映射表中记录更新时间,可以理解的是,更新时间可以不限定于在映射表中记录,也可以在其他地方记录,通过时间可以确定何时进行了写操作。当映射表更新后,可以通过映射表找到在预置的存储区域存储的新写入数据,也可以找到新写入数据所对应的源卷上的chunk地址。
例如,在源卷中第一个chunk存放数据abcd,在未收到快照生成命令前收到在第一个chunk的d位置写入m的命令时,预置的存储区域申请一个新的存储空间存放要写入的数据m和要写入源卷第一块第四个数据位置的地址,即d的地址,此时申请的存储空间的大小等于数据m的大小与源卷d的地址长度之和。然后更新映射表中d的地址和m的地址,使得能够在读取d时找到新写入的数据m。
在本实施例中,在预置的存储区域申请存储空间大小等于写操作中写入数据的大小与写地址的数据大小之和,无需读取原chunk的数据,提高了写的效率,节省了存储空间。
在本发明实施例一实施方式中,参照图3,基于本发明写数据的装置第二实施例,在本发明写数据的装置第三实施例中,所述写数据的装置还包括:
第二获取模块30,配置为写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的目标读地址;
第一判断模块40,配置为若所述映射表中包含所述目标读地址,则判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;
数据合并模块50,配置为当所述映射表中包含所述目标读地址且所述目标读地址在所述映射表中对应的更新时间在最后一次基于所述目标读地 址生成快照之后时,在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
读操作模块60,配置为当所述映射表中不包含所述目标读地址或所述映射表中包含所述目标读地址但所述目标读地址在所述映射表中对应的更新时间不在最后一次基于所述目标读地址生成快照之后时,读取所述目标读地址在所述源卷上对应的存储块中的数据。
本实施例中,在写时重定向快照过程中,当已经生成旧快照但还未接收到新的快照生成命令时,或者是第一份快照未收到快照生成命令时,都可能接收到对源卷的读操作命令,读操作命令可以读取源卷也可能只读取其中某几个chunk。同时,在接收到读操作命令之前可能接收过写操作命令,也可能没有接收过写操作命令,即在接收到读操作命令之前可能进行过写操作,也可能没有进行过写操作。
上述第二获取模块30中目标读地址即获取到读操作命令中要读的数据的地址。当接收到读操作命令时,先获取读操作命令中的目标读地址,根据读操作命令的地址查找映射表,看映射表中是否有该目标读地址的记录,当存在时,表明该目标读地址作为目标写地址进行过写操作记录,再根据目标读地址记录的更新时间判断目标读地址在映射表中对应的更新时间是否在最后一次基于目标读地址生成快照之后,目的是判断写操作的时间是否在目标读地址在最后一生成快照到本次收到读操作之间。当要读目标读地址在映射表中没有记录,表明要读的目标读地址没有进行过写操作,则目标读地址上的数据并无发生改变,此时可以直接读取目标读地址在源卷 上对应的存储块中的数据。当要读的块或源卷进行过写操作,但是根据目标读地址记录的更新时间判断得到写操作的更新时间在最近一次生成快照之前,则说明在最后一次生成快照到本次收到读操作之间并未进行过写操作,此时要读的块或源卷上的数据在最后一次生成快照之后并无变化,此时可以直接读取目标读地址在源卷上对应的存储块中的数据。当在映射表中包含目标读地址的记录且目标读地址在映射表中对应的更新时间在最后一次基于目标读地址生成快照之后,说明在生成快照之后该目标读地址进行过写操作,即表明在最后一次生成快照之后目标读地址对应的存储块中所存储的数据有更新,又由于当写操作时,仅仅写入了实际要写入的数据,因此当要对整个块或是源卷进行读操作时,需要将源卷与写空间里的数据结合起来读。
当在进行读操作时判断得到目标读地址在最后一生成快照到本次收到读操作之间进行过写操作情况下,表明目标读地址曾作为目标写地址进行过写操作。所述第二存储地址是在第二目标写地址执行写操作时,在预置的存储区域新申请的空间存储写数据的地址,第二存储地址与目标读地址的对应关系可以通过映射表获取。上述第一数据是目标读地址在所述源卷上对应的存储块中的数据。当获取了第一数据和第二存储地址中的数据后,将第二存储地址与第一数据合并,具体是把第二存储地址中的数据合并到第一数据的相应位置,合并后的数据为第二数据。上述第二存储空间是在预设存储区新申请的用于保存第二数据的存储空间,此时第二数据为经过写操作之后的最新数据,第二存储空间中可保存第二数据和第二数据对应源卷的地址。读取数据时读取此时第二存储空间中的第二数据。当新申请第二存储空间保存第二数据后,将第二目标写地址与第二存储空间的地址之间的对应关系更新至映射表中,并释放原空间,即释放原来写入数据时申请的用于存储写入数据和存储地址的空间。
例如,当写时重定向快照过程中,按先后顺序在A、B、C三个时刻生成快照,当A时刻的快照生成后,若获取到对源卷的读操作命令,此时获取读操作命令中的读地址。当读地址为第三chunk时,根据查找映射表中第三chunk的记录判断此位置在收到读操作命令和最后一次生成快照期间是否有过写操作,当没有第三chunk的记录或者第三chunk的记录的更新时间不在A时刻之后,直接读取第三chunk中的数据。若第三chunk中存放的数据为abcde,从映射表获取到第三chunk的记录且更新时间在A时刻以后,表明A时刻后进行过写操作,则通过查找映射表找到与第三chunk对应的存储区域中的地址,根据与第三chunk对应的存储区域中的地址查找到存储区域存放的数据为m,修改位置为第三chunk中第五个偏移地址,即将abcde中e修改为m,此时将m与abcde合并,具体是将m写入第三chunk的e处生成新数据abcdm,在预置的存储区域申请新空间保存abcdm和对应第三chunk的地址,更新映射表中第三chunk在源卷的地址与存储区域中保存abcdm的新空间的地址,并释放存放e的空间。读数据时,读取最新的数据abcdm。
在本实施例中,通过判断接收到读命令与最近生成快照之间是否有写操作来进行数据的读取,使读到的数据为最新的实时数据。
在本发明实施例一实施方式中,参照图4,基于本发明写数据的装置上述实施例,在本发明写数据的装置的第四实施例中,所述写数据的装置还包括:
接收模块70,配置为若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操作;
第二判断模块80,配置为当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所 述地址中的数据为第三数据,同时获取所述第三数据的大小;
快照生成模块90,配置为根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
本实施例中,在写时重定向快照过程中,当接收到对源卷的快照生成命令时,此时可以是第一次对源卷进行快照,也可以是在已有旧的快照产生情况下收到的新的快照生成命令。当在收到快照生成命令前,源卷可能进行过写数据的操作或者是读数据的操作,当有过写重定向时,表明源卷中的数据有了写数据的修改,为了生成最新的数据快照,先判断所述源卷在接收到所述快照生成命令之前是否有写操作。当没有进行写操作时,此刻源卷的内容就是最新的数据。
上述第二判断模块中第三目标写地址是在源卷上写操作时要写入数据的地址。通过映射表可以查询该第三目标写地址与存储区域中的对应地址,获取存储区域中的对应地址中的数据为第三数据,当进行了写操作未进行读操作时,该对应地址中存放的是写数据时存储区域里存放的写入数据,当进行了写操作又进行了读操作时,该对应地址中存放的是读数据时将原写数据与原chunk合并的最新数据。具体如何进行数据合并在写数据的装置第二实施例中进行了详细阐述,这里不再赘述。
判断所述源卷在接收到所述快照生成命令之前是否有写操作的方法可以通过查找映射表得到。上述第三数据的大小可以通过获取第三数据的字节数的方式得到,然后根据第三数据、第三数据的大小和第三数据的写地址生成快照。
在本发明实施例一实施方式中,参照图5,本实施例中,上述快照生成模块90包括:
判断单元91,配置为判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;
写操作单元92,配置为当所述第三数据的大小与所述第三存储块的大小相等时,将所述第三数据写回至所述源卷的第三存储块中;
快照生成单元93,配置为当所述第三数据的大小与所述第三存储块的大小不相等时,获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
本实施例中,上述第三存储块是指第三目标写地址在源卷所对应的块,即为在执行写操作时写操作所对应的块,例如收到写操作将源卷第二chunk中的aaaa写成aaac时,第三目标写地址就是第二chunk的地址,所写的数据就是c,第三目标写地址所对应的第三存储块就是第二chunk。判断第三数据的大小与第三存储块的大小是否相等,当相等时,表明通过写操作写入数据将原chunk数据全部改写,或者是在写操作后产生过读操作,此时写重定向的存储空间里的写数据已是最新数据,将第三数据写回至源卷的第三存储块中,即写回至产生过写操作的chunk处。
当第三数据的大小与第三存储块的大小不相等时,表明此时执行过写操作,但只改写了chunk中的部分数据且并未进行过读操作,生成快照时需要获取此刻最新的数据。此时获取写操作时的第三存储块中的数据,这时第三存储块中的数据为写操作之前的数据,将第三数据与第三存储块中的数据合并生成第四数据,此时第四数据为源卷的第三存储块所对应的最新数据,将第四数据写回到源卷的第三存储块中。
例如,在收到快照生成命令前,源卷上第三chunk存放的数据为abcde执行过写操作将e修改为m,且并未收到读操作命令,当收到快照生成命令时,通过查找映射表得到源卷有过写操作。此时获取写操作的存储空间里的数据,即获取m,将m的大小与chunk的大小比较,容易得到m的大小小于chunk的大小,则获取源卷上第三chunk的数据,将m与第三chunk中的abcde 合并生成新数据。具体是将m写入第三chunk的对应位置保存为新的数据abcdm,此时abcdm为源卷第三chunk的最新数据,将abcdm写入源卷上第三chunk处,此时第三chunk内的数据为生成的快照的内容。
在本实施例中,通过判断收到快照生成命令之前是否有写操作,并获取写操作的数据,根据写操作的数据的大小进行快照的生成,保证了快照生成的时数据为最新的实时数据,同时通过对写操作的数据的大小有选择性的写入数据,提高了快照生成的效率。
参照图6,提出了本发明写数据的方法的第一实施例,该实施例中,写数据的方法包括步骤:
步骤S10,写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
步骤S20,根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
本发明提供的写数据的方法主要应用于分布式存储技术中,是使用写时重定向的快照技术时的一种写数据的方法。通常分布式存储场景下,数据的存储是以块(Chunk)为最小单位的,每一个存储系统可以根据需要将存储空间划分成若干个大小一致的chunk,在chunk中存放数据,例如可以将存储空间划分为若干个块,其中每个块的大小划分为4兆也可以为112个字节。
本实施例中,在写时重定向快照过程中,当已经生成旧快照但还未接收到新的快照生成命令时,或者是第一份快照未收到快照生成命令时,都可能接收到对源卷的写操作命令。例如在写时重定向快照过程中A、B、C三个时刻分别做快照,在A时刻完成快照后,在B时刻快照生成之前,即收到快照生成命令之前,可能收到写操作命令或读操作命令,在写时重定向 快照过程中发生的写操作或读操作具体说是写重定向操作或读重定向操作。上述源卷是指快照过程中将要备份的数据存储空间,可以根据需要对若干块做快照。上述写操作是指向源卷写入数据,当接收到对源卷的写操作命令时,获取写操作命令包含的写入数据的大小信息,上述写入数据是要写入源卷的数据,通常数据是以字节的形式存放在内存中,因此获取写入数据的大小信息可以通过获取写入数据的字节数来得到。
当获取到写入数据的大小信息后根据写入数据的大小信息在预置的存储区域申请存储空间,而不是按照chunk的大小信息申请存储空间,然后再在新申请的存储空间中进行写操作。上述预置的存储区域是存储空间中除了源卷以外的其他存储空间,具体可以根据需要选择新空间进行存放。
在本实施例中,写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。通过根据写操作写入数据的大小信息申请相匹配的数据空间,写入写操作命令包含的写入数据和要写入源卷的地址,无需读取原chunk的数据,实现高效写数据的目的。
在本发明实施例一实施方式中,基于本发明写数据的方法第一实施例,在本发明写数据的方法的第二实施例中,上述写操作命令中还包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址,
则参照图7,上述步骤S20的细化步骤包括:
步骤S21,根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
步骤S22,在所述第一存储空间存储所述写入数据和所述第一目标写地 址;
步骤S23,将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
本实施例中,上述写入数据是写操作命令中要写入源卷的数据,上述第一目标写地址是写入数据要写入的地址,可以是写入数据将要写入的源卷上的chunk的地址,也可以是写入数据要写入的源卷上的chunk的地址和数据在chunk内的偏移地址。上述第一存储空间是用于存储写入的数据和源卷上要写入数据的地址,第一存储空间的大小等于写入数据的大小与第一目标写地址的数据大小之和。当写入数据后,将源卷的第一目标写地址与第一存储空间的第一存储地址之间的对应关系更新至映射表中,上述第一存储地址是第一存储空间的地址,同时在映射表中记录更新时间,可以理解的是,更新时间可以不限定于在映射表中记录,也可以在其他地方记录,通过时间可以确定何时进行了写操作。当映射表更新后,可以通过映射表找到在预置的存储区域存储的新写入数据,也可以找到新写入数据所对应的源卷上的chunk地址。
例如,在源卷中第一个chunk存放数据abcd,在未收到快照生成命令前收到在第一个chunk的d位置写入m的命令时,预置的存储区域申请一个新的存储空间存放要写入的数据m和要写入源卷第一块第四个数据位置的地址,即d的地址,此时申请的存储空间的大小等于数据m的大小与源卷d的地址长度之和。然后更新映射表中d的地址和m的地址,使得能够在读取d时找到新写入的数据m。
在本实施例中,在预置的存储区域申请存储空间大小等于写操作中写入数据的大小与写地址的数据大小之和,无需读取原chunk的数据,提高了 写的效率,节省了存储空间。
在本发明实施例一实施方式中,参照图8,基于本发明写数据的方法第二实施例,在本发明写数据的方法的第三实施例中,所述写数据的方法还包括步骤:
步骤S30,写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的目标读地址;
步骤S40,判断所述映射表中是否包含所述目标读地址;若是,则执行步骤S50;否则执行步骤S70;
步骤S50,判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;若是,则执行步骤S60;否则,执行步骤S70;
步骤S60,在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
步骤S70,读取所述目标读地址在所述源卷上对应的存储块中的数据。
本实施例中,在写时重定向快照过程中,当已经生成旧快照但还未接收到新的快照生成命令时,或者是第一份快照未收到快照生成命令时,都可能接收到对源卷的读操作命令,读操作命令可以读取源卷也可能只读取其中某几个chunk。同时,在接收到读操作命令之前可能接收过写操作命令,也可能没有接收过写操作命令,即在接收到读操作命令之前可能进行过写操作,也可能没有进行过写操作。
上述目标读地址即获取到读操作命令中要读的数据的地址。当接收到 读操作命令时,先获取读操作命令中的目标读地址,根据读操作命令的地址查找映射表,看映射表中是否有该目标读地址的记录,当存在时,表明该目标读地址作为目标写地址进行过写操作记录,再根据目标读地址记录的更新时间判断目标读地址在映射表中对应的更新时间是否在最后一次基于目标读地址生成快照之后,目的是判断写操作的时间是否在目标读地址在最后一生成快照到本次收到读操作之间。当要读目标读地址在映射表中没有记录,表明要读的目标读地址没有进行过写操作,则目标读地址上的数据并无发生改变,此时可以直接读取目标读地址在源卷上对应的存储块中的数据。当要读的块或源卷进行过写操作,但是根据目标读地址记录的更新时间判断得到写操作的更新时间在最近一次生成快照之前,则说明在最后一次生成快照到本次收到读操作之间并未进行过写操作,此时要读的块或源卷上的数据在最后一次生成快照之后并无变化,此时可以直接读取目标读地址在源卷上对应的存储块中的数据。当在映射表中包含目标读地址的记录且目标读地址在映射表中对应的更新时间在最后一次基于目标读地址生成快照之后,说明在生成快照之后该目标读地址进行过写操作,即表明在最后一次生成快照之后目标读地址对应的存储块中所存储的数据有更新,又由于当写操作时,仅仅写入了实际要写入的数据,因此当要对整个块或是源卷进行读操作时,需要将源卷与写空间里的数据结合起来读。
当在进行读操作时判断得到目标读地址在最后一生成快照到本次收到读操作之间进行过写操作情况下,表明目标读地址曾作为目标写地址进行过写操作。所述第二存储地址是在第二目标写地址执行写操作时,在预置的存储区域新申请的空间存储写数据的地址,第二存储地址与目标读地址的对应关系可以通过映射表获取。上述第一数据是目标读地址在所述源卷上对应的存储块中的数据。当获取了第一数据和第二存储地址中的数据后,将第二存储地址与第一数据合并,具体是把第二存储地址中的数据合并到 第一数据的相应位置,合并后的数据为第二数据。上述第二存储空间是在预设存储区新申请的用于保存第二数据的存储空间,此时第二数据为经过写操作之后的最新数据,第二存储空间中可保存第二数据和第二数据对应源卷的地址。读取数据时读取此时第二存储空间中的第二数据。当新申请第二存储空间保存第二数据后,将第二目标写地址与第二存储空间的地址之间的对应关系更新至映射表中,并释放原空间,即释放原来写入数据时申请的用于存储写入数据和存储地址的空间。
例如,当写时重定向快照过程中,按先后顺序在A、B、C三个时刻生成快照,当A时刻的快照生成后,若获取到对源卷的读操作命令,此时获取读操作命令中的读地址。当读地址为第三chunk时,根据查找映射表中第三chunk的记录判断此位置在收到读操作命令和最后一次生成快照期间是否有过写操作,当没有第三chunk的记录或者第三chunk的记录的更新时间不在A时刻之后,直接读取第三chunk中的数据。若第三chunk中存放的数据为abcde,从映射表获取到第三chunk的记录且更新时间在A时刻以后,表明A时刻后进行过写操作,则通过查找映射表找到与第三chunk对应的存储区域中的地址,根据与第三chunk对应的存储区域中的地址查找到存储区域存放的数据为m,修改位置为第三chunk中第五个偏移地址,即将abcde中e修改为m,此时将m与abcde合并,具体是将m写入第三chunk的e处生成新数据abcdm,在预置的存储区域申请新空间保存abcdm和对应第三chunk的地址,更新映射表中第三chunk在源卷的地址与存储区域中保存abcdm的新空间的地址,并释放存放e的空间。读数据时,读取最新的数据abcdm。
在本实施例中,通过判断接收到读命令与最近生成快照之间是否有写操作来进行数据的读取,使读到的数据为最新的实时数据。
在本发明实施例一实施方式中,参照图9,基于本发明写数据的方法上述实施例,在本发明写数据的方法的第四实施例中,所述写数据的方法还 包括步骤:
步骤S80,若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操作;
步骤S90,当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所述地址中的数据为第三数据,同时获取所述第三数据的大小;
步骤S100,根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
本实施例中,在写时重定向快照过程中,当接收到对源卷的快照生成命令时,此时可以是第一次对源卷进行快照,也可以是在已有旧的快照产生情况下收到的新的快照生成命令。当在收到快照生成命令前,源卷可能进行过写数据的操作或者是读数据的操作,当有过写重定向时,表明源卷中的数据有了写数据的修改,为了生成最新的数据快照,先判断所述源卷在接收到所述快照生成命令之前是否有写操作。当没有进行写操作时,此刻源卷的内容就是最新的数据。
上述第三目标写地址是在源卷上写操作时要写入数据的地址。通过映射表可以查询该第三目标写地址与存储区域中的对应地址,获取存储区域中的对应地址中的数据为第三数据,当进行了写操作未进行读操作时,该对应地址中存放的是写数据时存储区域里存放的写入数据,当进行了写操作又进行了读操作时,该对应地址中存放的是读数据时将原写数据与原chunk合并的最新数据。具体如何进行数据合并在写数据的方法第二实施例中进行了详细阐述,这里不再赘述。
判断所述源卷在接收到所述快照生成命令之前是否有写操作的方法可 以通过查找映射表得到。上述第三数据的大小可以通过获取第三数据的字节数的方式得到,然后根据第三数据、第三数据的大小和第三数据的写地址生成快照。
在本发明实施例一实施方式中,参照图10,本实施例中,上述根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照包括:
步骤S110,判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;若是,则执行步骤S120;否则,执行步骤S130;
步骤S120,将所述第三数据写回至所述源卷的第三存储块中;
步骤S130,获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
本实施例中,上述第三存储块是指第三目标写地址在源卷所对应的块,即为在执行写操作时写操作所对应的块,例如收到写操作将源卷第二chunk中的aaaa写成aaac时,第三目标写地址就是第二chunk的地址,所写的数据就是c,第三目标写地址所对应的第三存储块就是第二chunk。判断第三数据的大小与第三存储块的大小是否相等,当相等时,表明通过写操作写入数据将原chunk数据全部改写,或者是在写操作后产生过读操作,此时写重定向的存储空间里的写数据已是最新数据,将第三数据写回至源卷的第三存储块中,即写回至产生过写操作的chunk处。
当第三数据的大小与第三存储块的大小不相等时,表明此时执行过写操作,但只改写了chunk中的部分数据且并未进行过读操作,生成快照时需要获取此刻最新的数据。此时获取写操作时的第三存储块中的数据,这时第三存储块中的数据为写操作之前的数据,将第三数据与第三存储块中的数据合并生成第四数据,此时第四数据为源卷的第三存储块所对应的最新 数据,将第四数据写回到源卷的第三存储块中。
例如,在收到快照生成命令前,源卷上第三chunk存放的数据为abcde执行过写操作将e修改为m,且并未收到读操作命令,当收到快照生成命令时,通过查找映射表得到源卷有过写操作。此时获取写操作的存储空间里的数据,即获取m,将m的大小与chunk的大小比较,容易得到m的大小小于chunk的大小,则获取源卷上第三chunk的数据,将m与第三chunk中的abcde合并生成新数据。具体是将m写入第三chunk的对应位置保存为新的数据abcdm,此时abcdm为源卷第三chunk的最新数据,将abcdm写入源卷上第三chunk处,此时第三chunk内的数据为生成的快照的内容。
在本实施例中,通过判断收到快照生成命令之前是否有写操作,并获取写操作的数据,根据写操作的数据的大小进行快照的生成,保证了快照生成的时数据为最新的实时数据,同时通过对写操作的数据的大小有选择性的写入数据,提高了快照生成的效率。
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行上述的写数据的方法。
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。
工业实用性
采用本发明实施例,在写时重定向快照过程中,未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。通过 根据写操作写入数据的大小信息申请相匹配的数据空间,写入写操作命令包含的写入数据和要写入源卷的地址,无需读取原chunk的数据,实现高效写数据的目的。

Claims (11)

  1. 一种写数据的装置,所述写数据的装置包括:
    第一获取模块,配置为写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
    写操作模块,配置为根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
  2. 如权利要求1所述的写数据的装置,其中,所述写操作命令中包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址;
    则所述写操作模块包括:
    存储空间申请单元,配置为根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
    存储单元,配置为在所述第一存储空间存储所述写入数据和所述第一目标写地址;
    更新单元,配置为将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
  3. 如权利要求2所述的写数据的装置,其中,所述写数据的装置还包括:
    第二获取模块,配置为写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的 目标读地址;
    第一判断模块,配置为若所述映射表中包含所述目标读地址,则判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;
    数据合并模块,配置为当所述映射表中包含所述目标读地址且所述目标读地址在所述映射表中对应的更新时间在最后一次基于所述目标读地址生成快照之后时,在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
    读操作模块,配置为当所述映射表中不包含所述目标读地址或所述映射表中包含所述目标读地址但所述目标读地址在所述映射表中对应的更新时间不在最后一次基于所述目标读地址生成快照之后时,读取所述目标读地址在所述源卷上对应的存储块中的数据。
  4. 如权利要求2或3所述的写数据的装置,其中,所述写数据的装置还包括:
    接收模块,配置为若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操作;
    第二判断模块,配置为当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所述地址中的数据为第三数据,同时获取所述第三数据的大小;
    快照生成模块,配置为根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
  5. 如权利要求4所述的写数据的装置,其中,所述快照生成模块包括:
    判断单元,配置为判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;
    写操作单元,配置为当所述第三数据的大小与所述第三存储块的大小相等时,将所述第三数据写回至所述源卷的第三存储块中;
    快照生成单元,配置为当所述第三数据的大小与所述第三存储块的大小不相等时,获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
  6. 一种写数据的方法,所述写数据的方法包括:
    写时重定向快照过程中,在未获取到快照生成命令之前,当接收到对源卷的写操作命令时,获取所述写操作命令包含的写入数据的大小信息;
    根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作。
  7. 如权利要求6所述的写数据的方法,其中,所述写操作命令中还包括所述写入数据,及所述写入数据要写入所述源卷的第一目标写地址;
    则所述根据所述写入数据的大小信息从预置的存储区域申请与所述大小信息匹配的存储空间,在申请到的存储空间进行写操作包括:
    根据所述写入数据的大小信息从所述存储区域申请第一存储空间,所述第一存储空间大小等于所述写入数据的大小与所述第一目标写地址的数据大小之和;
    在所述第一存储空间存储所述写入数据和所述第一目标写地址;
    将所述第一目标写地址与所述第一存储空间的第一存储地址之间的对应关系更新至映射表中,且在所述映射表中记录所述对应关系的更新时间,所述映射表用于保存所述源卷中的地址与所述存储区域的存储地址之间的对应关系及所述对应关系的更新时间。
  8. 如权利要求7所述的写数据的方法,其中,所述写数据的方法还包括:
    写重定向快照过程中,在未获取到快照生成命令之前,当接收到对所述源卷的读操作命令时,获取所述读操作命令中的目标读地址;
    若所述映射表中包含所述目标读地址,则判断所述目标读地址在所述映射表中对应的更新时间是否在最后一次基于所述目标读地址生成快照之后;
    若所述映射表中包含所述目标读地址且所述目标读地址在所述映射表中对应的更新时间在最后一次基于所述目标读地址生成快照之后,则在所述映射表中查找与所述目标读地址对应的第二存储地址,获取所述存储区域中所述第二存储地址中的数据,将所述第二存储地址中的数据与获取的第一数据合并成第二数据,在所述存储区域申请第二存储空间保存所述第二数据,将所述目标读地址与所述第二存储空间的存储地址之间的对应关系更新至所述映射表中,所述第一数据为所述目标读地址在所述源卷上对应的存储块中的数据;
    若所述映射表中不包含所述目标读地址,或所述映射表中包含所述目标读地址但所述目标读地址在所述映射表中对应的更新时间不在最后一次基于所述目标读地址生成快照之后,则读取所述目标读地址在所述源卷上对应的存储块中的数据。
  9. 如权利要求7或8所述的写数据的方法,其中,所述写数据的方法还包括:
    若在写时重定向快照过程中,接收到对所述源卷的快照生成命令,则根据所述映射表判断所述源卷在接收到所述快照生成命令之前是否有写操作;
    当所述源卷在接收到所述快照生成命令之前存在写操作时,获取对源卷的写操作的地址为第三目标写地址,根据所述映射表查找与所述第三目标写地址对应的所述存储区域中的地址,获取所述地址中的数据为第三数据,同时获取所述第三数据的大小;
    根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照。
  10. 如权利要求9所述的写数据的方法,其中,所述根据所述第三数据、所述第三数据的大小和所述第三目标写地址生成快照包括:
    判断所述第三数据的大小与所述第三目标写地址在所述源卷所对应的第三存储块的大小是否相等;
    若是,则将所述第三数据写回至所述源卷的第三存储块中;
    若否,则获取所述第三存储块中的数据,将所述第三数据与所述第三存储块中的数据合并成第四数据,将所述第四数据写回至所述源卷的第三存储块中。
  11. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令配置为执行权利要求6-10任一项所述的写数据的方法。
PCT/CN2017/075058 2016-03-17 2017-02-27 写数据的方法及装置、计算机存储介质 WO2017157158A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610159006.2A CN107203331B (zh) 2016-03-17 2016-03-17 写数据的方法及装置
CN201610159006.2 2016-03-17

Publications (1)

Publication Number Publication Date
WO2017157158A1 true WO2017157158A1 (zh) 2017-09-21

Family

ID=59850735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075058 WO2017157158A1 (zh) 2016-03-17 2017-02-27 写数据的方法及装置、计算机存储介质

Country Status (2)

Country Link
CN (1) CN107203331B (zh)
WO (1) WO2017157158A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061429A (zh) * 2019-11-22 2020-04-24 北京浪潮数据技术有限公司 一种数据访问方法、装置、设备、介质
CN112099943A (zh) * 2020-08-13 2020-12-18 深圳云天励飞技术股份有限公司 内存分配方法及相关设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309100B (zh) * 2018-03-22 2023-05-23 腾讯科技(深圳)有限公司 一种快照对象生成方法和装置
CN109144416B (zh) * 2018-08-03 2020-04-28 华为技术有限公司 查询数据的方法和装置
CN110209351B (zh) * 2019-05-10 2021-02-19 星辰天合(北京)数据科技有限公司 分布式存储数据处理方法和装置
CN116991542B (zh) * 2023-09-26 2024-02-13 苏州元脑智能科技有限公司 一种虚拟机快照方法、系统、电子设备及计算机存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667161A (zh) * 2008-09-02 2010-03-10 联想(北京)有限公司 存储设备的数据保护方法、数据保护装置及计算机系统
CN101997918A (zh) * 2010-11-11 2011-03-30 清华大学 异构san环境中的海量存储资源按需分配的实现方法
CN102193842A (zh) * 2010-03-15 2011-09-21 成都市华为赛门铁克科技有限公司 一种数据备份方法和装置
CN104102521A (zh) * 2014-07-25 2014-10-15 浪潮(北京)电子信息产业有限公司 一种更新非易失性存储器的方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8762671B2 (en) * 2011-06-28 2014-06-24 Hitachi, Ltd. Storage apparatus and its control method
CN103761190B (zh) * 2013-12-19 2017-01-11 华为技术有限公司 数据处理方法及装置
CN104407936B (zh) * 2014-11-18 2017-08-18 华为数字技术(成都)有限公司 一种数据快照方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667161A (zh) * 2008-09-02 2010-03-10 联想(北京)有限公司 存储设备的数据保护方法、数据保护装置及计算机系统
CN102193842A (zh) * 2010-03-15 2011-09-21 成都市华为赛门铁克科技有限公司 一种数据备份方法和装置
CN101997918A (zh) * 2010-11-11 2011-03-30 清华大学 异构san环境中的海量存储资源按需分配的实现方法
CN104102521A (zh) * 2014-07-25 2014-10-15 浪潮(北京)电子信息产业有限公司 一种更新非易失性存储器的方法和装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061429A (zh) * 2019-11-22 2020-04-24 北京浪潮数据技术有限公司 一种数据访问方法、装置、设备、介质
CN111061429B (zh) * 2019-11-22 2022-06-17 北京浪潮数据技术有限公司 一种数据访问方法、装置、设备、介质
CN112099943A (zh) * 2020-08-13 2020-12-18 深圳云天励飞技术股份有限公司 内存分配方法及相关设备
CN112099943B (zh) * 2020-08-13 2024-05-03 深圳云天励飞技术股份有限公司 内存分配方法及相关设备

Also Published As

Publication number Publication date
CN107203331B (zh) 2022-05-06
CN107203331A (zh) 2017-09-26

Similar Documents

Publication Publication Date Title
WO2017157158A1 (zh) 写数据的方法及装置、计算机存储介质
JP6556911B2 (ja) 注釈付きアトミック書き込み操作を行う方法および装置
US20190102262A1 (en) Automated continuous checkpointing
WO2016041384A1 (zh) 重复数据删除方法和装置
US10778762B2 (en) Cloud computing service architecture
US20140372394A1 (en) System, method and a non-transitory computer readable medium for transaction aware snapshot
US10114576B2 (en) Storage device metadata synchronization
WO2017113213A1 (zh) 访问请求处理方法、装置及计算机系统
JP2006268139A (ja) データ複製装置、方法及びプログラム並びに記憶システム
KR20130066639A (ko) 데이터 이용가능성의 마운트타임 조정
US9003228B2 (en) Consistency of data in persistent memory
US11030092B2 (en) Access request processing method and apparatus, and computer system
US10152274B2 (en) Method and apparatus for reading/writing data from/into flash memory, and user equipment
JP2016504700A (ja) ストレージシステムにおけるオブジェクトベースのトランザクションのための方法およびシステム
WO2018076633A1 (zh) 一种远程数据复制方法、存储设备及存储系统
TWI823504B (zh) 非暫態電腦可讀取媒體、儲存裝置、及儲存方法
US9189409B2 (en) Reducing writes to solid state drive cache memories of storage controllers
KR20170002848A (ko) 가비지 컬렉션 저널링 장치 및 방법
JP2015114750A (ja) 調査用プログラム,情報処理装置及び情報処理方法
WO2016206070A1 (zh) 一种文件更新方法及存储设备
KR101676175B1 (ko) 전원 손실 이후 데이터 손실을 방지하기 위한 메모리 저장 장치 및 방법
US9235349B2 (en) Data duplication system, data duplication method, and program thereof
WO2018076954A1 (zh) 一种数据存储方法、装置及系统
WO2019072088A1 (zh) 一种文件管理方法、文件管理装置、电子设备及存储介质
WO2018094958A1 (zh) 一种数据处理方法、装置及系统

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17765700

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17765700

Country of ref document: EP

Kind code of ref document: A1