CN110502455B - Data storage method and system - Google Patents

Data storage method and system Download PDF

Info

Publication number
CN110502455B
CN110502455B CN201810480504.6A CN201810480504A CN110502455B CN 110502455 B CN110502455 B CN 110502455B CN 201810480504 A CN201810480504 A CN 201810480504A CN 110502455 B CN110502455 B CN 110502455B
Authority
CN
China
Prior art keywords
zone
cache
index
storage
cache zone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810480504.6A
Other languages
Chinese (zh)
Other versions
CN110502455A (en
Inventor
乔晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201810480504.6A priority Critical patent/CN110502455B/en
Publication of CN110502455A publication Critical patent/CN110502455A/en
Application granted granted Critical
Publication of CN110502455B publication Critical patent/CN110502455B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0871Allocation or management of cache space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0882Page mode

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data storage method and a system, wherein the method comprises the following steps: aiming at data to be written, according to the mapping state of the storage Zone and the cache Zone, converting the address of the data to be written in the storage Zone into a target cache offset address in the cache Zone; writing the data to be written into the cache Zone according to the target cache offset address; and when the preset synchronization condition is met, sequentially writing the data in the cache Zone into the storage Zone of the SMR hard disk according to the mapping state of the storage Zone and the cache Zone. The method solves the problem that SMR hard disk storage Zone can only be designed to be written in sequence by utilizing the characteristic that a nonvolatile storage medium is not lost when power is down through the software logic, and realizes data random writing at any specified position of the SMR hard disk storage Zone from the upper application view of a user.

Description

Data storage method and system
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a data storage method and system.
Background
With the advent of the network big data age, data storage technologies face unprecedented challenges. For example, in the field of monitoring, with the increasing requirements of high-definition and intelligence, the requirement for data storage is higher and higher.
In order to cope with the characteristics of large data storage and high data throughput in video surveillance, SMR (Shingled Magnetic Recording) has been developed. The SMR hard disk is divided into two main modes of Host management, Host Aware, Drive management and Device management by overlapping tracks and readjusting the data storage mode to improve the track density and increase the capacity of a single disk. However, in any mode, the SMR hard disk only supports sequential writing of data, and cannot cope with the out-of-order writing in video monitoring.
Disclosure of Invention
The embodiment of the invention aims to provide a data storage method and a data storage system, which are used for realizing the out-of-order writing of an SMR hard disk. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data storage method, which is applied to a data storage system, where the data storage system includes a read-write device, a nonvolatile storage medium, and an SMR hard disk in shingled magnetic recording technology, the read-write device is connected to the nonvolatile storage medium, and the read-write device is connected to the SMR hard disk; the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to different offset addresses; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each cache Zone maps a different cache Zone; the read-write device records the mapping state of the storage Zone and the cache Zone, and the method includes:
for data to be written, the read-write device converts an address, in the storage Zone, to which the data to be written is to be written, into an offset address in the cache Zone according to the mapping state of the storage Zone and the cache Zone, and uses the offset address as a target cache offset address;
the read-write equipment writes the data to be written into the corresponding cache Zone according to the target cache offset address;
and when a preset synchronization condition is met, the read-write equipment writes the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
Optionally, in the data storage method according to the embodiment of the present invention, each of the cache zones corresponds to a different cache Zone index, where the cache Zone index represents a mapping state between the cache Zone and the storage Zone, and the cache Zone index is further used to represent a state of each page in the cache Zone, where the read-write device includes a radix tree, and the radix tree links the cache Zone index;
for the data to be written, the read-write device converts, according to the mapping state between the storage Zone and the cache Zone, an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone, where the address is used as a target cache offset address, where the method includes:
for data to be written, converting the page offset of the data to be written into an offset address of a storage Zone by the read-write equipment, and using the offset address as a target storage offset address;
the read-write equipment inquires whether the radix tree contains a cache Zone index corresponding to the target storage offset address;
when the radix tree includes the cache Zone index corresponding to the target storage offset address, the read-write device converts the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and the offset address serves as a target cache offset address;
the writing and reading device writes the data to be written into the corresponding cache Zone according to the target cache offset address, including:
and the read-write equipment determines the page offset of the target cache offset address in the cache Zone as a target page offset, writes the data to be written into the page corresponding to the cache Zone according to the target page offset, and sets the page corresponding to the cache Zone to be dirty in the cache Zone index corresponding to the target storage offset address.
Optionally, after the read-write device queries whether the radix tree includes the cache Zone index corresponding to the target storage offset address, the method further includes:
when the radix tree does not contain the cache Zone index corresponding to the target storage offset address, the read-write equipment judges whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium;
if the cache Zone of the unmapped storage Zone exists, the read-write equipment selects one cache Zone from the cache zones of the unmapped storage Zone as the cache Zone mapped by the storage Zone into which the data to be written is to be written;
the read-write equipment allocates a cache Zone index for the cache Zone of the storage Zone map to which the data to be written is to be written, configures the mapping state of the cache Zone index, and sets each page of the cache Zone index clean.
Optionally, after the read-write device determines whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium, the method further includes:
if the cache Zone which is not mapped with the storage Zone does not exist, the read-write equipment selects the cache Zone index to be recovered according to a preset selection rule;
the read-write equipment writes the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered, and takes the cache Zone index to be recovered as the recovered cache Zone index;
the read-write device uses the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone into which the data to be written is to be written, sets each page of the recovered cache Zone index clean, and configures a mapping state of the recovered cache Zone index as follows: and mapping the storage Zone to which the write data is to be written by the cache Zone corresponding to the recycled cache Zone index.
Optionally, the nonvolatile storage medium further includes an index management area, and the SMR hard disk further includes an index backup area; the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk; the method further comprises the following steps:
when the data storage system is started, the read-write equipment compares the last updating time of the cache Zone index in the index management area with the last updating time of the cache Zone index in the index backup area;
and if the last updating time of the cache Zone index in the index backup area is earlier than the last updating time of the cache Zone index in the index management area, the read-write equipment updates the index backup area according to each cache Zone index in the index management area.
Optionally, after the read-write device compares the last update time of the cached Zone index in the index management area with the last update time of the cached Zone index in the index backup area, the method further includes:
if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, the read-write equipment updates the cache Zone index in the index management area according to each cache Zone index in the index backup area;
the read-write equipment updates the data in the cache Zone according to the data in the storage Zone;
and hooking the updated cache Zone index in the index management area in a radix tree in the read-write equipment.
Optionally, before the read-write device configures the cache Zone indexes in the radix tree according to the cache Zone indexes in the index management area, the method further includes:
when the data storage system is started, the read-write device reads each cache Zone index from the nonvolatile storage medium, determines whether a new storage Zone exists according to each read cache Zone index, and configures a cache Zone index for the new storage Zone if the new storage Zone exists.
Optionally, in the data storage method according to the embodiment of the present invention, the preset synchronization condition includes a power-off restart of the data storage system.
In a second aspect, an embodiment of the present invention provides a data storage system, where the system includes:
the magnetic recording device comprises a read-write device, a nonvolatile storage medium and a shingled magnetic recording technology SMR hard disk, wherein the read-write device is connected with the nonvolatile storage medium and the read-write device is connected with the SMR hard disk;
the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to different offset addresses; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each cache Zone maps a different cache Zone; the read-write equipment records the mapping state of the storage Zone and the cache Zone;
the read-write device is configured to, for data to be written, convert an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone according to the mapping state of the storage Zone and the cache Zone, where the address is used as a target cache offset address; writing the data to be written into the corresponding cache Zone according to the target cache offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
Optionally, in the data storage system in the embodiment of the present invention, each of the cache zones corresponds to a different cache Zone index, where the cache Zone index represents a mapping state between the cache Zone and the storage Zone, and the cache Zone index is further used to represent a state of each page in the cache Zone, where the read-write device includes a radix tree, and the radix tree links the cache Zone index;
the read-write device is specifically configured to, for data to be written, convert a page offset of the data to be written into an offset address of a storage Zone, where the offset address is used as a target storage offset address; inquiring whether the radix tree contains a cache Zone index corresponding to the target storage offset address; when the radix tree contains the cache Zone index corresponding to the target storage offset address, converting the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and using the offset address as a target cache offset address; determining a page offset of the target cache offset address in the cache Zone, as a target page offset, writing the data to be written into a page corresponding to the cache Zone according to the target page offset, and setting a page corresponding to the cache Zone to be dirty in a cache Zone index corresponding to the target storage offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
Optionally, in the data storage system according to the embodiment of the present invention, the read-write device is further configured to: when the radix tree does not contain the cache Zone index corresponding to the target storage offset address, judging whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium; if the cache Zone of the unmapped storage Zone exists, selecting one cache Zone from the cache zones of the unmapped storage Zone as the cache Zone mapped by the storage Zone to which the data to be written is to be written; and allocating a cache Zone index for the cache Zone to store Zone mapping to which the data to be written is to be written, configuring a mapping state of the cache Zone index, and setting each page of the cache Zone index clean.
Optionally, in the data storage system according to the embodiment of the present invention, the read-write device is further configured to:
if the cache Zone which is not mapped with the storage Zone does not exist, selecting a cache Zone index to be recovered according to a preset selection rule; writing the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered, and taking the cache Zone index to be recovered as the recovered cache Zone index; taking the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone to which the data to be written is to be written, setting each page of the recovered cache Zone index clean, and configuring a mapping state of the recovered cache Zone index as follows: and mapping the storage Zone to which the write data is to be written by the cache Zone corresponding to the recycled cache Zone index.
Optionally, in the data storage system according to the embodiment of the present invention, the nonvolatile storage medium further includes an index management area, and the SMR hard disk further includes an index backup area; the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk;
the read-write device is further configured to: when the data storage system is started, comparing the last updating time of the cache Zone index in the index management area with the last updating time of the cache Zone index in the index backup area; and if the last updating time of the cache Zone index in the index backup area is earlier than the last updating time of the cache Zone index in the index management area, updating the index backup area according to each cache Zone index in the index management area.
Optionally, in the data storage system according to the embodiment of the present invention, the read-write device is further configured to: if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, updating the cache Zone index in the index management area according to each cache Zone index in the index backup area; updating the data in the cache Zone according to the data in the storage Zone; and hooking the updated cache Zone index in the index management area in a radix tree in the read-write equipment.
Optionally, in the data storage system according to the embodiment of the present invention, the read-write device is further configured to: when the data storage system is started, reading each cache Zone index from the nonvolatile storage medium, judging whether a new storage Zone exists according to each read cache Zone index, and if so, configuring the cache Zone index for the new storage Zone.
Optionally, in the data storage system according to the embodiment of the present invention, the preset synchronization condition includes a power-off restart of the data storage system.
In the data storage method and system provided by the embodiment of the invention, the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to a different offset address; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each cache Zone maps different cache zones; the read-write equipment records the mapping state of the storage Zone and the cache Zone; for data to be written, the read-write equipment converts an address, to be written in, of the data to be written in the storage Zone into an offset address in the cache Zone according to the mapping state of the storage Zone and the cache Zone, and the offset address serves as a target cache offset address; the read-write equipment writes the data to be written into the corresponding cache Zone according to the target cache offset address; and when the preset synchronization condition is met, the read-write equipment writes the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone. By writing data (writing in disorder) in the specified page of the cache Zone and sequentially writing the data in the cache Zone into the storage Zone of the SMR hard disk, the problem that the SMR hard disk has a technical short board in which the Zone can only write sequentially is logically solved, and a user writes data at any specified position of the SMR hard disk storage Zone from the perspective of an upper layer application, namely, realizes the random data writing. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a data storage system according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a cache Zone index according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a radix tree hooking cache Zone index according to an embodiment of the present invention;
FIG. 4 is a first flowchart illustrating a data storage method according to an embodiment of the invention;
FIG. 5 is a second flowchart illustrating a data storage method according to an embodiment of the invention;
fig. 6 is a schematic flow chart of a data storage method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the related storage technology of the SMR hard disk, the SMR hard disk can store data only by a sequential write method. In view of the fact that concurrent writing and out-of-order writing cannot be dealt with, an embodiment of the present invention provides a data storage system, and referring to fig. 1, the data storage system includes:
a read-write device 101, a nonvolatile storage medium 102, and a shingled magnetic recording technology SMR hard disk 103, the read-write device 101 being connected to the nonvolatile storage medium 102, the read-write device 101 being connected to the SMR hard disk 103;
the SMR hard disk 103 includes a plurality of storage zones, each of the storage zones corresponding to a different offset address; the non-volatile storage medium 102 is logically divided into a plurality of cache zones, and each of the cache zones maps a different cache Zone; the read-write device 101 records the mapping status between the storage Zone and the cache Zone;
the read-write device 101 is configured to, for data to be written, convert an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone according to the mapping state between the storage Zone and the cache Zone, where the address is used as a target cache offset address; writing the data to be written into the corresponding cache Zone according to the target cache offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
The read-write device 101 may be connected to the nonvolatile storage medium 102 through a memory controller, and the read-write device 101 may be connected to one or more SMR hard disks 103 through SATA (Serial Advanced Technology Attachment) or SAS (Serial Attached Small Computer System Interface) control. The non-volatile storage medium 102 may perform out-of-order writing, and the read-write device 101 further includes a CPU, a memory, and other peripheral devices related to a service.
Each storage Zone maps a different cache Zone, optionally, the storage zones and the cache zones are in a one-to-one mapping relationship, and the sizes of the storage zones and the cache zones mapped by the storage zones are equal.
For data to be written in the SMR hard disk, the read-write device 101 determines an address of a position to be written in of the data to be written, that is, an address to be written, and converts the address to be written into an offset address in the cache Zone, that is, a target cache offset address, according to a mapping state between the storage Zone and the cache Zone.
The nonvolatile storage medium 102 may perform out-of-order writing, and the read-write device 101 writes data to be written into the cache Zone according to the target cache offset address. And when the preset synchronization condition is met, the read-write equipment sequentially writes the data in the cache Zone into the storage Zone corresponding to the cache Zone. Writing data at the designated position of the SMR hard disk 103 is logically realized by writing data at the designated position in the cache Zone and then sequentially writing the data in the cache Zone into the storage Zone.
Optionally, satisfying the preset synchronization condition includes: and satisfying the preset write-back period. The read-write device 101 periodically writes the data in the cache Zone into the storage Zone corresponding to the cache Zone according to a preset write-back period. The read-write device 101 writes the data in the cache Zone of the nonvolatile storage medium 102 into the storage Zone of the SMR hard disk 103 according to a preset write-back period, thereby ensuring normal storage of the data and releasing the cache Zone in the nonvolatile storage medium 102.
In the embodiment of the invention, data is written in the specified page of the cache Zone (the data can be written out in disorder), and the data in the cache Zone is written in the storage Zone corresponding to the cache Zone in sequence, so that the out-of-order writing of the SMR hard disk is realized logically, the technical problem that the SMR hard disk can only be written in sequence is solved, and the user writes the data in the specified position of the SMR hard disk storage Zone from the perspective of upper-layer application, namely, realizes the random writing of the data. The present invention can be applied to the case of concurrent writing, in which data is written to each storage Zone of the SMR hard disk.
Optionally, in the data storage system according to the embodiment of the present invention, each of the cache zones corresponds to a different cache Zone index, where the cache Zone index represents a mapping state between the cache Zone and the storage Zone, and is further used to represent a state of each page in the cache Zone, the read-write device 101 includes a radix tree, and the radix tree links the cache Zone index;
the read-write device is specifically configured to, for data to be written, convert a page offset of the data to be written into an offset address of a storage Zone, where the offset address is used as a target storage offset address; inquiring whether the radix tree contains a cache Zone index corresponding to the target storage offset address; when the radix tree contains the cache Zone index corresponding to the target storage offset address, converting the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and using the offset address as a target cache offset address; determining a page offset of the target cache offset address in the cache Zone, as a target page offset, writing the data to be written into a page corresponding to the cache Zone according to the target page offset, and setting a page corresponding to the cache Zone to be dirty in a cache Zone index corresponding to the target storage offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
Each cache Zone has its own dedicated cache Zone index, the cache Zone indexes represent the corresponding relationship between the cache zones and the storage zones, one cache Zone corresponds to one cache Zone index, and the cache Zone indexes corresponding to the cache zones are different. Optionally, referring to fig. 2, the caching Zone index includes: the SMR hard disk stores byte offset (offset in byte unit) of the Zone, caches bitmap (bitmap file) of pages inside the Zone, caches memory domain offset of the Zone, caches last synchronization time of the Zone, stores Zone area offset (offset in Zone unit) in the SMR hard disk, caches last page caching time of the Zone, mapping state and index header information of the cache. The cache index header of the cache Zone index comprises: and storing Zone information, SMR hard disk information and a first cache index address of the Zone.
As shown in fig. 3, a radix tree for managing the cache Zone index is built on the super cache in the read-write device 101, and if the mapping state of the cache Zone index is mapped, a leaf on the radix tree points to the cache Zone index pointer in the nonvolatile storage medium 102; otherwise, the leaf pointer is NULL, which indicates that the cache Zone index is not hooked. Wherein the position of the hanging on the base tree is determined by the offset of the buffer Zone index in the unit of Zone.
In the embodiment of the invention, the cache Zone and the storage Zone can be quickly accessed through the radix tree and the cache Zone index, so that the corresponding relation between the cache Zone and the storage Zone can be conveniently obtained, and the management and the storage of data are convenient.
Optionally, the SMR hard disk 103 may be a Host manager mode SMR hard disk, a software layer of the read-write device 101 may adopt a Linux operating system, and a kernel supports the ZAC/ZBC instruction for Zone management (the kernel above the Linux-4.4 version already supports the instruction set).
In the related art, the I/O (Input/Output) is 4K pages, and in the embodiment of the present invention, a page offset based on a file is converted into an address offset based on a storage Zone, and then an I/O operation is initiated on the storage Zone according to the offset address. The nonvolatile storage medium 102 is divided into a plurality of cache zones based on the storage Zone size of the SMR hard disk, and the storage Zone and the cache Zone are equal in size. For example, the storage Zone size of the SMR hard disk is 256MB, the size of the nonvolatile storage medium 102 is 16GB, the nonvolatile storage medium 102 is divided into 63 cache zones of 256MB, and one 256MB is an index management area. The method comprises the steps of establishing a mapping relation between a cache Zone in a nonvolatile storage medium 102 and a storage Zone in an SMR hard disk 103, storing storage Zone information, hard disk information and the like in a cache index head, storing the mapping relation, the state and the like in a cache Zone index, and independently reserving a Zone space on the SMR hard disk for opening up index backup of the SMR hard disk and index synchronization.
In the embodiment of the invention, by writing data (writing in disorder) in the specified page of the cache Zone and writing the data in the cache Zone into the storage Zone of the SMR hard disk in sequence, a technical short board that can only write in sequence inside the SMR hard disk storage Zone is logically solved, and a user writes data in the specified position of the SMR hard disk storage Zone from the perspective of upper-layer application, that is, realizes data random writing. The cache Zone and the storage Zone are equal in size, data are written in the specified page of the cache Zone, and the data in the cache Zone are written in the storage Zone corresponding to the cache Zone, so that the sequential writing of the data in each storage Zone space in the SMR hard disk can be realized, and the situation of concurrent writing can be adapted.
Optionally, in the data storage system according to the embodiment of the present invention, the read-write device 101 is further configured to determine whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium 102 when the radix tree does not include the cache Zone index corresponding to the target storage offset address; if the cache Zone of the unmapped storage Zone exists, selecting one cache Zone from the cache zones of the unmapped storage Zone as the cache Zone mapped by the storage Zone to which the data to be written is to be written; allocating a cache Zone index for the cache Zone to store the Zone map, where the data to be written is to be written, configuring a mapping state of the cache Zone index, and setting each page of the cache Zone index clean.
The read-write equipment sets each page in the selected cache Zone index cleanly, sets the offset of the storage Zone in the cache Zone index corresponding to the selected cache Zone according to the address of the storage Zone to which the data to be written is to be written, and sets a cache index head and the like. After obtaining the cache Zone index representing the corresponding relationship between the storage Zone to which the data to be written and the cache Zone, the read-write device 101 determines the page offset of the target cache offset address in the cache Zone according to the cache Zone index, and writes the data to be written into the cache Zone according to the page offset; setting the written page in a cache Zone index to be dirty; when the preset synchronization condition is met, the read-write device 101 writes the data in the cache Zone into the corresponding storage Zone according to the cache Zone index.
In the embodiment of the present invention, when the storage Zone to which the data to be written is written does not establish a corresponding relationship with the cache Zone, one cache Zone is selected from the idle cache zones, the corresponding relationship between the selected cache Zone and the storage Zone to which the data to be written is established, and the cache Zone index is configured, so that the use of the newly added storage Zone can be ensured.
Optionally, the read-write device 101 is further configured to select a cache Zone index to be recovered according to a preset selection rule if there is no cache Zone not mapped with the storage Zone; writing the data in the cache Zone corresponding to the to-be-recovered cache Zone index into the storage Zone corresponding to the to-be-recovered cache Zone index, and taking the to-be-recovered cache Zone index as a recovered cache Zone index; taking the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone to which the data to be written is to be written, setting each page of the recovered cache Zone index clean, and configuring the mapping state of the recovered cache Zone index as follows: and mapping the storage Zone to which the write data is to be written to the cache Zone corresponding to the recycled cache Zone index.
The preset selection rule is a rule for selecting a cache Zone index to be recovered, for example, selecting the cache Zone index with the minimum average access frequency in unit time; or selecting the Least Recently accessed cache Zone index, and selecting the Least Recently accessed cache Zone index through an LRU (Least Recently Used) algorithm, and the like. And taking the cache Zone index selected according to the preset selection rule as the cache Zone index to be recovered.
Optionally, in the data storage system according to the embodiment of the present invention, the writing and reading device 101 writes data in the cache Zone corresponding to the to-be-recovered cache Zone index into the storage Zone corresponding to the to-be-recovered cache Zone index, where the writing and reading device includes:
the read-write equipment 101 determines whether a clean page exists in the cache Zone corresponding to the cache Zone index to be recovered according to the cache Zone index to be recovered;
if the cache Zone corresponding to the cache Zone index to be recovered has a clean page, acquiring an address of each clean page, reading data corresponding to each clean page from the storage Zone corresponding to the cache Zone index to be recovered according to the address of each clean page, writing the data into the clean page of the cache Zone, and writing all the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered;
and if no clean pages exist in the cache Zone corresponding to the cache Zone index to be recovered, writing all data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered.
Optionally, in the data storage system according to the embodiment of the present invention, the writing and reading device 101 writes data in the cache Zone corresponding to the to-be-recovered cache Zone index into the storage Zone corresponding to the to-be-recovered cache Zone index, where the writing and reading device includes:
the read-write equipment 101 determines a first dirty page and a last dirty page in the cache Zone corresponding to the cache Zone index to be recovered according to the cache Zone index to be recovered, and judges whether a clean page exists between the first dirty page and the last dirty page;
if a clean page exists between the first dirty page and the last dirty page, acquiring addresses of the clean pages between the first dirty page and the last dirty page, reading data corresponding to the clean pages from a storage Zone corresponding to the cache Zone index to be recovered according to the addresses of the clean pages, writing the data into the clean pages corresponding to the cache Zone, and sequentially writing the data between the first dirty page and the last dirty page in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered;
and if no clean page exists between the first dirty page and the last dirty page, writing continuous data between the first dirty page and the last dirty page in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered.
Optionally, after writing the data in the cache Zone corresponding to the to-be-recovered cache Zone index into the storage Zone corresponding to the to-be-recovered cache Zone index, the read-write device 101 is further configured to update the access time of the to-be-recovered cache Zone index in the index management area and the index backup area.
And after the data recovery of the cache Zone corresponding to the cache Zone index to be recovered is completed, taking the cache Zone index to be recovered as the recovered cache Zone index. The read-write device 101 uses the recovered cache Zone index as a cache Zone index of a cache Zone corresponding to a storage Zone to which data to be written is to be written, updates a mapping relation represented by the recovered cache index, and sets each page in the cache Zone index clean.
The read-write equipment 101 determines the page offset of the target cache offset address in the cache Zone according to the updated recycled cache Zone index, and writes the data to be written into the cache Zone according to the page offset; setting the corresponding page dirty according to the page offset; when the preset synchronization condition is satisfied, the read-write device 101 writes the data in the cache Zone into the corresponding storage Zone according to the updated cache Zone index.
In the data I/O process, since the number of the cache Zone indexes is limited, the cache Zone indexes may run out, and therefore, the cache Zone indexes and the cache zones need to be recycled. The data in the cache Zone needs to be recycled first, the span of the dirty page needs to be found, if a clean page exists in the span, the data corresponding to the clean page needs to be read from the storage Zone of the SMR hard disk, and then the span of the dirty page is continuously written into the storage Zone of the SMR hard disk. The cache Zone index to be recovered can be selected based on an LRU algorithm, the recovered cache Zone index is obtained after data are synchronized to the storage Zone of the SMR hard disk, and the recovered cache Zone index can be used for writing in a new other storage Zone. The read-write equipment clears the parameters of the storage Zone in the cache Zone index, sets each page in the new cache Zone index clean, sets the offset of the storage Zone in the target cache Zone index according to the address of the storage Zone to which the data to be written is to be written, and sets a cache index head and the like.
In the embodiment of the present invention, the cache Zone index and the cache Zone are recovered, and the mapping relationship between the recovered cache Zone and the storage Zone to which data is to be written is established by configuring the cache Zone index, so that the case that the number of the cache Zone indexes is limited can be dealt with, and the use of the newly added storage Zone is ensured.
Optionally, the nonvolatile storage medium 102 further includes an index management area, and the SMR hard disk 103 further includes an index backup area;
the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk 103;
the read-write device 101 is further configured to, when the data storage system is started, compare the last update time of the cache Zone index in the index management area with the last update time of the cache Zone index in the index backup area, and update the index backup area according to each cache Zone index in the index management area if the last update time of the cache Zone index in the index backup area is earlier than the last update time of the cache Zone index in the index management area.
When the last update time of the cache Zone index in the index backup area is earlier than the last update time of the cache Zone index in the index management area, which indicates that the data in the index management area is newer, the read-write device 101 updates each cache Zone index in the index backup area according to each cache Zone index in the index management area, and synchronizes the data in each cache Zone to the storage Zone. The read-write device 101 links the cache Zone index to the radix tree of the super block according to the mapping state of the cache Zone index in the index management area and the offset of the cache Zone index in units of Zone.
For example, if the mapping status of the cache Zone index is mapped, and the offset of the cache Zone index in units of Zone is 3, the cache Zone index is hooked on the leaf pointer with the order of 3 on the radix tree. If the mapping status of the cached Zone index is unmapped and the offset of the cached Zone index in units of zones is 4, the leaf pointer with the order of 4 in the radix tree is NULL.
In the embodiment of the present invention, when the last update time of the cache Zone index in the index backup area is earlier than the last update time of the cache Zone index in the index management area, the cache Zone index in the radix tree is configured according to each cache Zone index in the index management area, so that it can be ensured that the cache Zone index is the latest cache Zone index, and the stored latest data is ensured.
Optionally, the read-write device 101 is further configured to, if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, update the cache Zone index in the index management area according to each cache Zone index in the index backup area, and update the data in the cache Zone according to the data in the storage Zone, where the updated cache Zone index in the index management area is hooked in a radix tree in the read-write device.
When the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, it indicates that the data in the index backup area is newer, the read-write device 101 synchronizes each cache Zone index in the index backup area into the index management area, reads each dirty page from the storage Zone and stores the dirty page into the cache Zone according to the bitmap of the cache Zone index, and the read-write device 101 hangs the cache Zone index onto the radix tree of the superblock according to the mapping state of the cache Zone index and the offset of the cache Zone index in units of Zone.
In the embodiment of the present invention, when the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, each cache Zone index in the index backup area is synchronized into the index management area, and the cache Zone indexes in the radix tree are configured according to each cache Zone index in the index management area, so that it can be ensured that the cache Zone indexes are the latest cache Zone indexes, and data in the storage zones are synchronized to the cache zones according to the cache Zone indexes, so that it can be ensured that the data in the nonvolatile storage medium is the latest data.
Optionally, the read-write device 101 is further configured to, when the data storage system is started, read each of the cache Zone indexes from the nonvolatile storage medium 102, determine whether a new storage Zone exists according to each of the read cache Zone indexes, and if a new storage Zone exists, configure a cache Zone index for the new storage Zone.
The read-write device 101 reads the cache Zone index in the nonvolatile storage medium 102, and simultaneously obtains the information of each storage Zone, allocates a new cache Zone index to the storage Zone of the newly accessed SMR hard disk, and sets the mapping state and parameters of the uninitialized cache Zone index and attaches a radix tree when the storage Zone to which data to be written is the storage Zone of the newly accessed SMR hard disk.
In the embodiment of the present invention, a cache Zone index is configured for the new storage Zone, so as to ensure normal use of the new storage Zone.
Optionally, in the data storage system according to the embodiment of the present invention, the preset synchronization condition includes a power-off restart of the data storage system.
The nonvolatile storage medium 102 is used for caching between the data storage system memory and the SMR hard disk 103. Optionally, the nonvolatile storage medium is an Optane memory, and the Optane memory has the characteristics of high bandwidth, low time delay, high-quality service and long service life, so that the caching efficiency of the data storage system can be improved, and the reliability of cached data can be improved. The non-volatile storage medium may consist of one or more Optane memories, which are divided into a plurality of cache zones. For each caching Zone, the size of the caching Zone may be equal to the size of the storage Zone to which the caching Zone is mapped.
In order to increase the reliability of data storage, a nonvolatile storage medium is used as a cache between the memory of the data storage system and the SMR hard disk in the embodiment of the present invention. When the data storage system according to the embodiment of the present invention is powered off and restarted, the read-write device 101 synchronizes the data in the cache Zone and the storage Zone mapped by the cache Zone for each cache Zone according to the correspondence between the storage Zone and the cache Zone.
In the embodiment of the invention, the data is cached in the cache Zone of the nonvolatile storage medium, the data in the nonvolatile storage medium cannot be lost after unexpected power failure, and the data in the nonvolatile storage medium and the storage Zone of the SMR hard disk are synchronized after power failure restart, so that the situation of data loss caused by power failure can be reduced, and the reliability of data storage can be improved. And different storage zones in the SMR hard disk are written into the dynamic mapping cache Zone along with the data, so that data management is facilitated, the read-write equipment can directly store the data in the cache Zone into the storage Zone through the corresponding relation between the storage zones and the cache Zone, the storage speed is high, and the storage efficiency is high.
An embodiment of the present invention further provides a data storage method, which is applied to a data storage system, and the data storage system includes a read-write device, a nonvolatile storage medium, and an SMR hard disk of a shingled magnetic recording technology, where the read-write device is connected to the nonvolatile storage medium, and the read-write device is connected to the SMR hard disk; the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to a different offset address; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each of the cache zones maps a different cache Zone; the above-mentioned read-write equipment records the mapping status between the above-mentioned storage Zone and the above-mentioned cache Zone, and the above-mentioned method includes:
s401, for the data to be written, according to the mapping state between the storage Zone and the cache Zone, the read/write device converts the address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone, which is used as a target cache offset address;
s402, the read-write equipment writes the data to be written into the corresponding cache Zone according to the target cache offset address;
s403, when a preset synchronization condition is satisfied, the read/write device writes the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state between the storage Zone and the cache Zone.
In the embodiment of the invention, data is written in the specified page of the cache Zone (the data can be written out in disorder), and the data in the cache Zone is written in the storage Zone corresponding to the cache Zone in sequence, so that the out-of-order writing of the SMR hard disk is realized logically, the technical problem that the SMR hard disk can only be written in sequence is solved, and the user writes the data in the specified position of the SMR hard disk storage Zone from the perspective of upper-layer application, namely, realizes the random writing of the data. The present invention can be applied to the case of concurrent writing, in which data is written to each storage Zone of the SMR hard disk.
Optionally, in the data storage method according to the embodiment of the present invention, the preset synchronization condition includes a power-off restart of the data storage system.
In the embodiment of the invention, the data is cached in the cache Zone of the nonvolatile storage medium, the data in the nonvolatile storage medium cannot be lost after unexpected power failure, and the data in the nonvolatile storage medium and the SMR hard disk are synchronized after power failure restart, so that the situation of data loss caused by power failure can be reduced, and the reliability of data storage can be improved. And different storage zones in the hard disk are written into the dynamic mapping cache Zone along with the data, so that data management is facilitated, the read-write equipment can directly store the data in the cache Zone into the storage Zone through the corresponding relation between the storage zones and the cache Zone, the storage speed is high, and the storage efficiency is high.
Optionally, in the data storage method according to the embodiment of the present invention, each of the cache zones corresponds to a different cache Zone index, where the cache Zone index represents a mapping state between the cache Zone and the storage Zone, and the cache Zone index is further used to represent a state of each page in the cache Zone, where the read-write device includes a radix tree, and the radix tree links the cache Zone index;
for the data to be written, the converting, by the read-write device, an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone according to the mapping state between the storage Zone and the cache Zone, where the address is used as a target cache offset address, and the converting includes:
for data to be written, converting the page offset of the data to be written into an offset address of a storage Zone by the read-write equipment, and taking the offset address as a target storage offset address;
the read-write equipment inquires whether the radix tree contains a cache Zone index corresponding to the target storage offset address;
when the radix tree includes the cache Zone index corresponding to the target storage offset address, the read-write device converts the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and uses the offset address as a target cache offset address;
the writing and reading device writes the data to be written into the corresponding cache Zone according to the target cache offset address, including:
the read-write device determines a page offset of the target cache offset address in the cache Zone, and writes the data to be written into the page corresponding to the cache Zone according to the target page offset as a target page offset, and sets the page corresponding to the cache Zone in a cache Zone index corresponding to the target storage offset address to be dirty.
Optionally, after the read-write device queries whether the radix tree includes the cache Zone index corresponding to the target storage offset address, the method further includes:
step one, when the radix tree does not contain the cache Zone index corresponding to the target storage offset address, the read-write equipment judges whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium;
step two, if a cache Zone of the unmapped storage Zone exists, the read-write device selects one cache Zone from the cache Zone of the unmapped storage Zone as the cache Zone mapped by the storage Zone into which the data to be written is to be written;
step three, the read-write equipment allocates a cache Zone index for the cache Zone of the storage Zone map to which the data to be written is to be written, configures the mapping state of the cache Zone index, and sets each page of the cache Zone index clean.
In the embodiment of the present invention, when the storage Zone to which the data to be written is written does not establish a corresponding relationship with the cache Zone, one cache Zone is selected from the idle cache zones, the corresponding relationship between the selected cache Zone and the storage Zone to which the data to be written is established, and the corresponding relationship is configured in the cache Zone index parameter, so that the use of the newly added storage Zone can be ensured.
Optionally, after the read/write device determines whether the cache Zone of the unmapped storage Zone exists in the nonvolatile storage medium, the method further includes:
step one, if the cache Zone which is not mapped with the storage Zone does not exist, the read-write equipment selects a cache Zone index to be recovered according to a preset selection rule;
step two, the read-write equipment writes the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered, and takes the cache Zone index to be recovered as the recovered cache Zone index;
step three, the read-write device uses the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone to which the data to be written is to be written, sets each page of the recovered cache Zone index clean, and configures the mapping state of the recovered cache Zone index as: and mapping the storage Zone to which the write data is to be written to the cache Zone corresponding to the recycled cache Zone index.
Optionally, writing the data in the cache Zone corresponding to the to-be-recovered cache Zone index into the storage Zone corresponding to the to-be-recovered cache Zone index includes:
the read-write equipment determines a first dirty page and a last dirty page in a cache Zone corresponding to the cache Zone index to be recovered according to the cache Zone index to be recovered, and judges whether a clean page exists between the first dirty page and the last dirty page;
if a clean page exists between the first dirty page and the last dirty page, acquiring addresses of the clean pages between the first dirty page and the last dirty page, reading data corresponding to the clean pages from a storage Zone corresponding to the cache Zone index to be recovered according to the addresses of the clean pages, writing the data into the clean pages corresponding to the cache Zone, and sequentially writing the data between the first dirty page and the last dirty page in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered;
and if no clean page exists between the first dirty page and the last dirty page, writing continuous data between the first dirty page and the last dirty page in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered.
In the embodiment of the present invention, the cache Zone index and the cache Zone are recovered, the correspondence between the recovered cache Zone and the storage Zone of the data to be newly written is established, and the cache Zone index is configured, so that the case that the number of the cache Zone indexes is limited can be dealt with, and the use of the newly added storage Zone is ensured.
Optionally, as shown in fig. 5, in the data storage method according to the embodiment of the present invention, for the data to be written, the storing, by the read-write device, the data to be written into the cache Zone corresponding to the storage Zone into which the data to be written is to be written according to the cache Zone index includes:
s501, aiming at the data to be written, converting the page offset of the data to be written into an offset address of a storage Zone as a target storage offset address.
S502, querying whether the radix tree includes the cache Zone index corresponding to the target storage offset address, if the radix tree includes the cache Zone index corresponding to the target storage offset address, performing S509, and if the radix tree does not include the cache Zone index corresponding to the target storage offset address, performing S503.
S503, determining whether there is a cache Zone not corresponding to the storage Zone in the nonvolatile storage medium, if there is a cache Zone not corresponding to the storage Zone, executing S507, and if there is no cache Zone not corresponding to the storage Zone, executing S504.
S504, according to a preset selection rule, selecting a cache Zone index needing to be recycled.
And S505, the read-write equipment finds the cache Zone index needing to be recovered, and writes the data in the cache Zone corresponding to the cache Zone index needing to be recovered into the storage Zone corresponding to the target cache Zone index.
The read-write equipment determines a first dirty page and a last dirty page in a cache Zone corresponding to the cache Zone index needing to be recovered according to the cache Zone index needing to be recovered, and judges whether a clean page exists between the first dirty page and the last dirty page;
if a clean page exists between the first dirty page and the last dirty page, the read-write equipment acquires the address of each clean page between the first dirty page and the last dirty page, reads data corresponding to each clean page from the storage Zone corresponding to the target cache Zone index according to the address of each clean page, reads the data into the clean page corresponding to the cache Zone, and then sequentially writes continuous data between the first dirty page and the last dirty page in the cache Zone corresponding to the target cache Zone index into the storage Zone corresponding to the target cache Zone index;
and if no clean page exists between the first dirty page and the last dirty page, the reading and writing equipment sequentially writes continuous data between the first dirty page and the last dirty page in the cache Zone corresponding to the target cache Zone index into the storage Zone corresponding to the target cache Zone index to finish recycling.
S506, the read-write device selects the recovered cache Zone index as the cache Zone index of the cache Zone corresponding to the storage Zone to which the data to be written is to be written, sets each page corresponding to the recovered cache Zone index clean, and updates the cache Zone index.
The read-write equipment clears the parameters of the storage Zone in the new cache Zone index, sets each page in the new cache Zone index clean, sets the offset and mapping relation of the storage Zone in the target cache Zone index according to the address of the storage Zone to which the data to be written is to be written, and sets a cache index head and the like.
S507, selecting one cache Zone from the cache zones which do not correspond to the storage zones, wherein the cache Zone is used as the cache Zone corresponding to the storage Zone in which the data to be written is to be written;
and S508, the read-write equipment performs parameter cleaning on the cache Zone index of the cache Zone corresponding to the storage Zone in which the data to be written is to be written, and configures the cache Zone index.
The read-write equipment sets each page in the selected cache Zone index cleanly, sets the offset of the storage Zone in the cache Zone index corresponding to the selected cache Zone according to the address of the storage Zone to which the data to be written is to be written, and sets a cache index head and the like.
S509, according to the cache Zone index corresponding to the target storage offset address, the target storage offset address is converted into an offset address in the cache Zone, which is used as the target cache offset address.
S510, determining a page offset of a target cache offset address in a cache Zone as a target page offset, writing data to be written into the cache Zone according to the target page offset as required (out of order), and setting a corresponding page in a cache Zone index to be dirty according to the target page offset.
In the embodiment of the invention, a specific method for writing data to be written into the cache Zone is provided, which can ensure the use of the newly added storage Zone, support the writing of out-of-order data into the cache Zone, further synchronize the data sequence between the cache Zone and the storage Zone, and overcome the technical shortages that an SMR hard disk can only be written in sequence. Through the radix tree and the cache Zone index, the corresponding relation among the cache Zone, the management cache Zone and the storage Zone can be quickly accessed, and the management and the storage of data can be better realized.
Optionally, the nonvolatile storage medium further includes an index management area, and the SMR hard disk further includes an index backup area; the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk; the method further comprises the following steps:
when the data storage system is started, the read-write equipment compares the last updating time of the cache Zone index in the index management area with the last updating time of the cache Zone index in the index backup area;
and if the last update time of the cache Zone index in the index backup area is earlier than the last update time of the cache Zone index in the index management area, the read-write equipment updates the index backup area according to each cache Zone index in the index management area.
In the embodiment of the present invention, when the last update time of the cache Zone index in the index backup area is earlier than the last update time of the cache Zone index in the index management area, the cache Zone index in the radix tree is configured according to each cache Zone index in the index management area, so that it can be ensured that the cache Zone index is the latest cache Zone index, and the stored latest data is ensured.
Optionally, after the reading and writing device compares the last update time of the cached Zone index in the index management area with the last update time of the cached Zone index in the index backup area, the method further includes:
if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, the read-write equipment updates the cache Zone index in the index management area according to each cache Zone index in the index backup area;
the read-write equipment updates the data in the cache Zone according to the data in the storage Zone;
and hooking the updated cache Zone index in the index management area in the radix tree in the read-write equipment.
In the embodiment of the present invention, when the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, each cache Zone index in the index backup area is synchronized into the index management area, and the cache Zone index in the radix tree is configured according to each cache Zone index in the index management area, so that it can be ensured that the cache Zone index is the latest cache Zone index.
Optionally, before the read-write device configures the cache Zone indexes in the radix tree according to the cache Zone indexes in the index management area, the method further includes:
when the data storage system is started, the read-write device reads each cache Zone index from the nonvolatile storage medium, determines whether a new storage Zone exists according to each read cache Zone index, and configures a cache Zone index for the new storage Zone if the new storage Zone exists.
In the embodiment of the present invention, a cache Zone index is allocated for the new storage Zone, so as to ensure normal use of the new storage Zone.
In the data storage method according to the embodiment of the present invention, when the data storage system is started, the superblock and the storage Zone need to be bound, that is, hooking the cache Zone index in the radix tree is completed, and optionally, as shown in fig. 6, the method includes the following steps:
s601, checking and setting the cache address of each cache Zone index in the nonvolatile storage medium.
The read-write equipment checks and sets the cache addresses of the cache Zone indexes in the nonvolatile storage medium, and obtains the cache addresses of the cache Zone indexes in the nonvolatile storage medium.
S602, obtaining the cache index head in the nonvolatile storage medium, and checking the hard disk information to judge whether the hard disk is a newly accessed hard disk.
The read-write equipment acquires each cache index head in the nonvolatile storage medium, judges whether the hard disk is a newly accessed hard disk according to the hard disk information, and executes S607 if the hard disk is the newly accessed hard disk, and executes S603 if the SMR hard disk is not the newly accessed hard disk.
S603, judging whether the backup index of the index backup area is newer.
The read-write device compares the last update time of the cache Zone index in the index management area with the last update time of the cache Zone index in the index backup area, and determines whether the cache Zone index of the index backup area is newer, if the cache Zone index of the index backup area is newer, S604 is executed, and if the cache Zone index of the index management area is newer, S606 is executed.
S604, the cache Zone in the synchronous index backup area is indexed to the index management area.
And the read-write equipment copies each cache Zone index in the index backup area to the index management area.
S605, reading the dirty page to the cache Zone according to the bitmap indexed by the cache Zone.
And the read-write equipment determines each dirty page according to the bitmap of each page in the cache Zone index, reads the data of each dirty page from the storage Zone and stores the data into the cache Zone, wherein the sizes of the cache Zone and the storage Zone are equal.
And S606, according to the mapping state of the cache Zone index and the Zone offset, hanging the index to the super fast radix tree.
And the read-write equipment hangs the cache Zone index to the base number tree of the super block according to the mapping state of the cache Zone index and the offset of the cache Zone index in units of Zone.
S607, establishing a cache index head for the storage Zone of the newly accessed hard disk, and initializing the parameter of the cache index head according to the parameters of the newly accessed hard disk and the storage Zone.
And configuring an uninitialized cache Zone index for the storage Zone of the newly accessed SMR hard disk, and setting the mapping state and parameters of the uninitialized cache Zone index and hooking the radix tree when the storage Zone to which data to be written is the storage Zone of the newly accessed SMR hard disk. If a new storage Zone is written, new cache Zone indexes are allocated and instantiated, a doubly linked list relationship is formed among the cache Zone indexes to facilitate searching and deleting operations, index header information inside a cache Zone index structure is used for marking the storage Zone to which the index header information belongs, for example, the number of the cache zones in a nonvolatile storage medium is 63, and the number of the corresponding cache Zone indexes is 63.
In the embodiment of the invention, a method for hanging the cache Zone index by the radix tree when the data storage system is started is provided, which can ensure that the stored data is the latest data and ensure the normal use of the new storage Zone.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (16)

1. The data storage method is characterized by being applied to a data storage system, wherein the data storage system comprises a read-write device, a nonvolatile storage medium and an SMR hard disk of a shingled magnetic recording technology, the read-write device is connected with the nonvolatile storage medium, and the read-write device is connected with the SMR hard disk; the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to different offset addresses; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each cache Zone maps a different cache Zone; the read-write device records the mapping state of the storage Zone and the cache Zone, and the method includes:
for data to be written, the read-write device converts an address, in the storage Zone, to which the data to be written is to be written, into an offset address in the cache Zone according to the mapping state of the storage Zone and the cache Zone, and uses the offset address as a target cache offset address;
the read-write equipment writes the data to be written into the corresponding cache Zone according to the target cache offset address;
and when a preset synchronization condition is met, the read-write equipment writes the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
2. The method according to claim 1, wherein each of the caching zones corresponds to a different caching Zone index, the caching Zone index represents a mapping state between the caching Zone and the storage Zone, the caching Zone index is further used for representing a state of each page in the caching Zone, the read-write device includes a radix tree, and the radix tree links the caching Zone index;
for the data to be written, the read-write device converts, according to the mapping state between the storage Zone and the cache Zone, an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone, where the address is used as a target cache offset address, where the method includes:
for data to be written, converting the page offset of the data to be written into an offset address of a storage Zone by the read-write equipment, and using the offset address as a target storage offset address;
the read-write equipment inquires whether the radix tree contains a cache Zone index corresponding to the target storage offset address;
when the radix tree includes the cache Zone index corresponding to the target storage offset address, the read-write device converts the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and the offset address serves as a target cache offset address;
the writing and reading device writes the data to be written into the corresponding cache Zone according to the target cache offset address, including:
and the read-write equipment determines the page offset of the target cache offset address in the cache Zone as a target page offset, writes the data to be written into the page corresponding to the cache Zone according to the target page offset, and sets the page corresponding to the cache Zone to be dirty in the cache Zone index corresponding to the target storage offset address.
3. The method according to claim 2, wherein after the read/write device queries whether the radix tree includes the cache Zone index corresponding to the target storage offset address, the method further comprises:
when the radix tree does not contain the cache Zone index corresponding to the target storage offset address, the read-write equipment judges whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium;
if the cache Zone of the unmapped storage Zone exists, the read-write equipment selects one cache Zone from the cache zones of the unmapped storage Zone as the cache Zone mapped by the storage Zone into which the data to be written is to be written;
the read-write equipment allocates a cache Zone index for the cache Zone of the storage Zone map to which the data to be written is to be written, configures the mapping state of the cache Zone index, and sets each page of the cache Zone index clean.
4. The method according to claim 3, wherein after the read-write apparatus determines whether there is a cache Zone of the unmapped storage Zone in the non-volatile storage medium, the method further comprises:
if the cache Zone which is not mapped with the storage Zone does not exist, the read-write equipment selects the cache Zone index to be recovered according to a preset selection rule;
the read-write equipment writes the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered, and takes the cache Zone index to be recovered as the recovered cache Zone index;
the read-write device uses the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone into which the data to be written is to be written, sets each page of the recovered cache Zone index clean, and configures a mapping state of the recovered cache Zone index as follows: and mapping the storage Zone to which the write data is to be written by the cache Zone corresponding to the recycled cache Zone index.
5. The method of claim 2, wherein the non-volatile storage medium further comprises an index management area, the SMR hard disk further comprises an index backup area; the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk; the method further comprises the following steps:
when the data storage system is started, the read-write equipment compares the last updating time of the cache Zone index in the index management area with the last updating time of the cache Zone index in the index backup area;
and if the last updating time of the cache Zone index in the index backup area is earlier than the last updating time of the cache Zone index in the index management area, the read-write equipment updates the index backup area according to each cache Zone index in the index management area.
6. The method according to claim 5, wherein after the read-write device compares the last update time of the cached Zone index in the index management area with the last update time of the cached Zone index in the index backup area, the method further comprises:
if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, the read-write equipment updates the cache Zone index in the index management area according to each cache Zone index in the index backup area;
the read-write equipment updates the data in the cache Zone according to the data in the storage Zone;
and hooking the updated cache Zone index in the index management area in a radix tree in the read-write equipment.
7. The method according to claim 5, wherein before the read-write device configures the cache Zone indexes in the radix tree according to the cache Zone indexes in the index management area, the method further comprises:
when the data storage system is started, the read-write device reads each cache Zone index from the nonvolatile storage medium, determines whether a new storage Zone exists according to each read cache Zone index, and configures a cache Zone index for the new storage Zone if the new storage Zone exists.
8. The method of claim 1, wherein the preset synchronization condition comprises a power-off reboot of the data storage system.
9. A data storage system, the system comprising:
the magnetic recording device comprises a read-write device, a nonvolatile storage medium and a shingled magnetic recording technology SMR hard disk, wherein the read-write device is connected with the nonvolatile storage medium and the read-write device is connected with the SMR hard disk;
the SMR hard disk comprises a plurality of storage zones, and each storage Zone corresponds to different offset addresses; the nonvolatile storage medium is logically divided into a plurality of cache zones, and each cache Zone maps a different cache Zone; the read-write equipment records the mapping state of the storage Zone and the cache Zone;
the read-write device is configured to, for data to be written, convert an address in the storage Zone, where the data to be written is to be written, into an offset address in the cache Zone according to the mapping state of the storage Zone and the cache Zone, where the address is used as a target cache offset address; writing the data to be written into the corresponding cache Zone according to the target cache offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
10. The system according to claim 9, wherein each of said caching zones corresponds to a different caching Zone index, said caching Zone index represents a mapping status between said caching Zone and said storing Zone, said caching Zone index is further used for representing a status of each page in said caching Zone, said read-write device includes a radix tree, and said radix tree links said caching Zone index;
the read-write device is specifically configured to, for data to be written, convert a page offset of the data to be written into an offset address of a storage Zone, where the offset address is used as a target storage offset address; inquiring whether the radix tree contains a cache Zone index corresponding to the target storage offset address; when the radix tree contains the cache Zone index corresponding to the target storage offset address, converting the target storage offset address into an offset address in the cache Zone according to the cache Zone index corresponding to the target storage offset address, and using the offset address as a target cache offset address; determining a page offset of the target cache offset address in the cache Zone, as a target page offset, writing the data to be written into a page corresponding to the cache Zone according to the target page offset, and setting a page corresponding to the cache Zone to be dirty in a cache Zone index corresponding to the target storage offset address; and when a preset synchronization condition is met, writing the data in the cache Zone into the storage Zone mapped by the cache Zone according to the mapping state of the storage Zone and the cache Zone.
11. The system of claim 10, wherein the read-write device is further configured to: when the radix tree does not contain the cache Zone index corresponding to the target storage offset address, judging whether a cache Zone of an unmapped storage Zone exists in the nonvolatile storage medium; if the cache Zone of the unmapped storage Zone exists, selecting one cache Zone from the cache zones of the unmapped storage Zone as the cache Zone mapped by the storage Zone to which the data to be written is to be written; and allocating a cache Zone index for the cache Zone to store Zone mapping to which the data to be written is to be written, configuring a mapping state of the cache Zone index, and setting each page of the cache Zone index clean.
12. The system of claim 11, wherein the read-write device is further configured to:
if the cache Zone which is not mapped with the storage Zone does not exist, selecting a cache Zone index to be recovered according to a preset selection rule; writing the data in the cache Zone corresponding to the cache Zone index to be recovered into the storage Zone corresponding to the cache Zone index to be recovered, and taking the cache Zone index to be recovered as the recovered cache Zone index; taking the cache Zone corresponding to the recovered cache Zone index as the cache Zone mapped by the storage Zone to which the data to be written is to be written, setting each page of the recovered cache Zone index clean, and configuring a mapping state of the recovered cache Zone index as follows: and mapping the storage Zone to which the write data is to be written by the cache Zone corresponding to the recycled cache Zone index.
13. The system of claim 10, wherein the non-volatile storage medium further comprises an index management area, and wherein the SMR hard disk further comprises an index backup area; the index management area is used for managing and storing the cache Zone index of each cache Zone, and the index backup area is used for storing the cache Zone index of the storage Zone of the SMR hard disk;
the read-write device is further configured to: when the data storage system is started, comparing the last updating time of the cache Zone index in the index management area with the last updating time of the cache Zone index in the index backup area; and if the last updating time of the cache Zone index in the index backup area is earlier than the last updating time of the cache Zone index in the index management area, updating the index backup area according to each cache Zone index in the index management area.
14. The system of claim 13, wherein the read-write device is further configured to: if the last update time of the cache Zone index in the index management area is earlier than the last update time of the cache Zone index in the index backup area, updating the cache Zone index in the index management area according to each cache Zone index in the index backup area; updating the data in the cache Zone according to the data in the storage Zone; and hooking the updated cache Zone index in the index management area in a radix tree in the read-write equipment.
15. The system of claim 13, wherein the read-write device is further configured to: when the data storage system is started, reading each cache Zone index from the nonvolatile storage medium, judging whether a new storage Zone exists according to each read cache Zone index, and if so, configuring the cache Zone index for the new storage Zone.
16. The system of claim 9, wherein the preset synchronization condition comprises a power-off reboot of the data storage system.
CN201810480504.6A 2018-05-18 2018-05-18 Data storage method and system Active CN110502455B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810480504.6A CN110502455B (en) 2018-05-18 2018-05-18 Data storage method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810480504.6A CN110502455B (en) 2018-05-18 2018-05-18 Data storage method and system

Publications (2)

Publication Number Publication Date
CN110502455A CN110502455A (en) 2019-11-26
CN110502455B true CN110502455B (en) 2021-10-12

Family

ID=68584535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810480504.6A Active CN110502455B (en) 2018-05-18 2018-05-18 Data storage method and system

Country Status (1)

Country Link
CN (1) CN110502455B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399762B (en) 2019-11-27 2021-06-18 杭州海康威视系统技术有限公司 Data storage method, device and storage system
CN111427859B (en) * 2020-03-25 2024-04-05 京东科技控股股份有限公司 Message processing method and device, electronic equipment and storage medium
CN111651127B (en) * 2020-06-10 2023-05-02 杭州海康威视数字技术股份有限公司 Monitoring data storage method and device based on shingled magnetic recording disk
CN114237489B (en) * 2020-09-09 2024-04-05 浙江宇视科技有限公司 Method and device for writing logic resources into SMR disk, electronic equipment and storage medium
CN112698787A (en) * 2020-12-28 2021-04-23 杭州海康威视系统技术有限公司 Data writing method and device and computer storage medium
CN112698788B (en) * 2020-12-29 2022-12-09 湖南博匠信息科技有限公司 Embedded high-speed dump method and system
CN112559193A (en) * 2020-12-31 2021-03-26 山东华芯半导体有限公司 Region information table management method based on host memory space

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719103A (en) * 2009-11-25 2010-06-02 成都市华为赛门铁克科技有限公司 Memory device and information processing method based on same
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method
CN104461964A (en) * 2014-12-12 2015-03-25 杭州华澜微科技有限公司 Memory device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744611A (en) * 2013-12-17 2014-04-23 记忆科技(深圳)有限公司 Computer system based on solid state disc as cache and cache accelerating method
US10073774B2 (en) * 2016-04-29 2018-09-11 Netapp, Inc. Managing input/output operations for shingled magnetic recording in a storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719103A (en) * 2009-11-25 2010-06-02 成都市华为赛门铁克科技有限公司 Memory device and information processing method based on same
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method
CN104461964A (en) * 2014-12-12 2015-03-25 杭州华澜微科技有限公司 Memory device

Also Published As

Publication number Publication date
CN110502455A (en) 2019-11-26

Similar Documents

Publication Publication Date Title
CN110502455B (en) Data storage method and system
US8909887B1 (en) Selective defragmentation based on IO hot spots
US10521131B2 (en) Storage apparatus and storage control apparatus
US8719501B2 (en) Apparatus, system, and method for caching data on a solid-state storage device
US8924664B2 (en) Logical object deletion
TWI424316B (en) Controller, data storage device, and program product
US7694087B2 (en) Duplicate data storing system, duplicate data storing method, and duplicate data storing program for storage device
JP5685676B2 (en) Computer system and data management method
US8589617B2 (en) Write once recording device
US9009396B2 (en) Physically addressed solid state disk employing magnetic random access memory (MRAM)
US20130080687A1 (en) Solid state disk employing flash and magnetic random access memory (mram)
JP2008015769A (en) Storage system and writing distribution method
KR20090037705A (en) Nonvolatile memory system and method managing file data thereof
KR20070060070A (en) Fat analysis for optimized sequential cluster management
US9785547B2 (en) Data management apparatus and method
US10754549B2 (en) Append only streams for storing data on a solid state device
JP5944502B2 (en) Computer system and control method
WO2014142337A1 (en) Storage device and method, and program
JP2019169101A (en) Electronic apparatus, computer system, and control method
WO2016206070A1 (en) File updating method and storage device
US20180232154A1 (en) Append Only Streams For Storing Data On A Solid State Device
US10585592B2 (en) Disk area isolation method and device
CN112052218A (en) Snapshot implementation method and distributed storage cluster
CN108958657B (en) Data storage method, storage device and storage system
CN109508140B (en) Storage resource management method and device, electronic equipment and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant