CN115857813A - Data storage system and device - Google Patents

Data storage system and device Download PDF

Info

Publication number
CN115857813A
CN115857813A CN202211585503.0A CN202211585503A CN115857813A CN 115857813 A CN115857813 A CN 115857813A CN 202211585503 A CN202211585503 A CN 202211585503A CN 115857813 A CN115857813 A CN 115857813A
Authority
CN
China
Prior art keywords
hdd
data
address
physical address
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211585503.0A
Other languages
Chinese (zh)
Inventor
李舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202211585503.0A priority Critical patent/CN115857813A/en
Publication of CN115857813A publication Critical patent/CN115857813A/en
Pending legal-status Critical Current

Links

Images

Abstract

The embodiment of the application provides a data storage system and equipment, wherein the data storage system comprises a processor, a memory, a read cache and at least one Hard Disk Drive (HDD), the processor is respectively connected with the memory, the read cache and the HDD, the memory is used for storing an address mapping table, and the address mapping table comprises the corresponding relation between a logical address and a physical address in the HDD and a physical address in the read cache; the processor is used for writing data in the HDD according to a physical address continuous mode and updating the address mapping table; the processor is also configured to read data in the read cache or the HDD according to the address mapping table. The read-write balance of the data storage system is improved.

Description

Data storage system and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data storage system and device.
Background
The electronic device may include a data storage system, and the data storage system may include a Hard Disk Drive (HDD) and a write cache for reading and writing data.
In the related art, data is generally read in the HDD and written in the write buffer. However, the reading performance of the HDD is poor due to the head rotation speed limitation of the HDD; since the write cache typically uses non-volatile memory, there are designs that make the write performance of the write cache better than the read performance of the HDD.
As can be seen from the above, the read-write balance of the data storage system is poor.
Disclosure of Invention
Aspects of the present application provide a data storage system and device for improving read-write balance of the data storage system.
In a first aspect, an embodiment of the present application provides a data storage system, which includes a processor, a memory, a read cache, and at least one hard disk drive HDD, where the processor is connected to the memory, the read cache, and the HDD respectively,
the memory is used for storing an address mapping table, and the address mapping table comprises the corresponding relation between a logic address and a physical address in the HDD and the corresponding relation between the logic address and the physical address in the read cache;
the processor is used for writing data in the HDD according to a physical address continuous mode and updating the address mapping table;
the processor is further configured to read data in the read cache or the HDD according to the address mapping table.
In one possible embodiment, writing data in the HDD in a physical address sequential manner and updating the address mapping table includes:
determining a target physical address in the HDD according to a physical address continuous mode;
writing data in the HDD according to the target physical address;
and updating the address mapping table according to the target physical address.
In one possible embodiment, determining the target physical address in the HDD in a physical address sequential manner includes:
determining a current physical address where a magnetic head of the HDD is currently located;
determining the current physical address as the target physical address.
In one possible implementation, updating the address mapping table according to the target physical address includes:
determining a target logical address corresponding to the target physical address;
and if the target physical address and the target logical address are not included in the address mapping table, correspondingly adding the target physical address and the target logical address to the address mapping table.
In one possible implementation, the address mapping table includes a plurality of logical addresses and a merged physical address corresponding to each logical address, wherein,
the merged physical address comprises a first field and a second field;
the first field is used for storing a physical address in the HDD corresponding to the logical address;
the second field is used for storing the physical address in the read cache corresponding to the logical address.
In one possible implementation, reading data in the read cache or the HDD according to the address mapping table includes:
determining a first logic address of data to be read;
determining a first merged physical address in the address mapping table according to the first logical address;
and reading data in the read cache or the HDD according to the first combined physical address.
In one possible implementation, reading data in the read cache or the HDD according to the first merged physical address includes:
obtaining a read cache physical address in the second field of the first merged physical address;
if the read cache physical address is an effective address, reading data in the read cache according to the read cache physical address;
if the read cache physical address is an invalid address, obtaining an HDD physical address in the first field of the first combined physical address, and reading data in the HDD according to the HDD physical address when the HDD physical address is an effective address.
In one possible implementation, the processor is further configured to:
determining data to be migrated in the at least one HDD and the read cache;
determining a target HDD corresponding to the data to be migrated and a first physical address in the target HDD;
and writing the data to be migrated into the target HDD according to the first physical address, and updating the address mapping table.
In one possible embodiment, determining data to be migrated in the at least one HDD and the read cache includes:
determining a target physical storage unit in the at least one HDD, wherein the size of an invalid storage space in the target physical storage unit is greater than or equal to a preset threshold value, and the data stored in the invalid storage space is invalid data;
determining first data in the read cache according to the access heat of each data in the read cache, the access heat of the first data is less than or equal to a first threshold;
determining the data to be migrated comprises: data in an active storage space in the target physical storage unit, and the first data.
In a possible implementation, the read cache is a solid state disk SSD.
In a second aspect, an embodiment of the present application provides an electronic device, which includes the data storage system described in any one of the first aspects.
The embodiment of the application provides a data storage system and equipment, wherein the data storage system can comprise a processor, a memory, a read cache and at least one Hard Disk Drive (HDD). The processor may be connected to the memory, the read cache, and the HDD, respectively. Because data can be written in the HDD according to a physical address continuous mode, the time for repeatedly seeking and switching the magnetic head of the HDD is reduced, and the writing performance of the HDD is improved; the read cache can use a Solid State Disk (SSD) with high capacity and low cost, so that the performance of reading data is improved. Therefore, the read-write balance of the data storage system is improved by combining the two aspects.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic diagram of an application scenario provided in an exemplary embodiment of the present application;
FIG. 2 is a schematic structural diagram of a data storage system according to an exemplary embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for writing data according to an exemplary embodiment of the present disclosure;
FIG. 4 is a process diagram of a method for writing data according to an exemplary embodiment of the present application;
FIG. 5 is a flowchart illustrating a method for reading data according to an exemplary embodiment of the present disclosure;
FIG. 6 is a process diagram of a method for reading data according to an exemplary embodiment of the present application;
FIG. 7 is a flowchart illustrating a method for migrating data according to an exemplary embodiment of the present application;
FIG. 8 is a process diagram of a method for migrating data according to an exemplary embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic diagram of an application scenario provided in an exemplary embodiment of the present application. Referring to fig. 1, the electronic device may include a data storage system therein. A processor (CPU), a read cache, and at least one Hard Disk Drive (HDD) may be included in the data storage system. For example, the at least one hard disk drive HDD may be HDD-1, HDD-2, … …, HDD-n, respectively.
After the electronic equipment receives the data writing request, the data can be written into the HDD according to the data writing request in the data storage system through the processor; after receiving the read data request, the electronic device may read data in a read buffer or an HDD according to the read data request in the data storage system through the processor.
In the related art, data is generally read in the HDD and written in the write buffer. However, the reading performance of the HDD is poor due to the head rotation speed limitation of the HDD; since the write cache typically uses non-volatile memory, there are designs that make the write performance of the write cache better than the read performance of the HDD. As can be seen from the above, the read-write balance of the data storage system is poor.
In the embodiment of the application, the electronic device may include a data storage system, and the data storage system may include a processor, a read cache, and at least one hard disk drive HDD. Data may be written to the HDD and read from the read buffer or HDD. Because data can be written in the HDD according to a physical address continuous mode, the time for repeatedly seeking and switching the magnetic head of the HDD is reduced, and the writing performance of the HDD is improved; the read cache can use a Solid State Disk (SSD) with high capacity and low cost, so that the performance of reading data is improved. By combining the two aspects, the read-write balance of the data storage system is improved.
The technical means shown in the present application will be described in detail below with reference to specific examples. It should be noted that the following embodiments may exist alone or in combination with each other, and description of the same or similar contents is not repeated in different embodiments.
Fig. 2 is a schematic structural diagram of a data storage system according to an exemplary embodiment of the present application. Referring to fig. 2, the data storage system may include a processor, a memory, a read cache, and at least one hard disk drive HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively.
The Memory may be a Dynamic Random Access Memory (DRAM) in a Dual-Inline-Memory-Modules (DIMM).
The read cache may be a Solid State Disk (SSD).
The at least one hard disk drive HDD may be HDD-1, HDD-2, … …, HDD-n, respectively.
Optionally, the memory may be used to store an address mapping table. The Address mapping table may include a correspondence between a Logical Block Address (LBA) and a physical Address Block Address (PBA) in the HDD, and a physical Address in the read cache.
The physical address PBA is an address actually existing in the HDD or the read cache, and is also called an absolute address; the logical address LBA is an address generated by the CPU, and is also called a relative address.
In an alternative embodiment, the address mapping table may include a plurality of logical addresses and a merged physical address corresponding to each logical address.
Optionally, the merged physical address includes a first field and a second field, and the first field may be used to store the physical address in the HDD corresponding to the logical address; the second field may be used to store a physical address in the read cache corresponding to the logical address. If the physical address in the HDD can be represented as the HDD PBA, and the physical address in the read cache can be represented as the SSD PBA, the merged physical address can be represented as the HDD PBA + the SSD PBA.
For example, the address mapping table may include m logical addresses and a merged physical address corresponding to each logical address. As shown in table 1:
TABLE 1
Logical addresses Merging physical addresses
LBA-1 HDD PBA-1+SSD PBA-1
LBA-2 HDD PBA-2+SSD PBA-2
…… ……
LBA-m HDD PBA-m+SSD PBA-m
As in Table 1, LBA-1 may correspond to merged physical address-1, i.e., HDD PBA-1+ SSD PBA-1; LBA-2 may correspond to merged physical address-2, i.e., HDD PBA-2+ SSD PBA-2; … …; LBA-m may correspond to the consolidated physical address-m, i.e., HDD PBA-m + SSD PBA-m.
Alternatively, the processor may be configured to write data in the HDD in a physical address sequential manner and update the address mapping table.
The physical address contiguous manner is a manner in which data is written at contiguous physical addresses. For example, if physical address 1, physical address 2, and physical address 3 are consecutive, data may be written at physical address 1, data may be written at physical address 2 after full write, and data may be written at physical address 3 after full write at physical address 2.
In an alternative embodiment, the data may be written and the address mapping table updated in the HDD by: determining a target physical address in the HDD according to a physical address continuous mode; writing data in the HDD according to the target physical address; and updating the address mapping table according to the target physical address.
For example, if the data to be written is data-1, and if the target physical address is determined to be HDD PBA-2 in the HDD in a manner that the physical addresses are consecutive, data-1 may be written in the HDD according to HDD PBA-2, and the address mapping table may be updated according to HDD PBA-2.
Optionally, the processor may be further configured to read data in the read cache or the HDD according to the address mapping table. For example, the processor may determine a first logical address of data to be read, and determine a corresponding first merged physical address in the address mapping table according to the first logical address. Since the first merged physical address includes the physical address of the data to be read in the HDD and the physical address in the read buffer, the data can be read in the HDD or the read buffer according to the first merged physical address.
In an embodiment of the present application, a data storage system may include a processor, a memory, a read cache, and at least one hard disk drive HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively. Because data can be written in the HDD according to a physical address continuous mode, the time of repeatedly seeking and switching a magnetic head of the HDD is reduced, and the writing performance of the HDD is improved; the read cache can use a Solid State Disk (SSD) with high capacity and low cost, so that the performance of reading data is improved. Therefore, the read-write balance of the data storage system is improved by combining the two aspects.
Next, based on the embodiment shown in fig. 2, a process of writing data and updating the address mapping table in the data storage system will be described in detail with reference to fig. 3.
Fig. 3 is a flowchart illustrating a method for writing data according to an exemplary embodiment of the present disclosure. Referring to fig. 3, the method may include:
s301, determining the current physical address where the magnetic head of the HDD is located currently.
Alternatively, the processor may determine the current physical address of the HDD at which the head is currently located.
For example, the processor may determine that the current physical address of the HDD at which the head is currently located is HDD PBA-1.
S302, determining the current physical address as a target physical address.
For example, if the current physical address is HDD PBA-1, HDD PBA-1 may be determined to be the target physical address.
S303, writing data in the HDD according to the target physical address.
For example, if the data to be written is data-1 and the target physical address is HDD PBA-1, data-1 may be written in the HDD according to HDD PBA-1.
S304, determining a target logical address corresponding to the target physical address.
Optionally, after writing data in the target physical address, the processor may determine a target logical address corresponding to the target physical address according to the target physical address.
For example, if the target physical address is HDD PBA-1, after data-1 is written in HDD PBA-1, the processor can determine the corresponding target logical address according to HDD PBA-1, assuming that the corresponding target logical address is LBA-1.
S305, determining whether a target physical address and a target logical address exist in an address mapping table.
Alternatively, whether the target physical address and the target logical address exist may be determined by the processor in an address mapping table.
If so, the target physical address and the target logical address do not need to be added into the address mapping table because the target physical address and the target logical address already exist in the address mapping table;
if not, since the target physical address and the target logical address do not exist in the address mapping table, S306 may be executed.
For example, if the target physical address is HDD PBA-1 and the target logical address is LBA-1, if the HDD PBA-1 and LBA-1 exist in the address mapping table, the HDD PBA-1 and LBA-1 do not need to be added in the address mapping table; if the HDD PBA-1 and LBA-1 do not exist in the address mapping table, S306 can be executed.
S306, the target physical address and correspondingly adding the target logical address into the address mapping table.
For example, if the target physical address is HDD PBA-1 and the target logical address is LBA-1, and if the HDD PBA-1 and LBA-1 do not exist in the address mapping table, the HDD PBA-1 and LBA-1 may be correspondingly added to the address mapping table.
In the embodiment of the application, a processor in the data storage system may determine a current physical address where a head of the HDD is currently located, may determine the current physical address as a target physical address, and may further write data in the HDD according to the target physical address. The processor may determine a target logical address corresponding to the target physical address and determine whether the target physical address and the target logical address are present in an address mapping table. And if the address mapping table does not comprise the target physical address and the target logical address, correspondingly adding the target physical address and the target logical address into the address mapping table. Since data can be written in the HDD in a manner of continuous physical addresses, the time for repeatedly seeking and switching the HDD head is reduced, thereby improving the writing performance of the HDD.
Next, on the basis of any of the above embodiments, a process of writing data and updating the address mapping table is further described in detail by a specific example in conjunction with fig. 4.
Fig. 4 is a process diagram of a method for writing data according to an exemplary embodiment of the present application. Please refer to fig. 4, which includes steps (1), (2) and (3).
As in fig. 4, a data storage system may include a processor, a memory, a read cache, and at least one HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively.
An address mapping table may be stored in the memory, and the address mapping table may include a plurality of logical addresses and corresponding merged physical addresses. For example. The address mapping table may include LBA-1 and corresponding HDD-1PBA-1+ SSD PBA-1, LBA-2 and corresponding HDD-1PBA-2+ SSD PBA-2.
The at least one HDD can be HDD-1, HDD-2, … …, HDD-n, respectively.
For any one HDD, multiple physical addresses may be included in the HDD. For example, HDD-1 may include 5 physical addresses, HDD-1PBA-1, HDD-1PBA-2, HDD-1PBA-3, HDD-1PBA-4, and HDD-1PBA-5.
In step (1), a processor may receive a write data request. After receiving the write data request, the processor may determine the current HDD, determine a current physical address at which a head of the HDD is currently located, and determine the current physical address as the target physical address. For example, the processor may determine that the current HDD is HDD-1 and determine that the current physical address where the head of HDD-1 is currently located is HDD-1PBA-3, then HDD-1PBA-3 may be determined to be the target physical address.
In step (2), the processor may write data in the HDD according to the target physical address in a manner that the physical addresses are consecutive. For example, if the target physical address is HDD-1PBA-3 and the write data request includes data-1, the processor may write data-1 to HDD-1PBA-3 in a manner that the physical addresses are consecutive.
Alternatively, after the processor writes data according to the target physical address, the processor may determine a target logical address corresponding to the target physical address. For example, if the target physical address is HDD-1PBA-3, the processor may determine the corresponding target logical address from HDD-1 PBA-3. Assume that the determined target logical address is LBA-3.
In step (3), the processor may update the address mapping table according to the target physical address and the target logical address.
Specifically, the processor may determine whether a target physical address and a target logical address exist in an address mapping table of the memory. If the address mapping table exists, the target physical address and the target logical address exist in the address mapping table, so that the target physical address and the target logical address do not need to be added in the address mapping table; if the address mapping table does not exist, the target physical address and the target logical address do not exist in the address mapping table, and the target physical address and the target logical address can be correspondingly added to the address mapping table.
For example, if the target physical address is HDD-1PBA-3, the target logical address is LBA-3, and if the address mapping table includes LBA-1 and corresponding HDD-1PBA-1, and LBA-2 and corresponding HDD-1PBA-2. The processor can determine from HDD-1PBA-3 and LBA-3 whether HDD-1PBA-3 and LBA-3 are present in the address mapping table. Since the address mapping table includes LBA-1 and corresponding HDD-1PBA-1, and LBA-2 and corresponding HDD-1PBA-2, no HDD-1PBA-3 and LBA-3 exist, the processor can correspondingly add HDD-1PBA-3 and LBA-3 to the address mapping table.
It should be noted that, because the address mapping table includes a logical address and a merged physical address, the merged physical address includes a first field and a second field, and the first field may be used to store a physical address in the HDD corresponding to the logical address; the second field may be used to store a physical address in the read cache corresponding to the logical address. And after the processor writes the data-1 into the HDD-1PBA-3, the data-1 is not written into the read cache, so that the processor can be correspondingly stored as the LBA-3 and the HDD-1PBA-3+ SSD PBA-3 when the processor correspondingly comprises the HDD-1PBA-3 and the LBA-3 in the address mapping table. Wherein SSD PBA-3 is invalid data. If the processor writes data-1 into SSD PBA-3, SSD PBA-3 is valid data.
Optionally, the processor may further write data with the access heat greater than the first threshold in the read cache according to the access heat of the data in the HDD, so as to reduce read latency and improve read performance.
In the embodiment of the application, a processor in the data storage system may determine a current physical address where a head of the HDD is currently located, may determine the current physical address as a target physical address, and may further write data in the HDD according to the target physical address. The processor may determine a target logical address corresponding to the target physical address and determine whether the target physical address and the target logical address are present in an address mapping table. And if the address mapping table does not comprise the target physical address and the target logical address, correspondingly adding the target physical address and the target logical address into the address mapping table. Since data can be written in a manner of continuous physical addresses in the HDD, the time for the HDD head to repeatedly seek and switch is reduced, thereby improving the write performance of the HDD.
Next, based on the embodiment shown in fig. 2, a process of reading data in the data storage system will be described in detail with reference to fig. 5.
Fig. 5 is a schematic flowchart of a method for reading data according to an exemplary embodiment of the present application, please refer to fig. 5, where the method may include:
s501, determining a first logic address of data to be read.
Optionally, the processor may determine the first logical address of the data to be read in an address mapping table of the memory.
For example, if the data to be read is data-1, the processor may determine the first logical address in the address mapping table according to the data-1, assuming that the first logical address of the data-1 is LBA-1.
S502, determining a first merging physical address in an address mapping table according to the first logic address.
Since the address mapping table includes a plurality of logical addresses and a merged physical address corresponding to each logical address, the processor may determine the first merged physical address in the address mapping table according to the first logical address.
For example, if the first logical address of data-1 is LBA-1, the processor may determine a first merged physical address in the address mapping table according to the first logical address LBA-1, assuming that the first merged physical address is HDD PBA-1+ ssd PBA-1.
S503, obtaining the physical address of the read cache in the second field of the first merging physical.
Since the merged physical address includes the first field and the second field, where the second field is used to store the physical address in the read cache corresponding to the logical address, the processor may obtain the physical address of the read cache in the second field in the first merged physical address.
For example, if the first merged physical address is HDD PBA-1+ SSD PBA-1, the processor may determine that the read cache physical address is SSD PBA-1 in the first merged physical address.
S504, judging whether the read cache physical address is an effective address.
Alternatively, the processor may determine whether the read cache physical address is an effective address. If so, it indicates that the read cache physical address stores the data to be read, and S505 may be executed; if not, it indicates that there is no data to be read in the physical address of the read cache, S506 may be executed.
For example, if the data to be read is data-1 and the physical address of the read cache is SSD PBA-1, the processor may determine whether SSD PBA-1 is an effective address, if so, it indicates that data-1 is stored in SSD PBA-1, and S505 may be executed; if not, it indicates that data-1 does not exist in SSD PBA-1, S506 may be executed.
And S505, reading data in the read cache according to the physical address of the read cache.
Because the read cache physical address is an effective address, the read cache physical address stores the data to be read, and the processor can read the data in the read cache according to the read cache physical address.
For example, if the data to be read is data-1, the physical address of the read cache is SSD PBA-1, and the physical address is an effective address, the processor may read the data-1 in the read cache according to SSD PBA-1.
S506, obtaining the HDD physical address in the first field of the first combined physical address.
Since the merged physical address includes a first field and a second field, where the first field is used to store the physical address in the HDD corresponding to the logical address, the processor may obtain the HDD physical address in the first field of the first merged physical address.
For example, if the first merged physical address is HDD PBA-1+ SSD PBA-1, the processor may determine that the HDD physical address is HDD PBA-1 in the first merged physical address.
S507, judging whether the HDD physical address is an effective address.
Alternatively, the processor may determine whether the HDD physical address is a valid address. If yes, it indicates that the HDD physical address stores data to be read, and S508 may be executed; if not, the fact that the data to be read does not exist in the HDD physical address is indicated, and the data to be read is invalid data because the read cache physical address and the HDD physical address do not exist the data to be read.
For example, if the data to be read is data-1 and the physical address of the HDD is HDD PBA-1, the processor may determine whether HDD PBA-1 is an effective address, if so, it indicates that data-1 is stored in HDD PBA-1, and S508 may be executed; if not, the result shows that the data-1 does not exist in the HDD PBA-1, and the data-1 is invalid data because the read cache physical address and the HDD physical address do not have the data-1.
And S508, reading data in the HDD according to the HDD physical address.
And the processor can read the data in the HDD according to the physical address of the HDD because the physical address of the HDD is an effective address and the data to be read is stored in the physical address of the HDD.
For example, if the data to be read is data-1, the HDD physical address is HDD PBA-1, and is a valid address, the processor can read data-1 in the HDD according to HDD PBA-1.
In an embodiment of the present application, a processor in a data storage system may determine a first logical address of data to be read, and determine a first merged physical address in an address mapping table according to the first logical address. The processor may obtain a read cache physical address in the second field of the first merged physical address, and if the read cache physical address is an effective address, the processor may read data in the read cache according to the read cache physical address; if the read cache physical address is an invalid address, the processor may obtain the HDD physical address in the first field of the first merged physical address, and read data from the HDD according to the HDD physical address when the HDD physical address is an effective address. The processor can read data in the read cache, and the read cache can use a Solid State Disk (SSD) with high capacity and low cost, so that the random reading times of the HDD are reduced, and the data reading performance is improved.
Next, on the basis of any of the above embodiments, a process of reading data is further described in detail by a specific example in conjunction with fig. 6.
Fig. 6 is a process diagram of a method for reading data according to an exemplary embodiment of the present application, please refer to fig. 6, which includes steps (1) (2) (3) (4).
As in fig. 6, a data storage system may include a processor, a memory, a read cache, and at least one HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively.
An address mapping table may be stored in the memory, and the address mapping table may include a plurality of logical addresses and corresponding merged physical addresses. For example. The address mapping table may include LBA-1 and corresponding HDD-1PBA-1+ SSD PBA-1, LBA-2 and corresponding HDD-1PBA-2+ SSD PBA-2, … …, LBA-5 and corresponding HDD-1PBA-5+ SSD PBA-5, etc.
The at least one HDD can be HDD-1, HDD-2, … …, HDD-n, respectively.
For any one HDD, multiple physical addresses may be included in the HDD. For example, HDD-1 may include 5 physical addresses, HDD-1PBA-1, HDD-1PBA-2, HDD-1PBA-3, HDD-1PBA-4, and HDD-1PBA-5.
Multiple physical addresses may be included in the read cache. For example, SSD PBA-1, SSD PBA-2, … …, SSD PBA-k may be included in the read cache.
In step (1), a processor may receive a read data request. For example, the processor may receive a read data request that includes an identification of data-1.
In step (2), the processor may determine a first merged physical address of the data to be read.
Specifically, the processor may determine a first logical address of the data to be read in the address mapping table, and determine a first merged physical address in the address mapping table according to the first logical address. For example, if data-1 is included in the read data request, the processor may determine a first logical address of data-1 in the address mapping table. If the first logical address of the data-1 is LBA-1, the processor may determine, according to LBA-1, that the corresponding first merged physical address is HDD-1PBA-1+ ssd PBA-1 in the address mapping table.
Since the merged physical address includes the first field and the second field, where the second field is used to store the physical address in the read cache corresponding to the logical address, the processor may obtain the physical address of the read cache in the second field in the first merged physical address. For example, if the first merged physical address is HDD PBA-1+ SSD PBA-1, the processor may determine that the read cache physical address is SSD PBA-1 in the first merged physical address.
In step (3), the processor may read data in the read cache according to the read cache physical address.
In particular, the processor may determine whether the read cache physical address is an effective address. If so, reading data in the read cache according to the physical address of the read cache; if not, the HDD physical address is obtained in the first field of the first merged physical address. For example, if the read data request includes the identifier of data-1, if the first merged physical address is HDD PBA-1+ SSD PBA-1, where the read cache physical address is SSD PBA-1, the processor may determine whether SSD PBA-1 is an effective address, if so, it indicates that data-1 is stored in SSD PBA-1, and the processor may read data-1 in the read cache according to SSD PBA-1; if not, it is indicated that the SSD PBA-1 does not have data-1, the processor may determine, in the first merged physical address, that the HDD physical address is the HDD PBA-1.
In step (4), the processor may read data in the HDD according to the HDD physical address.
Specifically, the processor may determine whether the HDD physical address is a valid address. If yes, reading data in the HDD according to the HDD physical address; if not, the data to be read does not exist in the read cache physical address and the HDD physical address, and the data to be read is invalid data.
For example, if the read data request includes the identifier of data-1 and the physical address of the HDD is the physical address of HDD PBA-1, the processor may determine whether HDD PBA-1 is a valid address. If yes, the processor can read data-1 from the HDD according to the HDD PBA-1; if not, the SSD PBA-1 and the HDD PBA-1 do not have data-1, and the data-1 is invalid data.
It should be noted that, the execution sequence of step (3) and step (4) is not sequential.
In an embodiment of the present application, a processor in a data storage system may determine a first logical address of data to be read, and determine a first merged physical address in an address mapping table according to the first logical address. The processor may obtain a read cache physical address in the second field of the first merged physical address, and if the read cache physical address is an effective address, the processor may read data in the read cache according to the read cache physical address; if the read cache physical address is an invalid address, the processor may obtain the HDD physical address in the first field of the first merged physical address, and read data from the HDD according to the HDD physical address when the HDD physical address is an effective address. Because the processor can read data in the read cache, and the read cache can use a Solid State Disk (SSD) with high capacity and low cost, the random reading times of the HDD are reduced, and the performance of reading data is improved.
Optionally, the processor in the data storage system shown in fig. 2 may be further configured to determine data to be migrated in at least one HDD and the read cache, determine a target HDD corresponding to the data to be migrated and a first physical address in the target HDD, further write the data to be migrated to the target HDD according to the first physical address, and update the address mapping table.
Next, a process of migrating data will be described with reference to fig. 7.
Fig. 7 is a flowchart illustrating a method for migrating data according to an exemplary embodiment of the present application. Referring to fig. 7, the method may include:
s701, determining a target physical storage unit in at least one HDD.
The HDD may include a plurality of physical storage units, which may be tracks.
Alternatively, the processor may determine the target physical storage unit among a plurality of physical storage units of the at least one HDD.
The target physical storage unit may include valid and invalid storage spaces. The data stored in the effective storage space is effective data; the data stored in the invalid storage space is invalid data, the size of the invalid storage space is larger than or equal to a preset threshold value,
invalid data may be deleted data, expired data, or the like.
For example, if the preset threshold is 90%, the processor may determine the target physical storage unit as physical storage unit-1 in HDD-1, and physical storage unit-1 may include valid storage space-1 and invalid storage space-1. The size of the effective storage space-1 can be 8% of that of the physical storage unit-1, and the data stored in the effective storage space-1 is effective data; the size of the invalid storage space-1 may be 92% of the physical storage unit-1, and the data stored in the invalid storage space-1 is invalid data.
S702, determining first data in the read cache according to the access heat of each data in the read cache.
The access heat of the first data may be less than or equal to a first threshold.
The access heat may be the number of times the data is read per second. For example, the access heat of data-1 may be 100 times/s (second), which may indicate that data-1 is read 100 times within 1 s.
The first threshold may be a threshold that is artificially preset. For example, the first threshold may be 20 times/s.
Optionally, the processor may determine an access heat of each data in the read cache, and determine the first data in the read cache according to the access heat of each data in the read cache.
For example, if the read buffer includes 100 data, which are data-1, data-2, data-3, … …, and data-100, respectively. Wherein, the access heat of the data-2 is 20 times/s, and the access heat of the data-3 is 18 times/s, then the data-2 and the data-3 can be determined as the first data.
S703, determining the data to be migrated comprises: data in the active storage space in the target physical storage unit, and the first data.
For example, if the data stored in the effective storage space in the target physical storage unit includes 10 data, which are data-1, data-2, data-3, … … and data-10, respectively, and the first data in the read cache includes data-11 and data-12, it may be determined that the data to be migrated includes data-1, data-2, data-3, … …, data-10, data-11 and data-12.
S704 and determining a target HDD corresponding to the data to be migrated and a first physical address in the target HDD.
The target HDD may be the HDD that receives the data to be migrated. The target HDD may include a plurality of physical addresses, including the first physical address.
Optionally, the processor may determine a target HDD and a first physical address in the target HDD corresponding to the data to be migrated. For example, if the data to be migrated includes data-1, data-2, data-3, … …, data-10, data-11, and data-12, the processor may determine the target HDD corresponding to the data to be migrated and the first physical address in the target HDD. Assuming that the target HDD is HDD-2, the first physical address in HDD-2 can be HDD PBA-1.
S705, according to the first physical address, writing the data to be migrated into the target HDD, and updating the address mapping table.
For example, if the data to be migrated includes data-1, data-2, data-3, … …, data-10, data-11, and data-12, and if the target HDD is HDD-2 and the first physical address in HDD-2 can be HDD PBA-1, the processor can write the data to be migrated into HDD PBA-1 of HDD-2, and update the address mapping table according to HDD PBA-1.
It should be noted that, the process of updating the address mapping table may refer to steps S304-S306, and is not described herein again.
In an embodiment of the application, the processor may determine a target physical storage unit in the at least one HDD, and determine an effective storage space and an ineffective storage space in the target physical storage unit, thereby determining data in the effective storage space. The processor may determine the first data in the read cache according to the access heat of each data in the read cache. The processor can determine data to be migrated from the data in the effective storage space in the target physical storage unit and the first data, determine a target HDD corresponding to the data to be migrated and a first physical address in the target HDD, further write the data to be migrated into the target HDD according to the first physical address, and update the address mapping table. Based on the characteristics of the HDD, after data migration, data in the original target physical storage unit does not need to be erased, and the new data is written in by directly covering, so that the time and energy consumption for erasing the HDD are saved, and the writing performance of the HDD is improved.
Next, on the basis of any of the above embodiments, a process of migrating data is further described in detail by a specific example in conjunction with fig. 8.
Fig. 8 is a process diagram of a method for migrating data according to an exemplary embodiment of the present application. Please refer to fig. 8, which includes steps (1) (2) (3) (4).
As in fig. 8, a data storage system may include a processor, a memory, a read cache, and at least one HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively.
An address mapping table may be stored in the memory, and the address mapping table may include a plurality of logical addresses and corresponding merged physical addresses.
The at least one HDD may be HDD-1, HDD-2, … …, HDD-n, respectively.
For any HDD, multiple physical storage units may be included in the HDD. For example, HDD-1 may include physical storage unit 1, physical storage unit 2, … …, physical storage unit s.
For any one HDD, multiple physical addresses may be included in the HDD. For example, HDD-2 may include 5 physical addresses, HDD-2PBA-1, HDD-2PBA-2, HDD-2PBA-3, HDD-2PBA-4, … …, HDD-2PBA-z.
It should be noted that one physical storage unit may include a plurality of physical addresses. For example, the physical storage unit may be a track of the HDD, and one track may include a plurality of physical addresses.
A plurality of data may be stored in the read cache. For example, data-1, data-2 … …, data-k may be included in the read cache.
In step (1), the processor may determine valid data in the target physical storage unit.
Specifically, the processor may determine a target physical storage unit among a plurality of physical storage units of the at least one HDD. The target physical storage unit may have invalid data and valid data stored therein. And the size of the invalid storage space occupied by the invalid data is greater than or equal to a preset threshold value. The processor may determine valid data in the target physical storage unit. For example, the processor may determine the target physical storage unit as the physical storage unit 1 in the HDD-1, and the physical storage unit 1 may store therein invalid data and valid data. If the predetermined threshold is 90%, the size of the invalid storage space occupied by the invalid data may be 95% of the physical storage unit-1. The processor may determine valid data in the physical memory unit 1.
In step (2), the processor may determine the access heat of each data in the read cache, and determine the data with the access heat less than or equal to the first threshold as the first data. For example, if the first threshold is 20 times/s, if the read buffer includes data-1, data-2, … …, and data-k. Wherein, the access heat of the data-1 is 20 times/s, and the access heat of the data-2 is 18 times/s, then the data-1 and the data-2 can be determined as the first data.
It should be noted that, the execution processes of step (1) and step (2) are not in sequence.
Optionally, the processor may determine that the data to be migrated includes: data in the active storage space in the target physical storage unit, and the first data. For example, the processor may determine that the data to be migrated includes data stored in the effective storage space in the physical storage unit 1, and data-1 and data-2.
In step (3), the processor may write the data to be migrated to the target HDD in such a manner that the physical addresses are consecutive.
Specifically, the processor may determine a target HDD corresponding to the data to be migrated and a first physical address in the target HDD, and write the data to be migrated to the target HDD according to the first physical address.
For example, if the data to be migrated includes valid data, data-1 and data-2 in the physical storage unit 1, if the processor determines that the target HDD corresponding to the data to be migrated is HDD-2 and the first physical address is HDD-2PBA-1, the processor may migrate the valid data, data-1 and data-2 in the physical storage unit 1 to HDD-2PBA-1.
In step (4), the processor may update the address mapping table according to the first physical address in the target HDD.
It should be noted that, the process of updating the address mapping table may refer to steps S304-S306, and is not described herein again.
In an embodiment of the application, the processor may determine a target physical storage unit in the at least one HDD, and determine an effective storage space and an ineffective storage space in the target physical storage unit, thereby determining data in the effective storage space. The processor may determine the first data in the read cache according to the access heat of each data in the read cache. The processor can determine data to be migrated from the effective storage space in the target physical storage unit and the first data, determine a target HDD corresponding to the data to be migrated and a first physical address in the target HDD, write the data to be migrated into the target HDD according to the first physical address, and update the address mapping table. Based on the characteristics of the HDD, after data migration, data in the original target physical storage unit does not need to be erased, and the new data is written in by directly covering, so that the time and energy consumption for erasing the HDD are saved, and the writing performance of the HDD is improved.
Fig. 9 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application, please refer to fig. 9, the electronic device may include a data storage system, and the data storage system may include a processor, a memory, a read cache, and at least one HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively.
Optionally, a memory may be used to store the address mapping table. The address mapping table may include a plurality of logical addresses and a merged physical address corresponding to each logical address.
Optionally, the merged physical address includes a first field and a second field, and the first field may be used to store the physical address in the HDD corresponding to the logical address; the second field may be used to store a physical address in the read cache corresponding to the logical address.
Alternatively, the processor may determine a target physical address in the HDD in a manner that physical addresses are consecutive, write data in the HDD according to the target physical address, and update the address mapping table according to the target physical address.
Optionally, the processor may be further configured to read data in the read cache or the HDD according to the address mapping table, that is, the processor may determine a first logical address of the data to be read, and determine a corresponding first merged physical address in the address mapping table according to the first logical address. Since the first merged physical address includes the physical address of the data to be read in the HDD and the physical address in the read buffer, the data can be read in the HDD or the read buffer according to the first merged physical address.
Optionally, the processor may be further configured to determine data to be migrated in the at least one HDD and the read cache, determine a target HDD corresponding to the data to be migrated and a first physical address in the target HDD, write the data to be migrated to the target HDD according to the first physical address, and update the address mapping table.
In an embodiment of the present application, an electronic device may include a data storage system, which may include a processor, a memory, a read cache, and at least one HDD. The processor may be connected to the memory, the read cache, and the HDD, respectively. Because data can be written in the HDD according to a physical address continuous mode, the time for repeatedly seeking and switching the magnetic head of the HDD is reduced, and the writing performance of the HDD is improved; the read cache can use a Solid State Disk (SSD) with high capacity and low cost, so that the performance of reading data is improved. Therefore, the read-write balance of the data storage system is improved by combining the two aspects.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.

Claims (11)

1. A data storage system comprising a processor, a memory, a read cache and at least one Hard Disk Drive (HDD), said processor being connected to said memory, said read cache and said HDD respectively,
the memory is used for storing an address mapping table, and the address mapping table comprises the corresponding relation between a logic address and a physical address in the HDD and the corresponding relation between the logic address and the physical address in the read cache;
the processor is used for writing data in the HDD according to a physical address continuous mode and updating the address mapping table;
the processor is further configured to read data in the read cache or the HDD according to the address mapping table.
2. The system of claim 1, wherein writing data in the HDD and updating the address mapping table in a physical address sequential manner comprises:
determining a target physical address in the HDD according to a physical address continuous mode;
writing data in the HDD according to the target physical address;
and updating the address mapping table according to the target physical address.
3. The system of claim 2, wherein determining a target physical address in the HDD in a physical address sequential manner comprises:
determining a current physical address where a magnetic head of the HDD is currently located;
determining the current physical address as the target physical address.
4. The system of claim 2 or 3, wherein updating the address mapping table according to the target physical address comprises:
determining a target logical address corresponding to the target physical address;
and if the target physical address and the target logical address are not included in the address mapping table, correspondingly adding the target physical address and the target logical address to the address mapping table.
5. The system according to any one of claims 1-4, wherein the address mapping table comprises a plurality of logical addresses and a merged physical address corresponding to each logical address, wherein,
the merged physical address comprises a first field and a second field;
the first field is used for storing a physical address in the HDD corresponding to the logical address;
the second field is used for storing the physical address in the read cache corresponding to the logical address.
6. The system of claim 5, wherein reading data in the read cache or the HDD according to the address mapping table comprises:
determining a first logic address of data to be read;
determining a first merged physical address in the address mapping table according to the first logical address;
and reading data in the read cache or the HDD according to the first combined physical address.
7. The system of claim 6, wherein reading data in the read cache or the HDD according to the first merged physical address comprises:
obtaining a read cache physical address in the second field in the first merged physical address;
if the read cache physical address is an effective address, reading data in the read cache according to the read cache physical address;
if the read cache physical address is an invalid address, obtaining an HDD physical address in the first field of the first combined physical address, and reading data in the HDD according to the HDD physical address when the HDD physical address is an effective address.
8. The system of any one of claims 1-7, wherein the processor is further configured to:
determining data to be migrated in the at least one HDD and the read cache;
determining a target HDD corresponding to the data to be migrated and a first physical address in the target HDD;
and writing the data to be migrated into the target HDD according to the first physical address, and updating the address mapping table.
9. The system of claim 8, wherein determining data to migrate in the at least one HDD and the read cache comprises:
determining a target physical storage unit in the at least one HDD, wherein the size of an invalid storage space in the target physical storage unit is greater than or equal to a preset threshold value, and the data stored in the invalid storage space is invalid data;
determining first data in the read cache according to the access heat of each data in the read cache, wherein the access heat of the first data is less than or equal to a first threshold;
determining the data to be migrated comprises: data in the active storage space in the target physical storage unit, and the first data.
10. The system of any of claims 1-9, wherein the read cache is a Solid State Disk (SSD).
11. An electronic device comprising a data storage system as claimed in any one of claims 1 to 10.
CN202211585503.0A 2022-12-09 2022-12-09 Data storage system and device Pending CN115857813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211585503.0A CN115857813A (en) 2022-12-09 2022-12-09 Data storage system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211585503.0A CN115857813A (en) 2022-12-09 2022-12-09 Data storage system and device

Publications (1)

Publication Number Publication Date
CN115857813A true CN115857813A (en) 2023-03-28

Family

ID=85671934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211585503.0A Pending CN115857813A (en) 2022-12-09 2022-12-09 Data storage system and device

Country Status (1)

Country Link
CN (1) CN115857813A (en)

Similar Documents

Publication Publication Date Title
US10671290B2 (en) Control of storage of data in a hybrid storage system
TWI494761B (en) Method of partitioning physical block and memory system thereof
US9317214B2 (en) Operating a memory management controller
JP5943095B2 (en) Data migration for composite non-volatile storage
EP4137924A1 (en) Fragment management method and fragment management apparatus
CN107564558B (en) Implementing decentralized atomic I/O writing
US9524110B2 (en) Page replacement algorithms for use with solid-state drives
CN111930643B (en) Data processing method and related equipment
CN113568562A (en) Storage system, memory management method and management node
US20140372673A1 (en) Information processing apparatus, control circuit, and control method
CN108491290B (en) Data writing method and device
CN111399767A (en) IO request processing method, system, equipment and computer readable storage medium
JP2014220021A (en) Information processor, control circuit, control program, and control method
CN112835828A (en) Direct Memory Access (DMA) commands for non-sequential source and destination memory addresses
CN107632779B (en) Data processing method and device and server
US20150121033A1 (en) Information processing apparatus and data transfer control method
US11010091B2 (en) Multi-tier storage
US20110264848A1 (en) Data recording device
CN108334457B (en) IO processing method and device
CN115857813A (en) Data storage system and device
CN111796757B (en) Solid state disk cache region management method and device
CN114005476A (en) Flash memory, flash memory erasing and writing counting method, electronic equipment and computer storage medium
CN109002265B (en) Data processing method and related device
Paulson et al. Page Replacement Algorithms–Challenges and Trends
KR101874748B1 (en) Hybrid storage and method for storing data in hybrid storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination