CN116931830A - Data moving method, device, equipment and storage medium - Google Patents
Data moving method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN116931830A CN116931830A CN202310900665.7A CN202310900665A CN116931830A CN 116931830 A CN116931830 A CN 116931830A CN 202310900665 A CN202310900665 A CN 202310900665A CN 116931830 A CN116931830 A CN 116931830A
- Authority
- CN
- China
- Prior art keywords
- data
- information
- moved
- cache
- host
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000004590 computer program Methods 0.000 claims description 16
- 238000013508 migration Methods 0.000 claims description 14
- 230000005012 migration Effects 0.000 claims description 14
- 238000012545 processing Methods 0.000 abstract description 22
- 230000011218 segmentation Effects 0.000 abstract description 12
- 230000008569 process Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 9
- 230000003993 interaction Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 238000007796 conventional method Methods 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Landscapes
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The application discloses a data moving method, a device, equipment and a storage medium, which relate to the technical field of computers and comprise the following steps: acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information; determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark; and moving the data to be moved to the corresponding target moving area according to the received target instruction. According to the application, the hit information and the effective mark corresponding to the data information are obtained, the data to be issued are determined according to the information, the access requirement on the cache bandwidth and the burden on the central processing unit caused by finely divided IO segmentation are reduced, and the performance of the whole storage system and the use efficiency of the cache are improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data migration method, apparatus, device, and storage medium.
Background
Different cache hit results may occur due to the fact that the continuous logical block addresses mapped to the hard disk are mapped, the access of the logical block addresses mapped to the hard disk is segmented according to the query result of the cache hit condition, and the continuous logical block addresses are segmented into a disk read-write operation, so that a CPU (Central Processing Unit/Processor, central processing unit) needs to process multiple and finely divided segmentation tasks, and multiple interactions between the CPU and a hardware accelerator bring severe burden to the CPU, and the performance of a storage system is further lowered.
Disclosure of Invention
Accordingly, the present application is directed to a data moving method, apparatus, device, and storage medium, which can reduce the access requirement to the buffer bandwidth and the burden of finely divided IO splitting to the central processing unit, and improve the performance of the whole storage system and the use efficiency of the buffer. The specific scheme is as follows:
in a first aspect, the present application discloses a data moving method, including:
acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information;
Determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark;
and moving the data to be moved to the corresponding target moving area according to the received target instruction.
Optionally, the acquiring the host data page list includes:
and acquiring a host data page list from a host memory, and moving the host data page list to the local.
Optionally, the querying, based on the host data page list, cache information of each data information to output hit information and a valid flag corresponding to the data information according to the cache information includes:
accessing the host data page list in the local area through a cache controller, and inquiring the cache information of each data information based on the acquired host data page list;
acquiring hit information corresponding to each piece of data information based on the cache information, and sending the hit information to a hard disk controller and a data moving area;
and outputting the data information corresponding to the effective mark through the cache controller.
Optionally, the outputting, by the cache controller, the data information corresponding to the valid flag includes:
And determining each data information to correspond to the valid mark and outputting the valid mark through the cache controller and based on the current cache strategy and the input and output type.
Optionally, the determining, based on the hit information and the valid flag, the data to be moved and the corresponding target moving area from all the data information includes:
determining the data information with the effective mark as a first effective mark as the data to be moved; the first effective mark is a mark for representing actual operation on the hard disk;
acquiring the hit information corresponding to each piece of data to be moved;
judging whether the hit information of each data to be moved is first hit information or not;
if the hit information of the data to be moved is the first hit information, determining that the corresponding target moving area is a cache;
if the hit information of the data to be moved is not the first hit information, judging whether the hit information of each data to be moved is second hit information or not;
if the hit information of the data to be moved is the second hit information, determining the corresponding target moving area as a host;
Determining the data information with the valid mark as a second valid mark as non-moving data; the second valid flag is a flag indicating that no actual operation is performed on the hard disk.
Optionally, after determining the data to be moved and the corresponding target moving area from all the data information based on the hit information and the valid flag, the method further includes:
when a task is received, reading all the valid marks of the data information;
determining the data information which is continuous and the effective marks are the first effective marks as the current data to be moved;
storing the logical block address of the current data to be moved in a read-write operation to generate the corresponding target instruction, and issuing the target instruction to the hard disk.
Optionally, the moving the data to be moved to the corresponding target moving area according to the received target instruction includes:
receiving the target instruction and acquiring the hit information of the current data to be moved, which corresponds to each logic block address contained in the target instruction;
and moving the current data to be moved to the corresponding target moving area based on the hit information.
In a second aspect, the present application discloses a data mover, comprising:
the list acquisition module is used for acquiring a host data page list;
the information output module is used for inquiring the cache information of each data information based on the host data page list so as to output hit information and a valid mark corresponding to the data information according to the cache information;
the information determining module is used for determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark;
and the data moving module is used for moving the data to be moved to the corresponding target moving area according to the received target instruction.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the data migration method as disclosed above.
In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements the data migration method as previously disclosed.
It can be seen that the present application provides a data migration method, comprising: acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information; determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark; and moving the data to be moved to the corresponding target moving area according to the received target instruction. Therefore, the application obtains the hit information and the effective mark corresponding to the data information, determines the data to be issued according to the information, directly carries out the moving operation on the data to be moved according to the obtained target moving area, reduces the access requirement on the cache bandwidth and the burden on the central processing unit caused by finely divided IO segmentation, and improves the performance of the whole storage system and the use efficiency of the cache.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data migration method disclosed by the application;
FIG. 2 is a schematic diagram of a hard disk array according to the present disclosure;
FIG. 3 is a flowchart of a specific data movement method according to the present application;
FIG. 4 is a flow chart of a conventional data moving method according to the present application;
FIG. 5 is a diagram of a newly added data structure and data path according to the present disclosure;
FIG. 6 is a schematic diagram of a data flow of a data structure of the present disclosure;
FIG. 7 is a schematic diagram of the IO segmentation logic of the NVMe accelerator disclosed by the application;
FIG. 8 is a flowchart of a specific data movement method according to the present application;
FIG. 9 is a schematic diagram of a data transfer device according to the present application;
fig. 10 is a block diagram of an electronic device according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
At present, different cache hit results may occur due to the fact that the continuous logical block addresses mapped to the hard disk are mapped, the access of the logical block addresses mapped to the hard disk is segmented according to the query result of the cache hit condition, and the continuous logical block addresses are segmented into a disk read-write operation, so that the CPU needs to process a plurality of and finely divided segmentation tasks, and the CPU and the hardware accelerator interact for a plurality of times to bring serious burden to the CPU, so that the performance of the storage system is further lowered. Therefore, the application provides a data moving method, which can reduce the access requirement on the buffer bandwidth and the burden on the central processing unit caused by finely divided IO segmentation, and improve the performance of the whole storage system and the use efficiency of the buffer.
The embodiment of the invention discloses a data moving method, which is shown in fig. 1 and comprises the following steps:
step S11: and acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information.
In this embodiment, a host data page list is obtained, and cache information of each data information is queried based on the host data page list, so as to output hit information and a valid flag corresponding to the data information according to the cache information. Specifically, a host data page list is obtained from a host memory, the host data page list is moved to a local place, the host data page list in the local place is accessed through a cache controller, and the cache information of each data information is queried based on the obtained host data page list; acquiring hit information corresponding to each piece of data information based on the cache information, and sending the hit information to a hard disk controller and a data moving area; and outputting the data information corresponding to the effective mark through the cache controller. It is understood that each data information is determined by the cache controller and based on the current cache policy and input/output type to correspond to the valid flag and output.
It is appreciated that NVMe (Non Volatile Memory Express, nonvolatile memory host controller interface specification) hard disks are widely used for storage systems such as storage servers due to their excellent performance. Compared with the traditional mechanical hard disk with interfaces such as SATA (Serial ATA)/SAS (Serial Attached SCSI ), the advantages of the NVMe hard disk mainly include the following aspects: the NVMe protocol adopts higher-speed PCIe (Peripheral Component InterconnectExpress, high-speed serial computer expansion bus standard) as a transmission protocol of the bottom layer, and the transmission bandwidth is increased from hundred MB/s to a plurality of GB/s; the read-write delay of solid-state storage media is reduced from the order of 10ms to tens of us compared to magnetic media; the NVMe protocol performs an optimization design on an Input/Output (IO) interaction flow aiming at a solid medium, supports very high concurrency, improves the IO processing capacity of a hard disk from hundreds of IOPS (Input/Output Operations Per Second, the number of times of performing read/write operations per second) to hundreds of thousands of IOPS, and can enable a storage system to break through the traditional performance bottleneck by using the NVMe hard disk, so that the aim of improving the IOPS is fulfilled greatly. Because conventional hard disks IOPS are very low, the total IOPS of a storage system can only reach tens to hundreds of thousands, while a storage system using an NVMe hard disk can provide IOPS of several millions to tens of millions, a storage system using an NVMe hard disk generally needs to use a large amount of hardware acceleration, and an NVMe accelerator is included.
The NVMe accelerator receives instructions for read and write operations to the hard disk and related data structures in batches from the CPU, then translates them into data structures used for NVMe protocol interactions, and completes interactions with the hard disk. Since a continuous logical block address is required to be read and written to the hard disk at one time in the NVMe protocol, the conventional NVMe accelerator requires that each instruction in the instructions issued by the CPU in batches also uses a continuous logical block address, and considering that the cache is almost used in the storage system to improve the read and write performance, the continuous logical block access to the hard disk will become complicated due to the hit state of the cache. As with the disk array shown in fig. 2, one IO of the host accesses 13 consecutive logical block addresses of the hard disk array, denoted by numerals 0-12 in the figure, which 13 logical blocks are mapped to each data hard disk in the hard disk array in a certain order. Because of the existence of the system cache, when a certain logic block is written, in order to reduce frequent calculation and writing of redundant protection data of the hard disk and avoid large delay caused by actual writing action of the hard disk, the data is generally temporarily stored in the cache. Therefore, partial data possibly hit in the cache in the address of the logic block to be read can be directly obtained from the cache and returned to the host; another part of the data is not in the cache and needs to be read from the hard disk. In a specific case, as shown in fig. 2, different cache hit results will occur for the addresses of consecutive logical blocks mapped to the hard disk, taking the hard disk 1 as an example, the logical blocks 2 and 4 are not stored in the cache, and need to be read from the hard disk; the logic blocks 3, 5 hit in the cache without reading. Therefore, the conventional method needs to segment the logical block address access mapped to the hard disk according to the query result of the cache hit condition, and segment the continuous logical block hit into a disk read-write operation. For example, for disk 1, the reads of logical block 2 and logical block 4 need to be split into 2 read operations; whereas for disk 2, the logic blocks 7 and 8 need to be seated for one read operation.
As shown in the conventional method in fig. 3, after receiving the host IO, splitting the host IO and creating a processing sequence, querying a cache according to the processing sequence, if the cache is not queried, judging whether the cache is the last processing sequence currently, and if the cache is the last processing sequence, moving the cache data to the host; if the current hit is not the last processing sequence, initializing variables and judging whether the type of the next block is the same as that of the current hit, if so, adding 1 to the number of blocks, and judging whether the current hit is the last block, if not, adding 1 to the block pointer, and re-entering the step of judging whether the type of the next block is the same as that of the current hit; if the hit types are different, judging whether the current hit type is hit, if so, issuing IO to the hard disk, and reading data to the cache. It follows that the prior art has the following drawbacks: the CPU needs to segment the hard disk IO according to the cache query result, and the hit state of the cache in general case presents high randomness, so the CPU needs to process a plurality of finely-divided segmentation tasks. Since the IO segmentation needs to rely on the result of the cache query, multiple interactions between the CPU and the hardware accelerator will occur in one host IO processing process, which will also bring a serious burden to the CPU. The hard disk accelerator receives more fine crushing tasks, so that pressure is brought to a task queue of the system, and the CPU is required to be notified after each task is completed, so that the burden of the CPU is further increased. Since NVMe is an extremely high IOPS of the storage system, the above disadvantage will cause the CPU, which is originally a performance bottleneck, to further compromise the system performance.
Step S12: and determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark.
In this embodiment, after the hit information and the valid flag corresponding to the data information are output according to the cache information, the data to be moved and the corresponding target moving area are determined from all the data information based on the hit information and the valid flag. It can be understood that, unlike the conventional accelerator, the present invention does not issue all the logical block addresses to the hard disk once, but needs to determine the number of times of issuing according to other conditions (i.e. hit information and valid flag), and determine which logical block addresses need to be issued. And sending all possible continuous logical block addresses to be operated to an NVMe accelerator as a task, wherein the NVMe accelerator receives additional information corresponding to each block in the continuous logical block addresses and then carries out IO segmentation according to the additional information, and the additional information comprises validity information (namely a valid mark) and hit information so that the accelerator can carry out IO splitting and only carry out actual operation on the required addresses.
Step S13: and moving the data to be moved to the corresponding target moving area according to the received target instruction.
In this embodiment, after determining the data to be moved and the corresponding target moving area from all the data information based on the hit information and the valid flag, the data to be moved is moved to the corresponding target moving area according to the received target instruction. Specifically, the target instruction is received, and the hit information of the current data to be moved corresponding to each logic block address contained in the target instruction is obtained; and moving the current data to be moved to the corresponding target moving area based on the hit information. It can be understood that the hard disk moves the specified data to the memory of the host according to the instruction issued by the hard disk controller, and the data moving module also determines which data is moved from the cache to the host according to the hit information, and performs the moving operation.
According to the invention, the data read from the disk can be directly moved to the host memory by the NVMe accelerator instead of being moved to the cache, so that the burden of the CPU caused by finely divided IO (input/output) is reduced, and the performance of the whole storage system is improved; the software and hardware interaction of a single host IO in the processing process is reduced, so that continuous and autonomous work of hardware is realized as far as possible, IO processing delay and CPU burden are reduced, and the hardware acceleration effect is exerted; the split data IO can be directly moved to the memory of the host from the hard disk, so that the access requirement on the cache bandwidth is reduced, and the use efficiency of the cache is further improved.
It can be seen that the present application provides a data migration method, comprising: acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information; determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark; and moving the data to be moved to the corresponding target moving area according to the received target instruction. Therefore, the application obtains the hit information and the effective mark corresponding to the data information, determines the data to be issued according to the information, directly carries out the moving operation on the data to be moved according to the obtained target moving area, reduces the access requirement on the cache bandwidth and the burden on the central processing unit caused by finely divided IO segmentation, and improves the performance of the whole storage system and the use efficiency of the cache.
Referring to fig. 4, an embodiment of the present application discloses a data moving method, and compared with the previous embodiment, the present embodiment further describes and optimizes a technical solution.
Step S21: and acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information.
In this embodiment, a host data page list is obtained, and cache information of each data information is queried based on the host data page list, so as to output hit information and a valid flag corresponding to the data information according to the cache information. A portion of the data is moved directly from the disk to the host and another portion of the data is moved from the cache to the host using a list of data pages obtained from the host. It is understood that the data to be moved to the host is split into two processes, namely, the host data page list and the moved data from the host, wherein the moved host data page list needs to be used by the NVMe accelerator. As shown in fig. 5, the existing original data structure is a cache data page list, two new data structures are added in the present invention, which are respectively a host data page list, hit information and an effective flag, the original data path is to move data from the cache to the host memory, a new data path is generated after adding the new data structure, and the data path is directly moved through the hard disk controller according to all data structure information without the need of caching. That is, the path from the accelerator to the host is added to the data path of the conventional NVMe accelerator, and the conventional method generally reads the data that is missed by the cache from the hard disk, stores the data in the cache, and moves the data to the host through the DMA. This is because, like conventional NVMe accelerators, conventional NVMe DMA controllers generally autonomously move and manage host data page lists, responsible for continuous data movement of one NVMe read/write IO. According to the invention, the data read from the NVMe hard disk is directly sent to the host memory instead of being written into the cache first, so that the access bandwidth requirement on the cache can be effectively reduced, and the use efficiency of the cache is improved.
The host data page list is an address pointer list, and each address pointer in the host data page list points to a data page of a host memory; the cached data page list is similar to the host data page list and is also an address pointer list, but the pointer points to the locally cached data page. The key point of splitting the data moving process into two steps is that: the host data page list is changed from internal data originally managed by the data moving module into public data, and can be provided for the data moving module and the NVMe accelerator, and only then the NVMe accelerator can directly move the data in the hard disk to the host; the behavior of the data transfer module can be redefined, that is, the data transfer module can receive additional information to determine which data needs to be transferred from the cache to the host, instead of the continuous data corresponding to the host IO which can be transferred only once in the conventional method. It will be appreciated that the process of moving data to the host is split into two parts, data page list movement and data movement. The data page list moving means that a list composed of pointers for describing discrete target data pages stored in a host memory is moved to a storage system local. And the data is moved, the data page list and other information (namely hit information and valid mark) acquired from the host are read, and the data in the cache is moved to the host.
As shown in fig. 6, the host data page list is first migrated locally and then provided to the hard disk controller for use with the data migration module. And then the cache controller inquires the cache and outputs cache hit information, the information is also provided for the hard disk controller and the data moving module at the same time, and the cache controller outputs a valid mark according to the current cache strategy and IO type. The hard disk accelerator determines data to be moved to the host, data to be moved to the cache and data not to be moved according to the effective mark and the cache hit information, and splits IO according to the information and issues the IO to the hard disk. And the hard disk moves the specified data to the memory of the host according to the instruction issued by the hard disk controller. The data transfer module also decides which data are transferred from the cache to the host according to the hit information, and executes the transfer operation.
Step S22: and determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark.
In this embodiment, after the hit information and the valid flag corresponding to the data information are output according to the cache information, the data to be moved and the corresponding target moving area are determined from all the data information based on the hit information and the valid flag. Specifically, the data information with the effective mark as a first effective mark is determined as the data to be moved; the first effective mark is a mark for representing actual operation on the hard disk; acquiring the hit information corresponding to each piece of data to be moved; judging whether the hit information of each data to be moved is first hit information or not; if the hit information of the data to be moved is the first hit information, determining that the corresponding target moving area is a cache; if the hit information of the data to be moved is not the first hit information, judging whether the hit information of each data to be moved is second hit information or not; if the hit information of the data to be moved is the second hit information, determining the corresponding target moving area as a host; determining the data information with the valid mark as a second valid mark as non-moving data; the second valid flag is a flag indicating that no actual operation is performed on the hard disk.
It can be appreciated that the NVMe accelerator determines the processing manner of the logical block address in the task according to the valid flag and the cache hit information. Each valid tag and cache hit corresponds to a logical block address. The processing rules are shown in table 1 below:
TABLE 1
Step S23: and when a task is received, reading all the effective marks of the data information, and determining the data information which is continuous and has the effective marks being the first effective mark as the current data to be moved.
In this embodiment, after determining the data to be moved and the corresponding target moving area from all the data information based on the hit information and the valid flag, when a task is received, the valid flag of all the data information is read, and the data information with continuous and valid flags being the first valid flag is determined as the current data to be moved. It can be understood that after the NVMe accelerator receives the task, the valid flags are read one by one, then the logical block addresses corresponding to the valid flags with continuous 1 are placed in a read operation on the hard disk and issued to the hard disk, and the specific implementation flow is as shown in fig. 7, after the task is received, the block address pointer is initialized, whether the valid flag is 1 is judged, if 1 is judged, whether the valid flag of the first data block or the last address is 0 is judged, if 0 is judged, a new IO record starting block address is created, and the block technology is cleared; if the effective mark of the address is not 0, judging whether the data block is the last data block, if the data block is not the last data block, judging whether the next effective mark is 0, if the next effective mark is 0, issuing IO to the hard disk, and if the next effective mark is not 0, adding 1 to the number of blocks.
According to the NVMe accelerator, the performance of read IO can be improved through an IO splitting mode, but for write IO, different methods are used for calculating redundant protection data due to different disk numbers, so that the NVMe accelerator can not well replace a CPU to split, and the CPU load of the part is reduced through improving the accelerator.
Step S24: storing the logical block address of the current data to be moved in a read-write operation to generate the corresponding target instruction, and issuing the target instruction to the hard disk.
In this embodiment, after determining that the data information that is continuous and that the effective flags are the first effective flag is the current data to be moved, storing the logical block address of the current data to be moved in a read-write operation to generate the corresponding target instruction, and issuing the target instruction to the hard disk.
After implementing the method of the invention, a new system flow is shown in fig. 8, after receiving the host IO, looking up Fei Zhuji IO and creating a processing sequence, obtaining a host data page list, querying a cache and issuing a task to an accelerator according to the query result so as to move the data to be moved to the host according to the task. Wherein dark flow boxes in the figure indicate hardware acceleration, and light boxes indicate CPU processing. For one host IO, the CPU only needs to split the host IO at the beginning and create a hardware task sequence, and the subsequent execution process can be independently completed by hardware, so that the task amount processed by the CPU is greatly reduced, and the back-and-forth interaction between the CPU and the hardware is avoided.
Step S25: and moving the data to be moved to the corresponding target moving area according to the received target instruction.
In this embodiment, after the target instruction is issued to the hard disk, the data to be moved is moved to the corresponding target moving area according to the received target instruction. It can be understood that, in order to implement the above data stream, the present invention designs a new NVMe accelerator, unlike the conventional NVMe accelerator, which allows all possible continuous logical block addresses to be operated to be issued as a task to the NVMe accelerator, and after receiving the task, the NVMe accelerator does not directly initiate the operation on the disk according to the addresses, but decides which addresses need to operate on the hard disk according to the valid flag and hit information, which addresses should be ignored, and splits a task into multiple read-write operations on the hard disk according to the information. Because the IO splitting is determined according to the cached query result, the traditional method needs to perform cache query first, then intervene by the CPU, reject the address which does not need to actually operate the hard disk in the addresses to be operated (for the read operation, namely, the data of the cache hit part), and then issue a plurality of tasks to the NVMe accelerator, so that the addresses received by the accelerator are all addresses which need to actually operate the hard disk. Therefore, the situation that one host IO needs to switch and execute between hardware and a CPU for many times and the situation that the CPU issues a large number of fine crushing tasks to the NVMe accelerator occur, the load of the CPU is increased, and the bottleneck effect of the CPU is amplified to be tired of the system performance. The accelerator in the invention allows the issuing of the operation addresses which are possible to happen, so that the CPU can determine the addresses (determined by the address mapping rule) related to the NVMe accelerator task when receiving the host IO without waiting for a cache hit result, and the CPU can sequentially create a cache query task and the NVMe accelerator task and then fully transmit the cache query task and the NVMe accelerator task to hardware for execution without back and forth interaction.
Therefore, according to the embodiment of the application, the host data page list is obtained, and the cache information of each data message is queried based on the host data page list, so that the hit information and the effective mark corresponding to the data message are output according to the cache information; determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark; when a task is received, reading all the valid marks of the data information; determining the data information which is continuous and the effective marks are the first effective marks as the current data to be moved; storing the logical block address of the current data to be moved in a read-write operation to generate the corresponding target instruction, and issuing the target instruction to the hard disk; and moving the data to be moved to the corresponding target moving area according to the received target instruction, reducing the access requirement on the buffer bandwidth and the burden on the central processing unit caused by finely divided IO segmentation, and improving the performance of the whole storage system and the use efficiency of the buffer.
Referring to fig. 9, the embodiment of the application also correspondingly discloses a data moving device, which comprises:
A list acquisition module 11, configured to acquire a host data page list;
an information output module 12, configured to query cache information of each data information based on the host data page list, so as to output hit information and a valid flag corresponding to the data information according to the cache information;
an information determining module 13, configured to determine data to be moved and a corresponding target moving area from all the data information based on the hit information and the valid flag;
the data moving module 14 is configured to move the data to be moved to the corresponding target moving area according to the received target instruction.
It can be seen that the present application includes: acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information; determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark; and moving the data to be moved to the corresponding target moving area according to the received target instruction. Therefore, the application obtains the hit information and the effective mark corresponding to the data information, determines the data to be issued according to the information, directly carries out the moving operation on the data to be moved according to the obtained target moving area, reduces the access requirement on the cache bandwidth and the burden on the central processing unit caused by finely divided IO segmentation, and improves the performance of the whole storage system and the use efficiency of the cache.
In some specific embodiments, the list obtaining module 11 specifically includes:
the host data page list moving unit is used for acquiring a host data page list from a host memory and moving the host data page list to the local;
and the host data page list access unit is used for accessing the host data page list in the local area through the cache controller.
In some embodiments, the information output module 12 specifically includes:
a cache information inquiry unit configured to inquire the cache information of each data information based on the acquired host data page list;
the first hit information acquisition unit is used for acquiring hit information corresponding to each piece of data information based on the cache information;
a hit information transmitting unit for transmitting the hit information to the hard disk controller and the data moving area;
and the effective mark determining unit is used for determining that each piece of data information corresponds to the effective mark and outputs the effective mark through the cache controller and based on the current cache strategy and the input and output type.
In some embodiments, the information determining module 13 specifically includes:
the data information determining unit is used for determining the data information with the effective mark as a first effective mark as the data to be moved; the first effective mark is a mark for representing actual operation on the hard disk;
The second hit information acquisition unit is used for acquiring the hit information corresponding to each piece of data to be moved;
the first hit information judging unit is used for judging whether the hit information of each data to be moved is first hit information or not;
the first target moving area determining unit is used for determining that the corresponding target moving area is a cache if the hit information of the data to be moved is the first hit information;
the second hit information judging unit is used for judging whether the hit information of each piece of data to be moved is second hit information or not if the hit information of the data to be moved is not the first hit information;
the second target moving area determining unit is used for determining that the corresponding target moving area is a host if the hit information of the data to be moved is the second hit information;
a non-moving data determining unit configured to determine the data information whose effective flag is a second effective flag as non-moving data; the second effective mark is a mark for representing that actual operation is not executed on the hard disk;
an effective mark reading unit for reading the effective marks of all the data information when a task is received;
The current data to be moved determining unit is used for determining the continuous data information with the effective marks being the first effective marks as the current data to be moved;
and the target instruction issuing unit is used for storing the logical block address of the current data to be moved in one-time read-write operation so as to generate the corresponding target instruction and issuing the target instruction to the hard disk.
In some embodiments, the data mover module 14 specifically includes:
a target instruction receiving unit configured to receive the target instruction;
a third hit information obtaining unit, configured to obtain the hit information of the current data to be moved corresponding to each logical block address included in the target instruction;
and the current data to be moved is moved to the corresponding target moving area based on the hit information.
Further, the embodiment of the application also provides electronic equipment. Fig. 10 is a block diagram of an electronic device 20, according to an exemplary embodiment, and the contents of the diagram should not be construed as limiting the scope of use of the present application in any way.
Fig. 10 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is configured to store a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the data migration method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further comprise a computer program capable of performing other specific tasks in addition to the computer program capable of performing the data movement method performed by the electronic device 20 as disclosed in any of the previous embodiments.
Further, the embodiment of the application also discloses a storage medium, wherein the storage medium stores a computer program, and when the computer program is loaded and executed by a processor, the steps of the data moving method disclosed in any embodiment are realized.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has described in detail a method, apparatus, device and storage medium for data migration provided by the present invention, and specific examples have been applied herein to illustrate the principles and embodiments of the present invention, and the above examples are only for aiding in the understanding of the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.
Claims (10)
1. A method of data movement, comprising:
acquiring a host data page list, and inquiring cache information of each data message based on the host data page list so as to output hit information and an effective mark corresponding to the data message according to the cache information;
determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark;
and moving the data to be moved to the corresponding target moving area according to the received target instruction.
2. The data migration method of claim 1, wherein the obtaining a list of host data pages comprises:
And acquiring a host data page list from a host memory, and moving the host data page list to the local.
3. The data migration method according to claim 2, wherein the querying cache information of each data information based on the host data page list to output hit information and a valid flag corresponding to the data information according to the cache information includes:
accessing the host data page list in the local area through a cache controller, and inquiring the cache information of each data information based on the acquired host data page list;
acquiring hit information corresponding to each piece of data information based on the cache information, and sending the hit information to a hard disk controller and a data moving area;
and outputting the data information corresponding to the effective mark through the cache controller.
4. The data migration method of claim 3, wherein outputting, by the cache controller, the data information corresponding to the valid flag comprises:
and determining each data information to correspond to the valid mark and outputting the valid mark through the cache controller and based on the current cache strategy and the input and output type.
5. The method according to any one of claims 1 to 4, wherein the determining data to be moved and a corresponding target movement area from all the data information based on the hit information and the valid flag includes:
determining the data information with the effective mark as a first effective mark as the data to be moved; the first effective mark is a mark for representing actual operation on the hard disk;
acquiring the hit information corresponding to each piece of data to be moved;
judging whether the hit information of each data to be moved is first hit information or not;
if the hit information of the data to be moved is the first hit information, determining that the corresponding target moving area is a cache;
if the hit information of the data to be moved is not the first hit information, judging whether the hit information of each data to be moved is second hit information or not;
if the hit information of the data to be moved is the second hit information, determining the corresponding target moving area as a host;
determining the data information with the valid mark as a second valid mark as non-moving data; the second valid flag is a flag indicating that no actual operation is performed on the hard disk.
6. The method according to claim 5, wherein after determining the data to be moved and the corresponding target moving area from all the data information based on the hit information and the valid flag, further comprising:
when a task is received, reading all the valid marks of the data information;
determining the data information which is continuous and the effective marks are the first effective marks as the current data to be moved;
storing the logical block address of the current data to be moved in a read-write operation to generate the corresponding target instruction, and issuing the target instruction to the hard disk.
7. The method of claim 6, wherein the moving the data to be moved to the corresponding target moving area according to the received target instruction comprises:
receiving the target instruction and acquiring the hit information of the current data to be moved, which corresponds to each logic block address contained in the target instruction;
and moving the current data to be moved to the corresponding target moving area based on the hit information.
8. A data mover, comprising:
the list acquisition module is used for acquiring a host data page list;
the information output module is used for inquiring the cache information of each data information based on the host data page list so as to output hit information and a valid mark corresponding to the data information according to the cache information;
the information determining module is used for determining data to be moved and a corresponding target moving area from all the data information based on the hit information and the effective mark;
and the data moving module is used for moving the data to be moved to the corresponding target moving area according to the received target instruction.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to carry out the steps of the data migration method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program; wherein the computer program when executed by a processor implements a data migration method as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310900665.7A CN116931830A (en) | 2023-07-21 | 2023-07-21 | Data moving method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310900665.7A CN116931830A (en) | 2023-07-21 | 2023-07-21 | Data moving method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116931830A true CN116931830A (en) | 2023-10-24 |
Family
ID=88383949
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310900665.7A Pending CN116931830A (en) | 2023-07-21 | 2023-07-21 | Data moving method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116931830A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118331512A (en) * | 2024-06-14 | 2024-07-12 | 山东云海国创云计算装备产业创新中心有限公司 | Processing method and device based on memory control card |
-
2023
- 2023-07-21 CN CN202310900665.7A patent/CN116931830A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118331512A (en) * | 2024-06-14 | 2024-07-12 | 山东云海国创云计算装备产业创新中心有限公司 | Processing method and device based on memory control card |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11099769B1 (en) | Copying data without accessing the data | |
US9612975B2 (en) | Page cache device and method for efficient mapping | |
JP2018509695A (en) | Computer program, system, and method for managing data in storage | |
US9940023B2 (en) | System and method for an accelerator cache and physical storage tier | |
US10552045B2 (en) | Storage operation queue | |
CN112463753B (en) | Block chain data storage method, system, equipment and readable storage medium | |
US11036641B2 (en) | Invalidating track format information for tracks demoted from cache | |
US11080197B2 (en) | Pre-allocating cache resources for a range of tracks in anticipation of access requests to the range of tracks | |
US10275175B2 (en) | System and method to provide file system functionality over a PCIe interface | |
US9946660B2 (en) | Memory space management | |
US10996857B1 (en) | Extent map performance | |
CN116931830A (en) | Data moving method, device, equipment and storage medium | |
CN108132760A (en) | A kind of method and system for promoting SSD reading performances | |
US11080254B2 (en) | Maintaining data associated with a storage device | |
CN116755625A (en) | Data processing method, device, equipment and readable storage medium | |
CN110352410B (en) | Tracking access patterns of index nodes and pre-fetching index nodes | |
WO2022029563A1 (en) | Obtaining cache resources for expected writes to tracks in write set after the cache resources were released for the tracks in the write set | |
JP2007102436A (en) | Storage controller and storage control method | |
US8140804B1 (en) | Systems and methods for determining whether to perform a computing operation that is optimized for a specific storage-device-technology type | |
CN117170582A (en) | Method, device, equipment and storage medium for implementing redirection-on-write snapshot | |
CN118331512B (en) | Processing method and device based on memory control card | |
KR102697447B1 (en) | Half-match deduplication | |
US20240168876A1 (en) | Solving submission queue entry overflow using metadata or data pointers | |
US12135655B2 (en) | Saving track metadata format information for tracks demoted from cache for use when the demoted track is later staged into cache | |
JP2023137488A (en) | Storage system and data cache method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |