CN117032596B - Data access method and device, storage medium and electronic equipment - Google Patents

Data access method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN117032596B
CN117032596B CN202311300081.2A CN202311300081A CN117032596B CN 117032596 B CN117032596 B CN 117032596B CN 202311300081 A CN202311300081 A CN 202311300081A CN 117032596 B CN117032596 B CN 117032596B
Authority
CN
China
Prior art keywords
data
partition
target
address
disk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311300081.2A
Other languages
Chinese (zh)
Other versions
CN117032596A (en
Inventor
李飞龙
马艳
许永良
王磊
康佳
孙明刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311300081.2A priority Critical patent/CN117032596B/en
Publication of CN117032596A publication Critical patent/CN117032596A/en
Application granted granted Critical
Publication of CN117032596B publication Critical patent/CN117032596B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application provides a data access method and device, a storage medium and electronic equipment, wherein the data access method comprises the following steps: the method comprises the steps that a first access request is obtained, and a RAID storage system comprises a plurality of disk groups, wherein each disk comprises a cache partition and a data partition; when the first data are data stored on a target disk in a plurality of disk groups and the first data belong to a first type, determining whether the first data are stored in a cache partition in the target disk, wherein the first type of data are data with access times or access frequency larger than a preset threshold value; when the first data is not stored in the cache partition in the target disk, the first data is accessed in the data partition in the target disk, the first data is copied into the cache partition in the target disk, a target mapping relation is added in the mapping relation table, and the target mapping relation is a mapping relation between a first address of the first data in the data partition in the target disk and a second address of the first data in the cache partition in the target disk.

Description

Data access method and device, storage medium and electronic equipment
Technical Field
The embodiment of the application relates to the field of computers, in particular to a data access method and device, a storage medium and electronic equipment.
Background
RAID (Redundant Array of Independent Disks ) cards are peripheral devices to the storage server and are embedded devices. In order to avoid the problem that the disk becomes an I/O read-write performance bottleneck of the RAID storage system due to the increasingly larger performance gap between the disk suspended under the RAID storage system and the RAID storage system memory, frequently accessed data is usually stored in the RAID storage system memory, so that the data can be responded quickly and read efficiently.
However, as an embedded device, the RAID storage system has a small memory capacity, and the data that is frequently accessed is stored in the RAID storage system memory, which requires a large amount of memory space, resulting in insufficient memory. In addition, when a foreground host intensive I/O (Input/Output) request is encountered, the I/O read/write performance of the RAID storage system is degraded due to the consumed RAID storage system memory resources.
Therefore, the related art data access method has the problem of poor data access efficiency of the RAID storage system.
Disclosure of Invention
The embodiment of the application provides a data access method and device, a storage medium and electronic equipment, which at least solve the problem that the data access efficiency of a RAID storage system is poor in the data access method in the related technology.
According to one embodiment of the present application, there is provided a data access method including: acquiring a first access request, wherein the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition; determining whether the first data is stored in the cache partition in the target disk in the plurality of disk groups or not when the first data is data stored on the target disk in the plurality of disk groups and the first data is data of a first type, wherein the first type of data is data with access times or access frequencies larger than a preset threshold; accessing the first data in the data partition in the target disk and copying the first data into the cache partition in the target disk under the condition that the first data is not stored in the cache partition in the target disk, and adding a target mapping relation in a mapping relation table, wherein the mapping relation table is positioned in a memory of the storage system, the target mapping relation is a mapping relation between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk.
According to still another embodiment of the present application, there is provided a data access apparatus including: the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition; a first determining unit, configured to determine, if the first data is data stored on a target disk in the plurality of disk groups and the first data is a first type of data, whether the first data is stored in the cache partition in the target disk, where the first type of data is data with a number of accesses or an access frequency greater than a preset threshold; the first execution unit is configured to, when the first data is not stored in the cache partition in the target disk, access the first data in the data partition in the target disk, copy the first data into the cache partition in the target disk, and add a target mapping relationship in a mapping relationship table, where the mapping relationship table is located in a memory of the storage system, the target mapping relationship is a mapping relationship between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk.
According to a further embodiment of the present application, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the present application, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to the embodiment of the application, the data which are frequently accessed are stored in the cache partition of the disk hung under the RAID system, the storage address of the recorded data is built in the memory of the RAID storage system, the frequently accessed data can be copied into the cache partition from the data partition of the disk hung under the RAID system, then the mapping relation between the address of the data in the data partition and the address of the data in the cache data is recorded in the mapping relation table in the memory of the RAID storage system, the available space of the memory of the RAID storage system is not affected due to the small amount of resources occupied by the mapping relation table, and the address of the accessed data in the cache partition can be determined in a short time due to the fast reading and writing speed of the memory. In addition, the copy of the frequently accessed data is stored in the cache partition, so that the time for searching the data in the disk can be effectively reduced, the technical effect of the data access efficiency of the RAID storage system is achieved, and the problem that the data access efficiency of the RAID storage system is poor in the data access method in the related art is solved.
Drawings
FIG. 1 is a schematic diagram of a hardware environment of a data access method according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of data access according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a data access method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of another data access method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of yet another data access method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a cache mapping module according to an embodiment of the present application;
FIG. 7 is a flow chart of another data access method according to an embodiment of the present application;
fig. 8 is a block diagram of a data access device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking a computer terminal as an example, fig. 1 is a schematic diagram of a hardware environment of a data access method according to an embodiment of the present application. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the computer terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a data access method in the embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
According to an aspect of the embodiments of the present application, there is provided a data access method, taking as an example the data access method in the embodiment being executed by a computer terminal, fig. 2 is a flowchart of a data access method according to the embodiments of the present application, as shown in fig. 2, the flowchart including the steps of:
step S202, a first access request is obtained, wherein the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and disks in each disk group comprise a cache partition and a data partition.
The data access method in the embodiment can be applied to a control scene of data access in a RAID storage system. The core component of the RAID storage system can be a RAID card, the RAID card realizes data management, algorithm and all important functions by chip hardware, and the hard disk connected with the server is organized into a board card with RAID array function according to RAID level. The RAID card controller in the RAID card is a chip and consists of a series of components such as a buffer memory, an I/O processor, a disk controller, a disk connector and the like, and can be regarded as the peripheral equipment of a storage server, and the RAID card is a hard RAID storage technology which is proposed on the basis of a soft RAID storage technology.
The RAID card is a peripheral of a storage server and is an embedded device. The increasing performance gap between the disk under the RAID card and the RAID card memory is the main reason for the disk becoming the bottleneck of the I/O read-write performance of the RAID card. Methods of storing frequently used data in a memory are generally employed so that the corresponding data can be accessed quickly. Advantages include shorter response times and efficient read speeds. However, the disadvantages are also apparent, a large amount of memory space is required, and the problem of insufficient memory may be caused, and the RAID card, as an embedded device, has a very limited memory capacity. Thus, the methods used in the prior art are at the expense of consuming RAID card memory resources. When the intensive I/O request of the foreground host is met, the RAID card memory resource is consumed more, so that the I/O read-write performance of the RAID card is easily reduced, the user service is influenced, and the user experience is reduced.
To at least partially solve the above problem, in this embodiment, a small portion may be split in each disk in a RAID array suspended under a RAID card, to create a cache partition, that is, the disks in each disk group include a cache partition and a data partition, where the cache partition is used to cache a copy of data that is frequently accessed, that is, for data that is more frequently accessed in the data partition, the copied copy may be stored in the cache partition after being copied from the data partition. While the data that is not currently being accessed is stored in the data partition using the remaining disks. By splitting the cache partition and the data partition, data that is currently important to the client can be separated from data that is rarely used.
By the method, the cache partition is segmented from the disks of the RAID array, so that the data which are frequently used are placed, and the method has the following advantages:
creating a large cache using a small portion of all available disks can increase the time that important data resides in the cache;
the disk-based cache is a permanent cache, and any optimized layout will be continuously valid for the user initiated foreground host I/O read/write task, even if a RAID card needs to be turned off or reconfigured;
aggregating frequently accessed data together provides the opportunity to improve access patterns, i.e., initially scattered data accesses can be performed in sequence if the layout is appropriate, by placing hot data (i.e., frequently accessed data) together, addressing time and time delays in all disks can be reduced;
the possibility to maintain QoS (Quality of Service ) is provided by reassigning the most commonly accessed data based on the cache partition of the disk.
In this embodiment, the RAID array suspended under the RAID storage system may be a large-capacity disk set formed by combining a plurality of independent disks, that is, the RAID storage system may include a plurality of disk sets. For the first access request, the first access request may be captured by an associated request monitoring module within the RAID card when the first access request enters the RAID card. The first access request may be an I/O request, and may be used to request access to first data stored in a redundant array of independent disks RAID storage system. Correspondingly, the request monitoring module may be an I/O read-write monitoring module within the RAID storage system.
As shown in fig. 3, a RAID card constituted by three RAID arrays is exemplified, and the RAID card is constituted by three RAID arrays, one is RAID0, one is RAID5, and the other is RAID6. In the embodiment, an I/O read-write monitoring module, a cache mapping module and an I/O redirection module are newly added in the RAID storage system. In fig. 3, 101 indicates an I/O read/write monitoring module, 102 indicates an I/O redirecting module, 103 indicates a cache mapping module, and 104 indicates a cache partition (a disk in a RAID array suspended under a RAID card includes a cache partition and a data partition). The firmware layer in the RAID card comprises a driver, a RAID card kernel, a file system and the like, and the RAID card kernel and the file system can provide functions for accessing files and logical unit numbers and manage the functions. The drivers and processors at the firmware layer in the RAID card execute program instructions for handling host I/O requests. The I/O read/write monitoring module is responsible for capturing and analyzing foreground host I/O read/write requests, using the latest k more active different data (k being the current capacity of the cache partition), and the I/O redirection module is responsible for intercepting all read/write requests sent to RAID and redirecting them to the appropriate partition. The buffer mapping module is used for storing the mapping relation between the data storage address in the data partition and the storage address of the corresponding data in the buffer partition, and only occupies the minimum memory resource in the RAID card. Stripe in FIG. 3 is a collection of location-dependent partitions on different partitions of an array, which is the unit of organization of the partitions on different partitions.
In practical use, the storage capacity of a RAID card is huge, and it is common that tens of disks form a RAID array of tens or even twenty-thirty different levels.
In step S204, if the first data is data stored on the target disk in the plurality of disk groups and the first data is data of a first type, it is determined whether the first data is stored in the cache partition in the target disk, where the first type of data is data having a number of accesses or a frequency of accesses greater than a preset threshold.
In the present embodiment, after determining that the first data is data stored on a target disk in a plurality of disk groups, it may be determined first whether the first data belongs to hot data, that is, data of the first type in the present embodiment. In the event that it is determined that the first data is not the first type of data, the first data may only be present in the data partition. In the event that the first data is determined to be a first type of data, the first data may exist in both the data partition and the cache partition. Here, the first type of data may be data whose access number or access frequency is greater than a preset threshold.
Note that, the number of accesses in this embodiment may refer to the number of accesses within a preset time. The access frequency may also be an access frequency within a preset time. And in the preset time, the access times or the access frequency are larger than a preset threshold value, and the data can be considered as frequently accessed data. The number of accesses or the frequency of accesses corresponding to the data belonging to the first type in the different storage systems may be different in different products.
Thus, when it is determined that the first data is data stored on a target disk in the plurality of disk groups and the first data is data of the first type, it may be first determined whether the first data is stored in a cache partition in the target disk to determine whether the first data can be accessed directly from the cache partition.
In step S206, when the first data is not stored in the cache partition in the target disk, the first data is accessed in the data partition in the target disk, and copied to the cache partition in the target disk, and the target mapping relationship is added in the mapping relationship table, where the mapping relationship table is located in the memory of the storage system, the target mapping relationship is a mapping relationship between the first address and the second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk.
In this embodiment, in a case where it is determined that the first data is not stored in the cache partition in the target disk, the I/O redirection module in the foregoing embodiment may redirect the first access request to the data partition to access the first data in the data partition in the target disk. In addition, when the first data is accessed, the first data in the data partition can be copied to a cache partition in the target disk, so that subsequent access is facilitated, and the first data can be directly accessed from the cache partition.
In this embodiment, copying the data into the cache partition may refer to storing a copy of the data obtained by copying the data in the cache partition.
In order to improve the subsequent access efficiency to the first data in the cache partition, after the first data is copied to the cache partition in the target disk, a target mapping relationship may be added to the mapping relationship table. Here, the mapping relationship table is located in the memory of the RAID storage system, that is, in the foregoing cache mapping module. The target mapping may be a mapping between the first address and the second address. The first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk.
For example, as shown in fig. 4, the RAID card includes an I/O read/write monitoring module, an I/O redirecting module, and a cache mapping module, where the data partition is D a The cache partition is D c . When an I/O request enters a RAID card (flow A in FIG. 4), the I/O request is captured by an I/O read/write monitoring module of the RAID card, which determines whether the data accessed by the I/O request is hotIf the data is hot and the data is not in a cache partition, the data is copied to the cache partition (flow B.1 in FIG. 4) and the appropriate mapping is performed <LBA original ,LBA cache >Stored in the map cache (flow b.2 in fig. 4).
Optionally, in the case of each acquired access request, in order to fully utilize the parallelism of the disk array system and improve the efficiency and performance of data access, as shown in fig. 5, the I/O request of a host user may be split first and then split into blocks, that is, the I/O request is split according to the stripe of the RAID array, and then the stripe is split according to each disk partition, so as to obtain multiple strips (stripe units, i.e., a certain number of consecutive blocks predefined on each disk) and parity (parity value) in fig. 5, and multiple strips (stripes, i.e., a set of location-related strips on different partitions of the array). By splitting the IO requests into blocks, the large IO requests can be split into a plurality of small IO requests, and each small IO request only operates one block, so that the transmission time and response time of the IO requests can be reduced, and the reading and writing speed of data can be improved.
Through the steps, a first access request is obtained, wherein the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition; determining whether the first data is stored in a cache partition in a target disk in a plurality of disk groups under the condition that the first data is stored on the target disk in the plurality of disk groups and the first data is a first type of data, wherein the first type of data is data with access times or access frequency larger than a preset threshold value; under the condition that the first data is not stored in the cache partition in the target disk, accessing the first data in the data partition in the target disk, copying the first data into the cache partition in the target disk, and adding a target mapping relation in a mapping relation table, wherein the mapping relation table is positioned in a memory of a storage system, the target mapping relation is a mapping relation between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk, so that the problem that the data access efficiency of a RAID storage system is poor in a data access method in the related art can be solved, and the technical effect of improving the data access efficiency of the RAID storage system is achieved.
In an exemplary embodiment, the above method further comprises:
s11, acquiring a second access request, wherein the second access request is used for requesting to access first data stored in a storage system;
s12, determining whether the first data is stored in a cache partition in the target disk in the plurality of disk groups when the first data is data stored on the target disk in the plurality of disk groups and the first data is data of a first type;
s13, under the condition that the first data is stored in a cache partition in the target disk, acquiring a second address according to a target mapping relation in a mapping relation table;
s14, accessing the first data with the storage address being the second address in the cache partition in the target disk.
In this embodiment, data stored in the storage system may be accessed multiple times by multiple access requests. The second access request for requesting access to the first data stored in the storage system may also be acquired by the I/O read/write monitoring module in the foregoing embodiment. The processing of the second access request may be similar to that of the first access request in the foregoing embodiment, and this embodiment will not be described herein.
In the case where it is determined that the first data is data stored on a target disk of the plurality of disk groups and the first data is a first type of data, it may be determined first whether the first data is stored in a cache partition in the target disk, and then to which partition the second access request needs to be redirected.
Under the condition that the first data is stored in the cache partition in the target disk, the second address can be acquired according to the target mapping relation in the mapping relation table, and the first data with the storage address being the second address can be accessed in the cache partition in the target disk.
For example, as shown in FIG. 4, when an I/O request enters a RAID card (flow A in FIG. 4), the I/O read/write monitor module determines that the data accessed by the I/O request is hot data and is present in the cache partition, and the I/O redirection module redirects access to the data to the cache partition (flows C.1 and C.2 in FIG. 4).
By the embodiment, the first type of data is accessed in the cache partition, so that the data access efficiency can be improved, and the I/O read-write performance in the RAID card can be improved.
In one exemplary embodiment, determining whether first data is stored in a cache partition in a target disk includes:
s21, under the condition that the second access request comprises the first address, searching a target mapping relation corresponding to the first address in a mapping relation table;
s22, under the condition that the mapping relation table is searched for the target mapping relation, determining that the first data is stored in the cache partition in the target disk.
Since the mapping relation table records the mapping relation of the addresses of the data in the data partition and the cache partition, in the present embodiment, the access request for requesting to access the data stored in the storage system may be a request including the storage address of the data in the data partition.
When determining whether the first data is stored in the cache partition in the target disk, if the second access request includes the first address, a target mapping relationship corresponding to the first address may be searched in the mapping relationship table. Under the condition that the target mapping relation is found, it can be determined that the first data is stored in the cache partition in the target disk.
As shown in FIG. 6, the record mapping relationship in the cache mapping module may be<LBA original ,LBA cache >,LBA original May represent the address of the data in the data partition, LBA cache An address in the cache partition may be represented.
According to the embodiment, whether the first data is stored in the cache partition is determined through the data recorded in the mapping relation table, and whether the first data is stored in the cache partition can be determined in a short time because the mapping relation table is located in the memory of the RAID storage system, so that the efficiency of data access is improved.
In an exemplary embodiment, the mapping relation table is a binary structure of a tree, and searching the mapping relation table for the target mapping relation corresponding to the first address includes:
S31, searching a target mapping relation corresponding to the first address from a root node in a binary structure of a tree, wherein the binary structure of the tree comprises the root node and a group of leaf nodes.
In this embodiment, the mapping relationship may be processed using a tree-based binary structure algorithm, i.e., the mapping relationship is recorded in a binary structure of a tree. Here, the binary structure of the tree may include one root node and one set of leaf nodes.
In the binary tree, each node has two pointers, one pointing to the left child node and the other pointing to the right child node. The two pointers may be represented by a single binary bit.
When searching the target mapping relation corresponding to the first address in the mapping relation table, the mapping relation table can be searched from the root node, the key value is compared with the key value of the current node, and the left subtree or the right subtree is selected to continue searching according to the comparison result until the matched key value is found or the leaf node is reached.
By using the tree-based binary structure to process the mapping, the embodiment can reduce the complexity of searching and improve the searching efficiency.
In an exemplary embodiment, after the second access request is obtained, the method further includes:
S41, when the first data is data stored on the target disk in the plurality of disk groups, the second access request includes the first address, and the first data is not data of the first type, the first data having the first address stored therein is accessed in the data partition in the target disk.
In order to reduce the time of data access when the first data is not of the first type, the second access request may be redirected directly into the data partition. The second access request may include an address of the accessed data in the data partition.
Optionally, considering that there is the first data being the first type of data in a certain period of time, the first data is stored in the cache partition, and the mapping relation of the addresses of the first data in the data partition and the cache partition is also recorded in the cache mapping module, but in the period of time for obtaining the second access request, the first data is no longer the first type of data due to the reduced number of times or frequency of access in the period of time. In the case where the first data is data stored on a target disk in the plurality of disk groups, the second access request includes the first address, and the first data is not data of the first type, it may be determined whether the cache mapping table has the target mapping relationship corresponding to the first address according to the foregoing manner. If the target mapping relation is found, accessing the data of the second address in the cache partition according to the second address determined by the target mapping relation.
By the embodiment, when the first data is not the first type of data, the second access request is directly redirected to the data partition, so that the time for data access can be reduced.
In an exemplary embodiment, after the first access request is acquired and before the second access request is acquired, the method further includes:
s51, determining the type of the first data as a second type different from the first type when the access times or the access frequency of the first data are detected to be smaller than or equal to a preset threshold, wherein the second type of data is the data with the access times or the access frequency smaller than or equal to the preset threshold.
In this embodiment, after the first data is copied to the cache partition because the first access request is acquired and the first data accessed by the first access request is hot data, the number of accesses or the access frequency of the first data may be detected in real time or at a certain time interval. As in the previous embodiment, the number of accesses or the access frequency here may be the number of accesses or the access frequency within a preset time.
It should be noted that, when the first data is hot data, the I/O redirection module may redirect all accesses to the first data into the cache partition until the I/O read/write monitoring module determines that the first data is no longer hot data.
When it is determined that the number of accesses or the frequency of accesses of the first data is less than or equal to the preset threshold, the type of the first data may be determined as a second type different from the first type. The second type here may be data of which the number of accesses or the access frequency is less than or equal to a preset threshold.
Optionally, in the case that the type of the first data is determined to be a second type different from the first type, the method further includes:
deleting the target mapping relation in the mapping relation table; or,
deleting the target mapping relation in the mapping relation table under the condition that the number of the mapping relations recorded in the mapping relation table is larger than a preset number threshold; or,
and deleting the target mapping relation in the mapping relation table when the mapping relation table is full.
In this embodiment, after the first data is copied to the cache partition because the first data is hot data, when it is detected that the number of accesses or the access frequency of the first data is reduced to be smaller than a preset threshold, it may be determined that the first data is no longer hot data, and then the target mapping relationship may be deleted in the mapping relationship table.
Alternatively, when the first data is no longer hot data, the first data in the cache partition may be replaced with new hot data for the first data cached in the cache partition. The data stored in the cache partition may be arranged according to the determined number of accesses or the size of the access frequency, that is, the data with the largest number of accesses or the largest access frequency may be located at the position where the cache partition is accessed first.
In this embodiment, for the target mapping relationship of the first data that is no longer the hot data and is recorded in the mapping relationship table, the target mapping relationship may be deleted in the mapping relationship table if the number of the mapping relationships recorded in the mapping relationship table is greater than a preset number threshold, or the target mapping relationship may be deleted in the mapping relationship table if the mapping relationship table is full.
The fact that the mapping relationship table is full may mean that the mapping relationship recorded in the mapping relationship table reaches the maximum record number, that is, assuming that the mapping relationship table records 100 mapping relationships at most, 100 mapping relationships have been recorded currently, it may be indicated that the mapping relationship table is full.
By the embodiment, when the first data is no longer hot data, the target mapping relation corresponding to the first data recorded in the mapping relation table is deleted, so that the resource utilization rate of the mapping relation table can be improved, and the memory resource occupied by the mapping relation table can be reduced.
In one exemplary embodiment, accessing first data having a storage address of a second address in a cache partition in a target disk includes:
s61, when the value of the first data in the data partition in the target disk is the second value and the second access request is used for requesting to update the value of the first data to the first value, updating the value of the first data with the access storage address of the second address in the cache partition in the target disk from the second value to the first value.
In a RAID storage system, data in the storage system may be read or written by sending I/O requests, i.e., reading data stored by the storage system or writing new data to the storage system. In this embodiment, the second access request may be used to request updating of the value of the first data to the first value. The value here may be the actual data content corresponding to the data itself. For example, the first data is the temperature at 13 stored in the storage system, and the value of the first data is a specific value of the temperature.
Since the data in the cache partition is a copy of the portion of the data stored in the data partition, the value of the data in the cache partition and the value of the corresponding data in the data partition are generally the same. But the value of the data may be changed by the I/O request.
In this embodiment, when the value of the first data in the data partition in the target disk is the second value and the second access request is used to request to update the value of the first data to the first value, the accessing the first data with the second address in the cache partition in the target disk may refer to updating the value of the first data with the second address in the cache partition in the target disk from the second value to the first value, because the first data is stored in the cache partition.
Optionally, after updating the value of the first data having the access storage address of the second address in the cache partition in the target disk from the second value to the first value, the method further includes:
and updating the value of the first data in the data partition in the target disk from the second value to the first value under the condition that the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk.
It should be noted that any update to the data in the cache partition may be written back to the corresponding data in the data partition (as shown in the flow D of fig. 4). Any updates herein may include changes, deletions, additions, etc. to the data content.
Alternatively, in the case where the first data is stored in the cache partition due to the second access request, the value of the first data in the cache partition may be updated immediately after the second access request is obtained, and the updating of the value of the first data in the data partition may be performed when the RAID storage system is idle.
According to the embodiment, when the access request is used for updating the first data, the value of the first data in the cache partition is updated first, and then the value of the first data in the data partition is updated, so that the response efficiency to the access request can be improved.
In an exemplary embodiment, the above method further comprises:
s71, responding to the received operation instruction, and determining whether a first value of first data in a cache partition in the target disk is consistent with a second value of first data in a data partition in the target disk;
s72, when the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk, updating the value of the first data in the data partition in the target disk from the second value to the first value.
In this embodiment, for the data stored in the cache partition, the data partition of the corresponding disk may be further forced to be flushed down. The downhand here may refer to writing data in a cache partition to a data partition of a disk. Forced swipe down may be triggered by a user-triggered instruction.
It should be noted that after the data is flushed from the cache partition to the data partition, the data will not be present in the cache partition.
In this embodiment, after the received operation instruction, it may be determined whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, considering that the data in the cache partition is copied from the data partition and the data in the cache partition is not necessarily updated all. The operation instruction herein may be to flush down the first data in the cache partition into the data partition.
If the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk, the value of the first data in the data partition in the target disk can be updated from the second value to the first value.
Optionally, before determining, in response to the received operation instruction, whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, the method further includes:
and receiving an operation instruction, wherein the operation instruction is an instruction triggered by the detected command line interface or the graphical interface.
It should be noted that, a command input by a user through a command line CLI (command-line interface) or a graphical interface GUI (Graphical User Interface ) may be used as an operation instruction in this embodiment.
For example, taking the foregoing judgment of whether the data values are consistent as an example, when the user inputs a command to force the cache partition to flush down the cached hot data through the command line CLI or the graphical interface GUI, the I/O read/write monitoring module checks whether the copy cached in the cache partition is dirty data (i.e., the data in the cache partition is inconsistent with the corresponding original data in the data partition), and if so, schedules the corresponding I/O operation to update the original data.
According to the embodiment, when the instruction of brushing the first data from the cache partition to the data partition is received, whether the values of the first data in the cache partition and the data partition are consistent is judged first so as to determine whether to execute the command, and therefore waste of resources can be avoided.
In one exemplary embodiment, after determining whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk in response to the received operation instruction, the method further comprises:
s81, when the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, the first data is replaced by the data newly copied to the cache partition.
When it is determined that the value of the first data in the data partition is consistent with the value in the cache partition, since the value of the first data is consistent, no influence is given to the value of the first data in the data partition no matter whether the operation instruction is executed or not, in this embodiment, the operation instruction may be directly ignored, and when new data enters the cache partition, the first data cached in the cache partition may be replaced with the new data.
It should be noted that, the replacement of the data stored in the cache partition may be performed by the I/O read/write monitoring module. The I/O read/write monitoring module may use LFUDA (Least Frequently Used, a cache elimination algorithm, the core idea of which is to delete the least frequently used cache data) algorithm to determine which data in which cache partition may be replaced with new data based on the number of times the data is accessed. Meanwhile, when replacing data, the data can be replaced by using the minimum key. Key K i As shown in equation (1), ci is the retrieval cost (i.e., the time and resource costs required to retrieve the data), fi is the frequency count (i.e., the number of times the data is accessed over a period of time), L is the running age factor (i.e., the effect of the frequency and duration of the data being accessed on the life and performance of the storage system), and i starts from 0.
According to the embodiment, when the values of the first data in the cache partition and the data partition indicated by the operation instruction are consistent, the values of the first data in the cache partition are not used for updating the values of the first data in the data partition according to the operation instruction, so that the waste of resources can be avoided.
In one exemplary embodiment, determining whether first data is stored in a cache partition in a target disk includes:
S91, if the first access request comprises the first address, searching whether a target mapping relation corresponding to the first address exists in a mapping relation table;
s92, under the condition that the target mapping relation is not found in the mapping relation table, determining that the first data is not stored in the cache partition in the target disk.
For the received first access request, when the first access request includes the first address, the mapping relation table may be searched for whether the target mapping relation corresponding to the first address exists, and if the mapping relation table does not find the target mapping relation, it may be determined that the first data is not stored in the cache partition in the target disk, similar to the foregoing process of determining whether the first data is stored in the cache partition in the target disk based on the second access request.
It should be noted that, in the present embodiment, the process of searching the mapping relationship corresponding to the first address in the mapping relationship table is similar to the description of the foregoing embodiment, and the description of the present embodiment is omitted herein.
In an exemplary embodiment, the above method further comprises:
s101, under the condition that a new disk appears in a storage system, determining second data which are in a cache partition and are inconsistent with corresponding values in a data partition;
S102, updating the value of the second data in the data partition by using the value of the second data in the cache partition, and controlling all data stored in the cache partition to be invalid.
In this embodiment, the disks in the RAID array suspended under the RAID card may be added and deleted. After adding the new disk, the buffer partitions may be rebalanced by the I/O read/write monitor module, i.e., all data within the buffer partitions in the current RAID array is invalidated, and the size of the buffer partitions is reconfigured. Invalidating data may be accomplished by purging data, refreshing data, or updating data, among other ways.
In this embodiment, under the condition that it is determined that a new disk appears in the RAID array, before all data stored in the control cache partition is invalidated, second data in the cache partition, which is inconsistent with the corresponding value in the data partition, may be determined first, the value of the second data in the cache partition is updated by using the value of the second data in the cache partition, and then the data in the cache partition is invalidated.
Optionally, after all data stored in the control cache partition fails, the method further includes:
and in response to the acquired third access request, accessing third data corresponding to the third access request in a data partition in the target disk, copying the third data into a cache partition in the target disk, adding a mapping relation between a third address and a fourth address in a mapping relation table, wherein the third address is a storage address of the third data in the data partition, and the fourth address is a storage address of the third data in the cache partition.
After all data in the cache partition is controlled to be invalidated due to the addition of a new disk, the cache partition may be refilled based on the access request in the case of the acquired access request, that is, the data corresponding to the access request in the data partition is copied into the cache partition while being accessed according to the access request, and the mapping relationship of the data is added in the mapping relationship table.
According to the embodiment, after the disk is newly added, the corresponding data in the data partition is updated by using the data which is inconsistent with the original data in the data partition in the cache partition, and then all the data in the cache partition is invalidated, so that the situation of data loss caused by rebalancing the cache partition can be avoided.
In one exemplary embodiment, adding the target mapping relationship to the mapping relationship table includes:
s111, adding a target mapping relation and a modification mark of the first data in the mapping relation table, wherein the modification mark is used for indicating whether the value of the first data in the cache partition is modified or not.
In order to improve response efficiency of data access, when data needs to be updated, if the data is stored in the data partition and the cache partition at the same time, updating of the data can be performed in the cache partition. Correspondingly, the mapping relation table can record whether the data in the cache partition is updated or not in addition to the mapping relation of the addresses of the data in the data partition and the cache partition.
In this embodiment, when adding the target mapping relationship corresponding to the first data in the mapping relationship table, a modification flag of the first data may also be added. The modification flag is used for indicating whether modification occurs to the value of the first data in the cache partition. The modification here, i.e., the update of the data in the foregoing embodiment.
For example, for each address of data stored in a cache partition, 4 tuples and 1 dirty bit may be stored in the mapping table, plus 8 extra bytes to represent a fabric pointer. Here, the dirty bit may indicate whether the data has been modified in the cache partition.
By the embodiment, the mapping relation of the data and the modification mark of the data are recorded in the mapping relation table, so that the efficiency of determining the data update can be improved.
In an exemplary embodiment, after adding the modification flag of the first data and the target mapping relation to the mapping relation table, the method further includes:
s121, under the condition that the value of the first data in the cache partition is modified, adding a modification record of the value of the first data in the cache partition in a modification log, wherein the modification log is positioned in a memory in a storage system.
In consideration of the fact that when data are stored in the data partition and the cache partition at the same time, updating of the data is generally performed in the cache partition, and then updated data are written into corresponding positions of the data partition based on the mapping relation recorded in the mapping relation table, and when the mapping relation table is damaged, updated data are easy to lose. In this embodiment, a modification log may be newly added to the memory in the RAID storage system. The modification record of the data stored in the cache partition is recorded by the modification log.
It should be noted that, in addition to the modification record of the record data, information such as the position of the data in the data partition may be recorded in the modification log, so that when the mapping relationship table fails, the data may be recovered based on the modification log.
In this embodiment, when it is determined that the value of the first data in the cache partition is modified, a modification record of the value of the first data in the cache partition may be added to the modification log. In the event of a failure of the mapping table, dirty data in the cache partition (i.e., data that is inconsistent compared to the original data in the data partition) may be restored based on the modification records of the data recorded in the modification log.
According to the embodiment, the data modification record in the cache partition is recorded in the log, so that the data recovery capability can be provided for part of data, the data loss is avoided, and the reliability of the storage system is improved.
In an exemplary embodiment, the above method further comprises:
s131, configuring the size of a cache partition of the magnetic disk in each magnetic disk group according to the acquired partition indication, wherein the partition indication is used for indicating the size of the cache partition.
In this embodiment, the size of the cache partition may not be fixed, and the user may configure the size of the cache partition through the command line CLI or the GUI to better meet the storage requirement. The size of the cache partition of the disks in each disk group may be configured or changed according to the obtained partition indication. The partition indication here may be an indication triggered by the user via a command line CLI or a graphical interface GUI.
It should be noted that the data partitions may be managed by any data allocation policy, but it is important that the data partitions may grow moderately and access any data with acceptable performance.
According to the embodiment, the size of the cache partition is configured based on the indication of the user, so that the RAID storage system can meet different storage requirements.
The data access method in the embodiment of the present application is explained below in conjunction with an alternative example.
In this optional example, an I/O read/write monitoring module, a cache mapping module and an I/O redirection module are newly added in the RAID card, where the I/O read/write monitoring module is responsible for analyzing I/O read/write requests of a foreground host to identify a working set and schedule appropriate operations to copy data between partitions (the partitions are divided into cache partitions and data partitions, the cache partitions are used for storing data with frequent access), the cache mapping module occupies only very small memory resources of the RAID card and is used for recording the mapping relation of addresses of the data in the cache partitions and the data partitions, and the I/O redirection module is responsible for intercepting all read/write requests sent to the RAID and redirecting them to appropriate partitions. Through the cooperation and the work among the I/O read-write monitoring module, the cache mapping module and the I/O redirection module, the I/O read-write performance of the RAID card can be improved to the greatest extent, the user experience of a user using the RAID card is improved, and the market competitiveness of the RAID card is enhanced.
As shown in fig. 7, the data access method in this alternative example may include the steps of:
In step S702, an I/O request is acquired by an I/O read/write monitoring module.
In step S704, when the target data accessed by the I/O request is hot data and is not in the cache partition, the I/O request is redirected into the data partition by the I/O redirection module.
In step S706, the target data in the data partition is copied to the cache partition, the address of the target data in the data partition and the cache partition is recorded in the cache mapping module, and then all accesses to the target data are redirected to the cache partition.
By the alternative example, the I/O read-write performance of the RAID card can be improved while the storage space of the memory of the RAID card is not influenced.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or portions contributing to the prior art may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) including several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods of the embodiments of the present application.
According to still another aspect of the embodiments of the present application, a data access device is further provided, and the device is configured to implement the data access method provided in the foregoing embodiments, which is not described herein. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 8 is a block diagram of a data access device according to an embodiment of the present application, as shown in fig. 8, the device includes:
a first obtaining unit 802, configured to obtain a first access request, where the first access request is used to request access to first data stored in a storage system, where the storage system is a redundant array of independent disks storage system, and the storage system includes a plurality of disk groups, and a disk in each disk group includes a cache partition and a data partition;
a first determining unit 804, coupled to the first obtaining unit 802, configured to determine, if the first data is data stored on a target disk in the plurality of disk groups and the first data is a first type of data, where the first type of data is data with a number of accesses or an access frequency greater than a preset threshold, whether the first data is stored in a cache partition in the target disk;
the first execution unit 806 is connected to the first determination unit 804, and is configured to access the first data in the data partition in the target disk and copy the first data to the cache partition in the target disk when the first data is not stored in the cache partition in the target disk, and add a target mapping relationship in the mapping relationship table, where the mapping relationship table is located in a memory of the storage system, the target mapping relationship is a mapping relationship between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk.
According to the embodiment of the application, a first access request is acquired, wherein the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition; determining whether the first data is stored in a cache partition in a target disk in a plurality of disk groups under the condition that the first data is stored on the target disk in the plurality of disk groups and the first data is a first type of data, wherein the first type of data is data with access times or access frequency larger than a preset threshold value; under the condition that the first data is not stored in the cache partition in the target disk, accessing the first data in the data partition in the target disk, copying the first data into the cache partition in the target disk, and adding a target mapping relation in a mapping relation table, wherein the mapping relation table is positioned in a memory of a storage system, the target mapping relation is a mapping relation between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk, so that the problem that the data access efficiency of a RAID storage system is poor in a data access method in the related art can be solved, and the technical effect of improving the data access efficiency of the RAID storage system is achieved.
Optionally, the apparatus further includes:
the second acquisition unit is used for acquiring a second access request, wherein the second access request is used for requesting to access the first data stored in the storage system;
a second determining unit configured to determine whether or not the first data is stored in a cache partition in a target disk in the plurality of disk groups, in a case where the first data is data stored on the target disk and the first data is data of a first type;
the third obtaining unit is used for obtaining a second address according to the target mapping relation in the mapping relation table under the condition that the first data is stored in the cache partition in the target disk;
and the first access unit is used for accessing the first data with the storage address being the second address in the cache partition in the target disk.
Optionally, the second determining unit includes:
the first searching module is used for searching a target mapping relation corresponding to the first address in the mapping relation table under the condition that the second access request comprises the first address;
the first determining module is used for determining that the first data is stored in the cache partition in the target disk under the condition that the target mapping relation is found in the mapping relation table.
Optionally, the mapping relation table is a binary structure of a tree, and the first searching module includes:
and the searching sub-module is used for searching the target mapping relation corresponding to the first address from the root node in the binary structure of the tree, wherein the binary structure of the tree comprises the root node and a group of leaf nodes.
Optionally, the apparatus further includes:
and the second access unit is used for accessing the first data with the storage address of the first address in the data partition in the target disk under the condition that the first data is the data stored on the target disk in the plurality of disk groups after the second access request is acquired, the second access request comprises the first address, and the first data is not the first type of data.
Optionally, the apparatus further includes:
and a third determining unit configured to determine, after the first access request is acquired and before the second access request is acquired, a type of the first data as a second type different from the first type in a case where the number of accesses or the frequency of accesses of the first data is detected to be less than or equal to a preset threshold, wherein the second type of data is data whose number of accesses or frequency of accesses is less than or equal to the preset threshold.
Optionally, the apparatus further includes:
a first deleting unit configured to delete the target mapping relationship in the mapping relationship table in a case where the type of the first data is determined to be a second type different from the first type; or,
a second deleting unit, configured to delete the target mapping relationship in the mapping relationship table when the number of the mapping relationships recorded in the mapping relationship table is greater than a preset number threshold; or,
and a third deleting unit, configured to delete the target mapping relationship in the mapping relationship table when the mapping relationship table is full.
Optionally, the first access unit includes:
and the updating module is used for updating the value of the first data with the access storage address of the second address in the cache partition in the target disk from the second value to the first value under the condition that the value of the first data in the data partition in the target disk is the second value and the second access request is used for requesting to update the value of the first data to the first value.
Optionally, the apparatus further includes:
the first updating unit is used for updating the value of the first data in the data partition in the target disk from the second value to the first value when the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk after updating the value of the first data with the second address in the cache partition in the target disk from the second value to the first value.
Optionally, the apparatus further includes:
a fourth determining unit, configured to determine, in response to the received operation instruction, whether a first value of the first data in the cache partition in the target disk is consistent with a second value of the first data in the data partition in the target disk;
and the second updating unit is used for updating the value of the first data in the data partition in the target disk from the second value to the first value under the condition that the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk.
Optionally, the apparatus further includes:
the receiving unit is used for receiving the operation instruction before responding to the received operation instruction, and determining whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, wherein the operation instruction is a detected command line interface or a graphical interface triggered instruction.
Optionally, the apparatus further includes:
and the replacing unit is used for replacing the first data by using the data newly copied to the cache partition under the condition that the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk after determining whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk in response to the received operation instruction.
Optionally, the first determining unit includes:
the second searching module is used for searching whether a target mapping relation corresponding to the first address exists in the mapping relation table under the condition that the first access request comprises the first address;
and the second determining module is used for determining that the first data is not stored in the cache partition in the target disk under the condition that the target mapping relation is not found in the mapping relation table.
Optionally, the apparatus further includes:
a fifth determining unit, configured to determine, when it is determined that the newly added disk appears in the storage system, second data in the cache partition that is inconsistent with the corresponding value in the data partition;
and the third updating unit is used for updating the value of the second data in the data partition by using the value of the second data in the cache partition and controlling all data stored in the cache partition to be invalid.
Optionally, the apparatus further includes:
the second execution unit is configured to, after all data stored in the control cache partition is invalid, access third data corresponding to the third access request in the data partition in the target disk in response to the obtained third access request, copy the third data to the cache partition in the target disk, add a mapping relationship between a third address and a fourth address in the mapping relationship table, where the third address is a storage address of the third data in the data partition, and the fourth address is a storage address of the third data in the cache partition.
Optionally, the first execution unit includes:
the adding module is used for adding the target mapping relation and the modification mark of the first data in the mapping relation table, wherein the modification mark is used for indicating whether the value of the first data in the cache partition is modified or not.
Optionally, the apparatus further includes:
and the adding unit is used for adding the modification record of the value of the first data in the cache partition in the modification log under the condition that the value of the first data in the cache partition is modified after adding the target mapping relation and the modification mark of the first data in the mapping relation table, wherein the modification log is positioned in a memory in the storage system.
Optionally, the apparatus further includes:
the configuration unit is used for configuring the size of the cache partition of the magnetic disk in each magnetic disk group according to the acquired partition indication, wherein the partition indication is used for indicating the size of the cache partition.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
According to a further aspect of the embodiments of the present application, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
According to a further aspect of embodiments of the present application, there is also provided an electronic device comprising a memory, in which a computer program is stored, and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
The foregoing description is only a preferred embodiment of the present application and is not intended to limit the embodiment of the present application, but various modifications and changes may be made to the embodiment of the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the embodiments of the present application should be included in the protection scope of the embodiments of the present application.

Claims (20)

1. A method of data access, comprising:
acquiring a first access request, wherein the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition;
determining whether the first data is stored in the cache partition in the target disk in the plurality of disk groups or not when the first data is data stored on the target disk in the plurality of disk groups and the first data is data of a first type, wherein the first type of data is data with access times or access frequencies larger than a preset threshold;
accessing the first data in the data partition in the target disk and copying the first data into the cache partition in the target disk under the condition that the first data is not stored in the cache partition in the target disk, and adding a target mapping relation in a mapping relation table, wherein the mapping relation table is positioned in a memory of the storage system, the target mapping relation is a mapping relation between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk;
Wherein the method further comprises:
acquiring a second access request, wherein the second access request is used for requesting to access the first data stored in the storage system;
determining whether the first data is stored in the cache partition in the target disk in the case where the first data is data stored on the target disk in the plurality of disk groups and the first data is the first type of data;
under the condition that the first data is stored in the cache partition in the target disk, acquiring the second address according to the target mapping relation in the mapping relation table;
and accessing the first data with the storage address being the second address in the cache partition in the target disk.
2. The method of claim 1, wherein the determining whether the first data is stored in the cache partition in the target disk comprises:
if the second access request includes the first address, searching the mapping relation table for the target mapping relation corresponding to the first address;
and under the condition that the target mapping relation is found in the mapping relation table, determining that the first data is stored in the cache partition in the target disk.
3. The method according to claim 2, wherein the mapping table is a binary structure of a tree, and the searching the mapping table for the target mapping corresponding to the first address includes:
searching the target mapping relation corresponding to the first address from a root node in a binary structure of the tree, wherein the binary structure of the tree comprises the root node and a set of leaf nodes.
4. The method of claim 1, wherein after obtaining the second access request, the method further comprises:
when the first data is data stored on the target disk in the plurality of disk groups, the first address is included in the second access request, and the first data is not the first type of data, the first data with a storage address of the first address is accessed in the data partition in the target disk.
5. The method of claim 1, wherein after the first access request is obtained and before the second access request is obtained, the method further comprises:
and determining the type of the first data as a second type different from the first type under the condition that the access frequency or the access frequency of the first data is smaller than or equal to a preset threshold value, wherein the second type of data is the data with the access frequency or the access frequency smaller than or equal to the preset threshold value.
6. The method of claim 5, wherein in the event that the type of the first data is determined to be a second type different from the first type, the method further comprises:
deleting the target mapping relation in the mapping relation table; or alternatively
Deleting the target mapping relation in the mapping relation table under the condition that the number of the mapping relations recorded in the mapping relation table is larger than a preset number threshold; or alternatively
And deleting the target mapping relation in the mapping relation table under the condition that the mapping relation table is full.
7. The method of claim 1, wherein said accessing the first data having a storage address of the second address in the cache partition in the target disk comprises:
and updating the value of the first data with the access storage address of the second address in the cache partition in the target disk from the second value to the first value when the value of the first data in the data partition in the target disk is the second value and the second access request is used for requesting to update the value of the first data to the first value.
8. The method of claim 7, wherein after updating the value of the first data having the access storage address of the second address from the second value to the first value in the cache partition in the target disk, the method further comprises:
and updating the value of the first data in the data partition in the target disk from the second value to the first value under the condition that the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk.
9. The method according to claim 1, wherein the method further comprises:
in response to a received operation instruction, determining whether a first value of the first data in the cache partition in the target disk is consistent with a second value of the first data in the data partition in the target disk;
and updating the value of the first data in the data partition in the target disk from the second value to the first value under the condition that the first value of the first data in the cache partition in the target disk is inconsistent with the second value of the first data in the data partition in the target disk.
10. The method of claim 9, wherein prior to the determining, in response to the received operation instruction, whether the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, the method further comprises:
and receiving the operation instruction, wherein the operation instruction is an instruction triggered by the detected command line interface or the graphical interface.
11. The method of claim 9, wherein after said determining whether a first value of said first data in said cache partition in said target disk is consistent with a second value of said first data in said data partition in said target disk in response to a received operation instruction, said method further comprises:
and under the condition that the first value of the first data in the cache partition in the target disk is consistent with the second value of the first data in the data partition in the target disk, replacing the first data by using the data newly copied to the cache partition.
12. The method of claim 1, wherein the determining whether the first data is stored in the cache partition in the target disk comprises:
if the first address is included in the first access request, searching whether the target mapping relation corresponding to the first address exists in the mapping relation table;
and under the condition that the target mapping relation is not found in the mapping relation table, determining that the first data is not stored in the cache partition in the target disk.
13. The method according to claim 1, wherein the method further comprises:
under the condition that the newly added disk appears in the storage system, determining second data which are inconsistent with the corresponding value in the data partition in the cache partition;
updating the value of the second data in the data partition by using the value of the second data in the cache partition, and controlling all data stored in the cache partition to be invalid.
14. The method of claim 13, wherein after said controlling all data stored in said cache partition to be invalidated, said method further comprises:
And in response to the acquired third access request, accessing third data corresponding to the third access request in the data partition in the target disk, copying the third data into the cache partition in the target disk, and adding a mapping relation between a third address and a fourth address in the mapping relation table, wherein the third address is a storage address of the third data in the data partition, and the fourth address is a storage address of the third data in the cache partition.
15. The method of claim 1, wherein adding the target mapping relationship to the mapping relationship table comprises:
and adding a modification mark of the target mapping relation and the first data in the mapping relation table, wherein the modification mark is used for indicating whether the value of the first data in the cache partition is modified or not.
16. The method of claim 15, wherein after adding the target mapping and the modification flag for the first data to the mapping table, the method further comprises:
and under the condition that the value of the first data in the cache partition is modified, adding a modification record of the value of the first data in the cache partition in a modification log, wherein the modification log is positioned in a memory in the storage system.
17. The method according to claim 1, wherein the method further comprises:
and configuring the size of the cache partition of the magnetic disk in each magnetic disk group according to the acquired partition indication, wherein the partition indication is used for indicating the size of the cache partition.
18. A data access device, comprising:
the first access request is used for requesting to access first data stored in a storage system, the storage system is a redundant array of independent disks storage system, the storage system comprises a plurality of disk groups, and the disks in each disk group comprise a cache partition and a data partition;
a first determining unit, configured to determine, if the first data is data stored on a target disk in the plurality of disk groups and the first data is a first type of data, whether the first data is stored in the cache partition in the target disk, where the first type of data is data with a number of accesses or an access frequency greater than a preset threshold;
the first execution unit is used for accessing the first data in the data partition in the target disk and copying the first data into the cache partition in the target disk under the condition that the first data is not stored in the cache partition in the target disk, and adding a target mapping relation in a mapping relation table, wherein the mapping relation table is positioned in a memory of the storage system, the target mapping relation is a mapping relation between a first address and a second address, the first address is a storage address of the first data in the data partition in the target disk, and the second address is a storage address of the first data in the cache partition in the target disk;
Wherein the apparatus further comprises:
a second obtaining unit, configured to obtain a second access request, where the second access request is used to request access to the first data stored in the storage system;
a second determining unit configured to determine whether the first data is stored in the cache partition in the target disk in a case where the first data is data stored on the target disk in the plurality of disk groups and the first data is the first type of data;
a third obtaining unit, configured to obtain, when the first data is stored in the cache partition in the target disk, the second address according to the target mapping relationship in the mapping relationship table;
and the first access unit is used for accessing the first data with the storage address of the second address in the cache partition in the target disk.
19. A computer readable storage medium, characterized in that a computer program is stored in the computer readable storage medium, wherein the computer program, when being executed by a processor, implements the steps of the method according to any of the claims 1 to 17.
20. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 17 when the computer program is executed.
CN202311300081.2A 2023-10-09 2023-10-09 Data access method and device, storage medium and electronic equipment Active CN117032596B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311300081.2A CN117032596B (en) 2023-10-09 2023-10-09 Data access method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311300081.2A CN117032596B (en) 2023-10-09 2023-10-09 Data access method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN117032596A CN117032596A (en) 2023-11-10
CN117032596B true CN117032596B (en) 2024-01-26

Family

ID=88639421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311300081.2A Active CN117032596B (en) 2023-10-09 2023-10-09 Data access method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117032596B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604226A (en) * 2009-07-14 2009-12-16 浪潮电子信息产业股份有限公司 A kind of method that makes up raising performance of storage system in dynamic buffering pond based on virtual RAID
CN103246616A (en) * 2013-05-24 2013-08-14 浪潮电子信息产业股份有限公司 Global shared cache replacement method for realizing long-short cycle access frequency
CN111736764A (en) * 2020-05-28 2020-10-02 苏州浪潮智能科技有限公司 Storage system of database all-in-one machine and data request processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604226A (en) * 2009-07-14 2009-12-16 浪潮电子信息产业股份有限公司 A kind of method that makes up raising performance of storage system in dynamic buffering pond based on virtual RAID
CN103246616A (en) * 2013-05-24 2013-08-14 浪潮电子信息产业股份有限公司 Global shared cache replacement method for realizing long-short cycle access frequency
CN111736764A (en) * 2020-05-28 2020-10-02 苏州浪潮智能科技有限公司 Storage system of database all-in-one machine and data request processing method and device

Also Published As

Publication number Publication date
CN117032596A (en) 2023-11-10

Similar Documents

Publication Publication Date Title
US10346081B2 (en) Handling data block migration to efficiently utilize higher performance tiers in a multi-tier storage environment
US11409448B2 (en) Selectively storing data into allocation areas using streams
US9507800B2 (en) Data management in distributed file systems
US9355112B1 (en) Optimizing compression based on data activity
US8566550B2 (en) Application and tier configuration management in dynamic page reallocation storage system
US9323655B1 (en) Location of data among storage tiers
US9965381B1 (en) Indentifying data for placement in a storage system
US20120096059A1 (en) Storage apparatus and file system management method
US20070130423A1 (en) Data migration method and system
US20220083247A1 (en) Composite aggregate architecture
US10365845B1 (en) Mapped raid restripe for improved drive utilization
US20140074834A1 (en) Storage Block Metadata Tagger
US10802757B2 (en) Automated management of write streams for multi-tenant storage
CN101976181A (en) Management method and device of storage resources
CN110147203A (en) A kind of file management method, device, electronic equipment and storage medium
CN104021088A (en) Log storage method and device
US10228885B2 (en) Deallocating portions of data storage based on notifications of invalid data
US9606909B1 (en) Deallocating portions of provisioned data storage based on defined bit patterns indicative of invalid data
CN101997919A (en) Storage resource management method and device
JP6582721B2 (en) Control device, storage system, and control program
CN116661685A (en) Hierarchical storage method and system for object storage metadata of business behavior awareness
CN117032596B (en) Data access method and device, storage medium and electronic equipment
CN116009761A (en) Data writing method and related equipment
US9606938B1 (en) Managing caches in storage systems
CN117891409A (en) Data management method, device, equipment and storage medium for distributed storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant