CN112463333A

CN112463333A - Data access method, device and medium based on multithreading concurrency

Info

Publication number: CN112463333A
Application number: CN202011395097.2A
Authority: CN
Inventors: 苑忠科
Original assignee: Beijing Inspur Data Technology Co Ltd
Current assignee: Beijing Inspur Data Technology Co Ltd
Priority date: 2020-12-03
Filing date: 2020-12-03
Publication date: 2021-03-09

Abstract

The embodiment of the invention discloses a data access method, a device and a medium based on multithreading concurrency. When an I/O write task is received, starting all target storage areas in a target area group, and calling a plurality of write threads to concurrently write I/O data into the target area group; each I/O data has a corresponding logical block address. The reading address of each I/O data can be determined by utilizing the corresponding relation between the logical block address and the initial address of the target area group and the number of the target storage areas contained in the target area group. By simultaneously starting all the storage areas in the area group, the problem of zone concurrent writing is solved. The data storage location relationship can be obtained by simple conversion without additional performance overhead and additional storage space.

Description

Data access method, device and medium based on multithreading concurrency

Technical Field

The present invention relates to the field of data storage technologies, and in particular, to a data access method and apparatus based on multithread concurrency, and a computer-readable storage medium.

Background

Solid State Drives (SSD) are hard disks made of Solid State electronic memory chip arrays, and their I/O performance is greatly improved compared to conventional hard disks, thus being widely used. The open channel SSD (open channel SSD) standard is formulated to allow a host (host) to obtain better configuration for use while managing SSD, but the standard exposes too many details to the host, which causes too much burden to the host, and even the software needs to deal with the difference between different vendors and different media.

A namespace area (NVMe Zoned Namespaces, NVMe ZNS) divides a plurality of memory areas (zones) on the flash memory, and data having the same attribute may be stored in each zone. The NVMe ZNS technology is a product integrating the advantages of open channel SSD 2.0, and well balances the contradiction between the high efficiency use of SSD and the host usability.

However, the existing proposed scheme for implementing "zone multithreading write" based on the apend command needs to modify the original Logical Block Address (LBA) of the application, which is a great challenge for updating the existing application interface and usage mode.

Therefore, how to reduce the difficulty of zone multithread writing is a problem to be solved by those skilled in the art.

Disclosure of Invention

Embodiments of the present invention provide a data access method, an apparatus, and a computer-readable storage medium based on multithreading concurrency, which can reduce difficulty in zone multithreading writing.

To solve the foregoing technical problem, an embodiment of the present invention provides a data access method based on multithreading concurrency, including:

dividing a plurality of storage areas into at least one area group according to the number of concurrent threads of the host end and a preset proportional relation;

when an I/O write task is received, starting all target storage areas in a target area group, and calling a plurality of write threads to concurrently write I/O data into the target area group; wherein each I/O data has a corresponding logical block address; the target area group is any one idle area group in all the area groups;

and determining the reading address of each I/O data by utilizing the corresponding relation between the logical block address and the initial address of the target area group and the number of target storage areas contained in the target area group.

Optionally, the determining, by using the corresponding relationship between the logical block address and the start address of the target area group and the number of target storage areas included in the target area group, a read address of each I/O data includes:

determining the read address of the I/O data according to the following corresponding relation formula;

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

the LBA represents the logical block address of the I/O data, the ZGSLBA represents the start address of the target zone group, the ZGCAP represents the number of target storage zones contained in the target zone group, the read address of the I/O data comprises ZOFFSET and ZINDEX2, the ZINDEX2 represents the sequence number of the storage zone in which the I/O data is located in the target zone group, and ZOFFSET represents the offset of the I/O data in the storage zone in which the I/O data is located.

Optionally, for the adjustment process of the proportional relationship, the method includes:

calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the preset grouping number;

judging whether the actual ratio is larger than the ratio of the storage area to the thread;

if the actual ratio is larger than the proportional value of the storage area and the thread, the value of the proportional value is increased;

and if the actual ratio is smaller than the proportional value of the storage area and the thread, reducing the value of the proportional value.

Optionally, before the reducing the value of the proportional value, the method further includes:

judging whether the number of the preset groups is smaller than a preset limit value or not;

if the number of the preset groups is smaller than a preset limit value, the step of reducing the value of the proportional value is executed;

if the preset grouping number is not smaller than the preset limit value, reducing the value of the grouping number, taking the reduced grouping number as the preset grouping number, and executing the step of calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the preset grouping number.

Optionally, after the invoking the multiple write threads to concurrently write the I/O data into the target zone group, the method further includes:

judging whether the target area group has a residual storage space or not;

if the target area group does not have the residual storage space, setting a non-idle identifier for the target area group, and judging whether I/O data of unfinished write operation exist or not;

if I/O data which are not written in operation exist, calling a new idle area group, starting all storage areas in the new idle area group, and calling a plurality of write threads to write the I/O data into the new idle area group concurrently until the writing operation of all the I/O data is completed.

The embodiment of the invention also provides a data access device based on multithreading concurrence, which comprises a dividing unit, a starting unit, a calling unit and a determining unit;

the dividing unit is used for dividing the plurality of storage areas into at least one area group according to the number of the concurrent threads at the host end and a preset proportional relation;

the starting unit is used for starting all the target storage areas in the target area group when an I/O write task is received;

the calling unit is used for calling a plurality of write threads to write I/O data into the target area group concurrently; wherein each I/O data has a corresponding logical block address; the target area group is any one idle area group in all the area groups;

and the determining unit is used for determining the reading address of each I/O data by utilizing the corresponding relation between the logical block address and the initial address of the target area group as well as the number of the target storage areas contained in the target area group.

Optionally, the determining unit is specifically configured to determine a read address of the I/O data according to the following correspondence formula;

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

Optionally, for the adjustment process of the proportional relationship, the apparatus includes a calculating unit, a determining unit, a first adjusting unit, and a second adjusting unit;

the calculating unit is used for calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the preset grouping number;

the judging unit is used for judging whether the actual ratio is larger than the proportional value of the storage area and the thread;

the first adjusting unit is configured to increase a value of the proportional value if the actual ratio is greater than the proportional value of the storage area and the thread.

And the second adjusting unit is used for reducing the value of the proportional value if the actual ratio is smaller than the proportional value of the storage area and the thread.

Optionally, the system further comprises a number judgment unit and a third adjustment unit;

the number judging unit is used for judging whether the number of the preset groups is smaller than a preset limit value or not; if the number of the preset groups is smaller than a preset limit value, triggering the second adjusting unit to execute the step of reducing the value of the proportional value;

and the third adjusting unit is used for reducing the value of the number of the groups if the number of the preset groups is not less than a preset limit value, taking the reduced number of the groups as the number of the preset groups, and triggering the calculating unit to execute the step of calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the number of the preset groups.

Optionally, the system further includes a space determination unit, a marking unit, and an operation determination unit after the invoking of the plurality of write threads to concurrently write the I/O data into the target zone group;

the space judging unit is used for judging whether the target area group has a residual storage space or not;

the marking unit is used for setting a non-idle identifier for the target area group if the target area group has no residual storage space;

the operation judging unit is used for judging whether I/O data of unfinished writing operation exists or not;

the starting unit is also used for calling a new idle area group and starting all storage areas in the new idle area group if I/O data of the unfinished write operation exists;

the calling unit is also used for calling a plurality of write threads to write the I/O data into the new idle area group concurrently until the write operation of all the I/O data is completed.

The embodiment of the invention also provides a data access device based on multithreading concurrence, which comprises:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the multi-thread concurrency-based data access method as in any one of the above.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the data access method based on the multi-thread concurrence are implemented as described in any one of the above.

According to the technical scheme, the method comprises the steps that a plurality of storage areas are divided into at least one area group according to the number of concurrent threads of a host end and a preset proportional relation; by dividing the zone groups, the parallel opening of a plurality of storage zones in one zone group can be realized, so that when an I/O write task is received, any idle zone group in all the zone groups can be used as a target zone group, all target storage zones in the target zone group are started, and a plurality of write threads are called to concurrently write I/O data into the target zone group; each I/O data has a corresponding logical block address. The starting address of each zone group and the number of storage zones contained in the zone group are known information, so that the reading address of each I/O data can be determined by utilizing the corresponding relation between the logical block address and the starting address of the target zone group and the number of target storage zones contained in the target zone group, and the reading of the I/O data is realized. According to the technical scheme, the zone group is divided, and all the storage zones in the zone group are opened at the same time, so that the problem of zone concurrent writing is solved, and the service performance and the QoS of the system are greatly improved. And the data storage position relation can be obtained through simple conversion, the software implementation logic is simple, and no extra performance overhead or extra storage space is needed.

Drawings

In order to illustrate the embodiments of the present invention more clearly, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is a flowchart of a data access method based on multi-thread concurrency according to an embodiment of the present invention;

FIG. 2 is a block diagram of a data access apparatus based on multi-thread concurrency according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a hardware structure of a data access apparatus based on multi-thread concurrency according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative work belong to the protection scope of the present invention.

In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Next, a data access method based on multi-thread concurrency according to an embodiment of the present invention will be described in detail. Fig. 1 is a flowchart of a data access method based on multi-thread concurrency according to an embodiment of the present invention, where the method includes:

s101: and dividing the plurality of storage areas into at least one area group according to the number of the concurrent threads at the host end and a preset proportional relation.

The preset proportional relationship refers to a ratio relationship between the storage regions and the threads, for example, the proportional relationship between the storage regions and the threads can be set to 1:1, that is, one thread corresponds to 1 storage region; or the proportional relation between the storage areas and the threads is set to be 3:1, namely, one thread corresponds to 3 storage areas.

The number of the storage areas corresponding to one thread can be obtained according to the proportional relation, and the product value of the number multiplied by the number of the concurrent threads at the host end is the number of the storage areas contained in one area group.

S102: when an I/O write task is received, all target storage areas in the target area group are started, and a plurality of write threads are called to concurrently write I/O data into the target area group.

Each I/O data has a corresponding logical block address when stored in the target set of areas. The target zone group is any one free zone group in all zone groups, and for convenience of distinction, in the embodiment of the present invention, a storage area included in the target zone group is referred to as a target storage area.

Each thread has a corresponding storage area, and the simultaneous opening of a plurality of storage areas can be realized by starting all the target storage areas in the target area group, so that a plurality of parallel write threads are ensured to write data into the corresponding target storage areas simultaneously, and the parallel processing of multiple threads is realized.

S103: and determining the reading address of each I/O data by utilizing the corresponding relation between the logical block address and the initial address of the target area group and the number of the target storage areas contained in the target area group.

Each I/O data has a corresponding Logical Block Address (LBA), and when the I/O data needs to be read from the target zone group, a corresponding read Address of the I/O data in the target zone group needs to be acquired.

In practical application, the reading address of the I/O data can be determined according to the following corresponding relation formula;

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

In a specific implementation, on the premise that the logical block address of the I/O data, the start address of the target area group where the I/O data is located, and the number of target storage areas included in the target area group are known, the difference between the logical block address and the start address of the target area group is divided by the number of the target storage areas, so as to obtain a quotient and a remainder, wherein the quotient is a serial number of the storage area where the I/O data is located in the target area group, and the remainder is an offset of the I/O data in the storage area where the I/O data is located.

In the embodiment of the present invention, in order to make the divided region groups more suitable for the actual service requirements, the preset proportional relationship may be dynamically adjusted. In a specific implementation, an actual ratio of the number of the storage areas in each group to the number of the concurrent threads can be calculated according to a preset grouping number; and judging whether the actual ratio is larger than the proportional value of the storage area and the thread.

The preset grouping number refers to the number of the area groups to be divided, and the value of the preset grouping number can be set according to actual requirements, and is not limited herein.

If the actual ratio is greater than the ratio of the storage area to the thread, it means that the number of the area groups divided according to the current ratio is large, and frequent switching of the area groups may be caused when data reading and writing are performed.

The way of increasing the value of the ratio value can be various, for example, the actual ratio can be directly used as the increased ratio value; or 1 may be added to the original ratio value, for example, if the ratio of the storage area to the thread is 3:1, the ratio of the storage area to the thread may be increased to 4: 1.

If the actual ratio is smaller than the ratio of the storage region to the thread, it means that the number of the region groups divided according to the current ratio is small, and when data reading and writing are performed, data may be stored in a certain region group in a centralized manner, resulting in a low utilization rate of other region groups.

The way of adjusting the value of the proportional value to be smaller is similar to the way of adjusting the proportional value to be larger, for example, the actual ratio can be directly used as the proportional value after being adjusted to be smaller; the ratio may be decreased by 1 based on the original ratio, for example, if the ratio of the storage area to the thread is 3:1, the ratio of the storage area to the thread may be decreased to 2: 1.

It should be noted that, in the embodiment of the present invention, the preset number of groups is only used as a basis for adjusting the ratio of the storage area to the thread, and in practical applications, the storage area does not need to be grouped according to the preset number of groups, but the storage area is divided into the group according to the ratio of the storage area to the thread.

In the embodiment of the present invention, the preset number of packets is only used as a basis for adjusting the ratio of the storage area to the thread, and in practical applications, the value of the preset number of packets may be adjusted, so that before the value of the ratio is reduced, it may be determined whether the preset number of packets is smaller than the preset limit. The preset limit may be a minimum number of zone groups.

If the number of the preset groups is smaller than the preset limit value, executing the step of reducing the value of the proportion value; if the preset grouping number is not less than the preset limit value, the value of the grouping number can be reduced, the grouping number after the value reduction is taken as the preset grouping number, and the step of calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the preset grouping number is executed.

By dynamically adjusting the proportion value of the storage areas and the threads according to the number of the preset groups and the actual ratio of the number of the storage areas in each group to the number of the concurrent threads, the division of the area groups can be more suitable for actual service requirements, and the performance of each storage area can be fully exerted.

In the embodiment of the present invention, in order to ensure the ordered storage of data and the effective management of the regional group, after a plurality of write threads are invoked to concurrently write the I/O data into the target regional group, it may be determined whether the target regional group has a remaining storage space. If the target area group does not have the residual storage space, setting a non-idle identifier for the target area group, and judging whether I/O data of the unfinished write operation exists or not.

If I/O data which are not written into the I/O data storage device exist, the I/O data still need to be stored, at this time, a new idle area group can be called, all storage areas in the new idle area group are started, and a plurality of write threads are called to write the I/O data into the new idle area group concurrently until the write operation of all the I/O data is completed.

The idle zone group refers to a zone group without a non-idle identifier.

By detecting the residual storage space of the area group, the condition that the area group is fully stored can be found in time, so that unwritten data is written into a new idle area group in time, and the ordered execution of data writing is ensured. By setting the non-idle identification for the fully stored area group, the storage system can conveniently and quickly screen out the currently available idle area group, and the data processing efficiency of the storage system is improved.

Fig. 2 is a schematic structural diagram of a data access apparatus based on multi-thread concurrency according to an embodiment of the present invention, including a dividing unit 21, an initiating unit 22, a calling unit 23, and a determining unit 24;

the dividing unit 21 is configured to divide the plurality of storage areas into at least one area group according to the number of concurrent threads at the host and a preset proportional relationship;

a starting unit 22, configured to start all target storage areas in the target area group when the I/O write task is received;

a calling unit 23, configured to call multiple write threads to concurrently write the I/O data into the target zone group; wherein, each I/O data has a corresponding logical block address; the target area group is any one idle area group in all the area groups;

the determining unit 24 is configured to determine a read address of each I/O data by using a corresponding relationship between the logical block address and a start address of the target area group and the number of target storage areas included in the target area group.

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

Optionally, for the adjustment process of the proportional relationship, the apparatus includes a calculating unit, a judging unit, a first adjusting unit, and a second adjusting unit;

and the first adjusting unit is used for increasing the value of the proportional value if the actual ratio is greater than the proportional value of the storage area and the thread.

the number judging unit is used for judging whether the number of the preset groups is smaller than a preset limit value or not; if the number of the preset groups is smaller than the preset limit value, triggering a second adjusting unit to execute the value taking of the reduction ratio value;

and the third adjusting unit is used for reducing the value of the number of the groups if the number of the preset groups is not less than the preset limit value, taking the reduced number of the groups as the number of the preset groups, and triggering the calculating unit to execute the step of calculating the actual ratio of the number of the storage areas in each group to the number of the concurrent threads according to the number of the preset groups.

Optionally, the system further includes a space determination unit, a marking unit and an operation determination unit after the plurality of write threads are invoked to concurrently write the I/O data into the target zone group;

the marking unit is used for setting a non-idle identifier for the target area group if the target area group does not have the residual storage space;

an operation judgment unit for judging whether there is I/O data of an uncompleted write operation;

the starting unit is also used for calling a new idle area group and starting all storage areas in the new idle area group if I/O data of which the writing operation is not finished exist;

The description of the features in the embodiment corresponding to fig. 2 may refer to the related description of the embodiment corresponding to fig. 1, and is not repeated here.

Fig. 3 is a schematic hardware structure diagram of a data access apparatus 30 based on multithreading concurrency according to an embodiment of the present invention, including:

a memory 31 for storing a computer program;

a processor 32 for executing a computer program to implement the steps of the multi-thread concurrency-based data access method as described in any of the embodiments above.

The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the data access method based on the multi-thread concurrence described in any of the above embodiments are implemented.

The foregoing details a data access method, an apparatus and a computer-readable storage medium based on multi-thread concurrency according to embodiments of the present invention are described. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Claims

1. A method for data access based on multi-thread concurrency, comprising:

2. The method of claim 1, wherein determining the read address of each I/O data using the correspondence between the logical block address and the starting address of the target set of regions and the number of target storage regions included in the target set of regions comprises:

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

3. The multithreaded concurrency based data access method of claim 1, wherein for the scaling process, the method comprises:

4. The method of claim 3, further comprising, prior to the decrementing the value of the scaling value:

5. The multithreaded concurrency based data access method of any one of claims 1-4, further comprising, after the invoking of the plurality of write threads to concurrently write I/O data to the target set of regions:

judging whether the target area group has a residual storage space or not;

6. A data access device based on multi-thread concurrence is characterized by comprising a dividing unit, a starting unit, a calling unit and a determining unit;

7. The multithreading-based concurrent data access apparatus according to claim 6, wherein the determining unit is specifically configured to determine a read address of the I/O data according to the following correspondence formula;

LBA＝ZGSLBA+ZGCAP*ZOFFSET+ZINDEX2；

8. The multithread-based concurrent data access apparatus according to claim 6, wherein the apparatus includes a calculation unit, a judgment unit, a first adjustment unit, and a second adjustment unit for the adjustment process of the proportional relationship;

the first adjusting unit is configured to increase a value of the proportional value if the actual ratio is greater than the proportional value of the storage area and the thread;

9. The multithreading-based concurrent data access apparatus according to claim 8, further comprising a number judgment unit and a third adjustment unit;

10. The multithread concurrency-based data access apparatus of any one of claims 6-9, further comprising a space determination unit, a marking unit, and an operation determination unit;

11. A data access device based on multi-thread concurrency, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the method for multi-thread concurrency-based data access according to any one of claims 1 to 5.

12. A computer-readable storage medium, having stored thereon, a computer program for implementing the steps of the method for multithreaded concurrency based data access according to any of claims 1-5, when the computer program is executed by a processor.