CN115827508B

CN115827508B - Data processing method, system, equipment and storage medium

Info

Publication number: CN115827508B
Application number: CN202310026962.3A
Authority: CN
Inventors: 刘杰; 侯斌; 安祥文; 张柯
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-01-09
Filing date: 2023-01-09
Publication date: 2023-05-09
Anticipated expiration: 2043-01-09
Also published as: CN115827508A

Abstract

The invention discloses a data processing method, which comprises the following steps of: responding to receiving a read IO, and acquiring the number of times that a data object corresponding to the read IO is read in a preset time period; and responding to the read times not smaller than a threshold value, starting pre-reading and reading all the remaining data in the corresponding data object according to the address offset in the read IO. And screening the data to be read corresponding to the read IO from the rest of all the data according to the address length in the read IO, returning the data to be read to an upper layer, and putting the rest of the data into a cache of the blue layer. The invention also discloses a system, equipment and a storage medium. The scheme provided by the invention can improve the sequential reading performance of small IO, reduce frequent accesses to the disk, achieve the effects of prolonging the service life and reducing the power consumption, reduce the hardware cost and improve the competitiveness of products.

Description

Data processing method, system, equipment and storage medium

Technical Field

The present invention relates to the field of storage, and in particular, to a data processing method, system, device, and storage medium.

Background

With the continued development of information technology, data is becoming increasingly important as a precious resource, and how to quickly process data resources and obtain desired results is one of the key issues in the transition from resources to assets. Various activities of people in work and life can generate data, useful information can be obtained by collecting the data and analyzing and processing the data, and conversion from resources to assets is realized, so that the high-speed development of big data and high-performance calculation is catalyzed. Data storage has also emerged as one of the core elements of data resources for a period of rapid development. The traditional network storage system adopts a centralized storage server to store all data, and the storage server becomes a bottleneck of system performance, is also a focus of reliability and safety, and cannot meet the requirements of large-scale storage application. The distributed network storage system adopts an extensible system structure, so that the reliability, availability and access efficiency of the system are improved, and the system is easy to extend, and is accepted by more and more enterprises. Distributed storage systems typically have 3 to N nodes to provide high performance, mass data storage.

In the using process of the distributed storage block scene, the data access mode of some application scenes is small IO sequential reading. The sequential small IO read operation of the same object is equivalent to changing parallel processing into serial processing, the concurrency is reduced, and meanwhile, the time consumed by the read request when the disk is read is more, so that the performance of sequential small IO read is not high, and the performance advantage of distributed storage cannot be fully exerted. Industry typically addresses such issues by adding nonvolatile caches or higher performance CPUs, which require additional hardware and upgrade CPU performance, which in turn increases costs.

Disclosure of Invention

In view of this, in order to overcome at least one aspect of the above-mentioned problems, an embodiment of the present invention proposes a data processing method, including performing, at a blue layer, the following steps:

responding to receiving a read IO, and acquiring the number of times that a data object corresponding to the read IO is read in a preset time period;

responding to the read times not smaller than a threshold value, starting pre-reading and reading all the rest data in the corresponding data object according to the address offset in the read IO;

and screening the data to be read corresponding to the read IO from the rest of all the data according to the address length in the read IO, returning the data to be read to an upper layer, and putting the rest of the data into a cache of the blue layer.

In some embodiments, further comprising:

an access counter for recording the number of times read is added to the metadata of each data object, and a time stamp recorder for recording the reading time.

In some embodiments, in response to receiving a read IO, acquiring a number of times that a data object corresponding to the read IO is read in a preset time period, further includes:

in response to receiving the read IO, judging whether the data to be read can be hit in the cache;

And responding to the data to be read which can be hit in the cache, and directly returning the data to be read to an upper layer.

In some embodiments, further comprising:

and in response to the fact that the data to be read is not hit in the cache, reading an access counter and a timestamp recorder in metadata corresponding to the corresponding data object.

In some embodiments, in response to the number of times read is not less than a threshold, starting pre-reading and reading all remaining data in the corresponding data object according to an address offset in the read IO, further comprising:

in response to the access counter being 0, setting the access counter in the corresponding metadata to 1 does not open a read ahead.

In some embodiments, further comprising:

and directly reading the data to be read from the corresponding data object and returning to an upper layer.

In some embodiments, further comprising:

and updating the timestamp recorder in the corresponding metadata according to the current time.

Responding to the access counter not being 0, acquiring a timestamp recorder in the corresponding metadata and comparing the timestamp recorder with the current time;

and in response to the difference value between the access counter and the access counter being greater than the preset time period, setting the access counter of the corresponding data object to be 1 and not starting pre-reading.

In some embodiments, further comprising:

and in response to the difference value of the access counter and the access counter being smaller than the preset time period, adding 1 to the access counter of the corresponding data object.

In some embodiments, further comprising:

judging whether the access counter of the corresponding data object is not less than a threshold value;

and in response to the read IO address offset being not smaller than a threshold value, starting pre-reading and reading all the remaining data in the corresponding data object according to the read IO address offset.

In some embodiments, further comprising:

and responding to the condition that the read-ahead is not started and the data to be read is directly read from the corresponding data object and returned to an upper layer.

In some embodiments, further comprising:

acquiring and analyzing the object name of the corresponding data object to obtain an object number and a storage volume name;

determining a corresponding storage pool according to the storage volume names and acquiring the number of PG in the storage pool;

and determining the number of the next data object according to the object number plus the number of the PG, and further determining the name of the next data object.

In some embodiments, further comprising:

and reading all data in the next data object and putting the data into a cache of the blue layer.

In some embodiments, further comprising:

acquiring the preset size of each pre-read data;

judging whether all the remaining data in the corresponding data object are smaller than the size of the read-ahead data each time;

and in response to all the remaining data in the corresponding data object being smaller than the size of each pre-read data, reading the data with the corresponding size from the next data object so that the pre-read data reaches the size of each pre-read data.

Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a data processing system, including:

the acquisition module is configured to respond to receiving the read IO and acquire the number of times that the data object corresponding to the read IO is read in a preset time period;

the pre-reading module is configured to respond to the read times not smaller than a threshold value, start pre-reading and read all the rest data in the corresponding data object according to the address offset in the read IO;

and the return module is configured to screen data to be read corresponding to the read IO from the rest of all the data according to the address length in the read IO, return the data to an upper layer and put the rest of the data into a cache of the blue store layer.

Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer apparatus, including:

at least one processor; and

a memory storing a computer program executable on the processor, the processor executing steps of any one of the data processing methods described above when the program is executed.

Based on the same inventive concept, according to another aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of any of the data processing methods as described above.

The invention has one of the following beneficial technical effects: according to the scheme provided by the invention, the number of times of reading the same object in the threshold time is recognized by the blue store layer, the range of data reading is enlarged for the object judged as hot data, the read data is put into the memory cache, the data pre-reading is realized, the small IO reading is changed into the large IO reading for many times, the data of the next object can be intelligently recognized and pre-read into the memory, the data is directly read from the memory in the subsequent sequential reading of the object, the data is prevented from being read from the disk each time, the time delay of each reading is reduced, and the purpose of improving the sequential reading performance of the small IO is achieved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a data processing system according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a computer device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention will be described in further detail with reference to the accompanying drawings.

It should be noted that, in the embodiments of the present invention, all the expressions "first" and "second" are used to distinguish two entities with the same name but different entities or different parameters, and it is noted that the "first" and "second" are only used for convenience of expression, and should not be construed as limiting the embodiments of the present invention, and the following embodiments are not described one by one.

In an embodiment of the invention, distributed block storage is an expansive storage architecture. The storage architecture can realize cross-device data distribution and can share loads among a plurality of servers. In physical and virtual machine applications, block storage may be used as a long-term storage device, typically containing high-level services such as backup and snapshot.

The object is: the storage data is divided into a plurality of objects, each object has an object id, the size of each object can be set, the default is 4MB, and the object can be regarded as the minimum storage unit of the distributed storage.

OSD: the English name is Object Storage Device, and its main functions are data storage, data copying, balance data, recovery data, etc., and heartbeat checking with other OSD. In general, a hard disk corresponds to an OSD, and the OSD manages the hard disk storage, and of course, a partition may also be an OSD.

Bluestone: an object storage engine for managing underlying data is disclosed.

Onode: and recording a data structure of object metadata information, wherein each object corresponds to one onode.

Small IO: IO reads smaller than a preset size (e.g., 128 KB) are defined as small IO reads.

PG: the PG is an aggregate of some objects, is a basic unit for forming a storage pool, and various characteristics of the storage pool, such as multiple copies, erasure codes and other data backup strategies, are finally realized by means of the PG.

Suard (split): an OSD set distributed by a PG is called a card set, each OSD in the set is called a card, and each OSD in the set has a respective number, and from 0, the number of OSDs in the card set is different according to different data backup strategies. For example, in the three-copy backup strategy, a card set contains 3 osds, and the card numbers are 0, 1 and 2.

And (3) coiling: data in the block store is stored in blocks in volumes that are attached to nodes. It can provide greater storage capacity for applications and is more reliable and performance. The volumes formed by these blocks are mapped into the operating system and controlled by the file system layer.

Object name in block store: the object size in block storage is typically 4MB, a volume divides several objects by 4MB, these objects are numbered from 0 according to the volume internal offset, and the object name is composed of two parts: volume name and object number.

According to an aspect of the present invention, an embodiment of the present invention proposes a data processing method, as shown in fig. 1, which may include performing the following steps at a blue store layer:

s1, responding to receiving read IO, and acquiring the number of times that a data object corresponding to the read IO is read in a preset time period;

s2, responding to the read times not smaller than a threshold value, starting pre-reading and reading all the rest data in the corresponding data object according to the address offset in the read IO;

and S3, screening the data to be read corresponding to the read IO from the rest all data according to the address length in the read IO, returning the data to the upper layer, and putting the rest data into the cache of the blue store layer.

The scheme provided by the invention can improve the sequential reading performance of small IO, reduce frequent accesses to the disk, achieve the effects of prolonging the service life and reducing the power consumption, reduce the hardware cost and improve the competitiveness of products.

In some embodiments, further comprising:

Specifically, an access counter and a timestamp recorder may be added to a data structure for recording metadata information of an object, so as to count the number and time of small IO read accesses of the object in a period of time.

Specifically, after receiving a read request, the Bluestore tries to read data from the memory cache, and if the Bluestore hits in the memory cache, the Bluestore directly returns the data to an upper layer; if there is no hit, the data is read from the disk.

In some embodiments, further comprising:

Specifically, when the number of times the same object is read from the disk by a small IO is accumulated to reach a threshold (for example, 3 times) within a preset period (for example, 3 seconds), the pre-reading is started.

When the data object is read for the first time, the counter value is 0 at this time, so the corresponding counter of the object will be incremented by 1 and the current timestamp updated, at which time no pre-read operation is performed.

If the data object is not read for the first time, and the counter is not 0 at the moment, comparing the record time stamp with the current time, if the record time stamp exceeds the preset time period, setting the counter to be 1, updating the current time stamp, and not executing the pre-reading operation; if the preset time period is not longer than the second, the counter adds 1, updates the current time stamp, and meanwhile judges whether the counter value reaches the threshold value, if so, the pre-reading is started, and if not, the pre-reading is not started.

In some embodiments, further comprising:

Specifically, after the pre-reading is started, the read length of the read IO is expanded from length to the rest of all data of the object based on the read position (offset) of the read IO.

After the pre-reading is started, the next object to be read next on the PG can be automatically identified according to the current object name, and all data of the object can be pre-read on the PG. The method comprises the following steps: and analyzing the object name, obtaining the volume name and the object number, wherein the number of PG of the current object number plus the current storage pool is the number of the next object on the current card, and thus obtaining the object name of the next object. And pre-reading the data of the whole object according to the acquired object name.

In some embodiments, further comprising:

acquiring the preset size of each pre-read data;

Specifically, in some scenarios, all data of a next object may not be pre-read, for example, the size of each pre-read data may be set, and if all the remaining data in the data object where the data to be read is located is smaller than the size of each pre-read data, the data of the corresponding size is read from the next data object so that the pre-read data reaches the size of each pre-read data.

In some embodiments, after the bluestrore reads the remaining data of the object locally, the length data is returned to the upper layer to ensure data consistency, and the remaining data is placed in the blue cache. When the Bluestore buffers data to a certain threshold, the data that was added earliest to the buffer is trimed to free up buffer space.

It should be noted that the read-ahead of the Bluestone layer is not perceived by the upper layers.

According to the scheme provided by the invention, the number of times of reading the same object in the threshold time is recognized by the blue store layer, the range of data reading is enlarged for the object judged as hot data, the read data is put into the memory cache, the data pre-reading is realized, the small IO reading is changed into the large IO reading for many times, the data of the next object can be intelligently recognized and pre-read into the memory, the data is directly read from the memory in the subsequent sequential reading of the object, the data is prevented from being read from the disk each time, the time delay of each reading is reduced, and the purpose of improving the sequential reading performance of the small IO is achieved.

Based on the same inventive concept, according to another aspect of the present invention, there is also provided a data processing system 400, as shown in fig. 2, including:

an obtaining module 401, configured to obtain, in response to receiving a read IO, the number of times that a data object corresponding to the read IO is read in a preset time period;

a pre-reading module 402, configured to start pre-reading and read all remaining data in the corresponding data object according to the address offset in the read IO in response to the number of times that is read is not less than a threshold;

and a return module 403, configured to screen the data to be read corresponding to the read IO from the remaining all data according to the address length in the read IO, return the data to an upper layer, and put the remaining data into the cache of the blue store layer.

In some embodiments, the system further comprises a metadata module configured to:

In some embodiments, the acquisition module 401 is further configured to:

In some embodiments, the pre-read module 402 is further configured to:

acquiring the preset size of each pre-read data;

Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 3, an embodiment of the present invention further provides a computer apparatus 501, including:

at least one processor 520; and

the memory 510, the memory 510 stores a computer program 511 executable on a processor, and the processor 520 executes the program to perform the steps of:

and responding to the read times not smaller than a threshold value, starting pre-reading and reading all the remaining data in the corresponding data object according to the address offset in the read IO.

In some embodiments, further comprising:

acquiring the preset size of each pre-read data;

Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 4, an embodiment of the present invention further provides a computer-readable storage medium 601, the computer-readable storage medium 601 storing a computer program 610, the computer program 610 when executed by a processor performing the steps of:

In some embodiments, further comprising:

acquiring the preset size of each pre-read data;

Finally, it should be noted that, as will be appreciated by those skilled in the art, all or part of the procedures in implementing the methods of the embodiments described above may be implemented by a computer program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, and the program may include the procedures of the embodiments of the methods described above when executed.

Further, it should be appreciated that the computer-readable storage medium (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

It should be understood that as used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.

The foregoing embodiment of the present invention has been disclosed with reference to the number of embodiments for the purpose of description only, and does not represent the advantages or disadvantages of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, where the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

Those of ordinary skill in the art will appreciate that: the above discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the invention, including the claims, is limited to such examples; combinations of features of the above embodiments or in different embodiments are also possible within the idea of an embodiment of the invention, and many other variations of the different aspects of the embodiments of the invention as described above exist, which are not provided in detail for the sake of brevity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present invention.

Claims

1. A method of data processing comprising performing the following steps at a blue store layer:

responding to receiving a read IO, and obtaining the number of times that a data object corresponding to the read IO is read in a preset time period, wherein the size of data to be read corresponding to the read IO is smaller than a preset size;

Screening data to be read corresponding to the read IO from the rest of all data according to the address length in the read IO, returning the data to be read to an upper layer, and putting the rest of data into a cache of the blue layer;

wherein, in response to the number of times read is not less than a threshold, starting pre-reading and reading all remaining data in the corresponding data object according to the address offset in the read IO, further comprising:

determining the number of the next data object according to the object number plus the number of the PG, and further determining the name of the next data object;

2. The method as recited in claim 1, further comprising:

3. The method of claim 2, wherein in response to receiving a read IO, obtaining a number of times a data object corresponding to the read IO is read within a preset time period, further comprising:

4. A method as recited in claim 3, further comprising:

5. The method of claim 4, wherein in response to the number of times read is not less than a threshold, starting a read-ahead and reading all remaining data in the corresponding data object according to an address offset in the read IO, further comprising:

6. The method as recited in claim 5, further comprising:

7. The method as recited in claim 5, further comprising:

8. The method of claim 4, wherein in response to the number of times read is not less than a threshold, starting a read-ahead and reading all remaining data in the corresponding data object according to an address offset in the read IO, further comprising:

9. The method as recited in claim 8, further comprising:

10. The method as recited in claim 8, further comprising:

11. The method as recited in claim 8, further comprising:

12. The method as recited in claim 11, further comprising:

13. The method as recited in claim 12, further comprising:

14. The method of claim 11 or 13, further comprising:

15. The method as recited in claim 1, further comprising:

acquiring the preset size of each pre-read data;

16. The method as recited in claim 1, further comprising:

and recycling the data in the cache of the blue layer.

17. The method of claim 16, wherein reclaiming data in the cache of the blue tier further comprises:

and responding to the data quantity in the cache of the blue store layer reaching a preset threshold value, and recovering the data which is added into the cache earliest.

18. A data processing system, comprising:

the acquisition module is configured to respond to receiving a read IO and acquire the number of times that a data object corresponding to the read IO is read in a preset time period, wherein the size of data to be read corresponding to the read IO is smaller than a preset size;

the return module is configured to screen data to be read corresponding to the read IO from the rest of all the data according to the address length in the read IO, return the data to an upper layer and put the rest of the data into a cache of a blue store layer;

the pre-reading module is further configured to:

19. A computer device, comprising:

at least one processor; and

a memory storing a computer program executable on the processor, wherein the processor performs the steps of the method of any one of claims 1-17 when the program is executed.

20. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor performs the steps of the method according to any one of claims 1-17.