CN114564154B - Data reading method based on distributed storage - Google Patents
Data reading method based on distributed storage Download PDFInfo
- Publication number
- CN114564154B CN114564154B CN202210194691.8A CN202210194691A CN114564154B CN 114564154 B CN114564154 B CN 114564154B CN 202210194691 A CN202210194691 A CN 202210194691A CN 114564154 B CN114564154 B CN 114564154B
- Authority
- CN
- China
- Prior art keywords
- read
- read operation
- merging
- request message
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 153
- 230000008569 process Effects 0.000 claims abstract description 99
- 238000010586 diagram Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a data reading method based on distributed storage, which comprises the following steps: receiving a plurality of read operations with a read request message; executing a merging flow among a plurality of read operations, namely executing the merging flow on the read request message of the read operation, so that the size of the read request message after executing the merging flow is an integer multiple of the size of the basic storage unit; and the read operation after the merging process is executed enters the read process, and the data read operation is performed. The data reading method based on distributed storage executes the merging flow of the received read operation with the read request message, namely, executes the merging operation of the read request message, so that most of the read request message after executing the merging flow is an integral multiple of the size of the basic storage unit, and is in a stripe alignment state, and the data reading efficiency and the reading performance are effectively improved in the stripe alignment state.
Description
Technical Field
The invention relates to the technical field of distributed storage, in particular to a data reading method based on distributed storage.
Background
With the continued development of information technology, data is becoming increasingly important as a precious resource, and how to quickly process data resources and obtain desired results is one of the key issues in the transition from resources to assets. Various activities of people in work and life can generate data, and useful information can be obtained by collecting the data and analyzing and processing the data, so that the conversion from resources to assets is realized, and the high-speed development of large data and high-performance calculation is catalyzed. Data storage has also emerged as one of the core elements of data resources for a period of rapid development. The traditional network storage system adopts a centralized storage server to store all data, and the storage server becomes a bottleneck of system performance, is also a focus of reliability and safety, and cannot meet the requirements of large-scale storage application. The distributed network storage system adopts an extensible system structure, so that the reliability, availability and access efficiency of the system are improved, and the system is easy to extend, and is accepted by more and more enterprises.
in the scenario of iSCSI (Internet Small Computer System Interface ) interfacing distributed storage, storage volumes are mounted on compute nodes, and these volumes are read and written by virtual machines; because the computing nodes and the virtual machine nodes are flexibly managed according to actual use conditions in a large-scale deployment and use environment, batch and automatic operation is particularly important. Under the default condition, a 1M read request initiated by a virtual machine to a data disk is split into two IO models of [504k,504,16k ] and [512k,512k ] to be issued to the distributed storage, and the distributed storage is low in efficiency in processing the two non-stripe aligned IO models, so that the read performance is affected.
Disclosure of Invention
In order to solve the technical problems, the application provides a data reading method based on distributed storage, which can improve the data reading efficiency in the distributed storage.
In order to achieve the above object, the present application proposes a first technical solution:
a data reading method based on distributed storage, comprising the steps of:
receiving a plurality of read operations with a read request message;
executing a merging flow among a plurality of read operations, namely executing the merging flow on the read request message of the read operation, so that the size of the read request message after executing the merging flow is an integer multiple of the size of the basic storage unit;
and the read operation after the merging process is executed enters the read process, and the data read operation is performed.
In one embodiment of the present application, before the performing of the merging procedure between the plurality of read operations, the method further includes:
adding the received read operation to the first queue, and periodically transferring the read operation in the first queue to the second queue.
In one embodiment of the present application, the merging process between the performing a plurality of read operations specifically includes:
taking the read operation in the first queue as a source read operation, and taking the read operation in the second queue as a target read operation, and executing a merging flow, wherein the first queue can continuously add new read operations;
In the second queue, reading operation which is not executed with the merging flow is used as source reading operation, reading operation which is executed with the merging flow is used as target reading operation, and the merging flow is executed;
and executing a merging flow between the rest read operations in the second queue.
In one embodiment of the present invention, the merging process specifically includes:
judging whether the source read operation and the destination read operation meet the merging condition or not;
if the merging condition is met, inserting the read request message of the source read operation into the read request message of the destination read operation;
if the merge condition is not satisfied, a source read operation that does not satisfy the merge condition is sent to the second queue.
In one embodiment of the present invention, after the read request message of the source read operation is inserted into the read request message of the destination read operation, the method further includes:
a merge flag is added to both the source read operation and the destination read operation that perform the read request message merge.
In one embodiment of the present invention, after the read request message of the source read operation is inserted into the read request message of the destination read operation, the method further includes:
and recording the combined message record of the source read operation and the destination read operation.
In one embodiment of the present invention, after the merging process between the plurality of read operations is performed, the method further includes:
Judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold;
if the size of the read request message can be divided by the basic storage unit or reaches the waiting time threshold value, continuing to execute the read flow;
if the size of the read request message is not divisible by the base storage unit and the latency threshold is not reached, then a read operation is added to the second queue that is not divisible by the base storage unit or does not reach the latency threshold.
In one embodiment of the present application, the read process specifically includes:
judging whether the read operation has a merging mark, if not, returning a reply message according to the original flow, and ending the read flow; if a merge flag is present, the merge message record is traversed.
In one embodiment of the present application, after traversing the merged message record, the method further includes:
splitting the read data and the combined read request information according to the combined information record of the read request information, forming a reply message by the split data and the corresponding read request information, returning the reply message, and ending the read process.
In order to achieve the above object, the present application proposes a second technical solution:
A data reading system based on distributed storage, the system comprising:
the object storage module receives a read operation from the client module;
the operation merging module is used for executing a merging flow of the read operation;
the integer division judging module is used for judging whether the read request message of the read operation can be integer divided by the basic storage unit;
and the data reading module is used for reading the data in the object storage module.
In order to achieve the above object, the present application proposes a third technical solution:
a computer-readable storage medium, characterized by: the computer readable storage medium stores a program which, when executed by a processor, causes the processor to perform the steps of a data reading method based on distributed storage.
Compared with the prior art, the technical scheme of the application has the following advantages:
according to the data reading method based on distributed storage, the received reading operation with the reading request message is executed in the merging process, namely, the merging operation of the reading request message is executed, so that most of the reading request messages after the merging process is executed are integral multiples of the size of the basic storage unit, and are in a stripe alignment state, and the data reading efficiency and the reading performance are effectively improved in the stripe alignment state.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a first method flow diagram of a distributed storage based data reading method of the present invention;
FIG. 2 is a merged flow chart of a distributed storage based data reading method of the present invention;
FIG. 3 is a second method flow diagram of a distributed storage based data reading method of the present invention;
fig. 4 is a system configuration diagram of a data reading system based on distributed storage of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and that the steps can be interchanged to achieve the same or similar effects, which are all within the scope of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
referring to fig. 1, fig. 1 is a flowchart of a first method of the distributed storage based data reading method of the present invention.
The method of the embodiment comprises the following steps:
receiving a read operation with a read request message;
the read operation has read request information, and corresponding data is read through the read request information, so that the read operation with the read request information is received first, and the read request information with the read operation is read.
Executing a merging flow among a plurality of read operations, namely executing the merging flow on the read request information of the read operations, so that the size of the read request information after executing the merging flow is an integer multiple of that of the basic storage unit;
in the distributed storage of the prior art, the situation of non-stripe alignment is general, in which the efficiency of data reading is low, and the reading performance is greatly affected. The invention executes the merging flow of a plurality of read operations, each read operation corresponds to one read request message, and corresponding data is read according to the read request message. The merging of the read operation essentially refers to the merging of the read request messages, and the merging flow of the read operation is also referred to as the merging flow of the read request messages, so that the size of most of the read request messages after the merging flow is executed is an integer multiple of that of the basic storage units, and therefore the read request messages are in a stripe alignment state, and the data reading efficiency and the reading performance are effectively improved in the stripe alignment state. Wherein the value of the basic memory cell is 1 megabit (1M).
And the read operation after the merging process is executed enters the read process, and the data read operation is performed.
The read request information is subjected to a merging process, so that the size of the read request information after most of the read request information is integral multiple of that of the basic storage unit, the read request information is in the situation of alignment of the stripes, then the read operation after the merging process is performed enters the read process, and the data reading operation is completed according to the corresponding data in the read request information reading.
In one embodiment, before the merging process between the plurality of read operations is performed, the method further includes:
adding the received read operation to the first queue, and periodically transferring the read operation in the first queue to the second queue.
In the operation of executing the merging flow, two storage queues are generally set, and one storage queue is used for storing new read operations, namely, a first queue; the other store queue is used to store the merged read operation, i.e., the second queue. When the merging process is executed at the beginning, the first queue and the second queue are empty queues, and no read operation exists. When a read operation is received, the received read operation is added to the first queue, and the read operation in the first queue is transferred to the second queue periodically, so that the read operation in the first queue and the read operation in the second queue can be combined.
In one embodiment, after the received read operation is added to the first queue, the method further includes:
the enqueue time for the read operation is recorded.
After each read operation is added into the first queue, the enqueuing time is recorded, so that whether the read operation reaches the waiting time threshold or not is judged later, the read operation which reaches the time threshold is executed in a read flow, or the read operation which does not reach the time threshold is sent into the second queue.
In one embodiment, the merging process between the performing a plurality of read operations specifically includes:
taking the read operation in the first queue as a source read operation, and taking the read operation in the second queue as a target read operation, and executing a merging flow, wherein the first queue can continuously add new read operations;
after the transfer of the read operation is completed, traversing the read operation in the first queue, taking the read operation in the first queue as a source read operation, taking the read operation in the second queue as a destination read operation, and executing a merging flow by sequentially carrying out the source read operation and the destination read operation, namely, executing the merging flow by a read request message of the source read operation and a read request message of the destination read operation. For example, the merging flow is performed by the a read operation and the B read operation, that is, the merging flow is performed by the read request message of the a read operation and the read request message of the B read operation, that is, the a read request message is inserted into the read request message of the B read operation, the a read operation in the whole flow is called a source read operation, and the B read operation is called a destination read operation. The size of the read request message is an integer multiple of the basic storage unit through the merging process, so that the read request message of the integer multiple of the basic storage unit is in a state of stripe alignment, and the data reading efficiency is effectively improved. Wherein, one read operation corresponds to one read request message, and the merging flow of executing the read operation is essentially the merging flow of executing the read request message. In the process of executing the merging flow, new read operations are added to the first queue continuously, and each time the first queue is traversed, the new read operations are sent to the merging flow of the read operations.
In the second queue, reading operation which is not executed with the merging flow is used as source reading operation, reading operation which is executed with the merging flow is used as target reading operation, and the merging flow is executed;
when the merging process is performed on the read request message of the first queue and the read request message of the second queue, the read request message of the first queue is actually inserted into the read request message of the second queue. Therefore, after the read operation of the first queue is traversed, most of the read request information after merging is performed in the second queue, and at this time, the read operation which cannot be merged with the second queue in the first queue is transferred to the second queue, and there is a possibility that the read operations in the second queue can be merged, so that the read operation in the second queue is traversed, that is, the read operation in the second queue, which has performed the merging procedure, is taken as the destination read operation, and the read operation, which has not performed the merging procedure, is taken as the source read operation, so that the merging procedure can be performed again for the read operation which can be merged in the second queue, so that most of the merged read request information can be divided by the basic storage unit.
And executing a merging flow between the rest read operations in the second queue.
After the read operations in the first queue and the read operations in the second queue are sequentially executed to merge the flows, there are read operations which cannot meet the merge conditions, and the read operations are sent to the second queue and the merge flow of the second queue is executed; after the merging process of the second queue is executed, there is still a possibility that the read operation that cannot meet the merging condition may still exist, or the size of the read request message of the read operation that has executed the merging process in the second queue cannot be divided by the basic storage unit, and the merging process may still be executed again between the read operations that have executed the merging process in the second queue, so that the read operation merging process may still be executed between the read operations that remain in the second queue, so as to execute the merging process for all the read operations that can execute the merging process. By the merging flow, most of read operations capable of executing the merging flow are merged, and most of read request messages can be divided by the basic storage unit, so that the data reading efficiency is greatly improved. However, it is not excluded that there are always read operations that cannot perform the merge process, and these read operations that cannot perform the merge process have little effect on the efficiency of the entire data read process, which can be ignored.
In one embodiment, the merging process specifically includes:
judging whether the source read operation and the destination read operation meet the merging condition or not;
if the merging condition is met, inserting the read request message of the source read operation into the read request message of the destination read operation;
if the merge condition is not satisfied, a source read operation that does not satisfy the merge condition is sent to the second queue.
Not all read operations can execute the merge process, and only read operations satisfying the merge condition can execute the merge process. The merge condition is that the two merged read request messages are consecutive. The size of the read request message includes an index value and a length, and the read request message includes left and right consecutive. Further, for example, A, B, C, three read request messages, the index value of the a read request message plus the length of the a read request message is equal to the index value of the B read request message, which is referred to as the B read request message being consecutive to the left of the a read request message; the index value of the B read request message plus the length of the B read request message is equal to the index value of the C read request message, referred to as the B read request message being right consecutive to the right C read request message. Thus, the merge process of two read request messages can only be performed if consecutive merge conditions are satisfied. If the merging condition is met, the source read operation and the destination read operation execute the merging operation, namely, a read request message of the source read operation is inserted into a read request message of the destination read operation; if the merge condition is not satisfied, the source read operation that does not satisfy the merge condition is sent to the second queue to perform a merge operation of the read operation of the second queue.
In one embodiment, after the read request message of the source read operation is inserted into the read request message of the destination read operation, the method further includes:
a merge flag is added to both the source read operation and the destination read operation that perform the read request message merge.
In the merging process of the read operation and the read process, the judgment of judging whether the read operation performs the merging process is related to the judgment of judging whether the read operation performs the merging process, and the corresponding operation is performed by judging whether the merging process is performed, so that a flag is needed to perform the judgment of whether the merging process is performed, namely, the merging flag. When the read request information is merged, a merging mark is added to both the source read operation and the destination read operation, so that whether the read operation is merged or not can be judged through the merging mark later.
In one embodiment, after the read request message of the source read operation is inserted into the read request message of the destination read operation, the method further includes:
and recording the combined message record of the source read operation and the destination read operation.
When the read request information is merged, a merged message record of the source read operation and the destination read operation is recorded. Although the read request messages are combined to improve the efficiency of the whole data read process, when the read data is read and the reply message is returned later, the original read request information and the corresponding read data are packaged and the reply message is returned, wherein the original read request information refers to the read request message when the combined process is not executed. After recording the combined information record of the source read operation and the destination read operation, the original read request information can be obtained according to the combined information record, and the read data is split according to the original read request information, so that the read data corresponds to the original read request information one by one, and the read data and the corresponding original read request information are packaged and the reply information is returned.
In one embodiment, after the merging process between the plurality of read operations is performed, the method further includes:
judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold;
after the merging procedure is performed, there are two results, one is that the merged read request message can be divided by the basic storage unit, and the other is that the merging condition is not satisfied and cannot be divided by the basic storage unit, and that the merging is performed but cannot be divided by the basic storage unit. The read operation which can be divided by the basic memory unit is sent to the read flow to read data, and the read operation which cannot be divided by the basic memory unit cannot stay in the second queue all the time, so a waiting time threshold is set to prevent the read operation which cannot be divided by the basic memory unit from staying in the second queue all the time to influence the read efficiency, and the waiting time threshold is generally set to be 70 ms-80 ms, preferably 75ms, and the calculation is started with the enqueue time.
If the size of the read request message can be divided by the basic storage unit or reaches the waiting time threshold value, continuing to execute the read flow;
after the merging process is performed, if the size of the read request message is divisible by the base memory unit, then these read operations divisible by the base memory unit directly enter the read process to perform the data read operation. In addition, after the merging process is executed, there are read operations which cannot always satisfy the merging condition and do not execute the merging process, the read operations which cannot execute the merging process have little effect on the efficiency of the whole data read process, and can be ignored, and the read operations which cannot execute the merging process cannot stay in the second queue all the time, so that a waiting time threshold is set, and after the read operations enter the first queue to record the enqueuing time and reach the waiting time threshold, the read operations which reach the waiting time threshold can be sent to the read process for data reading no matter the read operations cannot be divided by the basic storage unit.
If the size of the read request message is not divisible by the base storage unit and the latency threshold is not reached, then a read operation is added to the second queue that is not divisible by the base storage unit or does not reach the latency threshold.
After the merging process is executed, as new read operations are continuously added in the first queue, new read operations are continuously added in the corresponding second queue, and the read operations which cannot be divided by the basic storage unit and do not reach the waiting time threshold are possibly merged with the new operations so that the merged read request message can be divided by the basic storage unit, and therefore the read operations which are within the waiting time threshold and cannot be divided by the basic storage unit are sent to the second queue to continue executing the merging process.
In one embodiment, the reading process specifically includes:
judging whether the read operation has a merging mark, if not, returning a reply message according to the original flow, and ending the read flow; if a merge flag is present, the merge message record is traversed.
The read process is to execute a corresponding data read process according to the read request information, and because the read request information is combined after the combining process is executed, and the read data is packed with the original read request information and returned, when the read process is executed, whether the read operation has a combining mark or not is firstly judged, and the return information is distinguished according to the combining mark, namely, the read operation which does not execute the combining process directly returns the read data and the read request information according to the original process; and the read operation after the merging flow is executed splits the read data and the merged read request message according to the merging information record, packages the split data and the corresponding original read request message to form a reply message, and returns the reply message. When executing the reading flow, firstly judging whether the read operation has a merging mark, if not, returning a reply message according to the original flow, and ending the reading flow; if the merge flag is present, the read request message in the read operation is traversed to split the read data.
In one embodiment, after traversing the merged message record, the method further includes:
splitting the read data and the combined read request information according to the combined information record of the read request information, forming a reply message by the split data and the corresponding read request information, returning the reply message, and ending the read process.
If the merging mark exists, the merging message record is traversed, the read data and the merged read request message are required to be split according to the merging message record, namely, after the merging message record is traversed, the read corresponding data and the merged read request message are split according to the merging message record, so that the split original read request message and the split data are in one-to-one correspondence, each read request message and the corresponding read data are packaged, the packaged read request message and the corresponding read data form a reply message, and the reply message is returned.
Embodiment two:
referring to fig. 2, fig. 2 is a combined flow chart of the data reading method based on distributed storage according to the present invention.
S100, judging whether the source read operation and the target read operation meet the merging condition or not;
it is determined whether a continuous merge condition is satisfied between read request messages for performing two read operations of the merge process. If the continuous merging condition is satisfied, step S200 is performed; if the continuous merging condition is not met, the source read operation which does not meet the merging condition is sent to a second queue, and then the merging flow is ended;
S200, inserting the read request message of the source read operation into the read request message of the destination read operation;
if the successive merge condition is satisfied, a read request message for a source read operation is inserted into a read request message for a destination read operation.
S300, adding a merging mark to both a source reading operation and a destination reading operation for executing the merging of the reading request messages;
when the read request information is merged, a merging mark is added to both the source read operation and the destination read operation, so that whether the read operation is merged or not can be judged through the merging mark later.
S400, recording a combined message record of the source read operation and the destination read operation.
After recording the combined message record of the source read operation and the destination read operation, the original read request message can be obtained according to the combined message record, and the read data is split according to the original read request message, so that the read data corresponds to the original read request message one by one, and the read data and the corresponding original read request message are packaged and the reply message is returned.
Embodiment III:
referring to fig. 3, fig. 3 is a flowchart of a second method of the distributed storage based data reading method of the present invention.
The data reading method based on distributed storage of the embodiment comprises the following steps:
s10, receiving a read operation with a read request message;
the read operation has read request information, and corresponding data is read through the read request information, so that the read operation is firstly received, and the read request information carried by the read operation is read.
S20, adding the received read operation to a first queue, and periodically transferring the read operation in the first queue to a second queue;
when the merging process is executed at the beginning, the first queue and the second queue are empty, no read operation exists, the read operation is added to the first queue after the read operation is received, and the read operation of the first queue is transferred to the second queue periodically, so that the read operation in the first queue and the read operation in the second queue can be executed in the merging process.
S30, taking the read operation in the first queue as a source read operation, taking the read operation in the second queue as a target read operation, and executing a merging flow;
after the transfer of the read operation is completed, traversing the read operation in the first queue, taking the read operation in the first queue as a source read operation, taking the read operation in the second queue as a destination read operation, and executing a merging flow by sequentially carrying out the source read operation and the destination read operation, namely, executing the merging flow by a read request message of the source read operation and a read request message of the destination read operation.
S40, taking the read operation which is not executed with the merging flow as a source read operation, and taking the read operation which is executed with the merging flow as a target read operation in the second queue, and executing the merging flow.
And traversing the read operation in the second queue, namely taking the read operation of which the merging flow is executed in the second queue as a target read operation and taking the read operation of which the merging flow is not executed as a source read operation, so that the merging flow can be executed again by the read operation which can be merged in the second queue, and most of merged read request information can be divided by the basic storage unit.
S50, executing a merging flow between the rest read operations in the second queue;
and executing a read operation merging flow among the rest read operations in the second queue so as to execute the merging flow for all the read operations capable of executing the merging flow. By the merging flow, most of read operations capable of executing the merging flow are merged, and most of read request messages can be divided by the basic storage unit, so that the data reading efficiency is greatly improved.
S60, judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold, if so, continuing to execute the read flow; if the size of the read request message is not divisible by the base storage unit and the latency threshold is not reached, then a read operation is added to the second queue that is not divisible by the base storage unit or does not reach the latency threshold.
After the merging process is executed, judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold value, if the size of the read request message can be divided by the basic storage unit, the read operation divided by the basic storage unit directly enters the read process of the step S70; if the corresponding read operation of the read request message is recorded in the first queue and reaches the waiting time threshold, the read operation reaching the waiting time threshold, whether it can not be divided by the basic storage unit, is sent to the read flow of step S70 to perform data reading, because the read operations unable to perform the merging flow have little effect on the efficiency of the whole data reading flow, can be ignored, and the read operations unable to perform the merging flow cannot stay in the second queue all the time, otherwise, the efficiency of the whole data reading flow is affected. After the merging process is performed, since new read operations are continuously added to the first queue, new read operations are continuously added to the corresponding second queue, and there is a possibility that read operations which cannot be divided by the basic storage unit and do not reach the latency threshold may be merged with new operations so that the merged read request message is divided by the basic storage unit, so if the size of the read request message cannot be divided by the basic storage unit and does not reach the latency threshold, read operations which are within the latency threshold and cannot be divided by the basic storage unit are added to the second queue, and step S30 is returned.
S70, judging whether a merge sign exists in the read operation, if not, sending a reply message according to the original flow, and ending the read flow; if the merge flag is present, the read request message in the read operation is traversed.
The read process is to execute a corresponding data read process according to the read request information, and because the read request information is combined after the combining process is executed, and the read data is packed with the original read request information and returned, when the read process is executed, whether the read operation has a combining mark or not is firstly judged, and the return information is distinguished according to the combining mark, namely, the read operation which does not execute the combining process directly returns the read data and the read request information according to the original process; and the read operation after the merging process is executed splits the read data and the merged read request information according to the merged information record, the split data and the corresponding original read request information are packaged, then the read data and the corresponding read request information form a reply message, and the reply message is returned. When executing the reading flow, firstly judging whether the read operation has a merging mark, if not, returning a reply message according to the original flow, and ending the reading flow; if the merge flag is present, the process advances to step S80.
And S80, if the merging mark exists, traversing the merging message record, splitting the read data and the merged read request message according to the merging message record, namely splitting the read corresponding data and the merged read request message according to the merging message record after traversing the merging message record, so that the split original read request message and the split data are in one-to-one correspondence, packaging each read request message and the corresponding read data to form a reply message, returning the reply message, and ending the reading flow.
Embodiment four:
referring to fig. 4, fig. 4 is a system configuration diagram of a data reading system based on distributed storage according to the present invention.
The data reading system based on distributed storage of the present embodiment includes:
the object storage module receives a read operation from the client module;
the client sends a read operation with a read request message to the object store module, which is configured to receive the read operation with the read request message from the client, so as to perform a merge procedure with the read operation with the read request message.
The operation merging module is used for executing a merging flow of the read operation;
the merging flow for executing the read operation is that of executing the read request message, and the merging flow is executed on the read request message by operating the merging module, so that the size of the read request message after most of the merging flow is integral multiple of that of the basic storage unit, and the read request message is in a stripe alignment state, and the data reading efficiency and the reading performance are effectively improved in the stripe alignment state.
The integer division judging module is used for judging whether the read request message of the read operation can be integer divided by the basic storage unit;
after the merging process is executed, the integer division judging module judges that if the size of the read request message can be integer divided by the basic storage unit, the read operations which can be integer divided by the basic storage unit directly enter the read process to execute the data read operation.
And the data reading module is used for reading the data in the object storage module.
And the read operation after the merging process is executed enters a read process, and the data reading module reads corresponding data according to the read request information to complete the data reading operation.
In one embodiment, the data reading system further comprises:
an operation transfer module for transferring the read operation in the queue;
when receiving the read operation, the operation transfer module adds the received read operation to the first queue, and periodically transfers the read operation in the first queue to the second queue, so as to execute the merging process of the read operation in the first queue and the read operation in the second queue.
The time recording module is used for recording the enqueue time of the read operation;
after each read operation is added into the first queue, the time recording module records the enqueuing time of the read operation so as to conveniently judge whether the read operation reaches the waiting time threshold or not, so that the read operation reaching the time threshold is executed in a read flow, or the read operation not reaching the time threshold is sent into the second queue.
The condition judging module is used for judging whether the combination condition of the read operation is met or not;
before executing the merging flow, firstly judging whether the two read operations meet the merging condition through a condition judging module, and if so, executing the merging operation by the source read operation and the destination read operation, namely inserting the read request message of the source read operation into the read request message of the destination read operation; if the merge condition is not satisfied, the source read operation that does not satisfy the merge condition is sent to the second queue to perform a merge operation of the read operation of the second queue.
And the message recording module is used for recording the combined record of the read operation.
The message recording module records the combined message record of the source read operation and the destination read operation, then the original read request message can be obtained according to the combined message record, and the read data is split according to the original read request message, so that the read data corresponds to the original read request information one by one, and the read data and the corresponding original read request information are packaged and returned to the reply message.
In one embodiment, the data reading system further comprises:
the time judging module is used for judging whether the read operation reaches a waiting time threshold value or not;
After the merging process is executed, there are read operations which cannot always meet the merging condition and do not execute the merging process, the read operations which cannot execute the merging process have little influence on the efficiency of the whole data read process, and can be ignored, and the read operations which cannot execute the merging process cannot always stay in the second queue, so that a waiting time threshold is set, after the judgment of the time judgment module, the read operations which enter the first queue to record the enqueuing time and reach the waiting time threshold after the read operations are judged by the time judgment module, and the read operations which cannot be divided by the basic storage unit are sent into the read process for data reading.
The merging marking module is used for adding merging marks to the read operations participating in the merging flow;
when the read request information is merged, the merge flag module adds a merge flag to both the source read operation and the destination read operation, so that whether the read operations are merged can be judged by the merge flag later.
The merging judging module is used for judging whether the read operation is merged or not;
when executing the reading flow, firstly judging whether the reading operation has a merging mark through a merging judging module, and if the merging mark does not exist, returning a reply message according to the original flow, and ending the reading flow; if the merge flag is present, the read request message in the read operation is traversed to split the read data.
The message reply module is used for returning a reply message;
splitting the read data and the combined read request information according to the combined information record of the read request information, and forming a reply message by the split data and the corresponding read request information, wherein the message reply module is responsible for returning the reply message, and the read flow is ended.
And the client module is used for sending the read operation to the object storage module.
The client initiates a read operation and sends the read operation with the read request message to the object storage module.
Fifth embodiment:
the present embodiment provides a computer-readable storage medium storing a program which, when executed by a processor, causes the processor to execute the steps of the data reading method based on distributed storage in the above-described embodiment.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.
Claims (9)
1. A data reading method based on distributed storage, the method comprising the steps of:
Receiving a plurality of read operations with a read request message;
executing a merging flow among a plurality of read operations, namely executing the merging flow on the read request message of the read operation, so that the size of the read request message after executing the merging flow is an integer multiple of the size of the basic storage unit;
the read operation after the merging process is executed enters the read process, and the data read operation is carried out;
the merging process between the execution of the plurality of read operations specifically includes:
taking the read operation in the first queue as a source read operation, and taking the read operation in the second queue as a target read operation, and executing a merging flow, wherein the first queue can continuously add new read operations;
in the second queue, reading operation which is not executed with the merging flow is used as source reading operation, reading operation which is executed with the merging flow is used as target reading operation, and the merging flow is executed;
executing a merging flow between the rest read operations in the second queue;
after the merging process between the plurality of read operations is executed, the method further comprises:
judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold;
if the size of the read request message can be divided by the basic storage unit or reaches the waiting time threshold value, continuing to execute the read flow;
If the size of the read request message is not divisible by the base storage unit and the latency threshold is not reached, then a read operation is added to the second queue that is not divisible by the base storage unit or does not reach the latency threshold.
2. The method for reading data based on distributed storage according to claim 1, further comprising, before the performing of the merging flow between the plurality of read operations:
adding the received read operation to the first queue, and periodically transferring the read operation in the first queue to the second queue.
3. The data reading method based on distributed storage according to claim 1, wherein the merging process specifically includes:
judging whether the source read operation and the destination read operation meet the merging condition or not;
if the merging condition is met, inserting the read request message of the source read operation into the read request message of the destination read operation;
if the merge condition is not satisfied, a source read operation that does not satisfy the merge condition is sent to the second queue.
4. The method for reading data based on distributed storage according to claim 3, wherein after inserting the read request message of the source read operation into the read request message of the destination read operation, the method further comprises:
A merge flag is added to both the source read operation and the destination read operation that perform the read request message merge.
5. The data reading method based on distributed storage according to claim 3, wherein after the read request message of the source read operation is inserted into the read request message of the destination read operation, the method further comprises:
and recording the combined message record of the source read operation and the destination read operation.
6. The data reading method based on distributed storage according to claim 1, wherein the reading process specifically comprises:
judging whether the read operation has a merging mark, if not, returning a reply message according to the original flow, and ending the read flow; if a merge flag is present, the merge message record is traversed.
7. The distributed storage-based data reading method according to claim 6, wherein after traversing the merged message record, further comprising:
splitting the read data and the combined read request information according to the combined information record of the read request information, forming a reply message by the split data and the corresponding read request information, returning the reply message, and ending the read process.
8. A data reading system based on distributed storage, characterized in that: the system comprises:
The object storage module receives a read operation from the client module;
the operation merging module is used for executing a merging flow of the read operation;
the integer division judging module is used for judging whether the read request message of the read operation can be integer divided by the basic storage unit;
the data reading module is used for reading the data in the object storage module;
the merging process between the execution of the plurality of read operations specifically includes:
taking the read operation in the first queue as a source read operation, and taking the read operation in the second queue as a target read operation, and executing a merging flow, wherein the first queue can continuously add new read operations;
in the second queue, reading operation which is not executed with the merging flow is used as source reading operation, reading operation which is executed with the merging flow is used as target reading operation, and the merging flow is executed;
executing a merging flow between the rest read operations in the second queue;
after the merging process between the plurality of read operations is executed, the method further comprises:
judging whether the size of the read request message can be divided by the basic storage unit or reaches a waiting time threshold;
if the size of the read request message can be divided by the basic storage unit or reaches the waiting time threshold value, continuing to execute the read flow;
If the size of the read request message is not divisible by the base storage unit and the latency threshold is not reached, then a read operation is added to the second queue that is not divisible by the base storage unit or does not reach the latency threshold.
9. A computer-readable storage medium, characterized by: the computer readable storage medium stores a program which, when executed by a processor, causes the processor to perform the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210194691.8A CN114564154B (en) | 2022-03-01 | 2022-03-01 | Data reading method based on distributed storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210194691.8A CN114564154B (en) | 2022-03-01 | 2022-03-01 | Data reading method based on distributed storage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114564154A CN114564154A (en) | 2022-05-31 |
CN114564154B true CN114564154B (en) | 2023-08-18 |
Family
ID=81715077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210194691.8A Active CN114564154B (en) | 2022-03-01 | 2022-03-01 | Data reading method based on distributed storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114564154B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103425439A (en) * | 2013-07-16 | 2013-12-04 | 记忆科技(深圳)有限公司 | Method for reading and writing solid-state disk and solid-state disk thereof |
KR20170095524A (en) * | 2016-02-15 | 2017-08-23 | 에스케이하이닉스 주식회사 | Memory system and operation method thereof |
CN109976679A (en) * | 2019-04-11 | 2019-07-05 | 苏州浪潮智能科技有限公司 | A kind of distributed type assemblies volume pre-head method, system, equipment and computer media |
CN111881096A (en) * | 2020-07-24 | 2020-11-03 | 北京浪潮数据技术有限公司 | File reading method, device, equipment and storage medium |
-
2022
- 2022-03-01 CN CN202210194691.8A patent/CN114564154B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103425439A (en) * | 2013-07-16 | 2013-12-04 | 记忆科技(深圳)有限公司 | Method for reading and writing solid-state disk and solid-state disk thereof |
KR20170095524A (en) * | 2016-02-15 | 2017-08-23 | 에스케이하이닉스 주식회사 | Memory system and operation method thereof |
CN109976679A (en) * | 2019-04-11 | 2019-07-05 | 苏州浪潮智能科技有限公司 | A kind of distributed type assemblies volume pre-head method, system, equipment and computer media |
CN111881096A (en) * | 2020-07-24 | 2020-11-03 | 北京浪潮数据技术有限公司 | File reading method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114564154A (en) | 2022-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TW202004584A (en) | Block chain-based data migration method and device | |
CN109493223B (en) | Accounting method and device | |
CN105824846B (en) | Data migration method and device | |
CN105827678B (en) | Communication means and node under a kind of framework based on High Availabitity | |
JP6987340B2 (en) | Database data change request processing method and equipment | |
CN104346373A (en) | Partition log queue synchronization management method and device | |
JP2023541298A (en) | Transaction processing methods, systems, devices, equipment, and programs | |
CN111708738A (en) | Method and system for realizing data inter-access between hdfs of hadoop file system and s3 of object storage | |
US20110131288A1 (en) | Load-Balancing In Replication Engine of Directory Server | |
CN107992358B (en) | Asynchronous IO execution method and system suitable for extra-core image processing system | |
CN107277022B (en) | Process marking method and device | |
CN114006946B (en) | Method, device, equipment and storage medium for processing homogeneous resource request | |
CN102724301B (en) | Cloud database system and method and equipment for reading and writing cloud data | |
CN116701387A (en) | Data segmentation writing method, data reading method and device | |
CN113297159B (en) | Data storage method and device | |
CN114564154B (en) | Data reading method based on distributed storage | |
CN112347080B (en) | Data migration method and related device | |
CN115470235A (en) | Data processing method, device and equipment | |
CN112596669A (en) | Data processing method and device based on distributed storage | |
CN117472534A (en) | Distributed asset packaging method, cluster deployment server and storage medium | |
CN111209263A (en) | Data storage method, device, equipment and storage medium | |
CN112631994A (en) | Data migration method and system | |
CN102163164B (en) | Processing method and processor for critical data in shared memory | |
CN109791541B (en) | Log serial number generation method and device and readable storage medium | |
CN114785662B (en) | Storage management method, device, equipment and machine-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |