CN111190541A - Flow control method of storage system and computer readable storage medium - Google Patents

Flow control method of storage system and computer readable storage medium Download PDF

Info

Publication number
CN111190541A
CN111190541A CN201911367540.2A CN201911367540A CN111190541A CN 111190541 A CN111190541 A CN 111190541A CN 201911367540 A CN201911367540 A CN 201911367540A CN 111190541 A CN111190541 A CN 111190541A
Authority
CN
China
Prior art keywords
time
storage system
processed
flow control
control method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911367540.2A
Other languages
Chinese (zh)
Other versions
CN111190541B (en
Inventor
张廷全
纪志祥
沈海嘉
吕方川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Zhongke Shuguang Storage Technology Co Ltd
Original Assignee
Tianjin Zhongke Shuguang Storage Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Zhongke Shuguang Storage Technology Co Ltd filed Critical Tianjin Zhongke Shuguang Storage Technology Co Ltd
Priority to CN201911367540.2A priority Critical patent/CN111190541B/en
Publication of CN111190541A publication Critical patent/CN111190541A/en
Application granted granted Critical
Publication of CN111190541B publication Critical patent/CN111190541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a flow control method of a storage system and a computer readable storage medium, wherein the method comprises the following steps: splitting the io of the distributed object storage system into different types of objects, and dividing the objects into different groups according to target nodes; respectively establishing a plurality of time queues for each type of object in each group; and calculating an object of one type to be processed according to the weight, sequentially detecting whether the object to be processed exists in each time queue according to the priority order of the plurality of time queues, and processing the object to be processed if the object to be processed exists. Through the technical scheme, the invention can at least ensure uniform time delay of the io processing.

Description

Flow control method of storage system and computer readable storage medium
Technical Field
The present invention relates to the field of distributed storage technologies, and in particular, to a flow control method for a storage system and a computer-readable storage medium.
Background
In a distributed object storage system, an io (Input/Output) is split (this part is referred to as a distribution module) into a plurality of objects, each object is sent to a corresponding target storage node, the storage node is responsible for reading and writing data on a disk, the processing capacity of the disk is limited, when the sending speed of the object exceeds the processing capacity of the target node, the exceeded object is returned, the distribution module needs to retry again, and generally, such a retried object consumes more time, according to a dynamic wrr flow control algorithm: and splitting different types of objects by using different types of io (such as data, metadata reading, writing and the like), wherein each type of io has a buffer queue, adjusting the processing weight of each queue according to the depth of the queue, and then scheduling the queues according to an wrr algorithm. However, this approach can be the case: old objects which consume more time may be arranged behind a plurality of new objects, and the old objects are processed after the previous new objects are processed, so that the processing delay fluctuation of the objects is large, the completion time of some objects is short, the time of some objects is long, and the improvement of the throughput is not facilitated.
Disclosure of Invention
In view of the above problems in the related art, the present invention provides a flow control method for a storage system and a computer-readable storage medium, which can at least ensure uniform io processing delay.
The technical scheme of the invention is realized as follows:
according to an aspect of the present invention, there is provided a flow control method for a distributed storage system, including:
splitting the io of the distributed object storage system into different types of objects, and dividing the objects into different groups according to target nodes;
respectively establishing a plurality of time queues for each type of object in each group;
and calculating an object of one type to be processed according to the weight, sequentially detecting whether the object to be processed exists in each time queue according to the priority order of the plurality of time queues, and processing the object to be processed if the object to be processed exists.
And correspondingly setting the corresponding time of each time queue in the plurality of time queues according to the processing time of the object.
In one embodiment, the number of objects of the same type in each of the time queues is the same during each cycle.
In one embodiment, the greater the number of the plurality of time queues, the smaller the corresponding time of each of the time queues.
In one embodiment, the flow control method of the distributed storage system further includes:
judging whether no object is queued to be sent on the corresponding target,
if no object is queued, the current thread no longer sends the object.
Wherein, the object to be processed is taken out from the head of the corresponding time queue for processing; and after the object to be processed is processed, adding the object at the tail part of the corresponding time queue when the object needs to be delivered again.
According to another aspect of the present invention, there is provided a computer readable storage medium storing a program executable by an electronic device, the program, when run on the electronic device, causing the electronic device to perform the steps of the above-described method.
According to the technical scheme, the object segmentation ordered arrangement is realized by adopting a dynamic wrr ordered queue flow control mode and combining various types of object use time queues, so that the first-come io can be ensured to the maximum extent, the priority is given to the processing, and the uniform time delay of the io processing is ensured.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart of a flow control method of a distributed storage system according to an embodiment of the present invention;
FIG. 2 is a queuing schematic of a flow control method of a distributed storage system according to an embodiment of the invention;
fig. 3 is a flowchart of a flow control method of a distributed storage system according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
Fig. 1 is a flowchart of a flow control method of a distributed storage system according to an embodiment of the present invention. As shown in fig. 1, a flow control method of a distributed storage system according to an embodiment of the present invention may include the following steps:
s11, splitting the io of the distributed object storage system into different types of objects, and dividing the objects into different groups according to the target nodes.
At this step, the different types of io may be split at the distribution module into different types of objects, which are queued in two levels: the first level is the target node and the second level is the object type. Firstly, dividing the objects into different groups according to the target nodes, then processing the objects in each group by using the subsequent steps.
S12, a plurality of time queues are established for each type of object in each group.
Specifically, the distribution module can receive N io types (data, metadata read, write, etc.) from the front-end module, and accordingly, decompose into N types of objects. Assuming that there is only one target node, M time queues are established for each type of object, as shown in fig. 2, N ═ 3(3 types of objects), and M ═ 3 is taken as an example for explanation. The processing weight (light) of the 3 types of objects is W1-W3, the weight is fixed in one processing period, and the processing weight of each service is calculated according to the priority of the 3 types of objects, the number of the objects to be processed of each type (the sum of the objects in the M time queues), the resource occupation condition of logs and the like at that time before the next period starts, and the influence degree of each influence factor. In one embodiment, a deficit weighted round robin (D _ WRR) algorithm is employed. In the processing process, if the target device returns a result of processing failure to a certain object, and the distribution module determines to retry the object, it puts the object into one of 3 "time queues" according to the consumed time of the service, where the time queue is 1011 queue in fig. 1, and the highest priority time queue of the T1 type object (obj _ type), and if the 3 time queues respectively correspond to consumed times 0 to timeout/3, timeout/3 to 2 times out/3, and 2 times out/3 to timeout, the time range of the retried object is as follows: 2 × timeout/3-timeout. After step S12, the method proceeds to step S13.
In some embodiments, the weights may be derived from the type of fixed priority (Pi), the queue depth (COUNT), the log RESOURCE occupancy (RESOURCE), and the impact factor (a, b, c) of each impact factor: wi is a Pi + b COUNTi + c RESOURCEi.
S13, calculating an object of a type to be processed according to the weight, sequentially detecting whether the object to be processed exists in each time queue according to the priority order of the time queues, and processing the object to be processed if the object to be processed exists. That is, when a certain type of object is processed, whether or not an object is present in the time queue of the highest priority is checked first, if so, the processing is performed, and if the type of object whose priority is to be processed T1 is calculated from the weight, whether or not an object is present in the queue 1011 is checked first, and if so, the object is taken out from the queue and the processing is performed on the queues 1011 to 1013 in sequence.
It should be appreciated that the number of object types, number of established time queues, etc. shown in FIG. 2 are merely exemplary, and in other embodiments, any other suitable configuration may be made.
According to the technical scheme, the object segmentation ordered arrangement is realized by adopting a dynamic wrr ordered queue flow control mode and combining various types of object use time queues, so that the first-come io can be ensured to the maximum extent, the priority is given to the processing, and the uniform time delay of the io processing is ensured.
In addition, for the allocation of the time queue, it can be set with reference to the object density distribution. For example, the processing timeout time of an object is 40 seconds, 80% of the objects are completed within 10 seconds, and in the case of 5 time queues for each object type, 4 time queues correspond to 0-10 seconds, and 1 time queue corresponds to 10-40 seconds. The principle of arrangement is as follows: the number of objects in the time queues for one type is roughly equivalent in each cycle. The number of time queues also affects the uniformity of the io processing time: the larger the number, the smaller the time range corresponding to each time queue, and the more uniform the processing time of io.
In addition, the distribution module is used as a middle link on the io path, and the object after io decomposition from the front-end module also enters the corresponding time queue according to the rule under the condition that the object exists in the buffer queue of the object of the corresponding type of the module. And returning the target node after the execution obtains the result, if the execution fails, under the condition that the object exists in the buffer queue of the module, delivering the object to the queue of the module again. When the front-end module thread and the back-end module thread deliver the object to the time queue of the module, the io is triggered to be sent to the back end at the same time. Thus, there is lock contention for enqueuing and dequeuing between the front and back end threads and the module thread, which affects efficiency, although the M time queues have dispersed part of the contention. Thus, we consider the following ways to further reduce lock contention: the object is added into the tail of the queue in the incoming queue, the object is taken out of the head of the queue in the outgoing queue, two locks can be set for one queue to respectively protect the enqueue and the dequeue, the lock granularity is reduced, and the performance is improved.
Fig. 3 is a flowchart of a flow control method of a distributed storage system according to another embodiment of the present invention. In the embodiment shown in FIG. 3, the process flow is when the target node has started to back-press the module. In this embodiment, at step S2011, a corresponding type time queue corresponding to the target is added according to the consumed time. At step 2012, a determination is made as to whether the number of objects that have been sent to the corresponding target node but have not received a response exceeds the number that can be sent to the target node at the same time, which can be derived from two parameters: 1. distributing the target processing capacity estimated by the module; 2. the processing power value brought back by the target when it responds back to the distribution module. Then, at step S2013, the type of the present transmission object is selected according to the weight. At step S2014, the object is fetched from the type time queue and transmitted. At step 2015, when there is no obj (object) queued for transmission on the corresponding target, the current thread does not send obj any more, and in addition, to avoid that when the number of queued obj is large, the thread is always sending obj and cannot process other tasks, an upper limit of obj that can be sent by one-time trigger transmission is set, and when the upper limit is reached, the transmission is stopped, and the next time the thread triggers obj transmission, the transmission is sent again.
The flow control method provided by the invention is verified by using pure random 4k writing, and after the dynamic wrr ordered queuing flow control provided by the invention is adopted, the io average time delay is reduced and iops (Input/output operations Per Second, which is a measurement mode for computer storage device performance test) is improved compared with the simple dynamic wrr flow control.
In summary, in the storage system of the technical solution, in the event of object processing failure on the target node, in order to reduce the processing delay of each object, the invention adopts a dynamic wrr ordered queuing flow control mode, and various types of objects use time queues to realize the ordered arrangement of object segments, and the more time queues, the more uniform the average delay of the objects. Correspondingly, the distribution of the time queues is consistent with the distribution of the objects on the time axis, namely, the time queues occupy more time periods with more objects.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A flow control method for a distributed storage system, comprising:
splitting the io of the distributed object storage system into different types of objects, and dividing the objects into different groups according to target nodes;
respectively establishing a plurality of time queues for each type of object in each group;
and calculating an object of one type to be processed according to the weight, sequentially detecting whether the object to be processed exists in each time queue according to the priority order of the plurality of time queues, and processing the object to be processed if the object to be processed exists.
2. The flow control method for the distributed storage system according to claim 1, wherein the corresponding time of each of the plurality of time queues is set according to a processing time of the object.
3. The flow control method for the distributed storage system according to claim 2, wherein the number of objects of the same type in each time queue is the same in each cycle.
4. The flow control method for the distributed storage system according to claim 2, wherein the greater the number of the plurality of time queues, the smaller the corresponding time of each time queue.
5. The flow control method for the distributed storage system according to claim 1, further comprising:
judging whether no object is queued to be sent on the corresponding target,
if no object is queued, the current thread no longer sends the object.
6. The flow control method for a distributed storage system according to claim 1, wherein,
taking out the object to be processed from the head of the corresponding time queue for processing;
and after the object to be processed is processed, adding the object at the tail part of the corresponding time queue when the object needs to be delivered again.
7. A computer-readable storage medium, storing a program executable by an electronic device, which when run on the electronic device causes the electronic device to perform the steps of the method of any one of claims 1 to 6.
CN201911367540.2A 2019-12-26 2019-12-26 Flow control method of storage system and computer readable storage medium Active CN111190541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911367540.2A CN111190541B (en) 2019-12-26 2019-12-26 Flow control method of storage system and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911367540.2A CN111190541B (en) 2019-12-26 2019-12-26 Flow control method of storage system and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111190541A true CN111190541A (en) 2020-05-22
CN111190541B CN111190541B (en) 2024-04-12

Family

ID=70708015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911367540.2A Active CN111190541B (en) 2019-12-26 2019-12-26 Flow control method of storage system and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111190541B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111708494A (en) * 2020-06-17 2020-09-25 浪潮云信息技术股份公司 Method for realizing distributed storage QOS
CN112000543A (en) * 2020-07-29 2020-11-27 北京浪潮数据技术有限公司 Method, device and equipment for detecting time delay performance of storage system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103327094A (en) * 2013-06-19 2013-09-25 成都市欧冠信息技术有限责任公司 Data distributed type memory method and data distributed type memory system
CN104009936A (en) * 2014-05-21 2014-08-27 深圳市邦彦信息技术有限公司 Queue scheduling method based on dynamic weight calculation
CN105162878A (en) * 2015-09-24 2015-12-16 网宿科技股份有限公司 Distributed storage based file distribution system and method
CN106294870A (en) * 2016-08-25 2017-01-04 苏州酷伴软件科技有限公司 Object-based distributed cloud storage method
CN107733689A (en) * 2017-09-15 2018-02-23 西南电子技术研究所(中国电子科技集团公司第十研究所) Dynamic weighting polling dispatching strategy process based on priority
CN108710686A (en) * 2018-05-21 2018-10-26 北京五八信息技术有限公司 A kind of date storage method, device, storage medium and terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103327094A (en) * 2013-06-19 2013-09-25 成都市欧冠信息技术有限责任公司 Data distributed type memory method and data distributed type memory system
CN104009936A (en) * 2014-05-21 2014-08-27 深圳市邦彦信息技术有限公司 Queue scheduling method based on dynamic weight calculation
CN105162878A (en) * 2015-09-24 2015-12-16 网宿科技股份有限公司 Distributed storage based file distribution system and method
CN106294870A (en) * 2016-08-25 2017-01-04 苏州酷伴软件科技有限公司 Object-based distributed cloud storage method
CN107733689A (en) * 2017-09-15 2018-02-23 西南电子技术研究所(中国电子科技集团公司第十研究所) Dynamic weighting polling dispatching strategy process based on priority
CN108710686A (en) * 2018-05-21 2018-10-26 北京五八信息技术有限公司 A kind of date storage method, device, storage medium and terminal

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111708494A (en) * 2020-06-17 2020-09-25 浪潮云信息技术股份公司 Method for realizing distributed storage QOS
CN112000543A (en) * 2020-07-29 2020-11-27 北京浪潮数据技术有限公司 Method, device and equipment for detecting time delay performance of storage system
CN112000543B (en) * 2020-07-29 2023-03-31 北京浪潮数据技术有限公司 Method, device and equipment for detecting time delay performance of storage system

Also Published As

Publication number Publication date
CN111190541B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN107579926B (en) QoS setting method of Ceph cloud storage system based on token bucket algorithm
CN106095572B (en) distributed scheduling system and method for big data processing
CN109697122B (en) Task processing method, device and computer storage medium
CN106170016A (en) A kind of method and system processing high concurrent data requests
US9319493B2 (en) Communication method and information processing system
WO2011076608A2 (en) Goal oriented performance management of workload utilizing accelerators
US6848107B1 (en) Message control apparatus
Xie et al. Pandas: robust locality-aware scheduling with stochastic delay optimality
US8737227B2 (en) Packet transmission device, memory control circuit, and packet transmission method
CN111190541A (en) Flow control method of storage system and computer readable storage medium
CN103164266A (en) Dynamic resource allocation for transaction requests issued by initiator to recipient devices
US8018958B1 (en) System and method for fair shared de-queue and drop arbitration in a buffer
CN111526081B (en) Mail forwarding method, device, equipment and storage medium
US11528232B1 (en) Apparatus and method for handling real-time tasks with diverse size based on message queue
CN113157465B (en) Message sending method and device based on pointer linked list
CN112368681A (en) Asymmetric cooperative queue management of messages
US11221971B2 (en) QoS-class based servicing of requests for a shared resource
US7460544B2 (en) Flexible mesh structure for hierarchical scheduling
CN115766582A (en) Flow control method, device and system, medium and computer equipment
CN113225263B (en) Flow request processing method and device and network chip
US11474868B1 (en) Sharded polling system
US20170063976A1 (en) Dynamic record-level sharing (rls) provisioning inside a data-sharing subsystem
EP3408742A1 (en) Technique for determining a load of an application
CN115858133B (en) Batch data processing method and device, electronic equipment and storage medium
CN112698790B (en) QoS control method and system for processing burst I/O in storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant