CN115914119A

CN115914119A - File downloading current limiting method and device

Info

Publication number: CN115914119A
Application number: CN202211553606.9A
Authority: CN
Inventors: 孙长生; 孙元田
Original assignee: Inspur Software Group Co Ltd
Current assignee: Inspur Software Group Co Ltd
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-04-04

Abstract

The invention relates to the technical field of server current limiting, and particularly provides a file downloading current limiting method. Compared with the prior art, the method and the system can correctly distribute the load of the server to different downloading requests, and greatly improve the stability of the server program in the trusted environment. The token bucket is optimized, the current limiting function of the token bucket and the load capacity of the instant large flow are kept, the capacity of grading processing of the request is coupled, and the method has good popularization value.

Description

File downloading current limiting method and device

Technical Field

The invention relates to the technical field of server current limiting, and particularly provides a file downloading current limiting method and device.

Background

In recent years, the nation vigorously supports the development of nationwide hardware and software with independent intellectual property rights, and a plurality of basic hardware and software products with independent intellectual property rights represented by a domestic operating system and a CPU are emerged. Ecological environments of domestic operating systems such as the Galaxy kylin system, the Tongxin uos operating system and the like are gradually improved, high-end general chips with independent intellectual property rights such as dragon cores and Feiteng are developed vigorously, and the technical level reaches or approaches the world advanced level of similar products.

With the vigorous development of domestic basic software and hardware, the popularization and the use of the domestic basic software and hardware bring unprecedented opportunities. The server is used as a device for bearing corresponding service requests, bearing services and guaranteeing services, and has extremely high requirements on performance and stability.

Server availability, i.e., the selected server can meet the requirements for long-term stable operation. The flow consumption of the download request to the server is very large, especially for some requests of downloading, distributing and the like of large files. Limiting the download module helps to maintain stability of the server, preventing the server from being affected by the pressure of a large number of download tasks.

The current main current limiting algorithms include a fixed window algorithm, a sliding window algorithm, a leaky bucket algorithm, a token bucket algorithm and the like. The fixed window algorithm lacks a coping method for burst traffic; although the sliding window algorithm solves the problem of burst flow, the request frequency is difficult to control in a real scene; the leaky bucket algorithm can only carry out current limiting with constant flow when the sudden flow is faced; although the token bucket algorithm can well cope with burst flow, the priority-related processing cannot be performed on the request, and when a large-flow request with a long duration is encountered, the server pressure can only be buffered, and the high-priority request cannot be effectively and quickly responded.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a file downloading current limiting method with strong practicability.

The invention further aims to provide a file downloading current limiting device which is reasonable in design, safe and applicable.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a method for limiting current of file downloading is characterized in that first, token bucket configuration is carried out, the size of the token bucket and the rate of tokens put into the token bucket are set, then priority queue configuration and file downloading segmentation are carried out, a file of a downloading request is stored in a minio cluster, and after a server receives a request of segmented downloading, a piece of data is obtained from the minio and cached to the local.

Further, the size of the token bucket, i.e. the capacity of the token bucket, is determined by the maximum instantaneous traffic that the server can load, and when the token bucket is full, the capacity of the bucket determines the size of the flow limit;

the rate at which tokens are put in the token bucket is used to limit the average request rate.

Furthermore, when the priority queue is configured, after the server obtains the request, the server firstly enters the priority queue, and after the request is dequeued, the server enters token buckets corresponding to different priorities;

the higher the priority, the smaller the corresponding token bucket size and the faster the rate of putting tokens in;

when the token bucket corresponding to the low-priority request is blocked, delay time is set for the request, and if multiple requests fail, the priority can be properly adjusted.

Furthermore, in the file downloading section, when the token bucket runs, one request corresponds to one token, and the maximum data volume of single downloading is limited so as to control the total flow.

Furthermore, by setting a threshold value during the sectional downloading, the files with the sizes exceeding the threshold value are downloaded in a sectional mode, and the files without the sizes exceeding the threshold value are normally downloaded;

the lowest priority queue uses a large bucket, and the low token input rate is designed for the common download request received by the load server;

the capacity of a token bucket used by the medium priority queue is lower than that of a lowest priority token bucket, the input rate is similar to but slightly lower than that of the lowest priority token bucket, and the token bucket used by the medium priority queue is used for sharing the pressure of the token bucket of the low priority queue;

a high-limit token bucket should employ a keg, high token input rate, for handling urgent download requests.

Further, under normal conditions, except for the marked high-priority tasks, all tasks enter the token bucket through the low-priority queue, and after the tokens are received, the server processes the downloading request.

Furthermore, when the large flow is in a burst state, the tokens in the token bucket of the low-priority queue are emptied, the server is in a nearly full-load state, the overflowing request is blocked at the server, waiting for a period of time, and if a new token is added into the token bucket in the time interval, the overflowing request is corresponding in sequence;

if the overflow request is not responded to during this period, a timeout is triggered, an excess of http code429 requests are returned, and an expected wait time interval is calculated based on the low priority token bucket rate.

Further, under the condition of continuous large flow, the quantity of the downloading requests is lower than that of the sudden large flow, but the token bucket of the low-priority queue can still overflow and can continue for a period of time;

due to a large number of continuous requests, the overtime waiting time returned by the low-priority token bucket cannot be accurately estimated, and the priority can be increased to the medium priority after the secondary request of the overtime request still fails;

the medium priority has a higher token bucket generation rate, and can quickly share the requests which cannot be processed by the low-priority queue token bucket;

after the request rises from the low priority queue to the medium priority queue, the priority generally cannot continue to rise; the high priority queue is typically stalled and only important download requests, or pre-requests for important download requests, are processed.

A file download streaming apparatus, comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor is used for calling the machine readable program and executing a file downloading current limiting method.

Compared with the prior art, the file downloading current limiting method and device have the following outstanding beneficial effects:

the invention can correctly distribute the load of the server to different downloading requests, thereby greatly improving the stability of the server program in the trusted environment. The token bucket is optimized, the current limiting function of the token bucket and the load capacity of the instant large flow are kept, and the capacity of grading and processing the requests is coupled.

By prioritizing the download requests, the smoothness of the overall download task can be maintained, preventing traffic waste caused by high priority task congestion.

The token bucket and the priority queue are adjusted to adapt to different server environments, and the implementation method is simple and effective.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a basic schematic diagram of token bucket optimization in a file download throttling method;

fig. 2 is a basic schematic diagram of token bucket optimization in a file download throttling method.

Detailed Description

The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A preferred embodiment is given below:

as shown in fig. 1 and 2, in the current limiting method for file downloading in this embodiment, in actual development, a file downloading process generally includes a file storage system, a server downloading module, a current limiting module, and a client file receiving module. The file storage system stores files to be downloaded and ensures the reliability and stability of the whole file system; the downloading module is mainly matched with the client receiving module to finish the theme operation of file downloading, and the theme operation comprises special processing of sectional downloading of large files, breakpoint continuous transmission and the like; the current limiting module limits the flow of file downloading, and prevents the server and the file storage system from being impacted due to overlarge flow.

Common algorithms include fixed window algorithm, sliding window algorithm, leaky bucket algorithm, token bucket algorithm.

The fixed window algorithm takes the access amount in a time window as the basis of current limitation, and the counter is automatically reset every time when one time window passes. The problem with fixed windows is that if a large number of requests are flooded within a very short time before and after the window is reset, the server will be loaded twice as much during this time. For example, the fixed window resetting time interval is 1s, the maximum load in 1s is 1 ten thousand requests, and 1 ten thousand requests are received in the time of 100ms before the end of the current window and 100ms after the start of the next window respectively, which means that the server needs to process 2 ten thousand requests in 200ms, and the increase of the multiple is probably far larger than the load capacity of the server.

The sliding window solves the problem of the fixed window to a certain extent, the sliding window divides the fixed window into smaller intervals again, and the sum is kept unchanged in the time interval of the whole fixed window. Taking the time window as 1s as an example, the 1s window is equally divided into 10 small windows, sliding one grid forward every 0.1s, and the request amount received in the sliding small window of 0.1s is recycled during each sliding, so that the total request number in a large window is kept stable. The pain point of this algorithm is that the frequency of requests in real scenes cannot be controlled.

The basic idea of the leaky bucket algorithm is that traffic continues to enter the leaky bucket, the bottom paces the requests, and if traffic enters at a higher rate than the bottom requests and the traffic in the bucket exceeds the bucket size, the traffic will be overflowed. The leaky bucket algorithm is advanced strictly, and the processing at the bottom is performed at a constant speed no matter how high the request rate is, and is somewhat similar to the processing mechanism of a message queue. However, in the case of burst traffic, it is often desirable to stabilize the system and to process requests faster, rather than continue to work on schedule.

The algorithm of the token bucket is similar to that of the leaky bucket, except that the leaky bucket is processed at the bottom at a constant speed, the token bucket packs tokens into the bucket at a constant speed, and the request can be processed by the server only when the tokens are taken. For a sudden large flow situation, the smoothing process can be performed as long as the flow does not exceed the total capacity of the token bucket. But the token bucket lacks a priority mechanism.

The method comprises the following specific steps:

s1, token bucket configuration;

the token bucket algorithm is set up primarily for two reasons, one is the token bucket size, i.e., the bucket capacity, which is determined by the maximum instantaneous traffic that the server can load, and the capacity of the bucket determines the size of the flow limit when the token bucket is full.

The second is the rate at which tokens are put into the token bucket, which is mainly used to limit the average request rate.

S2, configuring a priority queue;

the token bucket is optimized for use with the priority queue. The priority queue is established on a normal queue basis. The priority queue keeps the first-in first-out characteristic of a common queue, after the server obtains the request, the server firstly enters the priority queue, and after the request is dequeued, the server enters token buckets corresponding to different priorities. The higher the priority, the smaller the corresponding token bucket size and the faster the rate at which tokens are placed.

S3, file downloading segmentation;

when the token bucket runs, one request corresponds to one token, and the maximum data volume of single downloading is limited, so that the aim of controlling the flow is fulfilled.

S4, storing the file of the downloading request in a minio cluster;

minio is a distributed file system, which uses erasure codes to protect data, divides data into segments, expands and codes redundant data blocks, stores the redundant data blocks at different positions, can be understood as splitting an object file, and then codes the object file, so as to prevent any two data from being lost.

After receiving the request of the segmented downloading, the server acquires a segment of data from the minio and caches the segment of data to the local so as to reduce the load pressure of the minio.

The segmented downloading is realized by setting a threshold value, the files which are downloaded and exceed the threshold value need to be segmented, and other files can be normally downloaded.

The lowest priority queue uses a large bucket, and the low token input rate is designed and mainly used for common downloading requests received by a load server. The low token input rate is that the token bucket token put rate should still satisfy the download request under normal conditions relative to the other two high priority queues.

The medium priority queue should use a lower capacity token bucket than the lowest priority token bucket and the input rate should be similar to but slightly lower than the lowest priority token bucket. The token bucket used by the medium priority queue is mainly used for sharing the pressure of the token bucket of the low priority queue.

A high finite token bucket should employ a small bucket, high token input rate. Mainly used for processing urgent download requests.

Under normal conditions, all tasks except the marked high-priority task enter a token bucket through a low-priority queue, and after the token is received, a server processes a downloading request.

When the large flow is sudden, the token bucket of the low priority queue is emptied of tokens, and the server is in a nearly full load state. The overflowing request is blocked at the server, waiting for a period of time, and if a new token is added into the token bucket in the time interval, the overflowing request is corresponding in sequence; if the overflow request is not responded to during this period, a timeout is triggered, an excess of http code429 requests are returned, and an expected wait time interval is calculated based on the low priority token bucket rate.

In this case the number of download requests is less than the bursty large flow, but still the token bucket of the low priority queue can overflow and last for some time.

In this case, due to a large number of continuous requests, the timeout waiting time returned by the low-priority token bucket cannot be accurately estimated, and the priority is increased to the medium priority after the timeout request secondary request still fails. The medium priority has a higher token bucket generation rate, can quickly share the requests which cannot be processed by the low priority queue token bucket, but because the size of the token bucket of the medium priority queue is smaller, the total number of the loaded requests cannot be overlarge, the health of the server can be guaranteed, and the current limiting effect is achieved.

Requests generally do not continue to increase in priority after they have risen from the low priority queue to the medium priority queue. The high priority queue is typically empty and only handles important download requests, or pre-requests for important download requests.

the at least one memory to store a machine readable program;

The above embodiments are only specific examples, and the scope of the present invention includes, but is not limited to, the above embodiments, and any suitable modifications or substitutions that are consistent with the claims of the method and apparatus for limiting file download and the device for limiting current of the present invention and can be made by those skilled in the art are all within the scope of the present invention.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A method for limiting current of file downloading is characterized in that first, token bucket configuration is carried out, the size of the token bucket and the rate of tokens put into the token bucket are set, then priority queue configuration and file downloading segmentation are carried out, a file of a downloading request is stored in a minio cluster, and after a server receives a request of segmented downloading, a piece of data is obtained from the minio and cached to the local.

2. The method of claim 1, wherein the size of the token bucket, i.e. the capacity of the token bucket, is determined by the maximum instantaneous traffic that can be loaded by the server, and when the token bucket is full, the capacity of the token bucket determines the size of the throttling;

3. The method for limiting file download according to claim 2, wherein when configuring the priority queue, the server first enters the priority queue after obtaining the request, and enters the token buckets corresponding to different priorities after dequeuing;

4. The file download throttling method of claim 3, wherein during the file download segment, when the token bucket is running, one request corresponds to one token, and the maximum data amount of a single download is limited to control the total flow.

5. The method for limiting file download according to claim 4, wherein the downloading of the file exceeding the threshold size needs to be segmented by setting a threshold value during the segmented download, and the file does not normally download the file exceeding the threshold value;

6. The method of claim 5, wherein normally, all tasks except the marked high priority task enter the token bucket through a low priority queue, and after the token is picked up, the server processes the download request.

7. The method of claim 6, wherein when a large flow is suddenly generated, tokens in the token bucket of the low priority queue are emptied, the server is in a nearly full load state, the overflow request is blocked at the server, the server waits for a period of time, and if a new token is added to the token bucket within the time interval, the overflow request is ordered;

if the overflowed request is not responded to during this period, a timeout is triggered, an excess of http code429 requests are returned, and an expected wait interval is calculated based on the low priority token bucket rate.

8. The method according to claim 7, wherein in the case of a continuous large flow rate, the number of download requests is lower than the sudden large flow rate, but the token bucket of the low priority queue still overflows and lasts for a while;

due to a large number of continuous requests, the overtime waiting time returned by the low-priority token bucket cannot be accurately estimated, and the priority is increased to the medium priority after the secondary request of the overtime request still fails;

after the request is raised from the low priority queue to the medium priority queue, the priority generally cannot be raised continuously; the high priority queue is typically stalled and only important download requests, or pre-requests for important download requests, are processed.

9. A file download streaming apparatus, comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor, configured to invoke the machine readable program to perform the method of any of claims 1 to 7.