CN107704322B

CN107704322B - Request distribution method and device

Info

Publication number: CN107704322B
Application number: CN201710944087.1A
Authority: CN
Inventors: 许俊豪; 曲仲元; 金梦
Original assignee: Shanghai Elephant Jintai Technology Co Ltd
Current assignee: Shanghai Elephant Jintai Technology Co Ltd
Priority date: 2017-09-30
Filing date: 2017-09-30
Publication date: 2020-08-25
Anticipated expiration: 2037-09-30
Also published as: CN107704322A

Abstract

The invention discloses a request distribution method and device, and belongs to the technical field of internet. The method comprises the following steps: acquiring the sum of predicted processing time consumption of k to-be-processed requests in a request queue; calculating the slicing time corresponding to each thread in the n threads according to the sum of the predicted processing time consumption of the k to-be-processed requests and the number n of threads for executing the k to-be-processed requests; and respectively allocating the requests to be processed for each thread in the n threads according to the slicing time corresponding to each thread in the n threads, wherein the number of the requests to be processed allocated for the ith thread in the n threads is p, and the sum of the expected processing time consumption of the p requests to be processed is not more than the slicing time corresponding to the ith thread. By calculating the slicing time corresponding to each thread and taking the slicing time as a basis for distributing the to-be-processed requests for each thread, the total time consumption for processing all the to-be-processed requests in the request queue is reduced, and the overall processing efficiency is improved.

Description

Request distribution method and device

Technical Field

The embodiment of the invention relates to the technical field of internet, in particular to a request distribution method and a request distribution device.

Background

The interface is a channel used by the service providing device to provide resources and information to the service requesting device, and the interface request is a request sent by the service requesting device to the service providing device to obtain its internal resources and information. When a large number of interface requests are highly concurrent, how to process the interface requests more efficiently often depends on the method of allocating the interface requests.

In the related art, a method of starting multiple threads to process interface requests in parallel is mostly adopted, so that multiple interface requests can be processed at the same time, and the time consumption for processing a large number of interface requests is shortened. When allocating interface requests to a plurality of threads, the related art typically allocates the same number of interface requests to each thread.

However, different interface requests are used for acquiring different resources and information, and processing time consumption of different interface requests is different, and by the allocation method, some threads are easily delayed to be processed for a longer time than other threads due to being allocated to the to-be-processed requests which take longer time, so that the overall processing efficiency of the request queue is reduced.

Disclosure of Invention

The embodiment of the invention provides a request distribution method and a request distribution device, which can be used for solving the problems of long time consumption and low efficiency in processing interface requests due to the lack of a reasonable distribution method in the related art. The technical scheme is as follows:

according to a first aspect of the embodiments of the present invention, there is provided a request allocation method, including:

acquiring the sum of predicted processing time consumption of k to-be-processed requests in a request queue, wherein k is an integer larger than 1;

calculating the slicing time corresponding to each thread in the n threads according to the sum of the predicted processing time consumption of the k to-be-processed requests and the number n of threads for executing the k to-be-processed requests, wherein the n is an integer greater than 1;

and respectively allocating a to-be-processed request to each thread of the n threads according to the slicing time corresponding to each thread of the n threads, wherein the number of the to-be-processed requests allocated to the ith thread of the n threads is p, the sum of the predicted processing time consumption of the p to-be-processed requests is not more than the slicing time corresponding to the ith thread, p is a positive integer smaller than k, and i is a positive integer smaller than or equal to n.

In a possible implementation manner, the calculating, according to a sum of predicted processing time consumptions of the k pending requests and a number n of threads for executing the k pending requests, a slice time corresponding to each of the n threads includes:

acquiring the average value of the timeout rates of the k requests to be processed, wherein the timeout rate of the requests to be processed refers to the probability of the requests to be processed being overtime;

if the average value is equal to 0, dividing the sum of the predicted processing time consumption of the k to-be-processed requests by the number n of the threads to obtain the slicing time;

if the average value is larger than 0, calculating the product of the number n of the threads and a first coefficient a, and dividing the sum of the predicted processing time consumption of the k to-be-processed requests by the product to obtain the slicing time, wherein a is larger than 1.

In another possible implementation, the obtaining an average value of the timeout rates of the k pending requests includes:

obtaining historical statistical data of each type of request, wherein the historical statistical data of the target type of request comprises the following steps: a statistical number of the target type of requests within a historical statistical period, and an actual processing time consumption of each of the target type of requests;

respectively determining the timeout rate of each type of request according to the historical statistical data of each type of request and the timeout threshold of each type of request, wherein the timeout rate of the target type of request is equal to the quotient of the timeout quantity of the target type of request and the statistical quantity, and the timeout quantity of the target type of request refers to the quantity of requests which are in the target type of request and have the actual processing time consumption larger than the timeout threshold of the target type of request;

and calculating the average value of the timeout rates of the k requests to be processed according to the respective types of the k requests to be processed and the timeout rate of each type of request.

In another possible implementation, the historical statistics of the requests of the target type further include: each request in the target type requests a corresponding equipment load state, wherein the equipment load state comprises a normal load state and a high load state;

the timeout rate for requests of the target type includes: a timeout rate of the target type of request in the normal load state, and a timeout rate of the target type of request in the high load state.

In another possible implementation manner, the allocating a pending request to each of the n threads according to the slice time corresponding to each of the n threads respectively includes:

dividing the k to-be-processed requests into w groups, wherein each group comprises a plurality of to-be-processed requests, the minimum value of the predicted processing time consumption of all to-be-processed requests in the jth group in the w groups is larger than or equal to the maximum value of the predicted processing time consumption of all to-be-processed requests in the j +1 th group, w is an integer larger than 1, and j is a positive integer smaller than w;

starting from the 1 st packet in the w packets, one pending request is taken out of each packet in turn and allocated to the first thread, after a pending request extracted from the u-th packet is allocated to the first thread, if the sum of the expected processing time consumption of the pending requests allocated to the first thread is greater than the slicing time corresponding to the first thread, withdrawing said one pending request taken from the u-th packet and taking a pending request from the u + 1-th packet to be allocated to said first thread, and so on until a pending request taken from the last of the w packets is assigned to the first thread or retired after being assigned to the first thread, and u is a positive integer smaller than w, wherein the first thread is any one of the n threads.

In another possible implementation manner, after allocating a pending request to each of the n threads according to the slice time corresponding to each of the n threads, the method further includes:

when a target thread exists in the n threads, calculating an increase value of slice time corresponding to other threads except the target thread in the n threads, wherein the target thread refers to a thread of a to-be-processed request which is distributed to the to-be-processed request and has a probability of overtime;

and allocating the requests to be processed for the other threads according to the increment values of the slicing time corresponding to the other threads.

In another possible implementation, the calculating the increased value of the slice time corresponding to the other threads except the target thread in the n threads includes:

calculating the increment value delta T of the slice time corresponding to the threads except the target thread in the n threads according to the following formula:

wherein m is the number of the requests to be processed with the probability of overtime among the requests to be processed which have been allocated in the n threads, and m is a positive integer less than or equal to k; the delta t is a preset buffer value of each request to be processed in the m requests to be processed; the timeout is the timeout rate of each of the m pending requests; a is a first coefficient; and x is the number of the target threads.

In another possible embodiment, the method further comprises:

when a second thread in the n threads finishes processing the to-be-processed request distributed to the second thread, detecting whether an unallocated to-be-processed request still exists in the request queue;

and if the request queue has the unallocated request to be processed, allocating the request to be processed for the second thread according to the slicing time corresponding to the second thread.

According to a second aspect of embodiments of the present invention, there is provided a request distribution apparatus, the apparatus including:

the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring the sum of predicted processing time consumption of k to-be-processed requests in a request queue, and k is an integer larger than 1;

a calculating module, configured to calculate a slicing time corresponding to each thread of the n threads according to a sum of predicted processing time consumptions of the k to-be-processed requests and a number n of threads for executing the k to-be-processed requests, where n is an integer greater than 1;

the allocation module is configured to allocate a to-be-processed request to each of the n threads according to the slicing time corresponding to each of the n threads, where the number of to-be-processed requests allocated to an ith thread of the n threads is p, a sum of predicted processing time consumptions of the p to-be-processed requests is not greater than the slicing time corresponding to the ith thread, p is a positive integer smaller than k, and i is a positive integer smaller than or equal to n.

In one possible implementation, the calculation module includes:

the acquiring unit is used for acquiring the average value of the timeout rates of the k requests to be processed, wherein the timeout rate of the requests to be processed refers to the probability of the requests to be processed being overtime;

a calculating unit, configured to divide a sum of predicted processing time consumptions of the k to-be-processed requests by the number n of threads to obtain the slicing time when the average value is equal to 0;

the calculating unit is further configured to calculate a product of the number n of threads and a first coefficient a when the average value is greater than 0, and divide a sum of predicted processing time consumptions of the k to-be-processed requests by the product to obtain the slicing time, where a is greater than 1.

In another possible implementation, the obtaining unit includes:

an obtaining subunit, configured to obtain historical statistical data of each type of request, where the historical statistical data of the target type of request includes: a statistical number of the target type of requests within a historical statistical period, and an actual processing time consumption of each of the target type of requests;

a calculating subunit, configured to determine a timeout rate of each type of request according to historical statistical data of each type of request and a timeout threshold of each type of request, where the timeout rate of the target type of request is equal to a quotient of the timeout number of the target type of request and the statistical number, and the timeout number of the target type of request is the number of requests in the target type of request, where the actual processing time consumption is greater than the timeout threshold of the target type of request;

the calculating subunit is further configured to calculate an average value of the timeout rates of the k to-be-processed requests according to the respective types of the k to-be-processed requests and the timeout rate of each type of request.

In another possible embodiment, the allocation module includes:

the dividing unit is used for dividing the k to-be-processed requests into w groups, wherein each group comprises a plurality of to-be-processed requests, the minimum value of the predicted processing time consumption of all to-be-processed requests in the jth group in the w groups is larger than or equal to the maximum value of the predicted processing time consumption of all to-be-processed requests in the jth +1 group, w is an integer larger than 1, and j is a positive integer smaller than w;

In a further possible embodiment of the method according to the invention,

the computing module is further configured to compute, when a target thread exists in the n threads, an increased value of slice time corresponding to threads other than the target thread in the n threads, where the target thread is a thread of a to-be-processed request that is allocated to the to-be-processed request and has a probability of being overtime;

the allocation module is further configured to allocate the to-be-processed request to the other threads according to the added value of the slicing time corresponding to the other threads.

In another possible implementation, the calculation module is configured to:

In another possible embodiment, the apparatus further comprises: a detection module;

the detection module is configured to detect whether an unallocated pending request still exists in the request queue when a second thread of the n threads completes processing of a pending request allocated to the second thread;

the allocation module is further configured to allocate the pending request to the second thread according to the slice time corresponding to the second thread when the pending request that is not allocated still exists in the request queue.

According to a third aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the request distribution method according to the first aspect.

The technical scheme provided by the embodiment of the invention has the following beneficial effects:

the slicing time corresponding to each thread is calculated according to the sum of the expected processing time consumption of all the to-be-processed requests in the request queue and the number of the threads for executing the to-be-processed requests, and the slicing time is used as a basis for distributing the to-be-processed requests to each thread, so that all the threads can process and finish the distributed to-be-processed requests in the same or similar time, the situation that some threads are distributed to the to-be-processed requests with longer time consumption and are processed for a longer time than other threads is avoided, the total time consumption for processing all the to-be-processed requests in the request queue is reduced, and the overall processing efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a diagram illustrating a request processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a request distribution method provided by an embodiment of the invention;

FIG. 3A is a diagram illustrating a normal distribution of historical statistics provided by an embodiment of the present invention;

FIG. 3B is a schematic diagram of calculating the expected processing time provided by one embodiment of the present invention;

FIG. 4 is another flow chart of a request distribution method provided by an embodiment of the invention;

FIG. 5 is a flow chart of a request distribution method provided by another embodiment of the present invention;

FIG. 6 is a schematic diagram of an allocation buffer provided by another embodiment of the present invention;

FIG. 7 is a block diagram of a request distribution apparatus provided by one embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

As shown in fig. 1, it shows a schematic diagram of a request processing method provided by an embodiment of the present invention. Firstly, creating a preset number of threads for executing the requests to be processed, and distributing the requests to be processed existing in the request queue to each thread according to the request distribution method provided by the embodiment of the invention; subsequently, after a certain thread or some threads finish processing the pending requests allocated to the certain thread or some threads, if there are pending requests that are not allocated in the request queue, the remaining pending requests are allocated to the threads that have been processed and completed again according to the request allocation method provided by the embodiment of the present invention, and for the threads that have not been processed and completed, a timeout protection mechanism is adopted, and no new pending request is allocated to the threads that have not been processed and completed. In addition, after each of the pending requests is executed, the execution information of each of the pending requests is uploaded to and stored in the database, and the execution information includes the actual processing time consumption of each of the pending requests in the current processing, the device load status when the request is processed this time, and the like.

In the embodiment of the invention, the slicing time corresponding to each thread is calculated according to the sum of the expected processing time consumption of all the to-be-processed requests in the request queue and the number of the threads for executing the to-be-processed requests, and the slicing time is used as a basis for distributing the to-be-processed requests to each thread, so that all the threads can process and finish the distributed to-be-processed requests in the same or similar time, and the condition that some threads are distributed to the to-be-processed requests with longer time consumption and are processed for a longer time than other threads is avoided, thereby reducing the total time consumption of processing all the to-be-processed requests in the request queue and improving the overall processing efficiency.

The request distribution method provided by the embodiment of the invention can be completed by a computer device. The computer device includes a Central Processing Unit (CPU), and can distribute and process requests sent by various service request devices. Illustratively, the computer device is a server.

The embodiments of the present invention will be described in further detail below based on the common aspects related to the embodiments of the present invention described above.

Referring to fig. 2, a flowchart of a request distribution method according to an embodiment of the present invention is shown. As shown in fig. 2, the method may include:

step 201, obtain the sum of predicted processing time consumption of k pending requests in the request queue.

Wherein k is an integer greater than 1. The predicted processing time means the processing time (T in the present embodiment) predicted for each pending request_qRepresentation). The estimated processing time of each of the k pending requests is accumulated to obtain a total of the estimated processing times of the k pending requests. The pending request may be an interface request, where the interface is a channel used by the service providing device to provide resources and information to the service requesting device, and the interface request refers to a request sent by the service requesting device to the service providing device to obtain internal resources and information thereof.

In one possible implementation, the expected processing time of each pending request in the request queue is obtained by:

first, historical statistics are obtained for each type of request in the k pending requests.

Each pending request has its corresponding type, but the expected processing time is the same for pending requests of the same type. Optionally, the pending requests are divided into a plurality of types according to the difference of the resources or information used for requesting to obtain the pending requests. In the present embodiment, any one of the types to which the k pending requests belong is referred to as a "target type". For a target type of request, its historical statistics include: the statistical number of requests of the target type within the historical statistical period, and the actual processing time of each of the requests of the target type. Optionally, the historical statistical period is 2 to 4 weeks. The statistical number of the requests of the target type in the historical statistical period is the total number of the requests of the target type processed in the historical statistical period; the actual processing time consumption of each of the target type requests is the actual time consumption of each target type request being processed within the historical statistics period.

Secondly, a rectangular coordinate system is established, and historical statistical data of the request of the target type are displayed in the rectangular coordinate system.

The horizontal axis of the rectangular coordinate system corresponds to the actual processing time consumption of each target type of request, and the vertical axis of the rectangular coordinate system corresponds to the number of the target type of requests which are the same as the actual processing time consumption. Historical statistical data of a request of a target type is firstly displayed in a constructed rectangular coordinate system in the form of hash points, and a normal distribution curve as shown in fig. 3A can be drawn according to the distribution of the hash points.

Thirdly, calculating the expected processing time consumption of the target type request according to the historical statistical data of the target type request.

As shown in fig. 3B, the first 5% of hash points and the last 5% of hash points in the rectangular coordinate system are removed, and according to the historical statistical data corresponding to the remaining 90% of hash points, an average value of actual processing time consumption of the 90% of hash points is calculated, and the average value is used as the expected processing time consumption of the target type of request.

In this way, the expected processing time of each of the k pending requests can be obtained. For example, see Table 1, which shows one possible scenario where the expected processing of k pending requests is time consuming.

TABLE 1

Pending request	Expected process time(s)
		1 st one	0.25
2 nd (a)	0.29
		3 rd one	0.35
The 4 th	0.20
		···	···
The k is	0.29

Step 202, calculating the slicing time corresponding to each thread of the n threads according to the sum of the predicted processing time consumption of the k to-be-processed requests and the number n of threads for executing the k to-be-processed requests.

Wherein n is an integer greater than 1. A thread is the smallest unit of program execution that is independently scheduled and dispatched by the system, and in this embodiment is used to execute pending requests in a request queue. The slicing time is the expected maximum time consumption for a single thread to perform a single request processing task, and is also the basis for allocating pending requests to each thread in the following.

In a possible embodiment, the step specifically includes:

first, the average of the timeout rates for k pending requests is obtained.

The timeout rate (denoted by timeout in this embodiment) of the pending requests refers to the probability of the pending requests being timed out, and the timeout rates are the same for the pending requests belonging to the same type. Specifically, according to the historical statistical data of each type of request and the timeout threshold of each type of request, the timeout rate of each type of request may be determined, where the timeout rate of the target type of request is equal to the quotient of the timeout number of the target type of request and the statistical number, and the timeout number of the target type of request is the number of requests, in the target type of request, whose actual processing time is greater than the timeout threshold of the target type of request. In this embodiment, the timeout threshold of the target type request may be a product of the longest actual running time corresponding to 90% of the hash points in the target type request and the second coefficient b, or may be a product of the expected processing time of the target type request and a third coefficient c, where c is greater than b in general, and this embodiment does not limit this. Illustratively, the statistical number of the requests of the target type in the historical statistical period is 100, and the actual processing time consumption of 3 of the 100 requests is greater than the timeout threshold of the requests of the target type, so that the timeout rate of the requests of the target type is 3%.

Then, an average value Timeout of the Timeout rates of the k pending requests is calculated according to the respective types of the k pending requests and the Timeout rate of each type of request. See in particular the following formula:

secondly, if the average value is equal to 0, it indicates that there is no pending request with a probability of timeout, the pending request may be averagely allocated to each thread, that is, the sum of the expected processing time consumption of the k pending requests is divided by the number n of threads, so as to obtain the slice time assign time. See in particular the following formula:

thirdly, if the average value is greater than 0, it indicates that there is a pending request with a probability of occurring timeout in the request queue, and in order to avoid that the same thread is allocated with more pending requests with a probability of occurring timeout, it is necessary to shorten the slicing time corresponding to each thread, therefore, the product of the number n of threads and the first coefficient a is calculated first, and then the sum of the expected processing time consumptions of k pending requests is divided by the product, so as to obtain the slicing time assigntime. See in particular the following formula:

in general, a is greater than 1, and a is positively correlated with the average of the timeout rates of the k pending requests. See in particular table 2, which shows one possible case of the relationship between the first coefficient a and the average of the timeout rates of k pending requests.

TABLE 2

Average value of timeout rate	First coefficient a
		0.01％-5％	2
5-10％	3
		10-20％	4

Optionally, before the slicing time is calculated, the device load status is determined, and the slicing time corresponding to each thread is calculated according to the device load status. Wherein the device load state includes a normal load state and a high load state. Optionally, the average load rate of the device in a preset period is obtained through a program command "15 load average", and if the average load rate is less than or equal to the sum of the cores of each CPU in the device, the device is in a normal load state, otherwise, the device is in a high load state. The timeout rate for each pending request chosen when calculating the slicing time is also different for different device load states, and therefore, the historical statistics for each type of request (e.g., the historical statistics for the target type of request) further includes: each of the respective types of requests a corresponding device load status. From these device load states, the timeout rate in the normal load state and the timeout rate in the high load state for each type of request can be calculated, respectively, thereby obtaining the slicing time in the normal load state and the slicing time in the high load state.

Step 203, allocating a pending request to each of the n threads according to the slicing time corresponding to each of the n threads.

As can be seen from the above description of the slice time obtained in step 202, the slice time calculated in step 202 for each thread is the same.

In a possible embodiment, the step specifically includes:

dividing the k to-be-processed requests into w groups, wherein each group comprises a plurality of to-be-processed requests, w is an integer larger than 1, the minimum value of the predicted processing time consumption of all to-be-processed requests in the jth group in the w groups is larger than or equal to the maximum value of the predicted processing time consumption of all to-be-processed requests in the j +1 th group, and j is a positive integer smaller than w. In one possible grouping manner, the pending requests with the same processing time consumption in the k pending requests are divided into one group. Optionally, to facilitate grouping, the k to-be-processed requests are first arranged in the request queue according to a descending order of the expected processing time consumption, and then the arranged k to-be-processed requests are divided into w groups.

In the present embodiment, any one of the n threads is referred to as a "first thread". In the process of allocating the pending requests to the first thread, starting from the 1 st group in the w groups, one pending request is taken out of each group in turn and allocated to the first thread, after one pending request taken out of the u groups is allocated to the first thread, if the sum of the expected processing time consumption of the allocated pending requests in the first thread is greater than the slicing time corresponding to the first thread, one pending request taken out of the u groups is withdrawn, one pending request is taken out of the u +1 th group and allocated to the first thread, and so on, until one pending request taken out of the last group in the w groups is allocated to the first thread or is allocated to the first thread and withdrawn, wherein u is a positive integer less than w. Illustratively, the number of the pending requests already allocated to the first thread is p, and the sum of the expected processing time consumptions of the p pending requests is not greater than the slicing time corresponding to the first thread, when the p +1 th pending request is allocated to the first thread, if the sum of the expected processing time consumptions of the p +1 pending requests is greater than the slicing time corresponding to the first thread, the pending request is not allocated to the first thread, and one pending request is taken from the next group of the group to which the pending request belongs, allocated to the first thread, and so on, until the pending request in the w group is allocated to the first thread, the sum of the expected processing time consumptions of the pending requests allocated to the first thread is still greater than the slicing time corresponding to the first thread, which indicates that the first thread is allocated, and the pending request required to be executed by the first thread is allocated to the pending request in the w group before being allocated All pending requests. Optionally, if after allocating the pending request in the w-th group to the first thread, the sum of the expected processing time consumptions of the pending requests allocated in the first thread is still less than the slicing time corresponding to the first thread, the above steps of starting from the 1 st group, taking out one pending request from each group, and allocating the one pending request to the first thread are repeated. p is a positive integer less than k.

For each of the n threads, the assignment is made in the manner described above for assigning pending requests to the first thread.

It should be noted that, when the second thread of the n threads finishes processing the pending request allocated to the second thread, the device may detect whether there is an unallocated pending request in the request queue, where the second thread is any one of the n threads. And if the request queue has the unallocated requests to be processed, continuously allocating the requests to be processed for the second thread according to the slicing time corresponding to the second thread.

As shown in fig. 4, another flow chart of the request distribution method according to an embodiment of the present invention is exemplarily shown. Firstly, creating n threads for executing the requests to be processed, and acquiring the number of the requests to be processed in a request queue; secondly, dividing the to-be-processed requests in the request queue into a plurality of groups, and distributing the to-be-processed requests for the request queue according to the slicing time corresponding to the spare thread as long as the request queue is detected to have the unallocated to-be-processed requests; then, after one of the n threads finishes executing the allocated to-be-processed request, re-detecting whether an unallocated to-be-processed request still exists in the request queue, and if so, continuing to allocate the to-be-processed request to the thread; and if the request queue does not have the unallocated request to be processed, waiting for the completion of the processing of other threads.

In summary, according to the method provided in the embodiment of the present invention, the slicing time corresponding to each thread is obtained by calculating according to the sum of the expected processing time consumption of all the pending requests in the request queue and the number of threads for executing the pending requests, and the slicing time is used as a basis for allocating the pending requests to each thread, so that all the threads can process and complete the allocated pending requests in the same or similar time, and a situation that some threads delay processing for a longer time than other threads due to being allocated to pending requests that consume longer time is avoided, thereby reducing the total time consumption for processing all the pending requests in the request queue and improving the overall processing efficiency.

Referring to fig. 5, a flowchart of a request distribution method according to another embodiment of the present invention is shown. As shown in fig. 5, the method may include:

step 501, obtain the sum of predicted processing time consumption of k pending requests in the request queue.

Wherein k is an integer greater than 1.

Step 502, calculating the slicing time corresponding to each thread of the n threads according to the sum of the predicted processing time consumption of the k pending requests and the number n of threads for executing the k pending requests.

Wherein n is an integer greater than 1.

Step 503, allocating a pending request to each of the n threads according to the slicing time corresponding to each of the n threads.

Step 504, when there is a target thread in the n threads, calculating the increment of the slice time corresponding to the threads except the target thread in the n threads.

The target thread is a thread of a pending request with a probability of timeout, which is distributed to the pending requests, wherein the pending request with the probability of timeout is a pending request with a timeout rate greater than a timeout rate threshold. Typically, the timeout rate threshold is 5%.

Optionally, the increase value Δ T of the slice time corresponding to the other threads except the target thread in the n threads is calculated according to the following formula:

wherein m is the number of the requests to be processed with the probability of overtime among the requests to be processed which are distributed in the n threads, and m is a positive integer less than or equal to k; delta t is a preset buffer value of each request to be processed in the m requests to be processed; timeout is the timeout rate of each of the m pending requests; a is a first coefficient; x is the number of target threads.

Δ t may be a value that is set in advance by a developer and stored in the device, or a value calculated according to a certain algorithm. Optionally, when Δ t is a value calculated according to a certain algorithm, Δ t is a difference between a timeout threshold of each of the m pending requests and a longest actual running time corresponding to 90% of the hash points in each of the pending requests.

And 505, allocating the to-be-processed requests to other threads according to the increment values of the slicing time corresponding to other threads.

In the process of allocating the pending requests to any one of the other threads, starting from the 1 st packet in the w packets, taking one pending request out of each packet, allocating the pending request to the thread until the sum of the predicted processing time consumption of the allocated pending requests in the thread is greater than the sum of the slice time corresponding to the thread and the increment value of the slice time, stopping allocating the pending requests to the thread, and withdrawing the last pending request allocated to the thread. The specific allocation process can be seen in step 203 in the above embodiment.

If there is a pending request to which a thread is assigned with a probability of occurrence of timeout, the execution is started again from step 504, and the increment of the slice time corresponding to this time is calculated for all the threads (including the target thread) except the thread to which the pending request with the probability of occurrence of timeout is assigned from among the n threads.

In step 506, when the second thread of the n threads finishes processing the pending request allocated to the second thread, it is detected whether there is any pending request allocated in the request queue.

The second thread is any one of the n threads. If there is an unallocated pending request in the request queue, execute step 507; and if the request queue does not have the unallocated request to be processed, the execution step waits for the processing of other threads to be completed.

And 507, distributing the request to be processed for the second thread according to the slicing time corresponding to the second thread.

The slicing time corresponding to the second thread is the slicing time without the added value.

Note that, in this step, when the second thread is assigned to the pending request for which the timeout is likely to occur, the execution is started again from step 504, and the increment of the slice time corresponding to the thread other than the second thread among the n threads is calculated.

For details not disclosed in the embodiment of fig. 5, please refer to the embodiment of fig. 2.

In addition, by allocating the added value of the slice time corresponding to the threads except the target thread, all threads can be further ensured to process and finish the allocated pending requests at the same time.

The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.

Referring to fig. 7, a block diagram of a request distribution apparatus according to an embodiment of the present invention is shown. The device has the functions of realizing the method examples, and the functions can be realized by hardware or by hardware executing corresponding software. The apparatus may include: an acquisition module 710, a calculation module 720, and an assignment module 730.

An obtaining module 710, configured to obtain a total of predicted processing time consumptions of k to-be-processed requests in a request queue, where k is an integer greater than 1;

a calculating module 720, configured to calculate a slicing time corresponding to each thread of the n threads according to a sum of predicted processing time consumptions of the k to-be-processed requests and a number n of threads for executing the k to-be-processed requests, where n is an integer greater than 1;

an allocating module 730, configured to allocate to-be-processed requests to each of the n threads according to the slicing time corresponding to each of the n threads, where the number of to-be-processed requests allocated to an ith thread of the n threads is p, a sum of predicted processing time consumptions of the p to-be-processed requests is not greater than the slicing time corresponding to the ith thread, p is a positive integer smaller than k, and i is a positive integer smaller than or equal to n.

In summary, the apparatus provided in the embodiment of the present invention calculates the slicing time corresponding to each thread according to the sum of the expected processing time consumption of all pending requests in the request queue and the number of threads for executing the pending requests, and uses the slicing time as a basis for allocating the pending requests to each thread, so that all threads can process and complete the allocated pending requests in the same or similar time, and a situation that some threads delay processing for a longer time than other threads due to being allocated to pending requests that consume longer time is avoided, thereby reducing the total time consumption for processing all pending requests in the request queue and improving the overall processing efficiency.

In an optional embodiment provided based on the embodiment of fig. 7, the calculating module 720 includes:

In another optional embodiment provided on the basis of the embodiment of fig. 7, the obtaining unit includes:

In another optional embodiment provided based on the embodiment of fig. 7, the historical statistics of the requests of the target type further includes: each request in the target type requests a corresponding equipment load state, wherein the equipment load state comprises a normal load state and a high load state;

In another optional embodiment provided based on the embodiment of fig. 7, the allocating module 730 includes:

In another alternative embodiment provided based on the embodiment of figure 7,

the calculating module 720 is further configured to calculate, when a target thread exists in the n threads, an increased value of the slice time corresponding to other threads except the target thread in the n threads, where the target thread is a thread of a to-be-processed request that is allocated to the to-be-processed request and has a probability of occurrence of a timeout condition;

the allocating module 730 is further configured to allocate the pending requests to the other threads according to the added value of the slicing time corresponding to the other threads.

In another optional embodiment provided based on the embodiment of fig. 7, the calculating module 720 is configured to:

In another optional embodiment provided based on the embodiment of fig. 7, the apparatus further comprises: a detection module (not shown);

the allocating module 730 is further configured to, when an unallocated pending request still exists in the request queue, allocate the pending request to the second thread according to the slice time corresponding to the second thread.

It should be noted that: the request distribution device provided in the foregoing embodiment is only illustrated by dividing the functional modules, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the server is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiments of the request allocating apparatus and the request allocating method provided by the foregoing embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Referring to fig. 8, a schematic structural diagram of a computer device according to an embodiment of the present invention is shown. The computer device is used for implementing the request distribution method provided in the above embodiment. Specifically, the method comprises the following steps:

the computer device 800 includes a Central Processing Unit (CPU)801, a system memory 804 including a Random Access Memory (RAM)802 and a Read Only Memory (ROM)803, and a system bus 805 connecting the system memory 804 and the central processing unit 801. The computer device 800 also includes a basic input/output system (I/O system) 806, which facilitates transfer of information between devices within the computer, and a mass storage device 807 for storing an operating system 813, application programs 814, and other program modules 815.

The basic input/output system 806 includes a display 808 for displaying information and an input device 809 such as a mouse, keyboard, etc. for user input of information. Wherein the display 808 and the input device 809 are connected to the central processing unit 801 through an input output controller 810 connected to the system bus 805. The basic input/output system 806 may also include an input/output controller 810 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 810 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 807 is connected to the central processing unit 801 through a mass storage controller (not shown) connected to the system bus 805. The mass storage device 807 and its associated computer-readable media provide non-volatile storage for the computer device 800. That is, the mass storage device 807 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.

Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing.

The system memory 804 and mass storage 807 described above may be collectively referred to as memory. The memory also includes at least one instruction, at least one program, set of codes, or set of instructions, wherein the at least one instruction, at least one program, set of codes, or set of instructions is stored in the memory and configured to be executed by one or more processors to implement the request allocation method described above.

The computer device 800 may also operate as a remote computer connected to a network via a network, such as the internet, in accordance with various embodiments of the invention. That is, the computer device 800 may be connected to the network 812 through the network interface unit 811 coupled to the system bus 805, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 811.

In an exemplary embodiment, a computer readable storage medium is also provided, in which at least one instruction, at least one program, code set, or set of instructions is stored, which is loaded and executed by a processor of an electronic device to implement the above request distribution method. Alternatively, the computer-readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided for implementing the request distribution method described above when executed.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. As used herein, the terms "first," "second," and the like, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method of request distribution, the method comprising:

calculating the average value of the timeout rates of the k requests to be processed according to the respective types of the k requests to be processed and the timeout rate of each type of request, wherein the timeout rate of the requests to be processed refers to the probability of the requests to be processed being overtime;

if the average value is equal to 0, dividing the sum of the predicted processing time consumption of the k to-be-processed requests by the number n of threads to obtain the slicing time corresponding to each thread in the n threads, wherein the n is an integer greater than 1;

if the average value is larger than 0, calculating the product of the number n of the threads and a first coefficient a, and dividing the sum of the predicted processing time consumption of the k requests to be processed by the product to obtain the slicing time, wherein a is larger than 1;

2. The method of claim 1, wherein the historical statistics of the target type of request further comprise: each request in the target type requests a corresponding equipment load state, wherein the equipment load state comprises a normal load state and a high load state;

3. The method according to claim 1 or 2, wherein the allocating the pending request to each of the n threads according to the slice time corresponding to each of the n threads respectively comprises:

4. The method according to claim 1 or 2, wherein after allocating the pending request to each of the n threads according to the slicing time corresponding to each of the n threads, respectively, the method further comprises:

5. The method of claim 4, wherein calculating the increased value of the slice time corresponding to the other threads of the n threads except the target thread comprises:

6. The method according to claim 1 or 2, characterized in that the method further comprises:

7. A request distribution apparatus, characterized in that the apparatus comprises:

the obtaining module is further configured to obtain historical statistical data of each type of request, where the historical statistical data of the target type of request includes: a statistical number of the target type of requests within a historical statistical period, and an actual processing time consumption of each of the target type of requests;

a calculating module, configured to determine a timeout rate of each type of request according to historical statistical data of each type of request and a timeout threshold of each type of request, where the timeout rate of the target type of request is equal to a quotient of the timeout number of the target type of request and the statistical number, and the timeout number of the target type of request is the number of requests in the target type of request, where the actual processing time consumption is greater than the timeout threshold of the target type of request;

the calculating module is further configured to calculate an average value of timeout rates of the k to-be-processed requests according to respective types of the k to-be-processed requests and a timeout rate of each type of request, where the timeout rate of the to-be-processed requests refers to a probability that the to-be-processed requests are overtime;

the computing module is further configured to, when the average value is equal to 0, divide a sum of predicted processing time consumptions of the k to-be-processed requests by a number n of threads to obtain a slicing time corresponding to each of the n threads, where n is an integer greater than 1;

the calculating module is further configured to calculate a product of the number n of threads and a first coefficient a when the average value is greater than 0, and divide a sum of predicted processing time consumptions of the k to-be-processed requests by the product to obtain the slicing time, where a is greater than 1;

8. A computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the request distribution method of any one of claims 1 to 6.