RU2296362C1

RU2296362C1 - Method for servicing varying priority requests from users of computer system

Info

Publication number: RU2296362C1
Application number: RU2005129370/09A
Authority: RU
Inventors: Анатолий Афанасьевич Андриенко (RU); Анатолий Афанасьевич Андриенко; Валентин Эдуардович Гель (RU); Валентин Эдуардович Гель; Галина Сергеевна Колбасова (RU); Галина Сергеевна Колбасова; Роман Викторович Максимов (RU); Роман Викторович Максимов; Антон Владимирович Павловский (RU); Антон Владимирович Павловский; Геннадий Юрьевич Стародубцев (RU); Геннадий Юрьевич Стародубцев; Юрий Иванович Стародубцев (RU); Юрий Иванович Стародубцев
Original assignee: Военная академия связи
Priority date: 2005-09-20
Filing date: 2005-09-20
Publication date: 2007-03-27

Abstract

FIELD: digital communication systems and computer systems, possible use in automated information processing systems.

SUBSTANCE: in accordance to method, parameters of computing system are set, user requests are generated and received, a request servicing queue is formed in accordance to priorities of users on each server, calculated further is required computing resource for servicing received requests and difference ΔQ_p between maximal and required computing resources of p-numbered server, when ΔQ_p≥0 on each server requests from users attached to this server are serviced, and when ΔQ_p≤0, serviced is a portion of user requests, not exceeding maximal computing resource of server, remaining amount of requests of p-numbered server is compared to positive values of ΔQ_n of other servers, when ΔQ_n ≥ |ΔQ_p| excessive amount of requests of p-numbered server is transferred to n-numbered server, and when ΔQ_n < |ΔQ_p| on all servers excessive requests of p-numbered server are divided on parts ΔQ_i, compared serially is resource required for servicing ΔQ_p1, ΔQ_p2,...,ΔQ_pi to positive values of ΔQ_n of other servers and on basis of comparison results one or more parts ΔQ_pi of divided resource are transferred for service.

EFFECT: increased probability of timely servicing of requests of varying priority users within limits of dynamic range of possible alternation of amount of requests while simultaneously achieving economy of costs of computing resources of servicing devices.

3 cl, 6 dwg

Description

Изобретение относится к области цифровых систем связи и вычислительных систем и может быть применено в автоматизированных системах обработки информации и управления с целью обслуживания разноприоритетных запросов пользователей.The invention relates to the field of digital communication systems and computer systems and can be applied in automated information processing and control systems in order to serve users of different priorities.

Заявленное техническое решение расширяет арсенал средств данного назначения.The claimed technical solution expands the arsenal of funds for this purpose.

Известны способы обслуживания запросов, реализованные, например, в устройстве по Авт. св. СССР №1441398 "Многоканальное устройство динамического приоритета", МПК G 06 F 9/46, опубликованное 30.11.88, где повышение приоритета запроса осуществляется линейно через определенные интервалы времени; и по Авт. св. СССР №1562912 "Многоканальное устройство с динамическим изменением приоритета", МПК G 06 F 9/46, опубликованное 07.05.90, где повышение приоритета запроса происходит после каждого обслуживания заявки по другим приоритетным направлениям.Known methods of servicing requests, implemented, for example, in a device by Auth. St. USSR №1441398 "Multichannel device of dynamic priority", IPC G 06 F 9/46, published 11/30/88, where increasing the priority of the request is carried out linearly at certain intervals; and by Auth. St. USSR No. 1562912 "Multichannel device with dynamic priority change", IPC G 06 F 9/46, published on 05/07/90, where an increase in the priority of the request occurs after each service of the application in other priority areas.

Недостатком данных способов является низкая вероятность обслуживания низкоприоритетных запросов в результате того, что запросы могут находиться в очереди на обслуживание больше допустимого времени ожидания.The disadvantage of these methods is the low probability of serving low priority requests as a result of the fact that the requests can be in the service queue for more than the allowable waiting time.

Наиболее близким по своей технической сущности к заявленному является способ обслуживания запросов по патенту РФ №2140666 "Способ обслуживания запросов пользователей вычислительной системы и устройство, его реализующее (варианты)", МПК G 06 F 9/46, опубл. 27.10.99 (Вариант №1).Closest in its technical essence to the claimed one is a method of servicing requests according to the patent of the Russian Federation No. 2140666 "Method of servicing user requests of a computing system and a device that implements it (options)", IPC G 06 F 9/46, publ. 10.27.99 (Option No. 1).

Способ-прототип заключается в том, что предварительно задают параметры вычислительной системы, включающие коды максимального времени ожидания запросов, число разноприоритетных пользователей, число приоритетов пользователей, формируют запросы пользователей, принимают их, формируют очередь обслуживания запросов второго порядка в соответствии с приоритетами пользователей, переносят запросы в очередь первого порядка при истечении у них максимального времени ожидания запросов и обслуживают запросы из очереди первого порядка, а при отсутствии в ней запросов - из очереди второго порядка.The prototype method consists in pre-setting the parameters of the computing system, including codes for the maximum waiting time for requests, the number of different priority users, the number of user priorities, generating user requests, accepting them, forming a queue for servicing second-order requests in accordance with user priorities, transferring requests in the first order queue when they have the maximum waiting time for requests and serve requests from the first order queue, and when obstacle in her requests - from a second-order queue.

По сравнению с аналогами способ-прототип обеспечивает повышение вероятности обслуживания запросов низкоприоритетных пользователей, т.к. учитывает максимальное время нахождения запросов в очереди на обслуживание.Compared with analogues, the prototype method provides an increase in the probability of serving requests from low-priority users, because takes into account the maximum time that requests are in the service queue.

Недостатком прототипа является относительно высокая вероятность несвоевременного обслуживания запросов пользователей в пределах динамического диапазона возможного изменения количества запросов, причем вероятность несвоевременного обслуживания будет тем выше, чем более низкоприоритетным является запрос пользователя. Это связано с ограниченной вычислительной мощностью сервера. Здесь и далее под термином сервер понимают обслуживающее устройство, выполняющее операции по вычислениям, т.е. обслуживание запросов пользователей. Появление новых запросов от пользователей с более высоким приоритетом будет "отбрасывать" запросы низкоприоритетных пользователей в конец очереди. Однако использование сервера с вычислительным ресурсом, гарантирующим своевременное обслуживание всех поступивших запросов в периоды максимальной нагрузки экономически нецелесообразно, т.к. в случае уменьшения количества запросов вычислительная мощность дорогостоящего сервера не будет использоваться в полном объеме.The disadvantage of the prototype is the relatively high likelihood of untimely servicing user requests within the dynamic range of the possible change in the number of requests, and the likelihood of untimely service will be the higher, the lower the priority is the user's request. This is due to the limited processing power of the server. Hereinafter, the term server is understood to mean a servicing device that performs calculations, i.e. serving user requests. The appearance of new requests from users with a higher priority will “drop” requests from low-priority users to the end of the queue. However, using a server with a computing resource that guarantees timely servicing of all incoming requests during periods of maximum load is not economically feasible, because in the event of a decrease in the number of requests, the computing power of an expensive server will not be fully utilized.

Целью заявленного технического решения является разработка способа обслуживания запросов разноприоритетных пользователей вычислительной системы, обеспечивающего повышение вероятности своевременного обслуживания запросов разноприоритетных пользователей в пределах динамического диапазона возможного изменения количества запросов и при одновременном достижении экономической целесообразности расходов на ресурсные вычислительные мощности обслуживающих устройств путем распределенного использования вычислительного ресурса вычислительной системы и динамического перераспределения поступающего потока запросов пользователей между серверами с относительно невысокими вычислительными ресурсами и, следовательно, малой себестоимостью и минимальными затратами на их эксплуатацию.The purpose of the claimed technical solution is to develop a method for servicing requests of different priorities of users of a computing system, providing an increase in the likelihood of timely servicing of requests of different priorities of users within the dynamic range of a possible change in the number of requests and at the same time achieving economic feasibility of the cost of resource computing power of servicing devices through distributed use of computing resource an intensive system and dynamic redistribution of the incoming stream of user requests between servers with relatively low computing resources and, therefore, low cost and minimal operating costs.

Поставленная цель достигается тем, что в известном способе обслуживания запросов разноприоритетных пользователей вычислительной системы, заключающемся в том, что предварительно задают параметры вычислительной системы, включающие М≥2 разноприоритетных пользователей, Z≥2 приоритетов пользователей, формируют запросы пользователей, принимают их, формируют очередь обслуживания запросов в соответствии с приоритетами пользователей, после чего обслуживают поступившие запросы. Для вычислительной системы, включающей Р≥2 серверов, каждый из которых обладает максимальным вычислительным ресурсом Q_mp, где р=1, 2,..., Р - номер сервера, и к каждому из которых закреплены М_р разноприоритетных пользователей, сформированные запросы пользователей принимают серверами, к которым они закреплены, а очередь на выполнение поступивших запросов формируют на каждом сервере. После этого на каждом сервере вычисляют необходимый вычислительный ресурс на обслуживание поступивших запросов Q_рн. Необходимый вычислительный ресурс р-го сервера Q_рн определяется суммой ресурсов, необходимых для реализации всех поступивших от пользователей запросов, которые в рассматриваемой заявке принимаются одинаковыми. Например, при поступлении на сервер 20 запросов, реализация каждого из которых требует вычислительного ресурса q, необходимый на их выполнение вычислительный ресурс Q_рн равен 20q (Q_рн=20q). Затем вычисляют разницу ΔQ_p=Q_mp-Q_рн между максимальным Q_mp и необходимым Q_рн вычислительными ресурсами р-го сервера. Запоминают полученные разницы ΔQ_p. На каждом сервере при ΔQ_p>0 обслуживают запросы, поступившие от закрепленных за ним пользователей. При ΔQ_p<0 обслуживают часть запросов пользователей, не превышающую максимальный вычислительный ресурс Q_mp сервера, а оставшийся в очереди избыточный объем запросов р-го сервера ΔQ_p сравнивают с положительными значениями ΔQ_n, где n=1, 2,..., Р и n≠р, других серверов. В случае нахождения ΔQ_n≥|ΔQ_p|, передают избыточный объем запросов ΔQ_p р-го сервера на n-й сервер, где обслуживают эти запросы. А при условии ΔQ_n<|ΔQ_p| на всех серверах, избыточные запросы р-го сервера разделяют на части ΔQ_p, где i=1, 2, 3, ... - порядковый номер части разделенного ресурса ΔQ_p, требующих обслуживания оставшихся запросов от одного пользователя р-го сервера. Затем последовательно сравнивают необходимый ресурс для обслуживания ΔQ_p1, ΔQ_p2,..., ΔQ_pi,... с положительными значениями ΔQ_n других серверов и по результатам сравнения передают для обслуживания одну или несколько частей ΔQ_pi разделенного ресурса ΔQ_p, совокупный ресурс на обслуживание которых не превышает ΔQ_n n-го сервера. Причем при ΔQ_pi≥ΔQ_n на всех серверах, на каждом из них в процессе обслуживания собственных запросов повторно вычисляют ΔQ_p, и при достижении избыточности ресурса р-го сервера, при котором выполняется условие ΔQ_p≥ΔQ_pi, на р-м сервере обслуживают запросы ΔQ_pi, а при достижении условия ΔQ_n≥ΔQ_pi на n-м сервере на него передают для обслуживания запросы ΔQ_pi. При этом действия по передаче на n-й сервер избыточных объемов запросов ΔQ_p или их частей ΔQ_pi повторяют до тех пор, пока запросы всех пользователей не будут приняты на обслуживание.This goal is achieved by the fact that in the known method of servicing requests of different priority users of a computer system, which consists in pre-setting the parameters of a computer system including M≥2 different priority users, Z≥2 user priorities, form user requests, receive them, form a service queue requests in accordance with the priorities of users, after which they service incoming requests. For a computing system including P≥2 servers, each of which has a maximum computing resource Q _mp , where p = 1, 2, ..., P is the server number, and M _{p of} different priority users are assigned to each of them, generated user requests they are accepted by the servers to which they are assigned, and the queue for the execution of incoming requests is formed on each server. After that, on each server, the necessary computing resource for servicing the incoming requests Q _{ph is} calculated. The required computing resource of the r-th server Q _rn is determined by the sum of the resources necessary to implement all requests received from users, which are accepted the same in the application in question. For example, when entering the server 20 requests the implementation of each of which requires a computing resource q, the Required for their performance computing resource Q _pH equal 20q (Q _pH = 20q). Then calculate the difference ΔQ _p = Q _mp -Q _pH between the maximum Q _mp and the necessary Q _pH computing resources of the r-th server. The differences ΔQ _{p are} memorized. At each server with ΔQ _p > 0, requests from incoming users assigned to it are served. When ΔQ _p <0, a part of user requests is serviced that does not exceed the maximum computational resource Q _{mp of the} server, and the remaining request volume of the r-th server ΔQ _{p is} compared with positive values ΔQ _n , where n = 1, 2, ..., P and n ≠ p, other servers. If ΔQ _n≥ | ΔQ _p | is found, the excess volume of requests ΔQ _{p of the} r-th server is transmitted to the nth server, where these requests are served. And provided ΔQ _n <| ΔQ _p | on all servers, redundant requests of the r-th server are divided into parts ΔQ _p , where i = 1, 2, 3, ... is the serial number of the part of the shared resource ΔQ _p that require servicing the remaining requests from one user of the r-th server. Then, the necessary resource for servicing ΔQ _p1 , ΔQ _p2 , ..., ΔQ _pi , ... is successively compared with the positive values ΔQ _{n of} other servers, and one or more parts ΔQ _{pi of the} shared resource ΔQ _p , total resource for maintenance of which does not exceed ΔQ _{n of the} n-th server. Moreover, with ΔQ _pi ≥ΔQ _n on all servers, on each of them, in the process of servicing their own requests, ΔQ _{p is} recalculated, and upon reaching the resource redundancy of the r-th server, at which the condition ΔQ _p ≥ΔQ _{pi is} fulfilled, on the r-th server serve ΔQ _pi requests, and when the condition ΔQ _n ≥ΔQ _{pi is} reached on the nth server, ΔQ _pi requests are sent to it for servicing. In this case, the steps of transferring to the nth server excessive volumes of requests ΔQ _p or their parts ΔQ _{pi are} repeated until all users' requests are accepted for service.

При формировании очереди на обслуживание запросов пользователей, среди запросов равноприоритетных пользователей первыми на обслуживание назначают запросы от пользователя с большим количеством запросов.When forming a queue for servicing user requests, among requests of equal priority users, requests from a user with a large number of requests are the first to be assigned to service.

Равноприоритетные запросы одного пользователя обслуживают на одном сервере.The equal priority requests of one user are served on a single server.

Благодаря указанной новой совокупности существенных признаков в заявленном способе реализуется возможность перераспределения изменяющегося потока запросов разноприоритетных пользователей среди составляющих распределенную вычислительную систему серверов в пределах динамического диапазона возможного изменения количества запросов, чем обеспечивается повышение вероятности своевременного обслуживания запросов пользователей при одновременном снижении экономических показателей вычислительной системы в целом.Thanks to the indicated new set of essential features, the claimed method makes it possible to redistribute the changing flow of requests of different priority users among the components of a distributed computing system of servers within the dynamic range of the possible change in the number of requests, thereby increasing the likelihood of timely servicing of user requests while reducing the economic performance of the computing system as a whole.

Проведенный анализ уровня техники позволил установить, что аналоги, характеризующиеся совокупностью признаков, тождественных всем признакам заявленного технического решения, отсутствуют, что указывает на соответствие заявленного технического решения условию патентоспособности "новизна". Результаты поиска известных решений в данной и смежных областях техники с целью выявления признаков, совпадающих с отличительными от прототипа признаками заявленного объекта, показали, что они не следуют явным образом из уровня техники. Из уровня техники также не выявлена известность влияния предусматриваемых существенными признаками заявленного изобретения преобразований на достижение указанного технического результата. Следовательно, заявленное изобретение соответствует условию патентоспособности "изобретательский уровень".The analysis of the prior art made it possible to establish that analogues that are characterized by a set of features identical to all the features of the claimed technical solution are absent, which indicates the conformity of the claimed technical solution to the patentability condition of "novelty". Search results for known solutions in this and related fields of technology in order to identify features that match the distinctive features of the claimed object from the prototype showed that they do not follow explicitly from the prior art. The prior art also did not reveal the popularity of the impact provided by the essential features of the claimed invention transformations to achieve the specified technical result. Therefore, the claimed invention meets the condition of patentability "inventive step".

Заявленный способ поясняется чертежами, на которых показаны:The claimed method is illustrated by drawings, which show:

фиг.1 - структура вычислительной системы;figure 1 - the structure of the computing system;

фиг.2 - пояснение распределения пользователей вычислительной системы между вычислительными серверами;figure 2 - explanation of the distribution of users of a computing system between computing servers;

фиг.3 - блок-схема алгоритма, реализующего заявленный способ обслуживания разноприоритетных запросов пользователей вычислительной системы;figure 3 is a block diagram of an algorithm that implements the claimed method of servicing multi-priority requests of users of a computing system;

фиг.4 - схема проведения эксперимента;figure 4 - diagram of the experiment;

фиг.5 - математические ожидания количества обработанных запросов.5 is the mathematical expectation of the number of processed requests.

Известно, что основными требованиями к современным вычислительным системам, обслуживающим запросы разноприоритетных пользователей, являются своевременность (в соответствии с установленными приоритетами) обслуживания, или, другими словами, минимизация вероятности невыполнения запроса любого приоритета в установленные временные рамки.It is known that the main requirements for modern computing systems serving requests of different priority users are the timeliness (in accordance with the established priorities) of service, or, in other words, minimizing the probability of non-fulfillment of a request of any priority within the established time frame.

Решение такой задачи может быть найдено путем наращивания вычислительной мощности системы (ее сервера), что часто приводит к экономически недопустимым затратам при резко изменяющихся потоках запросов в пределах динамического диапазона возможного изменения их количества. Простое увеличение числа серверов, в рамках известных подходов также не решает проблемы, т.к. на каждом из серверов системы при традиционном его использовании могут возникать "перегрузки" запросов пользователей, приводящие к несвоевременному их обслуживанию.A solution to this problem can be found by increasing the computing power of the system (its server), which often leads to economically unacceptable costs with sharply changing request flows within the dynamic range of possible changes in their number. A simple increase in the number of servers, within the framework of well-known approaches, also does not solve the problem, because on each of the system servers during its traditional use, "overloads" of user requests may occur, leading to their untimely service.

Таким образом, имеет место противоречие между требованием по достижению максимально высокой вероятности своевременного обслуживания запросов разноприоритетных пользователей в условиях широкого диапазона изменения числа одновременно поступающих запросов, и требованием оптимизации (минимизации) экономических затрат, связанных с установкой и эксплуатацией вычислительной системы.Thus, there is a contradiction between the requirement to achieve the highest likelihood of timely service of requests of different priority users in a wide range of changes in the number of simultaneously received requests, and the requirement to optimize (minimize) the economic costs associated with the installation and operation of a computer system.

Одним из перспективных направлений по построению вычислительных систем, в значительной мере устраняющих указанное противоречие, может рассматриваться направление, связанное с динамическим регулированием потока запросов разноприоритетных пользователей между распределенным между группами пользователей вычислительным ресурсом.One of the promising areas for the construction of computer systems, which largely eliminates this contradiction, can be considered the direction associated with the dynamic control of the flow of requests of different priority users between a computing resource distributed between user groups.

Заявленное техническое решение реализует такой подход, что можно объяснить следующим образом.The claimed technical solution implements this approach, which can be explained as follows.

Первоначально в вычислительной системе, содержащей Р≥2 серверов (Сервер №1, сервер №2, сервер №Р на фиг.1), каждый из которых обладает максимальным вычислительным ресурсом Q_mp, где р=1, 2,..., Р - номер сервера, и М≥2 разноприоритетных пользователей с общим числом приоритетов Z≥2, распределяют пользователей между серверами. Для этого за каждым р-м сервером закрепляют примерно равное количество М_р разноприоритетных пользователей (фиг.2).Initially, in a computing system containing P≥2 servers (Server No. 1, Server No. 2, Server No. P in FIG. 1), each of which has a maximum computing resource Q _mp , where p = 1, 2, ..., P - server number, and M≥2 users of different priorities with a total number of priorities Z≥2, distribute users between servers. For this, each r-m server is assigned an approximately equal number of M _r different priority users (figure 2).

При определении максимального вычислительного ресурса отдельных серверов вычислительной системы их суммарный вычислительный ресурс выбирают в пределах 70-80% от вычислительного ресурса, необходимого в условиях пиковых нагрузок потока запросов, который определяется из статистики работы системы. При этом достигается минимизация стоимости используемых в системе серверов. В случае выбора большего значения максимального вычислительного ресурса вероятность своевременного обслуживания повышается незначительно при одновременном значительном повышении стоимости. И наоборот, вероятность своевременного обслуживания резко снижается при небольшой себестоимости серверов, если максимальный ресурс выбирают меньше 70% от необходимого в условиях пиковых нагрузок потока запросов.When determining the maximum computing resource of individual servers of a computing system, their total computing resource is selected within 70-80% of the computing resource required under peak loads of the query flow, which is determined from the statistics of the system. At the same time, the cost of the servers used in the system is minimized. In the case of choosing a larger value of the maximum computing resource, the probability of timely service increases slightly while significantly increasing the cost. Conversely, the likelihood of timely service decreases sharply with a low cost of servers, if the maximum resource is chosen less than 70% of the flow of requests required under peak loads.

После этого закрепленные за соответствующими серверами пользователи формируют запросы. Принимают запросы соответствующими серверами (блоки 2 и 3 на фиг.3). Затем формируют на каждом р-м сервере очередь на выполнение поступивших запросов в соответствии с приоритетами пользователей (блок 4 на фиг.3), причем среди запросов равноприоритетных пользователей первыми на обслуживание назначают запросы от пользователя с большим количеством запросов.After that, users assigned to the corresponding servers form queries. Requests are received by the respective servers (blocks 2 and 3 in FIG. 3). Then, a queue is formed on each r-th server for the execution of incoming requests in accordance with the priorities of users (block 4 in FIG. 3), and among requests of equal priority users, requests from a user with a large number of requests are assigned first for servicing.

Процедура формирования очереди запросов заключается в следующем. Из канала связи принимают пакеты с запросами пользователей и выделяют их идентификаторы (отправителей) пакетов. Определяют по идентификатору пользователя (отправителя) его приоритет, используя таблицу соответствия идентификаторов пользователей и их приоритетов. Затем запоминают содержимое запроса в массив. При этом группируют принятые запросы пользователей в массивах в соответствии с приоритетами пользователей, а также с учетом того, чтобы запросы одного пользователя являлись одной подгруппой. Это необходимо для того, чтобы запросы одного пользователя были обслужены на одном сервере.The procedure for generating a request queue is as follows. Packets with user requests are received from the communication channel and their identifiers (senders) of packets are allocated. Determine by priority the user (sender) identifier using the table of correspondence of user identifiers and their priorities. Then the contents of the request are stored in an array. At the same time, the received user requests are grouped in arrays in accordance with the priorities of the users, and also taking into account the fact that the requests of one user are one subgroup. This is necessary so that requests from one user are served on the same server.

После этого вычисляют необходимый вычислительный ресурс Q_рн на обслуживание поступивших запросов на каждом сервере (блок 5 на фиг.3). Вычисляют разницу ΔQ_p=Q_mp-Q_рн между максимальным Q_mp и необходимым Q_рн вычислительными ресурсами р-го сервера для оценки возможности вычислительного ресурса р-го сервера для обслуживания всех поступивших запросов. Запоминают полученные разницы ΔQ_p на каждом сервере (блоки 6 и 7 на фиг.3).After that, calculate the required computing resource Q _pH for servicing incoming requests on each server (block 5 in figure 3). The difference ΔQ _p = Q _mp -Q _pH between the maximum Q _mp and the required Q _pH computing resources of the r-th server is calculated to evaluate the possibility of the computing resource of the r-th server to serve all incoming requests. The obtained differences ΔQ _p on each server are stored (blocks 6 and 7 in FIG. 3).

На каждом р-м вычислительном сервере в случае, когда ресурсов сервера достаточно для обслуживания всех поступивших запросов (при ΔQ_p≥0), обслуживают запросы, поступившие от закрепленных за ним пользователей (блок 15 на фиг.3). А в случае, когда ресурсов на одном или нескольких серверах недостаточно (при ΔQ_p<0) обслуживают часть запросов пользователей, не превышающую максимальный вычислительный ресурс Q_mp сервера. Это связано с тем, что при выполнении условия обслуживания запросов пользователя на одном сервере количество запросов обслуживаемых сервером может быть меньше, чем позволяет его максимальный вычислительный ресурс, а при попытке обслуживания запросов дополнительного пользователя максимальный вычислительный ресурс оказывается меньше ресурса, требуемого на обслуживание всех запросов. А оставшийся в очереди избыточный объем запросов р-го сервера ΔQ_p сравнивают с положительными значениями ΔQ_n, где n=1, 2,..., P и n≠р, других серверов (блок 9 на фиг.3). Причем обслуживание запросов выполняют в порядке очереди, начиная со старших приоритетов, чтобы в остатке оставались запросы младшего приоритета.On each r-th computing server, in the case when the server resources are sufficient to service all incoming requests (at ΔQ _p ≥0), the requests received from users assigned to it are served (block 15 in FIG. 3). And in the case when the resources on one or several servers are insufficient (for ΔQ _p <0) they serve part of user requests that do not exceed the maximum computational resource Q _{mp of the} server. This is due to the fact that when the conditions for servicing user requests on one server are fulfilled, the number of requests serviced by the server may be less than its maximum computing resource allows, and when trying to service additional user requests, the maximum computing resource is less than the resource required to service all requests. And the excess queue of requests from the r-th server ΔQ _p remaining in the queue is compared with the positive values ΔQ _n , where n = 1, 2, ..., P and n ≠ p, of other servers (block 9 in Fig. 3). Moreover, the service requests are performed in order of priority, starting with the highest priorities, so that the remaining priority requests remain.

В случае нахождения сервера со свободным вычислительным ресурсом (ΔQ_n≥|ΔQ_p|), передают избыточный объем запросов ΔQ_p р-го сервера для обслуживания на n-й сервер (блок 16 на фиг.3).In the case of finding a server with a free computing resource (ΔQ _n ≥ | ΔQ _p |), an excess volume of requests ΔQ _{p of the} r-th server for service is transmitted to the nth server (block 16 in FIG. 3).

При отсутствии на всех серверах свободного вычислительного ресурса ΔQ_n, превышающего избыточный объем запросов р-го сервера ΔQ_p, избыточные запросы разделяют (блок 11 на фиг.3) на части ΔQ_pi, где i=1, 2, 3,... - порядковый номер части разделенного ресурса ΔQ_p, требующих обслуживания оставшихся запросов от одного пользователя р-го сервера. Т.е. в состав каждой отдельной части ΔQ_pi входят запросы только одного пользователя. Затем последовательно сравнивают необходимый ресурс для обслуживания ΔQ_p1, ΔQ_p2,..., ΔQ_pi, ... с положительными значениями ΔQ_n других серверов и по результатам сравнения передают (блок 16 на фиг.3) для обслуживания одну или несколько частей ΔQ_pi разделенного ресурса ΔQ_p, совокупный ресурс на обслуживание которых не превышает ΔQ_n n-го сервера. Например, при наличии на n-м сервере свободного ресурса ΔQ_n и требующих обслуживания запросов, разделенных на части ΔQ_p1≤ΔQ_p2≤ΔQ_p3, при выполнении условий (ΔQ_p1+ΔQ_p2)≤ΔQ_n, ΔQ_p2≤ΔQ_n и ΔQ_p3≤ΔQ_n, но одновременно и выполнения условий (ΔQ_p1+ΔQ_p3)≥ΔQ_n, (ΔQ_p2+ΔQ_p3)≥ΔQ_n, на n-й сервер могут быть переданы для обслуживания ΔQ_p1, или ΔQ_p2, или ΔQ_p3, или (ΔQ_p1+ΔQ_p3). При этом на n-й сервер передают максимально возможный объем запросов от р-го сервера. В приведенном примере передают объем (ΔQ_p1+ΔQ_p2). Оставшиеся части разделенного избыточного ресурса ΔQ_p передают на другие серверы.In the absence on all servers of a free computing resource ΔQ _n exceeding the excess volume of requests of the r-th server ΔQ _p , the excess requests are divided (block 11 in Fig. 3) into parts ΔQ _pi , where i = 1, 2, 3, ... - the serial number of the part of the shared resource ΔQ _p requiring servicing of the remaining requests from one user of the r-th server. Those. each separate part of ΔQ _pi contains only one user’s requests. Then, the necessary resource for servicing ΔQ _p1 , ΔQ _p2 , ..., ΔQ _pi , ... is successively compared with the positive values ΔQ _{n of} other servers and, according to the results of the comparison, are transmitted (block 16 in Fig. 3) for servicing one or more parts of ΔQ _{pi of the} divided resource ΔQ _p , the total service resource of which does not exceed ΔQ _{n of the} n-th server. For example, if there is a free resource ΔQ _n on the nth server and requests are serviced that are divided into parts ΔQ _p1 ≤ΔQ _p2 ≤ΔQ _p3 , when the conditions (ΔQ _p1 + ΔQ _p2 ) ≤ΔQ _n , ΔQ _p2 ≤ΔQ _n and ΔQ _p3 ≤ΔQ _n , but at the same time the fulfillment of the conditions (ΔQ _p1 + ΔQ _p3 ) ≥ΔQ _n , (ΔQ _p2 + ΔQ _p3 ) ≥ΔQ _n , can be transferred to the nth server for servicing ΔQ _p1 , or ΔQ _p2 , or ΔQ _p3 , or (ΔQ _p1 + ΔQ _p3 ). In this case, the maximum possible volume of requests from the r-th server is transmitted to the nth server. In the above example, the volume (ΔQ _p1 + ΔQ _p2 ) is transmitted. The remaining parts of the shared excess resource ΔQ _p are transmitted to other servers.

В случае, когда отдельные части избыточного объема запросов превышают свободный вычислительный ресурс на всех серверах (при ΔQ_pi≥ΔQ_n), то на всех серверах в процессе обслуживания собственных запросов повторно вычисляют ΔQ_p, и при достижении необходимого уровня свободного ресурса на р-м сервере, превышающего необходимое значение хотя бы одной из частей ΔQ_pi, т.е. при ΔQ_p≥ΔQ_pi, обслуживают избыточные запросы ΔQ_pi (блок 15 на фиг.3). А при достижении необходимого уровня свободного ресурса на n-м сервере (при достижении условия ΔQ_n≥ΔQ_pi), передают избыточные запросы ΔQ_pi для обслуживания на n-й сервер (блок 16 на фиг.3). Действия по передаче на n-й сервер и/или приему на обслуживание на р-м сервере избыточных объемов запросов ΔQ_p или их частей ΔQ_pi повторяют до тех пор, пока запросы всех пользователей не будут приняты на обслуживание.In the case when individual parts of the excess volume of requests exceed the free computing resource on all servers (for ΔQ _pi ≥ΔQ _n ), then on all servers in the process of servicing their own requests, ΔQ _{p is} recalculated, and when the required level of free resource is reached at r server exceeding the required value of at least one of the parts ΔQ _pi , i.e. when ΔQ _p ≥ΔQ _pi , redundant requests ΔQ _{pi are} served (block 15 in FIG. 3). And when the required level of free resource is reached on the nth server (when the condition ΔQ _n ≥ΔQ _{pi is} reached), redundant requests ΔQ _pi for service are transmitted to the nth server (block 16 in FIG. 3). The actions of transferring to the nth server and / or receiving for servicing on the rth server excessive volumes of requests ΔQ _p or their parts ΔQ _{pi are} repeated until all users' requests are accepted for service.

Возможность достижения сформулированного технического результата была проверена путем имитационного моделирования вычислительной системы при помощи программной среды "MathCad 2003".The ability to achieve the formulated technical result was tested by simulating a computer system using the software environment "MathCad 2003".

Обобщенная схема проведения эксперимента представлена на фиг.4.A generalized scheme of the experiment is presented in figure 4.

Модель вычислительной системы включала три сервера с заданными максимальными вычислительными мощностями Q_mp=50, 60 и 70 соответственно. За каждым из серверов были закреплены 10 разноприоритетных пользователей. Причем за каждым сервером было закреплено одинаковое количество равноприоритетных пользователей 1-го, 2-го и 3-го приоритетов, как это показано на фиг.2, и объединенных в три сегмента вычислительной системы.The computer system model included three servers with the given maximum computing powers Q _mp = 50, 60, and 70, respectively. Each server was assigned 10 different priority users. Moreover, each server was assigned the same number of equal priority users of the 1st, 2nd and 3rd priorities, as shown in figure 2, and combined into three segments of the computing system.

В модели было предусмотрено параллельное выполнение трех независимых друг от друга процессов:The model provided for the parallel execution of three independent processes:

процесса, моделирующего фон задач, решаемых сервером помимо обслуживания запросов (имитирован потоком внешних воздействий (λвв) на обслуживающий прибор);a process that simulates the background of tasks solved by the server in addition to servicing requests (simulated by a stream of external influences (λvv) on the serving device);

процесса, моделирующего загрузку сервера запросами на обслуживание;a process that simulates server loading with service requests;

процесса, моделирующего функционирование сервера по обслуживанию разноприоритетных запросов пользователей вычислительной системы.a process that simulates the functioning of a server for servicing multi-priority requests of users of a computing system.

Процесс, моделирующий загрузку сервера запросами на обслуживание, имитирован входящим потоком запросов на обслуживание, имеющим следующие характеристики:A process that simulates server loading by service requests is simulated by an incoming stream of service requests having the following characteristics:

интенсивность поступления запросов - λ, изменяющаяся во времени;the intensity of the receipt of requests - λ, changing in time;

закон распределения времени поступления запросов f(Δt), принят экспоненциальным

т.к. создает наиболее тяжелый режим при обслуживании запросов, а при большом числе источников запросов, что соответствует реальной вычислительной системе, суммарный поток стремится к простейшему, имеющему экспоненциальное распределение, независимо от законов распределения отдельных источников (см. Иванов Е.В. Имитационное моделирование средств и комплексов связи и автоматизации. - СПб.: ВАС, 1992. - 206 с.).the law of distribution of the time of receipt of requests f (Δt), adopted exponential

because creates the most difficult mode when servicing queries, and with a large number of query sources, which corresponds to a real computing system, the total flow tends to the simplest one having an exponential distribution, regardless of the laws of distribution of individual sources (see Ivanov E.V. Simulation modeling of tools and complexes communications and automation. - SPb .: VAS, 1992. - 206 p.).

Процесс, моделирующий функционирование сервера имитирован работой обслуживающего прибора - устройства или совокупности устройств, выполняющих операции по обслуживанию и диспетчеризации запросов. Таким образом, обслуживающий прибор представляет собой совокупность каналов обслуживания, причем один канал обслуживания соответствует одному серверу. Обслуживающий прибор характеризуется числом каналов обслуживания; интенсивностью обслуживания (μ пакетов/с); законом распределения времени обслуживания F(t₀).A process that simulates the functioning of a server is simulated by the operation of a service device — a device or a set of devices that perform maintenance and dispatch requests. Thus, the service device is a collection of service channels, with one service channel corresponding to one server. The service device is characterized by the number of service channels; service intensity (μ packets / s); the law of distribution of service time F (t ₀ ).

Выходными параметрами обслуживающего прибора являются поток запросов, обслуженных прибором λ_обс; поток запросов, потерянных (обслуженных несвоевременно) в результате недостаточной производительности сервера λ_пп.The output parameters of the serving device are the stream of requests served by the device λ _obs ; the flow of requests lost (served out of time) as a result of insufficient server performance λ _pp .

С учетом входных параметров и характеристик полученной системы массового обслуживания необходимо так распределять запросы на обслуживание при заданных вычислительных ресурсах серверов, чтобы поток запросов, потерянных в результате недостаточной производительности сервера λ_пп был меньше допустимого λ_доп.Taking into account the input parameters and the characteristics of the resulting queuing system, it is necessary to distribute service requests for the given computing resources of the servers so that the stream of requests lost as a result of insufficient server performance λ _{pp is} less than the permissible λ _add .

В ходе эксперимента с моделью потоки запросов пользователей, поступающие на серверы, обслуживались тремя вариантами: без распределения ресурсов, с применением разработанного способа и при неоптимальном распределении ресурсов, когда вычислительных мощностей серверов недостаточно (менее 70% от пиковой нагрузки).During the experiment with the model, the user request flows arriving at the servers were serviced in three ways: without resource allocation, using the developed method and with non-optimal resource allocation when the computing power of the servers is insufficient (less than 70% of the peak load).

Результаты моделирования приведены в табличном виде на фиг.5а и в виде диаграммы на фиг.5б, из которых видно, что суммарный поток запросов, обслуженных с применением разработанного технического решения (выделен жирным в табличном представлении и обведен рамкой на диаграмме), превышает наилучшие результаты, полученные в других вариантах обработки, на 20% при максимальной загрузке, на 14, 16% при средней и на 1,98% при минимальной загрузке сервера.The simulation results are shown in tabular form in Fig. 5a and in the form of a diagram in Fig. 5b, from which it can be seen that the total flow of requests served using the developed technical solution (highlighted in bold in the table view and circled in the diagram) exceeds the best results obtained in other processing options, by 20% at maximum load, by 14, 16% at average and 1.98% at minimum server load.

На основании этих результатов можно сделать вывод о том, что разработанный способ позволяет наиболее эффективно, т.е. более своевременно, обслуживать разноприоритетные запросы пользователей вычислительной системы при одновременном достижении экономической целесообразности расходов на ресурсные вычислительные мощности обслуживающих устройств (серверов).Based on these results, we can conclude that the developed method allows the most efficient, i.e. more timely, to serve different-priority requests of users of a computer system while at the same time achieving economic feasibility of the cost of resource computing power of serving devices (servers).

Claims

1. A method for servicing requests of different priority users of a computing system, which consists in pre-setting parameters of a computing system including M≥2 different priority users, Z≥2 user priorities, generating user requests, accepting them, forming a request servicing queue in accordance with user priorities then service incoming requests, characterized in that for a computing system including P≥2 servers, each of which has a maximum th computing resource Q _mp, where p = 1, 2, ..., R - number of the server, and to each of which different priorities _Mp fixed members formed accept user requests the servers to which they are attached, and queued for execution of requests formed on each server, and then on each server is calculated computational resource required for service requests received and Q _pn difference ΔQ _p = Q _mp -Q _rn between the maximum and Q _mp necessary computing resources O _rn p-th server, and storing received difference ΔQ _p , on each server, for ΔQ _p ≥0, requests from the users assigned to it are served, and for ΔQ _p <0, they serve part of user requests that do not exceed the maximum computing resource Q _{mp of the} server, and the remaining request volume of the r-th server remains in the queue ΔQ _{p is} compared with the positive values ΔQ _{n of} other servers, and if ΔQ _n ≥ | ΔQ _p |, where n = 1, 2, ..., P and n ≠ p are found, the excess volume of requests ΔQ _{p of the} rth server is transmitted to nth server where these requests are served, and provided ΔQ _n <| ΔQ _p | on all servers, the excess requests of the r-th server are divided into parts ΔQ _pi , where i = 1, 2, 3, ... is the serial number of the part of the shared resource ΔQ _p that require servicing the remaining requests from one user of the r-th server, then sequentially compare the required resource for servicing ΔQ _p1 , ΔQ _p2 , ..., ΔQ _pi , ... with the positive values ΔQ _{n of} other servers and, according to the results of the comparison, transfer one or more parts of ΔQ _pi , the divided resource ΔQ _p , for service, the total resource to the service of which does not exceed ΔQ _{n of the} n-th server, and when ΔQ _pi ≥ΔQ _n on all servers on each of them, in the process of servicing their own requests, ΔQ _{p is} recalculated, and upon reaching the resource redundancy of the rth server, at which the condition ΔQ _p ≥ΔQ _{pi is} fulfilled, the requests are served on the rth server ΔQ _pi , and when the condition ΔQ _n ≥ΔQ _{pi is} reached on the nth server, ΔQ _pi requests are sent to it for servicing, and the steps of transferring the excess volumes of ΔQ _p requests or their parts ΔQ _pi to the nth server are repeated until until the requests of all users are accepted for service.

2. The method according to claim 1, characterized in that when forming a queue for servicing user requests among requests of equal priority users, the first to assign services are requests from a user with a large number of requests.

3. The method according to claim 1, characterized in that equal priority requests of one user are served on a single server.