WO2004092971A1 - Procede de commande d'affectation de serveur - Google Patents

Procede de commande d'affectation de serveur Download PDF

Info

Publication number
WO2004092971A1
WO2004092971A1 PCT/JP2003/004679 JP0304679W WO2004092971A1 WO 2004092971 A1 WO2004092971 A1 WO 2004092971A1 JP 0304679 W JP0304679 W JP 0304679W WO 2004092971 A1 WO2004092971 A1 WO 2004092971A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
server group
servers
response time
requests
Prior art date
Application number
PCT/JP2003/004679
Other languages
English (en)
Japanese (ja)
Inventor
Yasuhiro Kokusho
Satoshi Tutiya
Tsutomu Kawai
Original Assignee
Fujitsu Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Limited filed Critical Fujitsu Limited
Priority to JP2004570858A priority Critical patent/JP3964909B2/ja
Priority to PCT/JP2003/004679 priority patent/WO2004092971A1/fr
Publication of WO2004092971A1 publication Critical patent/WO2004092971A1/fr
Priority to US11/099,538 priority patent/US20050193113A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/10015Access to distributed or replicated servers, e.g. using brokers

Definitions

  • the present invention relates to a method for dynamically changing the configuration of a group of servers assigned to a network service so that a plurality of servers providing the network service achieve a certain level of response time.
  • So-called service providers such as Internet service providers (ISPs) and application service providers (ASPs), which provide network services, maintain and manage servers for providing network services. May be outsourced to a data center operator.
  • ISPs Internet service providers
  • ASPs application service providers
  • a data center operator has multiple servers in its own data center, and allocates some of them to each network service. That is, a server group consisting of a plurality of servers is formed for each network service, and a service request for a network service is processed by one of the servers belonging to the corresponding server group.
  • the data center operator guarantees a certain level of service such as reliability, maintainability, availability, and response time to service users in contracts with each xSP. If the number of servers assigned to the network service is too large, there will be servers with a low operating rate, and effective utilization of servers will not be achieved. Conversely, if the number of servers is too small, the service level cannot be guaranteed beyond a certain level.
  • the data center operator installs a load balancer in front of multiple servers owned by the data center operator, and changes the settings of the load balancer to specify the load balancer.
  • Server is added to a group of servers that provide a certain network service, and a specific server is deleted from a group of servers that provide a certain network service.
  • the load distribution device also distributes the load on servers belonging to the server group by distributing service requests from user terminals according to a preset distribution ratio.
  • the load on network services fluctuates in a complex manner due to time of day, seasonal factors, and human factors, and the fluctuations vary from network service to network service.
  • the conventional method enables the setting of the load balancer to be performed automatically in real time without manual intervention.However, do you need more servers in the future or fewer servers? Servers are allocated only according to the current operation status and the set service level, without deciding whether or not it is necessary. Therefore, there is a problem that the server assigned according to the current operation status is not always optimal in the future.
  • the object is to provide a server system in a network system having a plurality of user terminals connected to a network, and a server group including a plurality of servers connected to the network and processing requests from the plurality of user terminals.
  • the number of requests from the plurality of user terminals for each predetermined time is accumulated, and a characteristic of time and number of requests is obtained as a function based on the accumulated number of past requests.
  • the average response time per one of the plurality of servers And substituting the predicted number of requests into a relational expression between the number of requests and the number of requests.
  • a first average response time per vehicle is determined, and it is determined whether the first average response time is a positive value and falls within a range equal to or less than a preset threshold, and according to the result of the determination.
  • the present invention is achieved by the invention according to claim 1, which provides a method for adjusting the number of servers in a server group, wherein the number of servers included in the server group is increased or decreased. Further, in the above-mentioned object, according to claim 1, when the average response time is included in the range as a result of the determination, (a) selecting one server included in the server group, and (b) selecting the server (C) obtain a second average response time per one of a plurality of servers included in the assumed server group, and (d) obtain the second average response time.
  • the selected server is referred to as the server group.
  • the invention is achieved by the invention according to claim 2, which provides a method of adjusting the number of server groups, which is excluded from the configuration of (1).
  • the above object is achieved in claim 2 in which after (e), after (e), the above (a) to (e) are repeated again, and as a result of the new determination, the second average response time is as described above.
  • a method for adjusting the number of servers in a server group wherein a plurality of servers included in the server group are excluded from the configuration one by one until the servers are not included in the range. The invention described in the range 3 is achieved.
  • the above-described object may be configured as described in claim 1, further comprising an unused server group including a plurality of unused servers connected to the network, and as a result of the determination, the average response time is included in the range.
  • a method of adjusting the number of servers is provided, wherein one server included in the unused server group is selected, and the selected server is added to the server group. This is achieved by the invention described in range 4.
  • the above object is achieved in claim 4 in which: (f) in a new server group after the selected server is added, a third server per one of a plurality of servers included in the new server group; (G) performing a new determination to determine whether or not the third average response time falls within the range; (h) a result of the new determination; the third average response time If not included in the range, one server included in the unused server group is selected, and (i) the selected server is added to the new server group, and after (i), again The above (f) to (i) are repeated, and as a result of the new determination, the servers included in the unused server group are added to the server group until the third average response time is included in the range.
  • a method for adjusting the number of server groups by adding one server at a time This is achieved by the invention described in claim 5 which provides the following.
  • the above object is to provide a server group including a plurality of user terminals connected to a network, a plurality of servers connected to the network and processing requests from the user terminals, A load balancing device that is connected and includes storage means for storing the number of requests from the user terminal, the distribution rate of the number of requests, and the configuration information of the server group at predetermined time intervals; and resources connected to the network.
  • a resource allocation control device in a network system having an allocation control device obtains a characteristic of time and the number of requests as a function based on the number of past requests stored in the load distribution device, and the function calculates a future time.
  • the number of requests from the plurality of user terminals assuming that the number of requests from the plurality of user terminals follows a predetermined probability distribution. Substituting the predicted number of requests into the relational expression between the response time and the number of requests to obtain a first average response time per one of the plurality of servers, wherein the first average response time is positive. Value and is determined to be included in a range equal to or less than a preset threshold, 7.
  • FIG. 1 is a diagram illustrating an example of the configuration of the entire system according to an embodiment of the present invention.
  • FIG. 2 is a block diagram illustrating a configuration example of a load distribution device.
  • FIG. 3 is a block diagram showing a configuration example of a mobile terminal such as a mobile phone or a PDA used as a user terminal.
  • FIG. 4 is a block diagram illustrating a configuration example of a server.
  • FIG. 5 is a diagram illustrating an example of a data configuration of server group configuration information stored in the RAM of the load distribution device.
  • FIG. 6 is a diagram illustrating a data configuration example of statistical information stored in the RAM of the load balancing device.
  • FIG. 7 is a diagram illustrating a data configuration example of data center configuration information stored in the RAM of the resource allocation control device.
  • FIG. 9 is a flowchart illustrating the server group new creation processing.
  • FIGS. 10 and 11 are flowcharts illustrating the server allocation adjustment processing.
  • FIG. 12 is a diagram for explaining a prediction method using the least squares method. BEST MODE FOR CARRYING OUT THE INVENTION
  • FIG. 1 is a diagram illustrating an example of the configuration of the entire system according to an embodiment of the present invention.
  • all servers that provide network services will be described as servers on which web servers operate.
  • the user terminal sends an http request including the homepage address to be viewed as a service request, and the web server sends the content corresponding to the corresponding homepage address to the user terminal.
  • Network 2 that connects a plurality of user terminals 1 and a plurality of servers 10 in a data center 3.
  • Network 2 can be wired or wireless.
  • the network 2 may be a LAN (Local Area Network) or a WAN (Wide Area Network), and the network 2 may be an internet connecting LANs and WANs.
  • a mobile terminal such as a PC (personal computer), a mobile phone, or a PDA (Personal Digital Assistant) is used.
  • the plurality of servers 10 in the data center 3 are grouped for each network service to be provided.
  • servers 7 to 7 3 to provide a network service exists in FIG. 1, the server group 7 have 7 2 provides separate network services as a web server. Server group 7 3 are unused servers that are not assigned to any network service.
  • Each of the servers 10 in the data center is connected to the LAN 6 in the center.
  • LA 6 in the center can be wireless, wired, or wired.
  • a request sent from the user terminal 1 is processed by the server 10 belonging to the server group that provides the network service corresponding to the request, and the network service is provided by responding the result to the user terminal.
  • the server group 7 provides company homepage content
  • the server group 7! Is sent to the user terminal that sent the company A homepage address as a browsing request (http request) from the user terminal.
  • Server sends the content corresponding to the address.
  • a load balancer 4 is provided at a stage preceding the server in the data center, and the load balancer 4 is connected to the LAN 6 and the network 2 in the center.
  • Load balancer 4 is to determine the server group corresponding to the request from the user terminal, and to prevent the load of the server belonging to that server group from being concentrated on a specific server, Distribute requests.
  • the load balancer 4 counts the number of requests (http requests) transmitted via the network 2 at regular time intervals for each server group 7 and accumulates the information as statistical information.
  • a resource allocation control device 5 for controlling the configuration of the server group 7 by adding or deleting servers 10 belonging to the server group 7 is connected to the LAN 6 in the center.
  • a configuration including a gateway, a router, a firewall, and the like between the load distribution device 4 and the network 2 is also possible.
  • a storage device such as a disk array may be connected to the outside of the server 10 if the hard disk provided inside the server is insufficient.
  • FIG. 2 is a block diagram showing a configuration example of the load distribution device 4.
  • the CPU 21 executes the control in the load distribution device 4.
  • a ROM (Read Only Memory) 22 stores a program executed when the load distribution device 4 is initialized, data necessary for the initialization, and a control program transferred to the RAM 23 at the time of initialization.
  • FIG. 3 is a block diagram showing a configuration example of a mobile terminal such as a mobile phone or a PDA used as the user terminal 1.
  • the CPU 31 executes control in the mobile terminal.
  • R0M32 a program executed when the mobile terminal is initialized, data necessary for initialization, and a control program transferred to the MM34 at the time of initialization are recorded.
  • the RAM 34 stores a control program and data S such as a work result when the control program is executed.
  • the communication device 35 connects to the network 2 and the LM 6 in the center. Interface to enable data transmission and reception with devices connected via the network 2 or the LAN 6 in the center.
  • the input device 36 is a keypad, a pen-type input device, or the like, and is used by a user for inputting various commands and data.
  • the display device 33 is a liquid crystal screen or the like, and displays a control result or the like of the CPU 31 to a user. These forces are connected by connecting lines 37 as shown in Fig.3.
  • FIG. 4 is a block diagram showing a configuration example of the resource allocation control device 5.
  • the CPU 41 executes the control in the resource allocation control device 5.
  • the ROM 42 stores a program executed when the resource allocation control device 5 is initialized and data necessary for the initialization.
  • the hard disk 46 stores OS (Operating System) data for controlling the resource allocation control device 5.
  • the RAM 45 stores data such as a work result when the OS is executed.
  • the communication device 48 has an interface for connecting to the network 2 and the LAN 6 in the center, and enables data transmission and reception to and from devices connected through the network 2 and the LA 6 in the center.
  • the input device 47 is a keyboard mouse or the like, and is used by a user to input various commands and data.
  • the display device 43 is a liquid crystal monitor, a CRT, or the like, and displays a control result of the CPU to a user.
  • the optical drive 44 is used to write or read data to media such as CD, DVD, and M0.
  • the administrator can log in to the resource allocation control device 5 from the server 10 connected via the network and remotely control the resource allocation control device 5, a configuration without the input device 47 and the display device 43 may be used.
  • the configuration of the PC used as the server 10 and the user terminal 1 is the same as that shown in FIG.
  • FIG. 5 is a diagram illustrating a data configuration example of the server group configuration information stored in the MM 23 of the load distribution device 4.
  • server group configuration information server group name 51, representative IP address 52, response time threshold 53, server name 54 belonging to the server, IP address 55, distribution ratio 56 are assigned to each server group. Is stored in
  • the server group name 51 is a management name for specifying the server group 7.
  • Representative IP address The address 52 is a number consisting of 32 bits for IPv4 and 128 bits for IPv6, and is an IP address that is disclosed to the outside for network services provided by the server group 7. That is, when the network service provided by the server group 7 is used, the user terminal 1 sends a request (http request) to the representative IP address 52 of the server group 7 and the server group 7 2 provides the request. If the network service is available, the user terminal 1 transmits a request to the representative IP Adoresu 5 second server group 7 2.
  • the representative IP address 52 a global IP address is used, but a private IP address may be used if the network service is for a limited organization. it can.
  • the response time threshold 53 is one of the service levels required by xSP when contracting with the data center, and the data center operator operates the network so that the response time to user terminal 1 does not fall below the response time threshold '53.
  • Server 10 is assigned to server group 7 that provides services, and data center 3 is managed.
  • the server name 54 of the belonging server information is a management name for specifying the server 10 belonging to each server group 7.
  • the IP address 55 is the IP address of the server corresponding to the server name 54.
  • As the IP address 55 a private IP address is used, but a global IP address can also be used if there is a surplus of global IP addresses.
  • Figure 5 stores information about two server groups.
  • the representative IP address 52 is GIP1
  • the response time threshold 53 is ST1
  • the server name 54 is WEB-A, WEB-B, WEB-C, and WEB-D.
  • the IP address 5 5 PIP1, PIP2, PIP3, PIP4, and distribution ratio 56 are 0.5, 0.3, 0.1, and 0.1, respectively.
  • FIG. 5 also stores the representative IP address 52 of the server group B and information about three servers to which the server group B belongs.
  • FIG. 6 is a diagram illustrating an example of a data configuration of statistical information stored in the RAM 23 of the load distribution device 4.
  • a value obtained by totaling the number of arriving requests at fixed time intervals is stored for each server group.
  • FIG. 6 contains the number of arriving requests per second, store that R 21 amino request arrives in R u number of requests force server group B in server group A during time 6 1 ⁇ 2 Is done.
  • the latest data is added below Figure 6. In this way, information on the number of arrival requests 62 for the last n seconds (n is a natural number) from the current time is stored.
  • FIG. 7 is a diagram showing a data configuration example of the data center configuration information stored in the RAM 45 of the resource allocation control device 5.
  • the data center configuration information stores the server name 71, IP address 72, CPU clock speed 73, and assigned server group name 74 of each server 10, including unused servers not assigned to network services. Is performed.
  • the server name 71 and the IP address 72 are the same as the server name 54 and the IP address 55 in FIG.
  • the CPU clock speed 73 stores a value obtained by dividing the clock frequency of the CPU 41 mounted on the server 10 by the reference clock frequency.
  • the reference cook frequency is 1 GHz. From the CPU clock speed 73, it can be seen that the clock frequency mounted on the server 10 is several times the reference clock frequency. For the server 10 having a plurality of CPUs 41, a value obtained by dividing the total value of the clock frequencies by the reference clock frequency is stored.
  • the belonging server group name 74 specifies the server group 7 to which the server 10 of each server name 71 belongs.
  • the data center configuration information Information is also stored.
  • FIG. 7 it can be seen that four servers whose server names 71 are WEB-H, WEB-1, WEB-J, and WEB-K are unused servers.
  • FIG. 8 is a table in which the number of requests processed per second by the server equipped with the CPU 41 of the reference clock frequency (1 GHz) stored in the RAM 45 of the resource allocation control device 5 is stored for each Web server software 81.
  • FIG. 4 is a diagram showing an example of the data configuration of FIG.
  • FIG. 8 when an application is used as web server software, multiple http requests are sent to a server equipped with a CPU 41 with a reference clock frequency (1 GHz), and as a result, requests per second are processed.
  • application An it is understood that c n requests are processed per second as a result of sending a plurality of http requests to a server equipped with CPU 41 of the reference clock frequency. This data is collected in advance and input to the resource allocation controller in advance.
  • FIG. 9 is a flowchart illustrating the server group new creation processing.
  • the resource allocation control device 5 receives a server group new creation request (S91).
  • the new creation of the server group 7 is based on the initiative of the operation manager, and here, the command input by the operation manager through the input device 47 provided in the resource allocation control device 5 is received. . It also receives the name of the newly created server group, the initial number of servers to be assigned, and the response time threshold as command arguments.
  • step S92 the name of the newly created server group, the initial number of servers to be allocated, and the response time threshold are stored (S92).
  • step S92 the information received as the command argument is temporarily stored in the RAM 45 provided in the resource allocation control device 5.
  • step S93 the data center configuration information is updated so that the servers for the initial number received in step S91 are added to the newly created server group (S93).
  • step S93 the servers corresponding to the initial number received in step S91 are selected from the entry of the server in which the server group name 74 is "unused" in the data center configuration information in FIG. Then, the field of the server group name 74 of the selected server may be changed to the name of the newly created server group received in step S91.
  • the representative IP address to be assigned to the newly created server group and the distribution rate of each server belonging to the server group are determined (S94). The representative IP address and the distribution ratio are set in the load distribution device 4, but are determined by the resource allocation control device 5.
  • the representative IP address is determined by selecting an arbitrary one from a set of unused IP addresses stored in the hard disk of the resource allocation control device 5 (because it is exclusively used as the representative IP address). .
  • the method of determining the initial value of the distribution rate is not particularly limited as long as the sum of the distribution rates is 1, so for example, 1 is divided by the initial number received in step S91, and the value of What is necessary is just to determine it as a distribution rate.
  • step S94 new configuration information is transmitted to the load balancer (S95), and the server group new creation process ends.
  • step S95 the name of the newly created server group, the representative IP address, the response time threshold stored in step S92, the server name for the initial number selected in step S93, and the step S94 Is transmitted to the load balancer by the resource allocation controller.
  • FIG. 9 When a new server group is created, the processing in FIG. 9 is executed.To determine whether a server belonging to an existing server group is optimal for providing network services, the resource allocation control device 5 Perform server allocation adjustment processing.
  • the CPU clock frequency is used as a numerical value for measuring the processing capacity of the server group, and the number of servers is simply increased to improve the processing capacity of the server group.
  • FIGS. 10 and 11 are flowcharts illustrating the server allocation adjustment processing.
  • the resource allocation control device 5 periodically executes a server allocation adjustment process to determine whether a server allocated to a group of servers providing each network service is optimal. It is assumed that the table data in FIG. 8 has been input before this processing.
  • the resource allocation control device 5 refers to the belonging server group name 74 of the data center configuration information in FIG. 7 and selects one server group name other than “unused”.
  • step S102 the number of arrival requests of the server group selected in step S101 after 60 seconds is predicted (S102).
  • step S 102 first, the latest 300 seconds of the server group selected in step S 101 are sent to the resource allocation control device 5 and the load distribution device 4. 3 004679
  • the load balancer 4 refers to the statistical information in FIG. 6 and sends the latest number of arrival requests 62 for the latest 300 seconds of the corresponding server group to the resource allocation controller 5. . Then, the number of arrival requests 60 seconds from now is predicted using the least squares method based on the number of arrival requests for 300 seconds obtained.
  • FIG. 12 is a diagram for explaining a prediction method using the least squares method.
  • Figure 12 shows the time (seconds) on the horizontal axis and the number of arrival requests (number) on the vertical axis, and plots data for 300 seconds on the coordinate plane at 1-second intervals. On the coordinate plane, a straight line that minimizes the distance in the vertical axis direction from each plotted point is obtained by the least squares method.
  • step S102 the number of arrival requests after 60 seconds is predicted.
  • the present invention is not limited to this and can be determined according to the policy of the operation manager.
  • the latest 300 seconds of data was used to estimate the number of arrival requests after 60 seconds, the present invention is not limited to this.
  • data of five times the difference between the predicted future time and the current time is used as a guide.
  • step S102 based on the number of arrival requests predicted in step S102, the average response time T after 60 seconds per server belonging to the server group selected in step S101 is calculated. Yes (S103).
  • p is the window utilization rate in queuing theory, and is the probability that the window is busy at any time.
  • the Poisson model guarantees that the queue will not overflow at p ⁇ 1, which means that no matter how much time passes, the queue length will remain below a certain length.
  • the waiting time means the average response time ⁇ of the servers belonging to the server group selected in step S101. Then, assuming that the average response time of the server takes the worst value, since p is 1,
  • the response time of the i-th server belonging to the server group selected in step S101 can be derived as follows. Assuming that the i-th server in the server group is a window in queuing theory, the number of requests processed per second at the window is f ; X ⁇ , the number of requests to the window arrives at the number of arrival requests for the entire server group. , X multiplied by the distribution ratio r of the i-th server. Therefore, the response time 1 of the i-th server belonging to the server group becomes
  • step S104 it is determined whether or not the response time T calculated in step S103 satisfies 0 ⁇ T ⁇ Tp with respect to a preset response time threshold Tp (S104). If the response time ⁇ falls within the range (0 ⁇ T ⁇ Tp) in step S104, the service level satisfies the service level guaranteed in the contract with xSP, but the allocated server is probably excessive, This means that there may be room for server reduction.
  • step S104 if the response time T falls within the predetermined range in step S104, one arbitrary server belonging to the server group is selected (S105). Then, assuming a configuration in which the server selected in step S105 is excluded from the server group selected in step S101, the future response time ⁇ is calculated again in the assumed new configuration.
  • Step S106 can be calculated using equation (A) as in step S103.
  • step S106 it is determined whether the response time calculated in step S106 satisfies 0 ⁇ T ⁇ T with respect to the response time threshold T (S107).
  • the number of servers is reduced by one, so the value of the denominator becomes smaller, and the response time ⁇ is calculated in the previous calculation.
  • Step S103 Therefore, if the increased value satisfies 0 ⁇ ⁇ , it is safe to delete one server.
  • step S107 the server actually selected in step S105 is added to the unused server (S108), and the number of servers is reduced.
  • step S108 the server group name selected in step S101 is updated by updating the affiliated server group name 74 corresponding to the server selected in step S105 in the data center configuration information in FIG. Means that the server is removed from the configuration.
  • step S108 the process returns to step S105 to determine whether there is room for further server reduction.
  • step S107 the response time T is 0 ⁇ If T ⁇ Tp is not satisfied, it is known that the current configuration is the minimum necessary, and the configuration is not changed, and the process proceeds to the next process (moving to FIG. 11), and the distribution ratio is calculated (S109) ).
  • step S109 in FIG. 11 the distribution rates of the servers belonging to the server group are calculated by a method described later, so that the response times of the servers belonging to the same server group become equal to each other. If the response time of each server belonging to the same server group is not uniform, even if the average response time of the server obtained by averaging the entire server group is less than the response time threshold, the response time of some servers The time threshold may be exceeded. Therefore, it is necessary to control the distribution ratio of each server so that the response times of the servers belonging to the same server group are equal to each other.
  • -(1- ⁇ ⁇ /,) + ⁇ (c)
  • n the number of servers belonging to the server group
  • selected in step S101.
  • the average number of requests processed per second at the reference frequency of the Web server software used by the servers belonging to the server group.R is the number of arrival requests after 60 seconds for the entire server group selected in step S101. Yes, it is the value predicted in step S102.
  • f i is the relative clock magnification of the i-th server among the n servers belonging to the server group selected in step S101, and it is appropriate to refer to the CPU clock speed 73 in the data center configuration information in FIG.
  • Equation (C) The derivation method of equation (C) is as follows. From equation (B), the response time of the i-th server is
  • step S104 of Fig. 10 If the response time T does not fall within the range of 0 ⁇ T ⁇ Tp in step S104 of Fig. 10, select one server belonging to an unused server (S112) . Then, the server selected in step S112 is added to the server group (S113).
  • the reason why the number of arrival requests does not fall within the predetermined range in step S104 is that the number of arrival requests exceeds the processing capacity of the server group, that is, the denominator of equation ( ⁇ ) is negative, or the number of arrival requests is The range of processing capacity is ⁇ , but the load is higher as the response time is shorter than the required response time. In any case, it is necessary to increase the number of servers and increase the processing capacity.
  • step S115 It is determined again whether the response time calculated in step S114 falls within a predetermined range (0T ⁇ Tp) (S115). In step S115, even if one is added, if it does not yet fall within the prescribed range, the service level guaranteed in the contract with xSP has not yet been satisfied, and in order to further add servers, steps S112 Return to and continue the process. If the value falls within the predetermined range in step S115, the flow advances to step S109 to calculate the distribution rate.
  • the resource allocation control device 5 1 It is also possible to set to execute the new server group creation processing in Fig. 9 as interrupt processing.
  • the present invention provides a method for automatically allocating servers in a data center to each network service in real time to a load distribution device without manual intervention. Can be. It also monitors the fluctuations in the amount of requests arriving at each network service, predicts the value of the amount of request after a certain period of time, and provides services for the network service in accordance with the magnitude of the estimated amount of request. It is possible to control the quota of the bus.
  • the burden on the operation manager can be reduced, and operation by fewer operation managers is possible. In addition, it can be operated by inexperienced operation managers.
  • the amount of servers allocated to the network service is determined by the average of the response time to the user terminal when the amount of traffic indicated by the predicted request value arrives at the network service. Is set to be equal to or less than a predetermined response time threshold.
  • the server application is a web server, and a request from the user terminal is an ht3 ⁇ 4) request.
  • Another application program is used in the server, and the user terminal performs a request for the application program. If the server sends a request and the server sends a reply to the request to the user terminal, the present invention can be applied to such a case.
  • the CPU clock frequency is used as a numerical value for measuring the processing capacity of the server group, and the number of servers is simply increased or decreased in order to adjust the processing capacity of the server group.
  • the present invention is also applicable to a case where computer resources such as a CPU, a memory, and a hard disk are individually numerically increased and decreased.
  • the force S at which the reference clock frequency is 1 GHz is not limited thereto.
  • the average number of processes for each application in the server equipped with the peak frequency determined as the reference peak frequency is collected in advance and input to the resource allocation control device in advance as shown in FIG. Is applicable.
  • the present invention provides a method of automatically allocating servers in a data center to respective network services to a load distribution apparatus in real time without manual intervention. Can be. It also monitors the fluctuations in the amount of requests arriving at each network service, predicts the value of the amount of request after a certain period of time, and provides services for the network service in accordance with the magnitude of the estimated amount of request. It is possible to control the quota of the bus.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention porte sur l'exécution automatique de l'affectation de serveurs d'un centre de données à des services de réseau par un distributeur de charge en temps réel sans aucune intervention manuelle. La variation de la quantité de demandes arrivant à chaque service de réseau est contrôlée, et une prévisibilité de la valeur de la quantité de demandes est faite après un temps écoulé prédéterminé. En fonction de la grandeur de la valeur prévue de la quantité de demandes, la quantité d'affectation des serveurs au service de réseau peut être contrôlée. La quantité de serveurs affectée au service du réseau est ainsi déterminée de sorte que la moyenne des temps de réponse aux terminaux utilisateurs soit un seuil de temps de temps réponse prédéfini par le gestionnaire d'opération lorsque le trafic d'une quantité indiquée par la valeur prévue de la quantité de demandes arrive au niveau du service de réseau. Le groupe de serveurs peut comprendre un nombre minimum exigé de serveurs nécessaires pour le fonctionnement des services du réseau de façon à évaluer si les temps de réponse basés sur la valeur prédite sont ou non dans une plage prédéterminée chaque fois qu'un serveur est ajouté ou éliminé.
PCT/JP2003/004679 2003-04-14 2003-04-14 Procede de commande d'affectation de serveur WO2004092971A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2004570858A JP3964909B2 (ja) 2003-04-14 2003-04-14 サーバ割当制御方法
PCT/JP2003/004679 WO2004092971A1 (fr) 2003-04-14 2003-04-14 Procede de commande d'affectation de serveur
US11/099,538 US20050193113A1 (en) 2003-04-14 2005-04-06 Server allocation control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2003/004679 WO2004092971A1 (fr) 2003-04-14 2003-04-14 Procede de commande d'affectation de serveur

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/099,538 Continuation US20050193113A1 (en) 2003-04-14 2005-04-06 Server allocation control method

Publications (1)

Publication Number Publication Date
WO2004092971A1 true WO2004092971A1 (fr) 2004-10-28

Family

ID=33193209

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2003/004679 WO2004092971A1 (fr) 2003-04-14 2003-04-14 Procede de commande d'affectation de serveur

Country Status (2)

Country Link
JP (1) JP3964909B2 (fr)
WO (1) WO2004092971A1 (fr)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664859B2 (en) 2005-03-30 2010-02-16 Hitachi, Ltd. Resource assigning management apparatus and resource assigning method
JP2010061261A (ja) * 2008-09-02 2010-03-18 Fujitsu Ltd 認証システムおよび認証方法
US7693995B2 (en) 2005-11-09 2010-04-06 Hitachi, Ltd. Arbitration apparatus for allocating computer resource and arbitration method therefor
JP2010191603A (ja) * 2009-02-17 2010-09-02 Kddi Corp サービス提供サーバ
JP2011076469A (ja) * 2009-09-30 2011-04-14 Nomura Research Institute Ltd 負荷管理装置、情報処理システムおよび負荷管理方法
JP2011204128A (ja) * 2010-03-26 2011-10-13 Nomura Research Institute Ltd 運用管理装置および運用管理方法
JP2011204110A (ja) * 2010-03-26 2011-10-13 Nomura Research Institute Ltd 情報処理システムおよび情報処理方法
JP2011210225A (ja) * 2010-07-14 2011-10-20 Nomura Research Institute Ltd 情報処理システムおよび情報処理方法
JP2012053899A (ja) * 2011-10-26 2012-03-15 Nomura Research Institute Ltd 運用管理装置および情報処理システム
JP2012518842A (ja) * 2009-02-23 2012-08-16 マイクロソフト コーポレーション エネルギーを意識したサーバ管理
US8839254B2 (en) 2009-06-26 2014-09-16 Microsoft Corporation Precomputation for data center load balancing
US8849469B2 (en) 2010-10-28 2014-09-30 Microsoft Corporation Data center system that accommodates episodic computation
JP2015095149A (ja) * 2013-11-13 2015-05-18 富士通株式会社 管理プログラム、管理方法、および管理装置
US9063738B2 (en) 2010-11-22 2015-06-23 Microsoft Technology Licensing, Llc Dynamically placing computing jobs
WO2015092873A1 (fr) * 2013-12-18 2015-06-25 株式会社日立製作所 Système de traitement d'informations et procédé de traitement d'informations
US9450838B2 (en) 2011-06-27 2016-09-20 Microsoft Technology Licensing, Llc Resource management for cloud computing platforms
US9595054B2 (en) 2011-06-27 2017-03-14 Microsoft Technology Licensing, Llc Resource management for cloud computing platforms
US9933804B2 (en) 2014-07-11 2018-04-03 Microsoft Technology Licensing, Llc Server installation as a grid condition sensor
US10234835B2 (en) 2014-07-11 2019-03-19 Microsoft Technology Licensing, Llc Management of computing devices using modulated electricity
JP2020102189A (ja) * 2018-12-21 2020-07-02 北京百度网▲訊▼科技有限公司Beijing Baidu Netcom Science And Technology Co., Ltd. データ処理用の方法、装置及びシステム
JPWO2020261399A1 (fr) * 2019-06-25 2020-12-30
CN113556372A (zh) * 2020-04-26 2021-10-26 浙江宇视科技有限公司 数据传输方法、装置、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5928699B2 (ja) * 2012-03-05 2016-06-01 日本電気株式会社 ジョブ割当方法、装置及びプログラム

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6230183B1 (en) * 1998-03-11 2001-05-08 International Business Machines Corporation Method and apparatus for controlling the number of servers in a multisystem cluster
US6466980B1 (en) * 1999-06-17 2002-10-15 International Business Machines Corporation System and method for capacity shaping in an internet environment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6230183B1 (en) * 1998-03-11 2001-05-08 International Business Machines Corporation Method and apparatus for controlling the number of servers in a multisystem cluster
US6466980B1 (en) * 1999-06-17 2002-10-15 International Business Machines Corporation System and method for capacity shaping in an internet environment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Edwin R. Lassettre et al., "HotRod: An Automomic System for Dynamic Surge Protection", [online], IBM Research Division, 17 March, 2003 [retrieved on 30 May, 2003], Retrieved from the Internet: <URL: http://domino.watson.ibm.com/library/CyberDig.nsf/0/485bf8b1f6ef56fd85256cee0064cd4f?OpenDocument> *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7664859B2 (en) 2005-03-30 2010-02-16 Hitachi, Ltd. Resource assigning management apparatus and resource assigning method
US7693995B2 (en) 2005-11-09 2010-04-06 Hitachi, Ltd. Arbitration apparatus for allocating computer resource and arbitration method therefor
JP2010061261A (ja) * 2008-09-02 2010-03-18 Fujitsu Ltd 認証システムおよび認証方法
JP2010191603A (ja) * 2009-02-17 2010-09-02 Kddi Corp サービス提供サーバ
JP2012518842A (ja) * 2009-02-23 2012-08-16 マイクロソフト コーポレーション エネルギーを意識したサーバ管理
KR101624765B1 (ko) * 2009-02-23 2016-05-26 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 에너지-인식 서버 관리
US8839254B2 (en) 2009-06-26 2014-09-16 Microsoft Corporation Precomputation for data center load balancing
JP2011076469A (ja) * 2009-09-30 2011-04-14 Nomura Research Institute Ltd 負荷管理装置、情報処理システムおよび負荷管理方法
JP2011204128A (ja) * 2010-03-26 2011-10-13 Nomura Research Institute Ltd 運用管理装置および運用管理方法
JP2011204110A (ja) * 2010-03-26 2011-10-13 Nomura Research Institute Ltd 情報処理システムおよび情報処理方法
JP2011210225A (ja) * 2010-07-14 2011-10-20 Nomura Research Institute Ltd 情報処理システムおよび情報処理方法
US9886316B2 (en) 2010-10-28 2018-02-06 Microsoft Technology Licensing, Llc Data center system that accommodates episodic computation
US8849469B2 (en) 2010-10-28 2014-09-30 Microsoft Corporation Data center system that accommodates episodic computation
US9063738B2 (en) 2010-11-22 2015-06-23 Microsoft Technology Licensing, Llc Dynamically placing computing jobs
US9450838B2 (en) 2011-06-27 2016-09-20 Microsoft Technology Licensing, Llc Resource management for cloud computing platforms
US9595054B2 (en) 2011-06-27 2017-03-14 Microsoft Technology Licensing, Llc Resource management for cloud computing platforms
US10644966B2 (en) 2011-06-27 2020-05-05 Microsoft Technology Licensing, Llc Resource management for cloud computing platforms
JP2012053899A (ja) * 2011-10-26 2012-03-15 Nomura Research Institute Ltd 運用管理装置および情報処理システム
JP2015095149A (ja) * 2013-11-13 2015-05-18 富士通株式会社 管理プログラム、管理方法、および管理装置
US10225333B2 (en) 2013-11-13 2019-03-05 Fujitsu Limited Management method and apparatus
WO2015092873A1 (fr) * 2013-12-18 2015-06-25 株式会社日立製作所 Système de traitement d'informations et procédé de traitement d'informations
US10234835B2 (en) 2014-07-11 2019-03-19 Microsoft Technology Licensing, Llc Management of computing devices using modulated electricity
US9933804B2 (en) 2014-07-11 2018-04-03 Microsoft Technology Licensing, Llc Server installation as a grid condition sensor
JP2020102189A (ja) * 2018-12-21 2020-07-02 北京百度网▲訊▼科技有限公司Beijing Baidu Netcom Science And Technology Co., Ltd. データ処理用の方法、装置及びシステム
US11064053B2 (en) 2018-12-21 2021-07-13 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus and system for processing data
US11277498B2 (en) 2018-12-21 2022-03-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus and system for processing data
JPWO2020261399A1 (fr) * 2019-06-25 2020-12-30
WO2020261399A1 (fr) * 2019-06-25 2020-12-30 日本電信電話株式会社 Dispositif de commande de serveur, procédé de commande de serveur et programme
JP7173340B2 (ja) 2019-06-25 2022-11-16 日本電信電話株式会社 サーバ制御装置、サーバ制御方法、およびプログラム
CN113556372A (zh) * 2020-04-26 2021-10-26 浙江宇视科技有限公司 数据传输方法、装置、设备及存储介质
CN113556372B (zh) * 2020-04-26 2024-02-20 浙江宇视科技有限公司 数据传输方法、装置、设备及存储介质

Also Published As

Publication number Publication date
JPWO2004092971A1 (ja) 2006-07-06
JP3964909B2 (ja) 2007-08-22

Similar Documents

Publication Publication Date Title
WO2004092971A1 (fr) Procede de commande d&#39;affectation de serveur
US20050193113A1 (en) Server allocation control method
Baek et al. Managing fog networks using reinforcement learning based load balancing algorithm
US11051210B2 (en) Method and system for network slice allocation
CN107743100B (zh) 一种基于业务预测的在线自适应网络切片虚拟资源分配方法
Zhang et al. Workload-aware load balancing for clustered web servers
US7680897B1 (en) Methods and systems for managing network traffic
EP1499152B1 (fr) Procédé et dispositif d&#39;attribution adaptive et en ligne dans des réseaux hiérarchiques superposés
KR101947354B1 (ko) 경합하는 어플리케이션들 간에 네트워크 대역폭을 배분하기 위한 시스템
US8352951B2 (en) Method and apparatus for utility-based dynamic resource allocation in a distributed computing system
CN113364850B (zh) 软件定义云边协同网络能耗优化方法和系统
US20080215742A1 (en) METHOD AND APPARATUS FOR DYNAMICALLY ADJUSTING RESOURCES ASSIGNED TO PLURALITY OF CUSTOMERS, FOR MEETING SERVICE LEVEL AGREEMENTS (SLAs) WITH MINIMAL RESOURCES, AND ALLOWING COMMON POOLS OF RESOURCES TO BE USED ACROSS PLURAL CUSTOMERS ON A DEMAND BASIS
EP0384339A2 (fr) Courtier pour la sélection de serveur de réseau d&#39;ordinateur
CN111711666B (zh) 一种基于强化学习的车联网云计算资源优化方法
EP3371938B1 (fr) D&#39;attribution adaptative de ressources guidés par l&#39;abonné pour surveillance basée sur la poussée
CN112822050A (zh) 用于部署网络切片的方法和装置
US20230110415A1 (en) Dynamic connection capacity management
JP2009244945A (ja) 負荷分散プログラム、負荷分散方法、負荷分散装置およびそれを含むシステム
Guha Roy et al. Service aware resource management into cloudlets for data offloading towards IoT
CN111143036A (zh) 一种基于强化学习的虚拟机资源调度方法
CN112219191A (zh) 数据中心中的服务和服务器的自配置
US11838389B2 (en) Service deployment method and scheduling apparatus
JP3545931B2 (ja) 呼制御スケジューリング方法
Li et al. Profit driven service provisioning in edge computing via deep reinforcement learning
JP7359222B2 (ja) 通信管理装置及び通信管理方法

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

WWE Wipo information: entry into national phase

Ref document number: 2004570858

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 11099538

Country of ref document: US