CN111371866B

CN111371866B - Method and device for processing service request

Info

Publication number: CN111371866B
Application number: CN202010120697.1A
Authority: CN
Inventors: 张瑶
Original assignee: Xiamen Wangsu Co Ltd
Current assignee: Xiamen Wangsu Co Ltd
Priority date: 2020-02-26
Filing date: 2020-02-26
Publication date: 2023-03-21
Anticipated expiration: 2040-02-26
Also published as: CN111371866A

Abstract

The invention discloses a method and a device for processing a service request, and belongs to the technical field of data transmission. The method comprises the following steps: a first cache server receives a service request of a user terminal and inquires the load state of the local machine; if the first cache server is in an overload state, determining a second cache server according to the load state of the standby cache server; and the first cache server triggers the second cache server to acquire the service request so as to enable the second cache server to process the service request. The invention can relieve the load pressure of the machine in an overload state in time and reduce the influence of the overload of the machine on the service quality.

Description

Method and device for processing service request

Technical Field

The present invention relates to the field of data transmission technologies, and in particular, to a method and an apparatus for processing a service request.

Background

With the continuous development of internet technology, the network environment is increasingly complex, the number of users is increasing, and the quality requirement of network access is higher and higher. In order to improve user experience and competitiveness, each large network operator deploys a large number of cache servers in a network architecture or selects to cooperate with a cloud service party, so that network accelerated access is realized through the cache servers.

In the above process, the service request sent by the user terminal may be first transmitted to the redirection server, and the redirection server may distribute the received service request to different cache servers according to a preset redirection policy (e.g., redirection according to a URL or a file name). If the cache server has the corresponding request processing capacity, the cache server can directly respond to the service request, so that the service request does not need to be repeatedly transmitted to the source station.

In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:

the redirect server will periodically detect the load status of the cache server in the working state, for example, once every 3-5 minutes. When a certain cache server is overloaded, on one hand, because the real-time performance of detection is insufficient, if the cache server is overloaded, the redirection server needs to sense the overload state for a long time (3-5 minutes), and before the latest state is sensed, the service request of the user is still redirected to the overloaded cache server, so that the response efficiency of the service request is influenced; on the other hand, if the user terminal establishes a TCP long connection with the cache server, the user terminal will continuously transmit service requests to the cache server based on the TCP connection, and the service requests cannot be stripped off, when the cache server is overloaded, the service requests cannot be responded quickly in time, and the user experience is poor.

Disclosure of Invention

In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for processing a service request. The technical scheme is as follows:

in a first aspect, a method for processing a service request is provided, where the method includes:

a first cache server receives a service request of a user terminal and inquires the load state of the local machine;

if the first cache server is in an overload state, determining a second cache server according to the load state of the standby cache server;

and the first cache server triggers the second cache server to acquire the service request so as to enable the second cache server to process the service request.

In a second aspect, an apparatus for processing a service request is provided, the apparatus comprising:

a request receiving module, configured to receive a service request of a user terminal;

the load balancing module is used for inquiring the load state of the local computer, and determining a second cache server according to the load state of the standby cache server if the local computer is in an overload state;

and the data sending module is used for triggering the second cache server to acquire the service request so as to enable the second cache server to process the service request.

In a third aspect, a cache server is provided, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for processing a service request according to the first aspect.

In a fourth aspect, there is provided a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the method of handling service requests according to the first aspect.

The technical scheme provided by the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, a first cache server receives a service request of a user terminal and inquires the load state of the local machine; if the first cache server is in an overload state, determining a second cache server according to the load state of the standby cache server; the first cache server triggers the second cache server to acquire the service request, so that the second cache server processes the service request. Compared with the load detection only depending on the redirection server, on one hand, the cache server detects the load state of the local machine, and can shorten the detection frequency to the second level, so that the real-time performance and the accuracy of the detection result can be improved, the load pressure of the machine in an overload state can be sensed in time, a more appropriate cache server is selected for the service request, and the influence of the overload of the machine on the service quality of the service can be reduced; on the other hand, for the scene of establishing the long TCP connection, the cache server receiving the service request executes load balancing, so that the service request transmitted on the long TCP connection can be effectively transferred, the service request can be responded timely and quickly, and the user experience can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a network framework diagram for processing service requests according to an embodiment of the present invention;

fig. 2 is a flowchart of a method for processing a service request according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an apparatus for processing a service request according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus for processing a service request according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a cache server according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

The embodiment of the invention provides a method for processing a service request, which can be applied to a cache server, and a specific network framework is shown in fig. 1.

The redirection server may be configured to guide a service request of the user terminal, and specifically, may distribute the service request to a corresponding cache server based on a preset distribution policy. The mode for the redirect server to implement the service request to the user terminal may include at least the following two modes:

the first method is as follows: the method comprises the steps that a service request sent by a user terminal to a target source station can be obtained in a mirror image mode, a target cache server is selected based on the service request and a preset distribution strategy, and address information of the cache server is fed back to the user terminal in a 302-skip returning mode, so that the user terminal can request resources from the target cache server based on the address information in the 302-skip, and it is worth noting that a redirection server and the user terminal generally belong to the same intranet, so that the service request can be responded before the target source station when the service request is obtained in the mirror image, and meanwhile, in order to enable the user terminal to receive the response, the redirection server can be disguised as the target source station by modifying a response source IP into a target IP requested by the user terminal, and the 302-skip is sent to the user terminal;

the second method comprises the following steps: the service request may be directly forwarded to a target cache server (not shown) selected based on a preset distribution policy after being subjected to NAT (Network Address Translation) processing by a transparent proxy;

the third method comprises the following steps: the DNS analysis request aiming at the target domain name sent by the user terminal can be obtained in a mirror image mode, the target cache server is selected based on a preset distribution strategy, and the address of the target cache server is responded to the user terminal, so that the user terminal can construct a service request based on the address of the target cache server and send the service request to the target cache server.

The preset distribution policy may include distribution based on the requested content, for example, distribution according to url or file name, distribution based on source IP, destination IP, or target domain name, or distribution based on application type, which is not limited in the present invention.

Furthermore, the redirect server may periodically probe each cache server to obtain the service state of each cache server, and select a target cache server based on a preset distribution policy and the service state of each cache server. The service status may include, among other things, whether it is alive, overloaded, etc.

The cache server may be installed in a network service architecture of a network operator or a service provider, or provided by a cloud service provider, and may be used to cache a resource file requested by a user, so as to implement a quick response to a service request and save an extranet access resource.

The main-standby relation can be established between the cache servers, the cache servers which are mutually main-standby can mutually detect the load state, and the load balance is realized according to the detection result. The active/standby cache servers may be cache servers for storing homogeneous resources. Preferably, to avoid the loop, it may be set that only one active-standby relationship exists between the two cache servers, for example, if the cache server B, C is a backup server of the cache server a, a is not set as the backup server of B, C.

After receiving a service request sent by a user terminal, if the local machine is normally loaded, the cache server responds to the service request by searching a cached resource file, if the request resource is not cached, the cache server can obtain the request resource back to the source to respond to the user terminal, and meanwhile, the cache server caches the obtained request resource for responding to a subsequent service request; if the local computer is overloaded, one of the standby servers can be selected, and the standby server is triggered to respond to the service request. Specifically, the method may include feeding back the address of the standby server to the user terminal, so that the user terminal directly requests the resource from the selected standby server. It should be noted that, if the redirect server directs the service request to the cache server in the second manner, the cache server may feed back the address of the standby server to the user terminal in the following two manners: .

Firstly, the address of a standby server can be directly fed back to a user terminal, but the address of the standby server needs to be disguised as a target server of a user request by modifying a source IP as a target IP of the user request at the same time so as to respond to the user terminal;

second, the address of the backup server may be fed back to the redirect server so that the redirect server redirects traffic requests to the backup server.

The cache server may include a processor, a memory, and a transceiver, where the processor may be configured to perform processing of a service request in the following process, the memory may be configured to store data required and generated in the following process, and the transceiver may be configured to receive and transmit related data in the following process.

The process flow shown in fig. 1 will be described in detail below with reference to specific embodiments, and the contents may be as follows:

step 201, a first cache server receives a service request of a user terminal, and queries a local load state.

In implementation, after the user terminal generates the service request, the service request may be sent to a corresponding service server. The redirect server may direct the service request to the first cache server in the manner described above. Further, the first cache server may receive the service request and query the local load state to determine whether sufficient device resources currently exist in the local to process the service request.

Step 202, if the first cache server is in an overload state, determining a second cache server according to the load state of the standby cache server.

In implementation, if the first cache server detects that the local computer is in an overload state, the first cache server may obtain a load state of a corresponding standby cache server, and then select an available standby cache server from the obtained load state to determine the standby cache server as the second cache server.

Further, when the second cache server is selected, the future load state of the standby cache servers can be judged according to the current load state and the historical load state of each standby cache server, and then the cache server with the best future load state is selected as the second cache server.

In the network architecture of this embodiment, each cache server may establish an association relationship with a plurality of other cache servers, and the associated cache servers are mutually active and standby. If the first cache server is in a normal state, the first cache server can directly process the service request and respond to the service request. Of course, in another embodiment, the backup cache server may be a cache server dedicated to providing backup service for the service request, which is different from the conventional cache server and belongs to the backup cluster.

Optionally, for different service requests, a second cache server may be selected according to performance requirements, and correspondingly, the processing in step 202 may be as follows: and the first cache server determines a second cache server according to the load state of the standby cache server and the load demand limit corresponding to the service request.

In implementation, when the first cache server selects the second cache server, the load demand quota corresponding to the service request may be determined first. Then, the first cache server may select, according to the detected load state of the standby cache server, the standby cache server that can meet the load demand of the service request as the second cache server. For example, if the service request has a higher load demand line for the CPU and the memory, and a lower load demand line for the bandwidth and the IO, a standby cache server with a lower CPU load and a lower memory load may be selected as the second cache server.

Step 203, the first cache server triggers the second cache server to obtain the service request, so that the second cache server processes the service request.

In implementation, after the first cache server determines the second cache server, the second cache server may be triggered to acquire the service request of the user terminal, so that the second cache server may complete processing and responding of the service request after acquiring the service request.

In implementation, after determining the second cache server, the first cache server may send the network address of the second cache server to the user terminal in a 302 jump manner, and after acquiring the network address of the second cache server from the 302 jump, the user terminal may establish a communication connection with the second cache server according to the network address and send a service request to the second cache server through the communication connection.

Therefore, when the first cache server cannot process the current service request, the first cache server can directly feed back 302 skip to the user terminal so as to quickly respond to the user terminal, and the user terminal can directly request resources from the second cache server without continuously occupying the resources of the first cache server, so that the load of the first cache server can be effectively relieved.

It should be noted that, after the second cache server receives the service request, the above method also needs to be performed to determine whether there is a local cache server to process the service request, so as to ensure that a cache server with a good load condition can be found to perform a request response.

In this embodiment, the load status of the server may be recorded in a table form to realize fast query of the detection result, and the corresponding processing may be as follows: the first cache server periodically detects the load state of the local cache server and the load state of the standby cache server, and updates the available identification table of the server according to the detection result.

In implementation, when the first cache server provides a service, the first cache server may periodically detect a local load state and a load state of a corresponding standby cache server. Meanwhile, the first cache server may establish a server available identification table (may be abbreviated as T-Enable), and the structure of the T-Enable table may be as shown in table 1:

TABLE 1

IP address	Bandwidth redundancy flag bit
		127.0.0.1 (local machine)	0
192.168.3.1 (cache server B)	1
		192.168.3.2 (cache server C)	0
…	…

The server identifier may be used to identify each server, and may be a local IP address and an IP address of an alternative cache server, or may be a hostname or other identity identifier, and the available state identifier may be used to indicate a current state of the server, or may be an identifier that determines whether the server is in an overload state based on a last detection result, where 1 represents that the load is normal and the server is available, and 0 represents that the server is in an overload state and the server is unavailable.

Further, parameter values of each load parameter and priorities of the cache servers determined based on each load parameter may also be recorded in the server available identification table, where a lower current load is, a higher priority of a standby cache server is, that is, the standby cache server is selected as the second cache server.

Optionally, different probing frequencies may be adopted for the local cache server and other cache servers, and the corresponding processing may be as follows: the first cache server detects the load state of the local machine in a short period, and updates the available identification table of the server according to the latest detection result; the first cache server sends a load detection request to the standby cache server in a long period, and updates the available server identification table according to a request response fed back by the standby cache server.

In an implementation, the first cache server may probe the local load status and the load status of the standby cache server according to different probing frequencies.

First, for the local computer, the detection process is more convenient and efficient, so a higher detection frequency can be adopted, that is, the first cache server can detect the local load state in a short period (for example, 1 s), and update the relevant information of the first cache server in the server available identification table according to the latest detection result, specifically, the cache server can determine the bandwidth load condition by reading the real-time traffic of the local network card, and further, can synchronously acquire other state information of the local computer, including but not limited to CPU utilization rate, IO access rate, and the like.

Secondly, for the standby cache server, the detection processing between the machines needs to consume certain device processing resources, and the detection frequency is not too high, so that the first cache server can send the load detection request to the standby cache server in a long period (e.g. 3 minutes), and can update the relevant information of the corresponding standby cache server in the server available identification table according to the request response after receiving the request response fed back by the standby cache server for the load detection request. Wherein the load probe request may be a specially constructed request, the request header may be in a special format and with a special typeface like "checknet" to distinguish it from the ordinary service request. Based on the processing, the load state of the local machine can be timely and accurately acquired through high-frequency detection of the local machine, and the processing resources of the equipment can be saved through cross-machine low-frequency detection.

Optionally, the cache server may use the detected local load state as a load detection request response to feed back the received load detection request, and the corresponding processing may be as follows: when receiving the load detection request, the first cache server feeds back the local load state recorded in the available identification table of the server, thereby accelerating the detection response process.

In implementation, the first cache server may also serve as a backup cache server for other cache servers when detecting the load status of the backup cache server. Therefore, the first cache server can receive the load detection request sent by other cache servers, then query the locally maintained server available identification table, obtain the local load state recorded therein, and further feed back the local load state to the sender of the load detection request. Therefore, the local load state in the available identification table of the server is obtained through high-frequency detection, so that the accuracy of the local load state can be effectively ensured. And the load detection request is responded based on the record in the available identification table of the server, so that the accuracy and timeliness of the detection result can be ensured as far as possible on the premise of cross-machine low-frequency detection.

Optionally, the cache server may implement load balancing by querying the server available identification table, and accordingly, the processing in step 202 may be as follows: and the first cache server selects a second cache server with the load state being the available state according to the server available identification table.

In implementation, after the first cache server periodically detects the load states of all the standby cache servers and updates the available server identification table according to the detection result, if the local cache server is found to be in an overload state and a received service request needs to be transferred to other cache servers, the locally maintained available server identification table may be queried, and then the second cache server in which the load state is the available state is selected. Further, if the available server identifier table also records a priority, a standby cache server with the highest priority may be selected as the second cache server.

Optionally, the manner in which the first cache server triggers the second cache server to obtain the service request is various, for example Two possible processing modes are given below:

in the first mode, the first cache server sends the network address of the second cache server to the user terminal, so that the user can use the network address And the terminal sends a service request to the second cache server according to the network address.

In an implementation, the first cache server, after determining the second cache server,the network address of the second cache server may be sent to the user terminal by means of 302 hops. Then, the user terminal acquires the second cache server After the network address, a communication connection with the second cache server can be established according to the network address, and the communication connection is established through the communication connection And then sending the service request to a second cache server.Therefore, when the first cache server cannot process the current service request, the first cache server can directly feed back 302 skip to the user terminal so as to quickly respond to the user terminal, and the user terminal can directly request resources from the second cache server without continuously occupying the resources of the first cache server, so that the load of the first cache server can be effectively relieved.

And in the second mode, the first cache server forwards the service request to the second cache server.

In an implementation, the first cache server may be based on the second cache server after determining the second cache server And the communication connection maintained between the storage servers forwards the service request to the second cache server.

It should be noted that, although both of the above two manners can enable the second cache server to obtain the service request, the service request is obtained by the second cache server The device processing resources consumed in implementing the two modes are different. Compared with the second mode, the first mode is directly carried out by the terminal The process of establishing connection with the second cache server can effectively relieve the bandwidth load of the first cache server, but the terminal is required to be heavy When a connection is newly initiated to the second cache server, the processing of the service request is long in time consumption, and more terminal resources are consumed, so that the connection is established between the second cache server and the second cache server, and the second cache server is not required to be connected Resulting in a reduced user experience for the business service. Therefore, if the load of the first cache server is not the bandwidth load, the load can be selected In the processing in the second mode, the first cache server directly forwards the service request to the second cache server so as to shorten the service The processing of the request is time-consuming, so that the user side has no perception, and the user experience of the business service is improved. If the first cache server If the load bottleneck is bandwidth load, the first mode can be selected for processing, and the terminal and the second cache server are directly connected And then, the first cache server is not required to forward the service request, so that the bandwidth load of the first cache server can be relieved.

Optionally, when the first cache server is overloaded or the load is recovered, the first cache server notifies the redirection server to suspend or re-receive the service request in time, and the corresponding processing may be as follows: when the first cache server is in an overload state, the first cache server sends a service suspension notice to the redirection server so that the redirection server suspends the distribution of the service request to the first cache server; when the load of the first cache server is recovered to be normal, the first cache server sends a service restart notification to the redirection server, so that the redirection server can start to distribute the service request to the first cache server.

In implementation, if the first cache server detects that the local computer is in an overload state, a service suspension notification may be sent to the redirection server, so that after receiving the service suspension notification, the redirection server may mark the first cache server as an unavailable state and suspend the distribution of the service request to the first cache server, so that the first cache server will not receive a new service request subsequently. When the first cache server detects that the load of the first cache server is recovered to normal, the first cache server may send a service restart notification to the redirect server, and after receiving the service restart notification, the redirect server may mark the first cache server as an available state and may distribute the service request to the first cache server. Based on the processing, the cache server actively informs the redirection server, and suspends or resumes the external service of the cache server, so that the phenomenon that the cache server continuously receives the service request and is in an overload state for a long time and cannot guarantee the service quality due to the fact that the redirection server cannot acquire the state change of the cache server in time through detection of the redirection server can be avoided.

Optionally, after detecting that the local computer is in an overload state, the cache server may actively notify other cache servers associated with the local computer, and the corresponding processing may be as follows: the method comprises the steps that a first cache server records all source servers corresponding to load detection requests received within a preset time length; and if the first cache server is in an overload state, sending a service suspension notice to all source servers.

In implementation, the first cache server may play a role as a backup server for other cache servers during operation, so that it may continuously receive load probe requests from other cache servers. In this process, the first cache server may record all source servers corresponding to the load detection request received within a preset time period, and the first cache server is a standby cache server of the source servers. The preset duration can be set according to a detection period of a cross-server, and is preferably slightly less than the detection period of the cross-server, so that the local state is actively fed back before part of source servers send out detection requests, and resources are saved. Then, when detecting that the local computer is in an overload state, the first cache server may send a service suspension notification to all the recorded source servers, so that all the source servers mark the service state of the first cache server as unavailable, thereby preventing other cache servers from redirecting the service request to the first cache server when in the overload state, so as to cause the service request to be redirected repeatedly for multiple times.

Of course, in another embodiment, the server identifications of all the origin servers may be directly written in the configuration file of the first cache server, so that when the first cache server is overloaded, all the origin servers may be determined according to the configuration file. It can be understood that, compared with the process of directly writing configuration files, determining all source servers through load detection requests can effectively deal with the situation that the device states in the network architecture are changeable, so as to improve the statistical accuracy and the real-time performance of the source servers.

In the embodiment of the invention, a first cache server receives a service request of a user terminal and inquires the load state of the local machine; if the first cache server is in an overload state, determining a second cache server according to the load state of the standby cache server; the first cache server triggers the second cache server to acquire the service request, so that the second cache server processes the service request. Compared with the method that the load detection is carried out only by relying on the redirection server, on one hand, the cache server detects the load state of the local machine, the detection frequency can be shortened to the second level, so that the real-time performance and the accuracy of the detection result can be improved, the load pressure of the machine in an overload state can be relieved by suspending receiving or forwarding in time, a more proper cache server is selected for the service request, and the influence of the overload of the machine on the service quality can be reduced; on the other hand, for the scene of establishing the long TCP connection, the cache server receiving the service request executes load balancing, so that the service request transmitted on the long TCP connection can be effectively transferred, the service request can be responded timely and quickly, and the user experience can be improved.

Based on the same technical concept, an embodiment of the present invention further provides a device for processing a service request, as shown in fig. 3, where the device includes:

a request receiving module 301, configured to receive a service request of a user terminal;

the load balancing module 302 is configured to query a local load state, and determine a second cache server according to the load state of the standby cache server if the local load state is in an overload state;

a data sending module 303, configured to trigger the second cache server to obtain the service request, so that the second cache server processes the service request.

Optionally, as shown in fig. 4, the apparatus further includes a load detection module 304, configured to:

detecting the load state of the local computer in a short period, and updating the available identification table of the server according to the latest detection result;

and sending a load detection request to the standby cache server in a long period, and updating a server available identification table according to a request response fed back by the standby cache server.

Optionally, the data sending module 303 is specifically configured to:

sending the network address of the second cache server to the user terminal, so that the user terminal sends the service request to the second cache server according to the network address;

or forwarding the service request to the second cache server.

Fig. 5 is a schematic structural diagram of a cache server according to an embodiment of the present invention. The cache server 500 may vary widely in configuration or performance and may include one or more central processors 522 (e.g., one or more processors) and memory 532, one or more storage media 530 (e.g., one or more mass storage devices) storing applications 542 or data 544. Memory 532 and storage media 530 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the cache server 500. Still further, the central processor 522 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the cache server 500.

The cache Server 500 may also include one or more power supplies 529, one or more wired or wireless network interfaces 550, one or more input-output interfaces 558, one or more keyboards 556, and/or one or more operating systems 541, such as Windows Server, mac OS X, unix, linux, freeBSD, etc.

Cache server 500 may include memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for processing the service request as described above.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method of processing a service request, the method comprising:

the first cache server triggers the second cache server to acquire the service request so that the second cache server processes the service request;

the triggering, by the first cache server, the second cache server to obtain the service request includes:

and the first cache server forwards the service request to the second cache server.

2. The method of claim 1, further comprising:

and the first cache server periodically detects the load state of the local cache server and the load state of the standby cache server, and updates the available server identification table according to the detection result.

3. The method of claim 2, wherein the first cache server periodically probes the local load status and the load status of the standby cache server, and updates the server available identifier table according to the probing result, comprising:

the first cache server detects the load state of the local machine in a short period, and updates the available identification table of the server according to the latest detection result;

and the first cache server sends a load detection request to the standby cache server in a long period, and updates the available server identification table according to a request response fed back by the standby cache server.

4. The method of claim 3, further comprising:

and when a load detection request is received, the first cache server feeds back the local load state recorded in the server available identification table.

5. The method of claim 2, wherein the determining, by the first cache server, the second cache server according to the load status of the backup cache server comprises:

and the first cache server selects a second cache server with a load state as an available state according to the server available identification table.

6. The method of claim 1, wherein the triggering, by the first cache server, the second cache server to obtain the service request comprises:

and the first cache server sends the network address of the second cache server to the user terminal, so that the user terminal sends the service request to the second cache server according to the network address.

7. The method of claim 1, further comprising:

when the first cache server is in an overload state, the first cache server sends a service suspension notice to a redirection server, so that the redirection server suspends the redirection of the service request to the first cache server;

when the load of the first cache server is recovered to be normal, the first cache server sends a service restart notification to a redirection server, so that the redirection server redirects a service request to the first cache server again.

8. The method of claim 1, wherein the determining, by the first cache server, the second cache server according to the load status of the backup cache server comprises:

and the first cache server determines a second cache server according to the load state of the standby cache server and the load demand limit corresponding to the service request.

9. The method of claim 1, further comprising:

the first cache server records all source servers corresponding to the load detection requests received within a preset time length;

and if the first cache server is in an overload state, sending a service suspension notification to all the source servers.

10. An apparatus for processing a service request, the apparatus comprising:

the data sending module is used for triggering the second cache server to acquire the service request so that the second cache server processes the service request; the triggering, by the first cache server, the second cache server to obtain the service request includes: and the first cache server forwards the service request to the second cache server.

11. The apparatus of claim 10, further comprising a load detection module to:

12. The apparatus of claim 10, wherein the data sending module is specifically configured to:

or forwarding the service request to the second cache server.

13. A cache server, comprising a processor and a memory, the memory having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the method of:

receiving a service request of a user terminal, and inquiring the load state of the local machine;

if the machine is in an overload state, determining a second cache server according to the load state of the standby cache server;

triggering the second cache server to acquire the service request so that the second cache server processes the service request;

14. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a method comprising:

if the local computer is in an overload state, determining a second cache server according to the load state of the standby cache server;