Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to reduce the working pressure of the micro-service, at present, a load balancing API (Application Programming Interface) Gateway is usually implemented by using a Nginx service in combination with a Zuul (Spring Cloud API Gateway) Gateway, where the Zuul Gateway is used to intercept a request in the micro-service to uniformly process the intercepted request (such as rejecting a request that does not carry Token or fails Token parsing), and the Nginx service is used to load balance the Zuul Gateway to balance the workload of the Zuul Gateway. In practical application, a Zuul gateway intercepts a request in a microservice, and stateless identity authentication is realized by combining with a JWT (Json web Token), so that a request which does not carry Token in a Cookie or fails in Token parsing is rejected, thereby reducing the pressure of service, but the workload of the gateway is increased, so that the request can be uniformly transferred to each gateway cluster through load balancing by a Nginx service to relieve the problem of larger workload of the gateway.
On the basis of fig. 1, an interaction schematic diagram of a load balancing server is further provided in an embodiment of the present invention, as shown in fig. 2, first, a back-end service copy is registered in a registry, and if the load balancing server receives a call request sent by an application service of a client, the load balancing server determines the callable back-end service copy through an API gateway. However, this method has the following problems: and (1) the deployment complexity is high. As shown in fig. 2, a common load balancing means needs to add servers (i.e., load balancing servers) in a load balancing layer, and the load balancing capability can be transferred to a failure when one of the load balancing servers goes down, so that each back-end service copy needs at least 2 load balancing servers, and at least 2 servers support load balancing when one more back-end service copy is added. For an industrial business system, due to the limited hardware resources, if there are 10 backend servers and each backend server needs to be load balanced, the hardware requirements of 20 more load balancing servers are increased, and the hardware requirements are too high. And (2) the real-time performance is poor. When the pressure is high, the back-end server rejects service, reports errors, or accumulates indefinitely in the processing queue, and at this time, the load balancing server triggers a retry mechanism after a timeout period (for example, 5 s) to switch to the next healthy back-end server, and for the client, a time of more than 5 seconds has elapsed, and if the request is a request to control equipment, which needs real-time response and is related to production safety, serious consequences may be caused. (3) The support of the load balancing algorithm is not fine enough, and the current mainstream scheme is a mode of random allocation and polling, so that the load balancing algorithm is possibly allocated to a back-end node with a large load and cannot be finely balanced.
In order to alleviate the above problems, the embodiments of the present invention provide a service invocation method, device and system, which can effectively reduce the hardware requirement for invoking backend services, and can also improve the real-time performance for invoking backend services.
To facilitate understanding of the present embodiment, first, a detailed description is given to a service invoking method disclosed in the present embodiment, where the method is applied to a client, and the client is used to invoke a plurality of backend service copies in a server, referring to a flowchart of the service invoking method shown in fig. 3, the method mainly includes the following steps S302 to S308:
step S302, the priority information of each back-end service copy in the service end is obtained.
In one embodiment, the priority information of each back-end service copy can be configured in advance, and the configured priority information is stored in a designated area, so that the client can conveniently obtain the priority information of each back-end service copy in the server. In another embodiment, the priority information of each backend service copy may be configured according to the running state or the health state of each backend service copy, but since the running state or the health state of each backend service copy changes in real time, the priority information of each backend service copy may be updated in real time, and the updated priority information may be stored in a designated area.
Step S304, if receiving the calling request sent by the client, calling the first target service copy from each back-end service copy according to the priority information.
The calling request is used for indicating the client to determine and call the first target service copy from the back-end service copies. In one embodiment, the first target service copy may be determined from the back-end service copies in the order of priority information from high to low, and the first target service copy may be invoked.
Step S306, determine whether the first target service copy is successfully invoked.
Generally, when the call of the back-end service copy is successful, the back-end service copy can be used to execute the service corresponding to the call request, and when the call of the back-end service copy fails, the service corresponding to the call request cannot be executed. Thus, in one embodiment, if the first target service copy performs the service corresponding to the invocation request, it may be determined that the first target service copy was invoked successfully; if the first target service copy does not execute the service corresponding to the invocation request, it may be determined that the first target service copy invocation failed.
Step S308, if not, the priority information of each back-end service copy is updated, and a second target service copy is continuously determined from each back-end service copy according to the updated priority information until the second target service copy is successfully called, so that the service corresponding to the calling request is executed through the second target service copy.
In an embodiment, the health state of the first target service copy may be determined first, and when the health state of the first target service copy is abnormal, the priority information of the first target service copy is degraded, and at this time, the priority information of other backend service copies is correspondingly improved, so that the priority information of each backend service copy is updated, and then the second target service copy is determined according to the updated priority information from high to low until the second target service copy executes the service corresponding to the call request.
The service calling method provided by the embodiment of the invention includes the steps of firstly obtaining priority information of each back-end service copy in a service end, calling a first target service copy from each back-end service copy according to the priority information if a calling request sent by a client is received, judging whether the first target service copy is successfully called or not, updating the priority information of each back-end service copy if the first target service copy is not successfully called, continuously determining a second target service copy from each back-end service copy according to the updated priority information until the second target service copy is successfully called, and executing a service corresponding to the calling request through the second target service copy. In addition, the embodiment of the invention can determine a first target service copy according to the priority level of each back-end service copy when receiving a call request, and timely updates the priority information of each back-end service copy when the first target service copy cannot be successfully called so as to determine a second target service copy and call the second target service copy.
In one embodiment, the client comprises a load balancing component, and the server comprises a cache center; the embodiment of the invention provides a specific implementation mode for acquiring the priority information of each back-end service copy in a server, and the cache center can be accessed through a load balancing component to acquire the priority information of each back-end service copy in the server. In specific implementation, the priority information may carry IP (Internet Protocol Address) information, port information, or Address information of the back-end service copy, so that the client may call the first target service copy according to the IP information, the port information, or the Address information when determining the first target service copy.
In an embodiment, the server further includes a thread pool and/or a processing queue, and to facilitate understanding of step S308, in order to provide a specific implementation manner for updating the priority information of each backend service copy, see the following steps a to b:
step a, checking the health state of the first target service copy through a health checking strategy preset in the load balancing component. Wherein the health check policy includes at least one of: comparing the current load of the first target service copy with a preset load threshold value, comparing the current proportion of the thread pool of the first target service copy with a preset proportion threshold value, and comparing the current capacity of the processing queue of the first target service copy with a preset capacity threshold value. In one particular embodiment, the health status of the first target service replica may be checked as follows:
the first method is as follows: and if the current load of the first target service copy is greater than a preset load threshold value, determining that the health state of the first target service copy is abnormal. In an embodiment, one or more of a CPU (Central Processing Unit/Processor) load, a MEM (Memory) load, or an IO (disk) load of the first target service copy may be detected, and a current load and a preset load threshold corresponding to the CPU, a current load and a preset load threshold corresponding to the MEM, and a current load and a preset load threshold corresponding to the IO may be respectively compared.
The second method comprises the following steps: and if the current occupation ratio of the thread pool of the first target service copy is larger than a preset occupation ratio threshold, determining that the health state of the first target service copy is abnormal. For example, if the predetermined duty threshold is 80%, it is determined that the health status of the first target service copy is abnormal when the activity percentage (i.e., the current duty) of the thread pool of the first target service copy reaches 80%.
The third method comprises the following steps: and if the current capacity of the processing queue of the first target service copy is larger than a preset capacity threshold value, determining that the health state of the first target service copy is abnormal. For example, the preset capacity threshold is 80%, and when the current capacity of the processing queue of the first target service copy reaches 80%, it is determined that the health status of the first target service copy is abnormal.
In another embodiment, the 503 status code may be fed back to the client in time to inform the client of the health status anomaly of the first target service copy.
And b, if the health state of the first target service copy is in an abnormal state, updating the priority information of each back-end service copy. The priority information may include a weight value. In a specific embodiment, the priority information of each back-end service copy may be updated as follows: and if the health state of the first target service copy is in an abnormal state, updating the weight value of the first target service copy by using the difference value between the weight value of the first target service copy and a preset reduction value. For example, the same initial weight value is preset for each back-end service copy, if the health state processed by the back-end service copy is in an abnormal state, the back-end service copy cannot successfully process the call request at this time, and assuming that the preset reduction value is "10", the difference value between the weight value of the back-end service copy and "10" is calculated, and the difference value is used as the weight value after the back-end service copy is updated.
In addition, if the first target service copy is successfully called, the weight value of the first target service copy is updated by using the sum of the weight value of the first target service copy and the preset added value. For example, the preset added value is "1", and when the first target service copy is successfully called, the sum of the weight value of the first target service copy and "1" is calculated, and the sum is used as the updated weight value of the first target service copy.
In order to perform overload protection on the back-end service copy, the embodiment of the invention can also refuse to call the first target service copy within a preset time interval when the number of times of call failure of the first target service copy is greater than the preset number of times. For example, if a certain backend service copy times out 3 consecutive call requests, that is, three times of-10 consecutive times, it may be considered that the pressure of the backend service copy processing the call requests is too large, and at this time, a policy may be set: for the next several times (e.g., within 1 second), the client no longer assigns a call request to the back-end service copy; if the weight value of a certain backend service copy is reduced to 0 or fails to process the call request for 4 or more consecutive times, the preset time interval may be set to be equal to a larger value, and based on experience, the call request can be processed normally after 1 minute when the backend service copy is in the fullGC, for example, a policy may be set: for the next several times (e.g., 1 minute), the client no longer assigns a call request to the back-end service copy.
In the prior art, server load balancing is used, and client load balancing is used in the embodiment of the present invention, so as to facilitate understanding of the difference between server load balancing and client load balancing, the embodiment of the present invention exemplifies the differences between server load balancing and client load balancing: (1) The server side load balancing means that after the call request reaches the server side, the load balancing server judges and decides the health state of the back-end service copy to allocate flow; (2) The client load balancing means that before a client sends a call request, the health state of each back-end service copy is judged in advance, the back-end service copy which is available for health is determined as a target service copy, and then connection and request are directly carried out on the target service copy. If a server-side load balancing architecture is adopted, a plurality of load balancing services are necessarily required to be deployed to realize that the services are still available in a cluster when one or more of the services are down, which is acceptable in the internet architecture, but the cost is an extra burden or even unacceptable in the industrial limited hardware environment. Based on the architecture of the client, an embedded load balancing SDK (Software Development Kit) component is used at the client, so that the SDK can directly dispatch the scheduling request according to the self-learned health state of the backend service when the backend service copy is called.
To facilitate understanding of the service invocation method, an interaction diagram of the client and the server is provided in the embodiment of the present invention, as shown in fig. 4, the client includes an application service, and the application service includes a load balancing component (which may also be referred to as a backpressure-sensitive load balancing client component); the server side comprises a cache center, a registration center and a plurality of back-end service copies, wherein each back-end service copy comprises a back-end component and a thread pool or a processing queue; the cache center is used for storing priority information of each back-end service copy, the registration center is used for storing IP information and port information of the back-end service copies, the back-end components are used for being called by the client to execute corresponding services, the thread pool or the processing queue is used for detecting health scores of the back-end service copies to obtain the priority information, and the priority information can be a weight value.
Based on the above fig. 4, the embodiment of the present invention further explains the interaction process between the client and the server, which is referred to as (1) to (5) below:
(1) The service side registers the IP and the port of the back-end service to the registration center.
(2) The client sends a service invocation request through the application service. In a specific implementation, the application service initiates a call request, and accesses a back-end service copy through a backpressure-sensitive load balancing client component (hereinafter referred to as a client SDK).
(3) The client determines the backend service copy with the largest weight value from the cache center through a backpressure-sensitive load balancing client component in the application service. During specific implementation, the client SDK determines address information of a healthiest back-end service copy from the cache center, initiates a call to the back-end service copy, performs an access attempt, and if the back-end service copy is successfully called, executes a corresponding service by using the back-end service copy. The health degree of the back-end service copy can be represented by a weight value, for example, a larger weight value indicates that the back-end service copy is healthier.
(4) And if the back-end service copy is not successfully called, judging whether the health state of the back-end service copy is abnormal or not by using the back-pressure sensitive load balancing client component. In one embodiment, the back-end service copy uses the service SDK and customizes the health check policy so that the back-end service copy can predict in advance whether there is a block in the process, thereby improving the sensitivity of backpressure feedback. The health check policy may include comparing a current load of the first target service copy with a preset load threshold, comparing a current occupancy ratio of the thread pool of the first target service copy with a preset occupancy ratio threshold, and comparing a current capacity of the processing queue of the first target service copy with a preset capacity threshold, and specific implementation processes may refer to the foregoing first to third modes, which are not described herein again. After the back-end service copy is detected by the health check strategy, if the health state of the back-end service copy is abnormal, the state code 503 can be immediately fed back to the client SDK to inform the client SDK to degrade the back-end service copy. And if the health state of the back-end service copy is normal, entering the related processing of the business through the back-end service copy. It should be noted that the logic of this step is the core of the backpressure sensitive detection. In an implementation mode, the health check strategy can be embedded into the SDK of the server, and although the method has certain invasiveness, the performance health check effect is better; in addition, health state detection service can be provided only in a buried point of the SDK at the server side, the health state monitoring service is detected through the daemon, if a calling request sent by the SDK at the client side is received, protocol communication can be carried out between the daemon and the SDK at the client side, and a health check result is fed back to the SDK at the client side.
(5) And when the health state of the back-end service copies is abnormal, updating the weight value of each back-end service copy, storing the updated weight value to a cache center in the service end, and continuously determining the back-end service copy with the highest weight value from the cache center by the client SDK until the back-end service copies are successfully called. In a specific embodiment, the weight value of each back-end service copy may be updated by using a dynamic weight, where the weight value may be understood as a comprehensive score for reflecting the health degree of the back-end service copy, and the back-end service instances are scored and sorted according to the result of the request for trying. The above process can be realized by using a load balancing algorithm, specifically, the use of the adaptive load balancing algorithm to score and load dispatch the back-end service copies means that the load balancing algorithm can automatically evaluate the service capacity of the system and perform reasonable flow distribution no matter the system is in an idle, stable or busy state, so that the whole system always keeps better performance and the situations of starvation, overload and downtime are avoided. Adaptive load balancing algorithms are important in new era of industrial intelligence systems because they are not single-machine automation but cross-equipment system level automation under new requirements, and the trend is that more equipment or signals are crossed, the more intelligent a product is. Therefore, the processing capacity of the signal exceeds the calculation capacity of the single-work control machine, and distributed calculation is needed. The internet and e-commerce platforms are ready solutions when facing such challenges, however, they all require a lot of hardware coordination support, and in the industrial environment, it is still necessary to improve the capability of software to fully utilize hardware resources. However, no mature ready-made scheme is available at present, one set needs to be customized according to the idea of an internet platform to better perform traffic distribution scheduling and stability guarantee, pursue extreme performance, challenge the traffic peak in the scenes such as signal peak value and the like. The embodiment of the invention exemplarily provides a method for updating a weight value by using a self-adaptive load balancing algorithm, which comprises the following steps: identifying the processing capacity of each service (namely, the back-end service copy) by using a dynamic weight, and defaulting that the initial processing capacity is the same, namely, the probability (namely, the weight value) allocated to each service is equal; (II) when the service successfully processes one request, the service processing capacity is considered to be enough, and the weight is dynamically +1; (iii) whenever a service times out to process a request, it is assumed that service processing capacity may not be kept up to date, weight dynamic-10.
In the specific implementation, after the matching algorithm of the client and the server is designed, an LB SDK (encapsulation component) of each programming language needs to be developed, so that services are prevented from being implemented once according to the algorithm specification, and the performance of the algorithm is improved. In one embodiment, the wrapper component may be developed using a programming language such as JAVA, goLang, python, etc., to make the internal implementation details transparent to the service, quickly allowing the service layer to enjoy better performance.
In summary, the service calling method provided by the embodiment of the invention has better real-time performance and preventability. The embodiment of the invention realizes load balance by using back pressure sensitivity, the current pressure of the back-end service copies is the back pressure, the back pressure sensitivity prediction feeds back the trend pressure of the back-end service copies, if the current pressure is found to be larger, the client side is switched to other back-end service copies with smaller pressure at the fastest speed, but if the back-end service copies are all under high pressure, the back-end service cluster needs to perform horizontal capacity expansion. Since the health status is determined before the service processes the service request, this determination logic may be post-hoc (e.g., when the queue is full) or preventative (e.g., when 80% of the threads are active in the thread pool). If this logical write is accurate enough, the back-end service copy can control the waste of time to the millisecond level, allowing the client to dispatch the next healthy back-end service copy for processing soon. If the health judgment link is not available, the back-end service copy can backlog the request in the processing queue of the back-end service copy, the out-of-control phenomenon may exist in the processing time, and even when the processing queue of the back-end service copy overflows, the back-end service copy crashes. In addition, the embodiment of the invention also has the advantage of resource simplicity. Because the embodiment of the invention is based on the design of client load balancing, the method does not need to deploy any component to the server, so that additional hardware investment except for the back-end service copies is not needed, and the back-end service copies can be deployed by tens, hundreds or more as required, thereby better relieving the bottleneck problem of hardware performance of an industrial control scene.
For the service invoking method provided by the foregoing embodiment, an embodiment of the present invention provides a service invoking device, which is applied to a client, where the client is configured to invoke a plurality of backend service copies in a server, and referring to a schematic structural diagram of the service invoking device shown in fig. 5, the device mainly includes the following components:
the weight obtaining module 502 is configured to obtain priority information of each backend service copy in the server.
And a first calling module 504, configured to, if a calling request sent by a client is received, call a first target service copy from each backend service copy according to the priority information.
A determining module 506, configured to determine whether the first target service copy is successfully invoked.
The second invoking module 508 is configured to, if the determination result of the determining module is negative, update the priority information of each back-end service copy, and continue to determine a second target service copy from each back-end service copy according to the updated priority information until the second target service copy is successfully invoked, so as to execute a service corresponding to the invocation request through the second target service copy.
In addition, the embodiment of the invention can determine the first target service copy according to the priority level of each rear-end service copy when receiving the call request, and timely update the priority information of each rear-end service copy when the first target service copy cannot be successfully called so as to determine the second target service copy and call the second target service copy.
In one embodiment, the client comprises a load balancing component, and the server comprises a cache center; the cache center is used for storing the priority information of each back-end service copy; the weight obtaining module 502 is further configured to: and accessing the cache center through the load balancing component to obtain the priority information of each back-end service copy in the service end.
In one embodiment, the server further comprises a thread pool and/or a processing queue; the second calling module 508 is further configured to: checking the health state of the first target service copy through a health checking strategy preset in the load balancing component; wherein the health check policy includes at least one of: comparing the current load of the first target service copy with a preset load threshold, comparing the current ratio of the thread pool of the first target service copy with a preset ratio threshold, and comparing the current capacity of the processing queue of the first target service copy with a preset capacity threshold; and if the health state of the first target service copy is in an abnormal state, updating the priority information of each back-end service copy.
In an embodiment, the second invoking module 508 is further configured to: if the current load of the first target service copy is larger than a preset load threshold value, determining that the health state of the first target service copy is abnormal; or if the current occupation ratio of the thread pool of the first target service copy is larger than a preset occupation ratio threshold, determining that the health state of the first target service copy is abnormal; or if the current capacity of the processing queue of the first target service copy is larger than a preset capacity threshold value, determining that the health state of the first target service copy is abnormal.
In one embodiment, the priority information includes a weight value; the second invoking module 508 is further configured to include: and if the health state of the first target service copy is in an abnormal state, updating the weight value of the first target service copy by using the difference value between the weight value of the first target service copy and a preset reduction value.
In an embodiment, the apparatus further includes an update module configured to: and if the first target service copy is successfully called, updating the weight value of the first target service copy by using the sum of the weight value of the first target service copy and the preset added value.
In an embodiment, the apparatus further includes a reject call module configured to: and if the times of calling failure of the first target service copy are greater than the preset times, refusing to call the first target service copy within a preset time interval.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.
The embodiment of the invention provides a service calling system, which particularly comprises a server and a client; the client performs the method of any of the above embodiments.
Fig. 6 is a schematic structural diagram of a service invocation system according to an embodiment of the present invention, where the service invocation system 100 includes: a processor 60, a memory 61, a bus 62 and a communication interface 63, wherein the processor 60, the communication interface 63 and the memory 61 are connected through the bus 62; the processor 60 is arranged to execute executable modules, such as computer programs, stored in the memory 61.
The Memory 61 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 63 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The bus 62 may be an ISA bus, a PCI bus, an EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
The memory 61 is used for storing a program, the processor 60 executes the program after receiving an execution instruction, and the method executed by the apparatus defined by the flow process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 60, or implemented by the processor 60.
The processor 60 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by instructions in the form of hardware integrated logic circuits or software in the processor 60. The Processor 60 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory 61, and the processor 60 reads the information in the memory 61 and completes the steps of the method in combination with the hardware.
The computer program product of the readable storage medium provided in the embodiment of the present invention includes a computer readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the foregoing method embodiment, which is not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.