CN109194753B

CN109194753B - Method for processing event in service grid

Info

Publication number: CN109194753B
Application number: CN201811057619.0A
Authority: CN
Inventors: 李强; 王凤琴
Original assignee: Sichuan Hongwei Technology Co Ltd
Current assignee: Sichuan Hongwei Technology Co Ltd
Priority date: 2018-09-11
Filing date: 2018-09-11
Publication date: 2021-09-17
Anticipated expiration: 2038-09-11
Also published as: CN109194753A

Abstract

The invention discloses a method for processing events in a service grid, which comprises the following steps: A. the service grid applies dynamic routing rules to determine the service desired by the requester, B. the service grid retrieves a pool of instances from the service discovery endpoint, finds the correct destination address, C. the service grid selects the instance most likely to return a fast response based on various factors, D. the service grid attempts to send the request to the instance and records the delay and response type of the result, E. if the instance fails, the service grid retries the request on another instance, F. if the instance always returns an error, the service grid removes it from the load balancing pool, G. if the expiration time of the request has expired, the service grid will proactively fail the request, H. the service grid captures the above behavior data in the form of metric and distributed tracking and sends the data to the metric system. The method of the invention solves the problem of event processing for the service grid technology and is beneficial to the quick landing realization of the service grid technology.

Description

Method for processing event in service grid

Technical Field

The invention relates to the technical field of distributed systems of computer software, in particular to a method for processing events in a service grid.

Background

With the increasing penetration of the mobile internet, more and more companies and enterprises are gradually starting to face the internet and migrating customer-facing services and businesses to the online. When these online services are developed, as the number of clients increases, the architecture of the service platform also changes and evolves. From the early single application architecture to the micro-service architecture, clustering and distribution become standard technology.

The service grid technology is one of the most productive leading-edge technologies at present, and in the process of constructing a basic platform of a micro-service architecture, the service grid technology can further improve the overall performance of a system and reduce the research, development and operation costs. The services grid is a dedicated infrastructure layer for handling service-to-service intercommunication, which is responsible for securely and reliably delivering requests through a complex service topology that encompasses modern cloud-native applications.

The service grid is a network model, an abstraction layer above TCP/IP. It assumes that the underlying L3/L4 network exists and is capable of transferring bytes from one point to another. The services grid also assumes that the network is not as reliable as other aspects of the environment, so the services network must also be able to handle network failures. Event handling of the services grid is one of the basic functions of the services grid and is also a very difficult functional design. Enterprises are faced with this technical challenge in building a service grid.

Disclosure of Invention

The invention aims to overcome the defects in the background technology, and provides a method for processing events in a service grid, which is suitable for a system adopting cluster and/or distributed design, can solve the problem of event processing for the service grid technology with the most productivity at present, is beneficial to realizing the rapid landing of the service grid technology, can greatly reduce the operation and service cost of enterprises through the service grid technology, strengthens the soft infrastructure of business, further improves the system efficiency, and improves the reliability and stability of the whole system.

In order to achieve the technical effects, the invention adopts the following technical scheme:

a method of event handling in a services grid, comprising the steps of:

A. the service grid applies a dynamic routing rule to determine a service request required by a request end;

the routing rule is a mapping relation established between routing records of a routing table and back-end micro-services, and the dynamic routing rule is that a service grid can automatically establish a service routing table of the service grid according to specific routing information exchanged among routers and can timely automatically adjust the service routing table according to actual conditions of changes of links and nodes;

the routing determination includes determining whether the micro-service endpoint should be routed to a production environment or a software lifecycle phase environment; or the routing rule is dynamically configurable, can be globally applied, and can also be applied to any traffic slice;

B. the service grid searches the example pool from the service discovery end point to find out the correct micro-service end point address;

specifically, the service discovery endpoint is a micro-service endpoint, specifically provides functions of service registration and service discovery, and opens a corresponding service port, the instance pool is a pool-like set composed of all available micro-service instances, the function mechanism of the instance pool is similar to a database connection pool, and the server discovery endpoint provides an instance pool for retrieval, which can be used for searching a suitable micro-service endpoint;

C. the service grid selects and determines the microservice endpoint instance that is best able to return a fast response, including the observed delay of the most recent request, based on the influencing factors;

D. the service grid tries to send a service request to the microservice endpoint instance and records the delay and response type of the result;

E. if the micro-service endpoint instance fails, the service grid retries the service request on another micro-service endpoint instance;

F. if the micro-service endpoint example fails for multiple times, the service grid removes the micro-service endpoint example from the load balancing pool; that is, if the selected micro-service endpoint instance always returns an error within a given period of time, for example, if the indication that the micro-service endpoint instance returns an error is received more than 2 times within 1 minute, the service grid considers that it is a service down rather than a transient fault, in which case, the service grid removes the endpoint instance from the load balancing pool;

G. if the service request reaches the expiration time, the service grid actively makes the service request fail;

each request from the request end contains the expiration time of the request, and if the expiration time of the request is up and the response data of the service endpoint instance is not obtained, the service grid actively fails the request;

H. the service grid captures the behavior data in each step in a measuring and distributed tracking mode and sends the data to a measuring system;

the service grid captures the behavior data in the form of measurement and distributed tracking in each step, and sends the data to the measurement system, wherein the measurement system can be a Prometheus, Sysdig, graph, StastsD, Sensu monitoring measurement system.

Further, the instance pool in the step B is a single instance pool or a set of multiple instance pools.

Further, the influencing factors in the step C include, but are not limited to, historical records, measurement data of request time and response time, and load conditions of the micro-service endpoint, and other factors that may influence the response speed of the micro-service endpoint instance may also be considered in practice.

Further, the micro-service endpoint instance capable of returning the fast response most in the step C may be one micro-service endpoint instance, or may be multiple micro-service endpoint instances.

Further, the faults occurring in the micro-service endpoint instance include transient faults, service downtime, and network faults, and in practice, the faults that may occur in the micro-service endpoint instance are not limited to the above faults.

Further, in the step E, the service grid retries the service request form on another micro-service endpoint instance, the retried request is limited to the use of an idempotent HTTP request method, and the non-idempotent HTTP method cannot initiate the retried request.

Further, the step F further includes that the service grid puts the micro-service endpoint instance removed from the load balancing pool into the pool to be recovered, so as to perform a periodic retry in the subsequent process, and once the endpoint instance recovers to be normal during the retry, the end point instance can be removed from the pool to be recovered again and added into the load balancing pool again.

Compared with the prior art, the invention has the following beneficial effects:

the method for processing the event in the service grid can solve the problems in the background technology, namely the problem of event processing in the service grid technology with the highest productivity at present is solved, the service grid technology is favorable for realizing the quick landing of the service grid technology, the operation and service cost of an enterprise can be greatly reduced through the service grid technology, the soft infrastructure of business is reinforced, the system efficiency is further improved, and the reliability and the stability of the whole system are improved.

Drawings

FIG. 1 is a flow diagram of a method of event handling in a services grid in accordance with the present invention.

Detailed Description

The invention will be further elucidated and described with reference to the embodiments of the invention described hereinafter.

Example (b):

as shown in fig. 1, a method for processing events in a service grid, which may be applied to a micro-service architecture, and may be applied to data centers such as a distributed computing platform and a cloud computing platform, mainly includes the following steps:

step 101, the service grid applies dynamic routing rules to determine the services desired by the requestor.

Specifically, the routing rule refers to a mapping relationship established between a routing record of a routing table and a back-end microservice. The dynamic routing rule is specifically that the service grid can automatically establish a service routing table of the service grid according to specific routing information exchanged among routers, and can timely and automatically adjust the service routing table according to the actual condition of the change of links and nodes.

Wherein the routing determination includes determining whether to route to a microservice endpoint of the production environment or to a microservice endpoint of the software lifecycle phase environment; either the micro-service endpoint that should be routed to the local data center or the micro-service endpoint in the cloud host of the cloud service provider.

Preferably, the routing rules in this embodiment are dynamically configurable, and may be applied globally or to any traffic slice.

Step 102, the service grid retrieves the pool of instances from the service discovery endpoint to find the correct destination address.

Specifically, the service discovery endpoint is a micro-service endpoint that provides functions of service registration and service discovery and opens a corresponding service port.

Wherein, the instance pool is a kind of pool-like collection composed of all available micro service instances, and the function mechanism of the instance pool is similar to the database connection pool. The server discovery endpoint then provides a pool of instances that can be retrieved and used to find the appropriate microservice endpoint. In practice, the instance pool may be a single instance pool, or may be a set of instance pools consisting of a plurality of instance pools.

The service grid selects the most likely instance to return a quick response based on various influencing factors, step 103.

The service grid selects the instances that are most likely to return a quick response based on various influencing factors, including the observed delay of the most recent request. Specifically, the various influencing factors include historical records, measurement data of request time and response time, load conditions of micro-service endpoints and other information.

In practice, the fast response instance may be one micro-service endpoint instance or multiple micro-service endpoint instances.

The service grid attempts to send a request to the instance and records the delay and response type of the result, step 104.

The service grid attempts to send a request to the microservice endpoint instance selected in the previous step and records the latency and response type of the response result.

If the instance fails, the service grid will retry the request on another instance, step 105.

If the microservice endpoint instance selected in the previous step fails, the service grid will retry the request on another instance. The fault occurring in the instance may be a transient fault, or may be a service outage, a network fault, or the like.

Specifically, the retry request is limited to the use of an idempotent HTTP request method, and a non-idempotent HTTP method cannot initiate the retry request.

If the instance always returns an error, the service grid will remove it from the load balancing pool, step 106.

If the selected micro-service endpoint instance always returns errors within a given period of time, for example, the indication that the micro-service endpoint instance returns errors within 1 minute exceeds 2 times, and the specific limited time, that is, the number of times, can be set according to specific conditions in practice, the service grid will consider that it is a service downtime rather than a transient fault, and in this case, the service grid will remove this endpoint instance from the load balancing pool.

Preferably, in this embodiment, the service grid places the culled endpoint instance in another dedicated pool, the to-be-restored pool, for periodic retries at a later time. Once the endpoint instance returns to normal, it may be removed from this special pool again and rejoined to the load balancing pool.

If the expiration time of the request has expired, the service grid will proactively fail the request, step 107.

Each request from the requestor contains the expiration time of the request, and if the expiration time of the request has expired and no response data is obtained for the service endpoint instance, the service grid will proactively fail the request.

Step 108, the service grid captures the behavior data in the form of measurement and distributed tracking, and sends the data to the measurement system.

The service grid captures the behavior data in the form of measurement and distributed tracking in each step and sends the data to the measurement system.

Specifically, the measurement system may be a monitoring measurement system such as Prometheus, Sysdig, Graphite, StatsD, Sensu, or the like.

Therefore, the method for processing the events in the service grid solves the problem of event processing for the service grid technology with the most productivity at present, is beneficial to realizing the quick landing of the service grid technology, can greatly reduce the operation and service cost of enterprises through the service grid technology, strengthens the soft infrastructure of business, further improves the system efficiency, and improves the reliability and stability of the whole system.

It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims

1. A method for event handling in a services grid, comprising the steps of:

C. the service grid selects and determines a micro-service endpoint instance which can return the fast response most according to the influence factors; the influencing factors in the step C comprise historical records, measurement data of request time and response time and load conditions of micro-service endpoints;

E. if the micro-service endpoint instance fails, the service grid retries the service request on another micro-service endpoint instance; the service grid retries the service request formula on another micro-service endpoint instance in the step E, and the retried request is limited to an idempotent HTTP request method;

F. if the micro-service endpoint instance fails for multiple times within a given period of time, the service grid removes the micro-service endpoint instance from the load balancing pool;

H. the service grid captures the behavior data in each step in the form of measurement and distributed tracking and sends the data to the measurement system.

2. The method of claim 1, wherein the instance pool in step B is a single instance pool or a collection of multiple instance pools.

3. The method of claim 1, wherein the microservice endpoint instance that returns the most rapid response in step C may be one microservice endpoint instance or a plurality of microservice endpoint instances.

4. The method of claim 1, wherein the failures of the microservice endpoint instances include transient failures, service outages, and network failures.

5. The method of claim 1, wherein the step F further comprises the service grid placing the micro-service endpoint instances removed from the load balancing pool into the pool to be restored.