CN116743844A

CN116743844A - Service calling device and method of distributed system

Info

Publication number: CN116743844A
Application number: CN202310959441.3A
Authority: CN
Inventors: 时丹
Original assignee: China Everbright Bank Co Ltd
Current assignee: China Everbright Bank Co Ltd
Priority date: 2023-08-01
Filing date: 2023-08-01
Publication date: 2023-09-12

Abstract

The embodiment of the invention provides a service calling device and a service calling method of a distributed system, wherein the device comprises a plurality of data centers, each data center comprises a plurality of registration centers, a plurality of micro-services and a plurality of gateways, and the registration centers are used for registering service information of the corresponding data centers; the gateway communicates with each application system of the distributed system, and calls micro services according to a service list of the registry; the gateway and the micro-service are provided with a multi-center monitoring module and a service list multi-level cache module, wherein the multi-center monitoring module is used for performing state monitoring on service information of a plurality of data centers; the service list multi-level caching module caches the service list of the registry. The invention solves the problems of independence of each center, poor availability and incomplete consideration of an emergency mechanism of the distributed system in the related technology, achieves the effects of improving coordination and availability of multiple registration centers and solving the emergency condition.

Description

Service calling device and method of distributed system

Technical Field

The embodiment of the invention relates to the field of distributed system architecture, in particular to a service calling device and method of a distributed system.

Background

With the development of internet technology, an application system gradually adopts a distributed architecture, in the distributed architecture, a registration discovery component plays a core role in a distributed micro-service architecture such as service registration, service discovery, service call and configuration center, and the high availability of the registration discovery component has important significance for the stability of the distributed system.

In the existing distributed system based on the registration centers such as Zookeeper, eureka, services are respectively registered on the registration centers of the centers, and the whole service call chain is carried out under the current data center from the beginning to the end, namely, each center is a relatively independent environment. When a certain service of the center is abnormal, the whole center is set to be unavailable, and the transaction of the center is cut to a standby center or other centers, so that the problems of resource waste and unreasonable load distribution exist. The distributed system based on Consul supports a multi-center mode, but the native Consul supports a limited multi-center, is only a multi-center of a deployment layer, and is a service list and configuration of the center obtained by micro services, and has no cross-center access strategy, so that the multi-center function is not fully exerted.

In the existing Spring Cloud system micro-service, a service list cache based on loadbalance load balancing is an expiration mechanism, the service list cache is only valid in expiration time, when service call is carried out, the load balancing cache is preferentially obtained, and when the load balancing cache is expiration, a registry is accessed to obtain the service list. When the load balancing cache failure time configuration is large, the update is not timely, and the service is called to be unavailable; when the load balancing cache dead time configuration is short, the problem of frequent interaction with the registry exists. The most important problem is that under the condition that the registry is down, service list cache without degradation is used for service call, namely, the problem that service can not be called when the registry is down exists. In the existing distributed system, when the registry is down, the service list stops updating, the service is unavailable, and the service is still called for many times, so that the problem of call failure is caused.

In summary, the existing distributed system has the problems of independence of each center, poor availability and incomplete consideration of emergency mechanism.

Disclosure of Invention

The embodiment of the invention provides a service calling device and a service calling method for a distributed system, which at least solve the problems that the distributed system in the related technology is independent in all centers, poor in availability and incomplete in consideration of an emergency mechanism.

According to one embodiment of the invention, a service calling device of a distributed system is provided, which comprises a plurality of data centers, wherein each data center comprises a plurality of registries, a plurality of micro services and a plurality of gateways, and the registries are used for registering service information under the corresponding data centers; the gateway performs communication with each application system of the distributed system, and calls the micro services according to the service list of the registry; the gateway and the micro-service are provided with a multi-center module and a service list multi-level cache module, wherein the multi-center module is used for monitoring states of a plurality of registries of a plurality of data centers; and the service list multi-level caching module caches the service list of the registry.

According to another embodiment of the present invention, there is provided a service invocation method of a distributed system, implemented using the service invocation apparatus, including: the method comprises the steps that a client acquires name lists of a plurality of data centers, acquires a service list from a corresponding registration center of the data centers according to a preset priority order of the registration center according to a calling micro-service name of the client, and performs micro-service calling according to the service list, wherein when the service list is acquired, the service list is acquired from a service list multi-level cache module according to the preset priority order.

According to a further embodiment of the invention, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

The invention provides a service calling device of a distributed system, which comprises a plurality of data centers, wherein each data center comprises a plurality of registration centers, a plurality of micro services and a plurality of gateways, and the registration centers are used for registering service information of the corresponding data centers; the gateway communicates with each application system of the distributed system, and calls micro services according to a service list of the registry; the gateway is provided with a multi-center module and a service list multi-level cache module, wherein the multi-center module is used for monitoring states of a plurality of registries of a plurality of data centers; the service list multi-level caching module caches service lists of the registry in different states. The invention solves the problems of independence of each center, poor availability and incomplete consideration of an emergency mechanism of the distributed system in the related technology, achieves the effects of improving coordination and availability of multiple registration centers and solving the emergency condition.

Drawings

FIG. 1 is a deployment architecture diagram based on Zookeeper or Eureka;

FIG. 2 is a Consul-based multi-center deployment architecture diagram;

FIG. 3 is a service invocation schematic diagram of a distributed system according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a service invocation apparatus of a distributed system according to an embodiment of the present invention;

FIG. 5 is a flow chart of a micro service call according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart of an initialization process for a service list when a registry is down, according to an embodiment of the present invention;

fig. 7 is a schematic view of the structure of an emergency escape module according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a communication principle of an emergency escape module according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of an emergency address invocation according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

Fig. 1 is a deployment architecture diagram based on Zookeeper or Eureka, and the system comprises two independent centers 1 and 2, wherein the service under each center is respectively registered under the corresponding center, and when a certain service of the center 1 such as service1 is abnormal, the center 1 is made unavailable, which results in an increase of the traffic and access pressure of the center 2.

When the registry is a Zookeeper or a Eureka, the service is respectively registered in the registry cluster of the center where the service is located, and the transaction calling link is carried out under the current center from the beginning to the end, that is, each center is a relatively independent environment. When all the instances of a certain service under the center are abnormal, the whole center is set to be unavailable, the transaction of the center is reserved by the center or other centers, the problems of resource waste and unreasonable load distribution exist, the flow and access pressure of other centers can be multiplied, and the risks of service failure and service blockage exist. When the registry is a Zookeeper, the acquisition and the update of the service list can be realized based on an event monitoring mechanism of the Zookeeper, but when the registry is down, the service list is not updated any more. Moreover, the scheme is strongly dependent on the event listening mechanism of the Zookeeper, and is not applicable to a system which cannot adopt the Zookeeper as a registration center.

Fig. 2 is a diagram of a multisentric deployment architecture based on condul, the system comprises two centers, center 1 and center 2, on the network level, condul of each center is based on local area network POOL (LAN GOSSIP POOL, LGP), condul of each center is based on wide area network POOL (WAN GOSSIP POOL, WGP) to establish communication, but service of each center still obtains service list of the center, configuration of the center, and support and design of the center are not performed.

The native Consul supports a multi-center mode, but supports a limited multi-center, only a multi-center of a deployment layer, each micro service still accesses a registration center of the center, such as a service list of the center and configuration of the center, has no cross-center access strategy, and does not make a high-availability design based on the cross-center. When the registry is Consul, the service list is obtained by calling a pull mode of a Consul interface, in the existing Spring Cloud system micro-service, the service list cache is based on a load balancing cache, is an expiration mechanism and is valid only in expiration time, when the service is called, the load balancing cache is preferentially obtained, and when the load balancing cache is expired, the service list is obtained by accessing the registry. Therefore, under the condition that the registry is down, the service list cache without degradation is used for service call, namely, the problem that service can not be called when the registry Consul is down exists. When the registry is down, the service list stops updating, and the service is unavailable and is still called for many times, so that the call fails.

Based on a distributed system of a registration center such as Zookeeper, eureka, the service is registered on the registration center of the center, and the whole service call chain is performed under the current data center from beginning to end, that is, each center is a relatively independent environment. When a certain service of the center is abnormal, the whole center is set to be unavailable, and the transaction of the center is cut to a standby center or other centers, so that the problems of resource waste and unreasonable load distribution exist.

The distributed system based on Consul supports a multi-center mode, but the native Consul supports a limited multi-center, is only a multi-center of a deployment layer, and is a service list and configuration of the center obtained by micro services, and has no cross-center access strategy, so that the multi-center function is not fully exerted.

In the existing SpringCloud system micro-service, based on LoadBalancer-based load balancing, a service list cache is an expiration invalidation mechanism, and is only valid in expiration time, when service call is performed, the load balancing cache is preferentially obtained, and when the load balancing cache is invalid, a registry is accessed to obtain the service list. When the load balancing cache failure time configuration is large, the update is not timely, and the service is called to be unavailable; when the load balancing cache dead time configuration is short, the problem of frequent interaction with the registry exists. The most important problem is that under the condition that the registry is down, service list cache without degradation is used for service call, namely, the problem that service can not be called when the registry is down exists.

In the prior art, when a registry is down, a service list stops updating, and the service is unavailable and is still called for multiple times, so that the problem of call failure is caused.

In order to solve the above problems, an embodiment of the present invention provides a service invocation apparatus of a distributed system, including a plurality of data centers, each data center including a plurality of registries, a plurality of micro services, and a plurality of gateways, where the registries are used to register service information of the corresponding data centers; the gateway communicates with each application system of the distributed system, and calls micro services according to a service list of the registry; the gateway and the micro-service are provided with a multi-center module and a service list multi-level cache module, wherein the multi-center module is used for monitoring the states of service information of a plurality of registries of a plurality of data centers; the service list multi-level caching module caches service lists of the registry in different states.

In one exemplary embodiment, the registry may be a condul registry; the microservice may be a SpringCloud microservice. In an actual implementation, the types of the registry and the micro service may be determined according to actual situations, which are not limited herein.

In one exemplary embodiment, the multi-center module further comprises: the server comprises a monitor for monitoring the states of service lists of a plurality of registries, a connector for connecting the registries with clients, and a filter for filtering the service lists acquired by the registries according to access rules.

In one exemplary embodiment, the service list multi-level cache module includes: the system comprises a load balancing cache unit, a local service list cache unit and a local file cache unit, wherein the priority of the load balancing cache unit is greater than that of the local service list cache unit; the load balancing caching unit is used for caching the service list of the normally available registry and controlling the effective Time of the load balancing caching by configuring Time To Live (TTL); the local service list caching unit is used for calling a service list from the local cache by the data center when the connection of the registration center is abnormal; the local file caching unit is used for initializing the local service list caching unit after the registry is abnormally connected and the distributed system is restarted.

In an exemplary embodiment, the gateway is further provided with an emergency escape module, and the emergency escape module is used for providing an agent address and an emergency address for the distributed system, and directly notifying the data center to enable the emergency address to perform service call in case of abnormality of the agent address.

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

In another embodiment of the present invention, there is also provided a service invocation method of a distributed system, implemented by using the service invocation apparatus, including:

the method comprises the steps that a client acquires a name list of a data center, acquires a service list from a registration center of the corresponding data center according to a call address of the client according to a preset priority order of the registration center, and performs micro-service call according to the service list, wherein when the service list is acquired, the service list is acquired from a service list multi-level cache module according to the preset priority order.

In one exemplary embodiment, further comprising: the states of the data center and the registration center are monitored through the multi-center monitoring module, the emergency address is obtained through the emergency escape module under the condition that the proxy service of the current system is abnormal, and the current proxy address is replaced by the emergency address for communication and micro-service data transmission.

In one exemplary embodiment, further comprising: calling feedback monitoring is carried out on a service list of a registry through a multi-center monitoring module, and when the registry is normally available, a micro-service mark with an abnormal calling result in the service list is removed, wherein when the reason for the abnormal calling result is abnormal communication network connection, monitoring is kept until the communication network connection is normal, and the removed micro-service is recovered; under the condition that the registry is connected abnormally, setting preset access failure times aiming at the corresponding micro-services of each service list, and removing or suspending calling the corresponding micro-services when the actual failure access times are larger than the preset access failure times.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media in which a computer program can be stored.

An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

In an exemplary embodiment, the electronic apparatus may further include a transmission device connected to the processor, and an input/output device connected to the processor.

Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

In order to enable those skilled in the art to better understand the technical solutions of the present invention, the following description is provided with reference to specific exemplary embodiments.

Scene embodiment one

Fig. 3 is a service invocation schematic diagram of a distributed system according to an embodiment of the present invention, the system being composed of multiple centers, as shown in fig. 3, each center including a condul registry, a SpringCloud-based micro-service, and a gateway. Each service registers own service information to a registration center, the registration center detects the health condition of the service registered on the registration center through health check, and the service acquires a service list on the registration center to call the services.

In the actual implementation process, the micro-service and the gateway also comprise a multi-center module, a service list multi-level cache module and an emergency escape module. Fig. 4 is a schematic structural diagram of a service invocation apparatus of a distributed system according to an embodiment of the present invention, as shown in fig. 4, a multi-center module: including a multi-center list acquisition listener, a multi-center access client connector (i.e., the multi-center access client connector in fig. 4), a multi-center access rule filter (i.e., the service list priority filter in fig. 4).

The multi-center list acquisition monitor supports configuration loading of multi-center names, and dynamic monitoring of multi-center online and offline is achieved; the multi-center access client connector is used for initializing the multi-center access client connector based on the multi-center list and communicating with the registry; and the multi-center access rule filter returns the service list of the center with the highest priority to the acquired service list according to the configured priorities of the centers.

Multi-center listening module:

multi-center list acquisition listener: enhancing the Consul multi-center function of the registration center, loading a multi-center name list according to configuration, and obtaining a service list under each center according to the multi-center name. Monitoring the change of the multi-center in real time, and adding or deleting the service list under the corresponding center when the multi-center is added or deleted, so as to realize the dynamic online and offline functions of the multi-center.

Multi-center access client connector: based on the multi-center list, a multi-center access client is initialized for communication with the registry, and a multi-center service list, configuration, etc. can be obtained.

Service list multi-center priority filter: when calling between services, the nearby access strategy is defaulted, and the service list of the center is prioritized. Meanwhile, the priority of the multiple centers can be customized, and the service list is filtered according to the priority. When the service of the center is not available, the transaction link can be ensured to be normal by accessing other center services.

As shown in fig. 4, the service list multi-level cache module is composed of a load balancing cache L1, a local service list cache L2 (i.e. a service list memory cache L2 in fig. 4), and a local file cache L3 (i.e. a service list file cache L3 in fig. 4), and the multi-level cache supports switch configuration and combination and supports configuration access failure exception policies. The load balancing cache L1 is used for caching a Consul available service list, and the expiration time can be configured; the service list memory cache L2 is used for a degradation strategy of the service list after the registry condul is down, and the service list file cache L3 is used for initializing the service list when the registry condul is unavailable and the service is restarted. And the access failure removing strategy marks the service information of the call failure and supports the modification of the service states in the load balancing cache L1, the service list memory cache L2 and the service list file cache L3 according to the configured failure strategy. Specifically, the micro service and gateway are based on SpringCloud, the service list cache is a multi-level cache, and the multi-level cache supports switch configuration and combination, wherein the multi-level cache comprises a load balancing cache L1, a service list memory cache L2 and a service list file cache L3.

Load balancing cache L1: for supporting the native mode, the valid time of the load balancing cache L1 is controlled by configuring the TTL expiration time, the cache list data is preferentially obtained in the valid time of the cache, and the load balancing cache can be opened and closed through a switch. When the method is started, a service list of the load balancing cache L1 is preferentially obtained, and when the service list does not exist in the load balancing cache L1, the service list on the registration center is obtained through the multi-center client and written into the load balancing cache L1; and when the system is closed, the list of the service on the registration center is directly acquired through the multi-center client, and meanwhile, the cache is updated.

Service list memory cache L2:

initializing: when the service is called, the service name is added into the monitoring list set, and meanwhile, the service list scheduling of the service is started to be updated regularly, the latest service list is acquired from the registration center regularly, and the service list memory cache L2 is updated.

The method comprises the following steps: when the service acquisition service list is called, if the connection with the registry Consul is abnormal, the service list memory buffer is used for acquiring the L2 data as a degradation logic when the registry is abnormal.

Service list file cache L3: the service restart is not lost when the service is asynchronous with the service list memory cache L2, and is mainly used for a scene when a registry Consul is not available and the service is necessary to restart. When the method is started, the content of the service list file cache L3 is loaded into the local service list cache L2 to serve as an initialized service list memory cache.

As shown in fig. 4, an emergency escape module is further provided for interfacing the systems, communicating based on the proxy, monitoring based on configuration of the registry, synchronizing configuration of multiple centers, and receiving and issuing emergency notification, thereby ensuring network isolation and normal transaction between the systems under emergency conditions. Wherein, emergent escape mechanism: the service list multi-level cache compatible micro-service and the gateway together form a multi-center high-availability system. When the systems are in butt joint, cross-system service integration is realized for communication through the proxy. In order to reduce the influence of call abnormality among systems when the agent is abnormal, an emergency escape switching mechanism is provided. The calling party configures the address (proxy address), the emergency address (calling party service address), and the status of the called party in the configuration center of the calling party. When an emergency occurs, the emergency switching notification contains two pieces of information: service name, service status. After each system receives the notification, the system state in the configuration is updated according to the system name. When calling party proxy is abnormal, the related transaction is easy to access the address and walk the emergency address. When the current service agent is abnormal, all addresses of all out-of-transaction of the current service are emergent.

As shown in fig. 4, an access anomaly rejection policy is also provided: the actual data of each level of service list cache and the service list are not in real time and are strong consistent, and the problem that the service is offline, but still is called to the service according to a load balancing strategy in the service list cache, and timeout and even connection exhaustion occur can occur. Based on the life cycle of load balancing, an access failure exception strategy is provided, and the access to the service with exception is supported to be suspended strategically when the registry is normal and down. When the registry is available, the service with the non-business abnormality as the calling result is removed from the service list multi-level cache, if the service is temporarily abnormal due to the health check, the service is recovered to be normal in the next health check, and the service is normally outward, but the problem of service blockage caused by continuous access failure service is avoided. When the registry is down, configuration access failure policies (structures) are supported, access failure times (counts) are supported, the failure policies comprise a direct removal policy (remove) and a suspension call policy (suspension), and the suspension call policy can configure suspension call times (suspension). Wherein the removal policy: when the number of instance failures to invoke a service list is greater than (counts), the various levels of service list caches remove the instance. Suspending the calling strategy: when the number of instance failures to call a service list is greater than (counts), the service instance state is set to stop calling the service (suspended) time.

By adopting the service calling device of the distributed system, the multi-center monitoring comprises the following steps: multi-center module: the multi-center list acquisition monitor comprises a multi-center list acquisition monitor, a multi-center connection client connector and a multi-center access rule filter.

The multi-center list acquisition monitor supports configuration loading of multi-center names, and dynamic monitoring of multi-center online and offline is achieved; the multi-center access client connector is used for initializing a multi-center access client based on the multi-center list and used for communicating with a registry; and the multi-center access rule filter returns the service list of the center with the highest priority to the acquired service list according to the configured priorities of the centers.

(1) Multi-center list acquisition: the multi-center name list which can be explicitly specified in the configuration is firstly obtained when the service is started, and if the service is not configured, the multi-center name of the registry is obtained through the registry client.

(2) Multi-center list listening: initializing a center list scheduler, regularly pulling a center list, comparing the center list with the current center list, and triggering a multi-center change event when the center list is inconsistent with the current center list; after receiving the event, the center list is updated.

(3) Multi-center access client connector: modifying the access client of the registry, and initializing the access client of the multiple centers for acquiring the service list based on the names of the multiple centers acquired in the step (1) and the step (2) when loading is started.

(4) Multi-center access rule filter: and (3) based on the multi-center access client, acquiring a service list from the registry, filtering the acquired service list according to the multi-center priority, and returning the available service.

FIG. 5 is a flow chart of micro service invocation according to an embodiment of the present invention, and as shown in FIG. 5, the micro service invocation process based on the multi-level cache module mainly includes:

(1) When the @ EnableFeignClients switch is turned on, the @ FeignClient annotation is encapsulated as a BeanDefinition, resulting in a factory instance FeignClientFactoyBean.

(2) Based on the initialization of the Spring to the FactoryBean, a dynamic proxy class FeignInvationHandler is constructed.

(3) When a service call is initiated, its invoke method is performed based on the generated proxy class.

(4) And constructing a request parameter, generating a calling client FeignBlockingLoadBalanceClient, and pulling up the calling client.

(5) The load balancing policy is obtained from the load balancing factory and defaults to a polling policy.

(6) The service list is obtained from a service list provider, the service list provider relies on a multi-center access client connector MultiConsulclient and a multi-center access rule filter constructed by a multi-center monitoring module, and the service list is a multi-level cache comprising a load balancing cache L1 (optional), a service list memory cache L2 and a service list file cache L3 (optional). The multi-level cache supports switch configuration and combination, and service list providers are assembled according to the configuration.

Load balancing cache L1: when the load balancing cache is started, firstly, a service list is obtained from the load balancing cache, if the service list is obtained successfully, the service list is directly used, if the service list is obtained successfully, a multi-center access client is used for calling a registry to obtain the service list, the load balancing cache L1 and the service list local cache L2 are updated, and the service list file cache L3 is updated asynchronously. When the load balancing cache is closed, the multi-center access client is directly used for calling the registry to acquire a service list, the local service list cache L2 is updated, and the file cache L3 is asynchronously updated.

(7) The service list multi-center priority filter is used for filtering the acquired service list, defaulting to a near access principle and returning to the service list under the center; and when the service list of the center is empty, returning the service list of the center with the highest priority according to the configured access priority sequence of each center.

(8) And (3) selecting one service list for the service list returned in the step (7) according to the load balancing strategy, and taking the service list as a called party request address.

(9) And (5) requesting the server address by using the HTTP to acquire an access result.

(10) Based on the loadbalance life cycle, an access exception service rejection strategy is provided, and the access exception service is supported to be strategically suspended when a registry is normal and down.

When the registry is available, the service with the non-business abnormality as the calling result is removed from the multi-level cache mark of the service list, if the service is temporarily abnormal due to the health check, the service is recovered to be normal in the next health check, and the service is normally outward, but the problem of service blockage caused by continuous access failure service is avoided.

When the registry is down, the call completion strategy is modified according to the configured call failure strategy, and when a certain service result is called as failure, the failure times and service instances are recorded. Access failure policy (strategy): including a direct remove policy (remove) and a suspend call policy (suspend), which may configure the number of suspend calls (suspend). Access failure times (counts), and when the failure times reach counts, the policy is started. Removal policy: when the number of times of instance failure of calling a certain service list is larger than counts, the service list caches at all levels remove the instance. Suspending the calling strategy: when the number of instance failures of calling a certain service list is larger than counts, the service instance state is set to stop calling the service suspenscurrent time.

Fig. 6 is a schematic flow chart of an initialization process of a service list when a registry is down, and as shown in fig. 6, the initialization of the service list based on the service list local cache L2 and the service list file cache L3 mainly includes the following procedures:

1. when the micro service is started, monitoring an application starting success event, and initializing a service list from the service list file cache L3 to the service list memory cache L2.

2. When the service is called, the calling party service name is added into the local service list monitoring set. And simultaneously starting scheduling, monitoring the change of the service list of the service name, updating the local service list cache L2 when the service list is changed, and asynchronously updating the file cache L3.

Fig. 7 is a schematic structural diagram of an emergency escape module according to an embodiment of the present invention, as shown in fig. 7, a micro service architecture system is formed by a micro service, a gateway and a registry, and when communication is performed between other systems, management is performed by a management platform. Each system communicates based on agents, without intrusion into the system.

Fig. 8 is a schematic diagram of a communication principle of an emergency escape module according to an embodiment of the present invention, as shown in fig. 8, in order to improve usability of call between systems, based on configuration monitoring of a registry, synchronization of multi-center configuration, reception and issuing of emergency notification, implementing an emergency switching mechanism, ensuring that communication between systems is performed by using an emergency address in case of large-scale failure of a distributed agent, and ensuring network isolation and normal transaction between systems in case of emergency. The calling party configures the address (proxy address), emergency address (domain name), status of the called party in the database of the calling party. The management platform performs health examination, and when abnormality is identified, an emergency switching notification is issued, wherein the notification comprises two pieces of information: service name, service status. After each system receives the notification, updating the database to switch the relevant configuration of each system state in an emergency. The multi-center daemon periodically synchronizes database data into the registry condul configuration and further into the environment variables of the service instances of the various centers.

FIG. 9 is a schematic diagram of an emergency address call according to an embodiment of the present invention, as shown in FIG. 9, when the call is made, if the current service status is normal and the service of the called party is in status emergency, the transaction of the called party walks away the emergency address; if the current service state is emergent, the current service calls the transaction of other systems to leave the emergent address. The above realizes high availability of intersystem calls.

In summary, the above embodiments of the present invention provide a service invocation apparatus and method for a distributed system, which support multi-center list acquisition and interception, multi-level cache service list, and emergency escape mechanism to implement a multi-center registration discovery high availability technology under a financial distributed micro-service architecture and a system implementation thereof, specifically as follows: 1. the Consul-based multi-center high-availability distributed system design is realized, the acquisition and the priority configuration of a multi-center service list are realized, and when an instance is unavailable under a certain service of the center, other center services are not perceptively called, so that the nearby access is ensured, the network delay is reduced, and the transaction success rate is improved. 2. The service list cache part of the key design of the system consists of a plurality of levels of caches, supports switch configuration and combination, and supports configuration access abnormal failure strategies. When the registry is available, improving the consistency of the cache and the registry service list; under the condition that the registry is down, calling among systems is not affected; when a certain instance is called to fail, the failure strategy carries out state modification on the instance, so that the influence of unavailable service on a system link is reduced. 3. When the systems communicate, the emergency switching escape mechanism based on the agent is realized based on configuration monitoring of the registry and multi-center configuration synchronization, so that the emergency switching escape mechanism is realized, the fault transfer is performed on the systems without perception, the network isolation is ensured, the transaction between the systems is ensured to be normal under the emergency condition, and the high availability between the systems is improved.

Compared with the prior art, the prior art does not make intensive researches on multiple centers, cannot provide a multi-center mode, the service of each center is respectively registered on the registration center of the center, and the whole service call chain is carried out under the current data center from the beginning to the end, namely, each center is in a relatively independent environment. When a certain service of the center is abnormal, the whole center service is not available, and the transaction of the center is cut into a standby center or other centers, so that the problems of resource waste and unreasonable load distribution exist. The invention provides a Consul-based multi-center implementation, which supports multi-center list acquisition and monitoring, multi-center configuration and dynamic online and offline. When a certain service instance of the center is not available, other center services are called without perception, so that the delay of nearby access is reduced, and the transaction response rate and the success rate are improved. In the prior art, based on SpringCloud, loadBalancer load balancing, a local service list is not provided, and calling among services fails under the condition that a registry is down. The dubo dependent distributed architecture provides a local service list, but in the event of a downtime of the registry, the service list cannot be updated. The invention provides a multi-level cache service list, supports switch configuration and combination, and supports configuration failure strategies. When the registry is available, improving the consistency of the cache and the registry service list; under the condition that the registry is down, calling among systems is not affected; when a service call abnormal failure strategy fails to call a certain instance, the state of the instance is modified, so that the influence of unavailable service on a system call chain is reduced. In the prior art, when a system is called, a mode of no isolation or absolute isolation is adopted, and an emergency mechanism is considered to be incomplete.

The invention provides an emergency switching mechanism with inter-system docking, which realizes fault transfer without perception on a system based on configuration monitoring of a registry and multi-center configuration synchronization, ensures network isolation and normal inter-system transaction under emergency conditions, and improves high availability among systems.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A service calling device of a distributed system is characterized by comprising a plurality of data centers, wherein each data center comprises a plurality of registries, a plurality of micro services and a plurality of gateways,

the registration center is used for registering corresponding service information under the data center; the gateway communicates with each application system of the distributed system, and calls the micro services according to the service list of the registry;

the gateway and the microservice are provided with a multi-center module and a service list multi-level cache module, wherein the multi-center module is used for monitoring states of a plurality of registries of a plurality of data centers; and the service list multi-level caching module caches the service list of the registry.

2. The system of claim 1, wherein the registry is a condul registry; the microservices are SpringCloud microservices.

3. The system of claim 1, wherein the multi-center module further comprises: the server comprises a monitor used for monitoring the service states of the registries, a connector used for connecting the registries with clients, and a filter used for filtering the service lists acquired by the registries according to access rules.

4. The system of claim 1, wherein the service list multi-level caching module comprises: the system comprises a load balancing cache unit, a local service list cache unit and a local file cache unit, wherein the priority of the load balancing cache unit is greater than that of the local service list cache unit;

the load balancing caching unit is used for caching the service list of the normally available registry and controlling the effective time of load balancing caching by configuring time-to-live TTL;

the local service list caching unit is used for calling the service list from a local cache for the data center when the connection of the registration center is abnormal;

and the local file caching unit is used for initializing the local service list caching unit after the registry is abnormally connected and the distributed system is restarted.

5. The system according to claim 1, wherein the gateway is further provided with an emergency escape module, the emergency escape module is configured to provide a proxy address and an emergency address for the distributed system, and directly notify the data center to enable the emergency address for service call in case of abnormality of the proxy address.

6. A service invocation method of a distributed system, characterized in that it is implemented by using the service invocation device according to any one of claims 1 to 5, comprising:

the method comprises the steps that a client acquires name lists of a plurality of data centers, acquires a service list from a corresponding registration center of the data centers according to a preset priority order of the registration center according to a calling micro-service name of the client, and performs micro-service calling according to the service list, wherein when the service list is acquired, the service list is acquired from a service list multi-level cache module according to the preset priority order.

7. The method as recited in claim 6, further comprising:

monitoring states of the data center and the registration center through a multi-center monitoring module, acquiring an emergency address through an emergency escape module under the condition that the state of proxy service of a current system is abnormal, and replacing the current proxy address by the emergency address for communication and micro-service data transmission.

8. The method as recited in claim 6, further comprising:

calling feedback monitoring is conducted on the service list of the registry through a multi-center monitoring module, and when the registry is normally available, a micro-service mark with an abnormal calling result in the service list is removed, wherein when the reason for causing the abnormal calling result is abnormal communication network connection, monitoring is kept until the communication network connection is normal, and the removed micro-service is recovered; under the condition that the registry is connected abnormally, setting preset access failure times for the corresponding micro-services of each service list, and removing or suspending calling of the corresponding micro-services when the actual failure access times are larger than the preset access failure times.

9. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 6 to 8 when run.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 6 to 8.