CN109788068B

CN109788068B - Heartbeat state information reporting method, device and equipment and computer storage medium

Info

Publication number: CN109788068B
Application number: CN201910113747.0A
Authority: CN
Inventors: 陈路远; 袁浩; 王军
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-02-14
Filing date: 2019-02-14
Publication date: 2020-11-03
Anticipated expiration: 2039-02-14
Also published as: CN109788068A

Abstract

The invention discloses a method, a device and equipment for reporting heartbeat state information and a computer storage medium, belongs to the technical field of computers, and is used for accurately reporting the heartbeat state information of a scheduling platform. The method comprises the following steps: when the heartbeat state information reporting time is reached, the service providing equipment calls a monitoring process to traverse the running state information of the monitored process included by each module in the service providing equipment; when the monitoring process determines that all monitored processes successfully report the running state information according to the traversal result and the running state information indicates that the running state of the processes is normal, the service providing equipment calls the monitoring process to report the heartbeat state information indicating the normal running of the service providing equipment to a scheduling platform, so that when the service calling equipment requests a providable service list from the scheduling platform, the scheduling platform sends the providable service list comprising the service providing equipment to the service calling equipment.

Description

Heartbeat state information reporting method, device and equipment and computer storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, and a device for reporting heartbeat status information, and a computer storage medium.

Background

The massive service scene refers to a service scene with a large number of user requests, for example, jittering short videos, WeChat or QQ and the like all have a large number of user groups, so that the request amount generated by the user groups is very large, and in the scene, the balanced scheduling of loads is very important. For a mass service scenario, a server cluster may be generally adopted to provide services for a user, each server in the server cluster may be used as a service providing device, and information such as an Internet Protocol (IP) address and a Port number of each server is registered in a scheduling platform, so as to provide services for a service invoking device. Generally speaking, the dispatch platform is responsible for providing a list query of available service devices to the service invocation device, and the service invocation device can select one of the service provision devices from the obtained list of available service devices and initiate a service invocation to the service provision device.

Therefore, the service providing device needs to accurately report the available state of the service providing device to the scheduling platform, so that the scheduling platform can correctly know which service providing devices are available, and then can correctly provide the list of the available service providing devices for the service invoking device.

Disclosure of Invention

The embodiment of the invention provides a method, a device and equipment for reporting heartbeat state information and a computer storage medium, which are used for accurately reporting the heartbeat state information of a scheduling platform.

On one hand, a method for reporting heartbeat state information is provided, which is applied to a service providing device, wherein the service providing device is used for providing service for a service invoking device, and the method comprises the following steps:

when the heartbeat state information reporting time is reached, the service providing equipment calls a monitoring process to traverse the running state information of the monitored process included by each module in the service providing equipment;

when the monitoring process determines that all monitored processes successfully report the running state information according to the traversal result and the running state information indicates that the running state of the processes is normal, the service providing equipment calls the monitoring process to report the heartbeat state information indicating the normal running of the service providing equipment to a scheduling platform, so that when the service calling equipment requests a providable service equipment list from the scheduling platform, the scheduling platform sends the providable service equipment list comprising the service providing equipment to the service calling equipment.

On one hand, a heartbeat status information reporting device is provided, which is applied to a service providing device, wherein the service providing device is used for providing service for a service invoking device, and the device comprises:

the monitoring unit is used for calling a monitoring process to traverse the running state information of the monitored process included by each module in the service providing equipment when the heartbeat state information reporting time is reached;

and the heartbeat reporting unit is used for reporting the heartbeat state information indicating the normal operation of the service providing equipment to a scheduling platform by the service providing equipment when the monitoring process determines that all monitored processes successfully report the operation state information according to the traversal result and the operation state information indicates that the process operation state is normal, so that when the service calling equipment requests a service equipment list which can be provided from the scheduling platform, the scheduling platform sends the service equipment list which comprises the service providing equipment to the service calling equipment.

In one aspect, a computer device is provided, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the above aspect.

In one aspect, a computer-readable storage medium is provided,

the computer readable storage medium has stored therein computer instructions which, when run on a computer, cause the computer to perform the method of the above aspect.

In the embodiment of the invention, the service providing equipment can call the monitoring process to traverse the running state information of the monitored process included in each module in the equipment, and then only when all the monitored processes are determined to report the running state information successfully and the running state information indicates that the running state of the process is normal, the service providing equipment reports the heartbeat state information indicating the normal running of the service providing equipment to the scheduling platform, so that the heartbeat state information reported to the scheduling platform by the service providing equipment is based on the normal running of all the monitored processes, thereby avoiding the condition that some process faults report the heartbeat state information normally, and further, all the equipment in a list of the service providing equipment provided by the scheduling platform to the service invoking equipment is available, thereby reducing the possibility that the service invoking equipment is unsuccessful in service invocation, thereby improving the user experience.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic architecture diagram of a service providing device in the prior art;

fig. 2 is a schematic view of an application scenario provided in an embodiment of the present invention;

fig. 3 is a schematic flowchart of an initialization process according to an embodiment of the present invention;

fig. 4 is a schematic flow chart of reporting running state information of a monitored process per se by using the monitored process according to the embodiment of the present invention;

fig. 5 is a schematic structural diagram of a shared memory according to an embodiment of the present invention;

fig. 6 is a schematic flow chart of reporting heartbeat state information by a monitoring process according to an embodiment of the present invention;

FIG. 7 is a flowchart illustrating a service invocation according to an embodiment of the present invention;

fig. 8 is a schematic diagram illustrating a display of a list of devices capable of providing services according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a heartbeat status information reporting device according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The embodiments and features of the embodiments of the present invention may be arbitrarily combined with each other without conflict. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

In order to facilitate understanding of the technical solutions provided by the embodiments of the present invention, some key terms used in the embodiments of the present invention are explained first:

heartbeat state information: or referred to as a heartbeat packet, etc., for a device to notify the operating status of other devices, generally speaking, successful reporting of heartbeat status information indicates that the device is operating normally, and unsuccessful reporting of heartbeat status information indicates that the device may fail. The heartbeat state information may carry a flag tag, for example, the format of the heartbeat state information may be an IP address + the flag tag, the flag tag is used to indicate whether the device is healthy or not, for example, when the flag tag takes a value of 1, the device may be indicated to be healthy, that is, the device operates normally, and when the flag tag takes a value of 0, the device may be indicated to be unhealthy, that is, the device is abnormal, or when the flag tag takes a value of 1, the device is indicated to be unhealthy, and when the flag tag takes a value of 0, the device is indicated to be healthy.

A module: generally, for a service providing device, a master module and a plurality of slave modules may be deployed on the device, the master module and the slave modules run dependently, each module may include a plurality of processes, each process may be used for a user request in charge of a fixed number segment, for example, and the plurality of modules cooperate to provide a service, if one of the processes hangs, although the other process can be correctly served, the request routed to the hung process fails, so that the service providing device may be considered to be in an available state, i.e., normal operation, only if all processes of all modules run normally. For example, a module 1, a module 2, and a module 3 are deployed on one device, and only when three modules are normal at the same time, normal services can be provided to the outside.

And (4) normal operation: in the embodiment of the present invention, for a device, normal operation of the device means that the device is in an available state, that is, all processes of all modules in the device operate normally. For a process, the normal operation or survival of the process also means that the process is in an available state, that is, the process has not failed, such as hang-up, dead, and dead loop problems.

Sharing the memory: the shared memory is a storage space for storing running state information of processes to be monitored of all modules in the service providing equipment, and for each process, the shared memory can be accessed, and a memory area of the module where the shared memory is located can be found through a certain rule.

In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this document generally indicates that the preceding and following related objects are in an "or" relationship unless otherwise specified.

At present, a scheduling platform can generally determine whether a service providing device can normally serve through a heartbeat packet reported by the service providing device. When the service providing device is a single device deployment, that is, only one service is deployed in the same ip, the simple heartbeat report in the scene can meet the requirement. For a scenario that a service providing device is composed of a plurality of modules, the plurality of modules are deployed on the same IP address, and each module runs a plurality of processes, a conventional heartbeat packet reporting method generally selects one process from one module to fixedly report a heartbeat packet of the device, and if the heartbeat packet reported by the service providing device is not received at a preset time interval by a scheduling platform, the device is considered to be unavailable. However, in the current heartbeat packet reporting mode, the heartbeat packet is reported, but actually, the service providing device has a fault.

Please refer to fig. 1, which is a schematic diagram of an architecture of a service providing device, an IP address of the service providing device is 10.60.100.99, and the service providing device includes a module a and a module B, the module a includes N processes, i.e., a process 1 to a process N in the module a shown in fig. 1, the module B includes M processes, i.e., a process 1 to a process M in the module a shown in fig. 1, and a heartbeat packet of the device is reported through the process 1 in the module a. In practical application, it may happen that process 1 in module a reports a heartbeat packet, but actually one of process 2 in module a and process B in module B fails to operate normally, and in this case, the device should not actually continue to provide services, but because process 1 in module a reports a heartbeat packet, the scheduling platform still considers that the device is available, and then the device may be scheduled, so that a situation of unsuccessful scheduling occurs, and thus the user experience is poor.

After the inventor analyzes the prior art, the inventor finds that the heartbeat packet reported by the selected reporting process in the prior art can only indicate that the process is available, but cannot indicate that all processes of all modules are available, so that the problem that the equipment condition acquired by the scheduling platform is inconsistent with the actual condition of the equipment is caused. In view of this, in order to make the device condition obtained by the scheduling platform consistent with the actual condition of the device, the accuracy of the heartbeat packet reported by the service providing device is very important.

In view of the above analysis and consideration, an embodiment of the present invention provides a method for reporting heartbeat status information, in which a service providing device may invoke a monitoring process to traverse operation status information of monitored processes included in each module of the device, and then only when it is determined that all monitored processes successfully report the operation status information and the operation status information indicates that the process operation status is normal, the service providing device reports heartbeat status information indicating normal operation of the service providing device to a scheduling platform, so that the heartbeat status information reported to the scheduling platform by the service providing device is based on normal operation of all monitored processes, thereby avoiding a situation that some process faults occur but the heartbeat status information is normally reported, and so that all devices in a list of providable service devices provided by the service invoking device by the scheduling platform are available, therefore, the possibility of unsuccessful service call of the service call device is reduced, and the use experience of a user is improved.

In addition, in the method, the monitored process can also automatically report the running state information of the monitored process to the designated memory area, so that the monitoring process can periodically query the running state of each process in the designated memory area, and further determine whether the heartbeat state information needs to be reported, and further ensure the accuracy of the heartbeat state information reported by the monitoring process.

In the embodiment of the invention, when the conditions of process restart or configuration file change and the like occur, the memory for storing the running state information is emptied, so that the problem of data errors caused by conflicts among storage spaces for storing the running state information of each monitored process is avoided.

After the design idea of the embodiment of the present invention is introduced, some simple descriptions are provided below for application scenarios to which the technical solution of the embodiment of the present invention can be applied, and it should be noted that the application scenarios described below are only used for illustrating the embodiment of the present invention and are not limited. In the specific implementation process, the technical scheme provided by the embodiment of the invention can be flexibly applied according to actual needs.

Fig. 2 shows an application scenario to which the technical solution in the embodiment of the present invention is applicable, where the application scenario may include a service providing device 10, a scheduling platform 20, and a service invoking device 30.

The service providing device 10 may be, for example, one of the servers in a server cluster. When the service providing device 10 is put into use, registration may be performed in the scheduling platform 20, and the registered information may include information such as an IP address and a port number of the service providing device 10, for example.

The scheduling platform 20 may receive heartbeat status information of each service providing device 10 to determine whether each service providing device 10 is available and provide a list of available service providing devices to the service invocation device 30, and the service invocation device 30 may select an available service providing device 10 and initiate a remote service invocation.

The service invoking device 30 may be a user terminal, such as a mobile phone, a laptop, a tablet Computer (PAD), or a Personal Computer (Personal Computer), or may be another server, such as another service providing device 10.

Of course, the method provided in the embodiment of the present invention is not limited to be used in the application scenario shown in fig. 2, and may also be used in other possible application scenarios, which is not limited in the embodiment of the present invention. The functions that can be implemented by each device in the application scenario shown in fig. 2 will be described in the following method embodiments, and will not be described in detail herein.

In the embodiment of the invention, in order to realize a set of universal solution, any module or any service, as long as the module ID and the process number which are configured in advance and need to be monitored are required, the unified detection of all the modules and processes which depend on the module or the process can be realized.

Specifically, when each process is initialized, a configuration file is required to be loaded, where the configuration file is used to indicate configuration information of each module and a process that needs to monitor operating state information, where the configuration information of each module includes, for example, a module ID and a dependency relationship between each module. The format of the configuration file may be as follows:

in the embodiment of the present invention, each module needs to configure at least two pieces of information, that is, at least configure a module index, that is, information corresponding to a field "mod _ index" in the configuration file, where the module index may be, for example, an integer between 1 and 1024, and before the device is put into use, the indexes of the modules are pre-allocated to avoid collision of the module indexes; and the number of processes that each module needs to monitor, i.e., the information corresponding to the field "proc _ count" in the configuration file.

In the configuration file, the field "master _ modem" indicates that the module following the field is the master (master) module, for example, "master _ modem" { "mod _ index":1, "proc _ count":7} in the configuration file indicates that the master module is the module with index 1, and the number of processes that the module needs to monitor is 7.

In practical application, the monitoring process may select from the master module, and since in one module, other processes are generally created by the master module through a fork function, the master process of the master module may be selected as the monitoring process, and the monitoring process needs to monitor the operating state information of the processes included in the module where the monitoring process is located, as well as the operating state information of the processes included in the other slave modules.

In the configuration file, in some cases, the running states of some modules and even processes may not need to be monitored, and in order to facilitate the monitoring of the turning on and off of the function, the embodiment of the present invention provides a corresponding function switch. For example, in the configuration file, the field "enable _ slave _ mod _ check" is used to control the monitoring function to be turned on, for example, when the value of the field "enable _ slave _ mod _ check" is "1", the monitoring function is turned on, that is, the processes of the subsequent modules all need to monitor the running state, and when the value of the field "enable _ slave _ mod _ check" is "0", the monitoring function is disabled, that is, the processes of the subsequent modules do not need to monitor the running state, and similarly, the monitoring function of the process may be set to be turned on and off. Of course, it may also be set to turn on the monitoring function when the value of the field "enable _ slave _ mod _ check" is "0", and to disable the monitoring function when the value of the field "enable _ slave _ mod _ check" is "1", which is not limited in this embodiment of the present invention.

For a module with the closed running state information monitoring function, all processes of the module should be deleted from the monitored process list, and for a process with the closed running state information monitoring function, the process should be deleted from the monitored process list.

In the configuration file, in addition to the information related to the master module, information of a slave (slave) module that the master module depends on needs to be configured. For example, in the configuration file, the field "rpt _ slave _ mode" is used to indicate that the slave module list, i.e., the content following the field "rpt _ slave _ mode", is the slave module list. For each slave module, the field "mod _ index" and the number of processes to be monitored "proc _ count" also need to be configured, and for this, the contents of the corresponding part of the master module may be referred to, and are not described herein again.

In the configuration file, the field "rpt _ modulecfg _ list" is used to indicate that the configuration file is configured for a larger access, in other words, one configuration file may contain configuration information of a plurality of groups of master and slave modules, so that the dependency relationship of all the master and slave modules can be mastered through one configuration file. For example, the configuration file includes configuration information of two groups of master-slave modules, where the index of a master module of a first group of master-slave modules is 1, the number of processes that the master module needs to monitor is 7, the indexes of slave modules that the master module depends on are 2 and 3, the numbers of processes that the module 2 and the module 3 need to monitor are 5 and 2, the index of a master module of the first group of master-slave modules is 4, the number of processes that the master module needs to monitor is 4, the indexes of slave modules that the master module depends on are 5 and 6, and the numbers of processes that the module 5 and the module 6 need to monitor are 6 and 8, respectively.

Of course, other possible configuration information may also be included in the configuration file, and this is not limited in this embodiment of the present invention.

When the service providing device is used for the first time, the service providing device needs to be initialized, and since the initialization process is the same for each module or process, the process 1 of the main module is taken as an example below to introduce the initialization process, and other processes may refer to the following description and will not be described again. Fig. 3 is a schematic flow chart of an initialization process.

Step 301: the master module process 1 obtains the configuration file from the configuration center.

In the embodiment of the invention, the configuration center is used for storing the configuration file and providing the reading service of the configuration file. In a specific application, the configuration center may be uniformly disposed in a single storage device, and when a service providing device needs to be initialized, the service providing device may invoke a corresponding process to read a configuration file from the storage device, where the storage device may be, for example, a device with fixed settings, or a storage medium that can be removably carried, such as a usb disk or a mobile hard disk; or, the configuration center may also be disposed in each service providing device, and when the service providing device needs to be initialized, the service providing device may invoke a corresponding process to read the configuration file from a designated storage path of the service providing device.

Specifically, taking the primary module process 1 as an example, when acquiring the configuration file, the primary module process 1 may call an Application Programming Interface (API) for acquiring the configuration file, so as to acquire the configuration file from the specified storage path. For example, the API for obtaining the configuration file may be as follows:

int init_module(const std::string&config_path)

the parameters of the API are storage paths of the configuration files, so that the configuration files can be obtained from the storage paths after the API is called.

Step 302: the master module process 1 parses the configuration file to obtain master-slave module configuration information.

In the embodiment of the present invention, after the master module process 1 obtains the configuration file, the configuration file may be analyzed, so as to obtain the configuration information of the master module and the slave module.

Specifically, since the configuration file may include configuration information of a plurality of master-slave modules, the master module process 1 may only obtain configuration information of the master-slave module of the device in which the master module process 1 is located.

Step 303: master module process 1 loads the shared memory.

In the embodiment of the present invention, during initialization, for a monitored process, it is necessary to know where to check the running state information of each monitored process, and for the monitored process, it is necessary to know where to report the running state information of the monitored process, so that each process needs to load a shared memory, which is a storage space for storing the running state information.

Step 303: master module process 1 empties the shared memory.

In the embodiment of the invention, other data exist in the shared memory during the initial use, and the shared memory can be emptied after the shared memory is loaded in order to avoid the interference of other data. Certainly, in an actual application, one or more processes may be selected to perform the shared memory clearing, for example, the main module process 1 may be selected to clear the shared memory, while other processes do not perform the shared memory clearing, or each module selects one process to clear the memory area corresponding to the module where the module is located.

In the embodiment of the present invention, in the specific use process of the service providing device, the modules and processes of the service providing device may be adjusted, and further, the configuration file may be updated, for example, the number of processes of a certain module may be adjusted from 3 to 5. Therefore, each process in the service providing device needs to periodically detect whether the configuration file is updated, for example, the configuration file may be checked every 2 minutes (min), and when the configuration file is updated, all processes need to reload the updated configuration file. Specifically, the following APIs may be called to update the configuration file:

int update_cfg(const std::string&config_path)

since the configuration file is updated, the module index and the number of processes to be monitored may change, the shared memory may be emptied after the updated configuration file is loaded in order to avoid interference of data stored in the shared memory in the past. Similarly, one or more processes may be selected to perform the shared memory clearing, for example, the main module process 1 may be selected to clear the shared memory, while other processes do not perform the shared memory clearing, or each module selects one process to clear the memory area corresponding to the module in which it is located.

Similarly, when a process is restarted, the process may fill abnormal data into the shared memory, and in order to avoid interference of dirty data on subsequent processes, the shared memory may be emptied when the process is restarted.

In the embodiment of the invention, in the specific use process of the service providing equipment, the monitored process can actively report the running state information of the monitored process to the shared memory, so that the monitored process determines whether to report the heartbeat state information based on the storage memory of the shared memory. Since the reporting process of each monitored process is similar, the following describes the process by taking one of the monitored processes as an example. Please refer to fig. 4, which is a schematic flow chart of reporting the running state information of the monitored process.

Step 401: and the service providing equipment calls the monitored process to determine whether the reporting time of the running state information arrives.

In the embodiment of the invention, the running state information is used for indicating the running state of the monitored process. The format of the running state information may be, for example, a process ID + a flag tag, where the flag tag is used to indicate the running state of the monitored process, for example, when the value of the flag tag is 1, the running state of the monitored process may be indicated to be normal, and when the value of the flag tag is 0, the running state information may be indicated to be abnormal or not reported by the monitored process; or, when the flag tag value is 0, it may indicate that the monitored process is operating normally, and when the flag tag value is 1, it may indicate that the monitored process is operating abnormally or the monitored process does not report the operating state information.

Generally speaking, the monitored process can report the running state information, which indicates that the monitored process is alive, i.e. normally running, and if the monitored process fails to report the running state information in time, which indicates that the monitored process fails to normally run, abnormal situations such as hang-up or dead may occur. However, there may be some cases, the monitored process may partially have an abnormal condition, but the running state information can still be reported, and then the running state information reported at this time indicates that the process cannot run normally, for example, when the flag tag value is 0, which indicates that the monitored process runs abnormally, the flag tag value reported by the process is 0.

In the embodiment of the present invention, the reporting of the running state information by the monitored process may be performed periodically, for example, the reporting may be performed every 1 second(s), and of course, the interval duration may be specifically set according to an actual situation, which is not limited in the embodiment of the present invention. In practical application, the monitored process may start timing after the last report of the running state information is completed, and when the timing duration is greater than or equal to the interval duration, it is determined that the reporting time has arrived.

And if the determination result in the step 401 is negative, continuing to return to the step 401 until the reporting time is reached.

Step 402: if the determination result in step 401 is yes, the service providing device invokes the monitored process to locate the memory area corresponding to the module in which the service providing device is located.

In the embodiment of the invention, in order to facilitate the monitoring process to acquire the running state of the monitored process, the shared memory is reserved for storing the running state information of the monitored process, and in order to facilitate the management of each module, a corresponding memory area can be allocated to each module.

Fig. 5 is a schematic structural diagram of a shared memory. The service providing device comprises N modules, each module corresponds to one memory area, each memory area comprises a plurality of grids, and the length of bytes of each grid is the same and is used for storing running state information of one process. For example, 1024 bytes may be allocated to each module, and the memory area corresponding to each module is a storage location between the start location of each module and (start location + module index × 1024), where each grid byte is 5 bytes in length, the first 4 bytes store the process ID, and the last 1 byte is used to store the flag tag.

In practical applications, generally, the length of the storage space of the memory region allocated to each module is fixed, so the monitored process only needs to know the offset of the starting position of the monitored process. The monitored process can call the initial position acquisition interface according to the index of the module where the monitored process is located so as to acquire the initial position of the module where the monitored process is located in the memory.

Specifically, when reporting the running state information of the monitored process, the monitored process may call the following API:

bool report_current_load(uint32_t mod_index)

when the API is called, the index of the module in which the monitored process is located is transmitted as a parameter, so that the initial position offset of the module in the shared memory can be calculated, and the memory area corresponding to the module in which the monitored process is located can be located according to the initial position and the length of the storage space set for each module.

Step 403: and the service providing equipment calls the monitored process to traverse the memory area corresponding to the module where the monitored process is located so as to determine whether the process identification ID of the monitored process is stored in the memory area.

In the embodiment of the present invention, after the monitored process is located in the memory area corresponding to the module in which the monitored process is located, each cell in the memory area may be traversed to determine whether the process identifier ID of the monitored process is stored in the memory area.

For example, if a module includes N processes, if all the processes normally report the running state information, N grids should store the running state information, and then the grid storing the process ID of the monitored process can be found from the N grids during the traversal of the monitored process. However, in actual operation, all processes may not report normally, so that the monitored process may sequentially traverse the memory region according to the sequence from 1 to N to determine whether the process identifier ID of the monitored process is stored in the memory region.

Specifically, when the monitored process is traversing, when the traversed grid stores running state information, whether the process ID included in the traversed grid is the same as the ID of the process may be compared, if so, the process ID of the monitored process is determined to be stored in the memory area, otherwise, the next grid is continuously checked until the grid with empty storage content is traversed finally.

Step 404: if the determination result in step 403 is yes, the service providing device invokes the monitored process, and updates the running state information stored in the storage location where the process ID of the monitored process is located.

In the embodiment of the invention, if the self process ID exists in the existing grid, the running state information reported by the monitored process is successful, and the running state information stored in the grid only needs to be updated, namely the flag tag is updated.

Specifically, for one process, during traversal, if the process ID in the existing cell is already present, the flag can be directly updated without occupying a blank storage location, thereby avoiding the situation that the same process occupies multiple cells.

Step 405: if the determination result in the step 403 is negative, the service providing device invokes the monitored process, and writes the running state information of the monitored process into the storage location with empty storage content in the memory area.

In the embodiment of the present invention, if no process ID is stored in the grid, it indicates that the monitored process has not successfully reported the running state information before, and the running state information may be written into a storage location in the memory area where the storage content is empty.

Specifically, for different processes, because multiple processes may run simultaneously and find that the process ID of the process is not stored in the memory area, the multiple processes may write their running state information into the same blank grid, and the process that is written first may be overwritten by the process that is written later, so that the running state information of the process that is written first may not be obtained when the monitoring process is traversed, and it is determined that the current device is unavailable. However, such misjudgment is only short-lived, because the process written to be covered cannot find the grid in which the process ID of the process is stored in the next traversal, the running state information of the process itself is stored in the blank grid, and therefore, after the short-lived misjudgment, each process finds its own grid, and at this time, the monitoring process checks the running state information of all the processes in the traversal, thereby judging that the service providing device is available.

The misjudgment usually occurs when the shared memory is emptied, and a plurality of processes seize the grids, so that grid conflict occurs, but when the misjudgment occurs, the service providing device is judged to be unavailable by the scheduling platform only for a short time, and is not scheduled in a short time, when each process corresponds to the grid of the scheduling platform, the misjudgment is recovered correspondingly, and the scheduling platform can judge the device as available device again.

In the embodiment of the present invention, after

steps

404 and 405 are completed, the reporting of the current running state information is completed, timing may be restarted, and the step 401 is returned to execute when the reporting of the next running state information is waited.

In the embodiment of the present invention, in the specific use process of the service providing device, the monitoring process may determine whether to report the heartbeat state information based on the storage memory of the shared memory. Please refer to fig. 6, which is a schematic flow diagram of reporting heartbeat status information by a monitoring process.

Step 601: and the service providing equipment calls a monitoring process to determine whether the reporting time of the heartbeat state information is reached.

In the embodiment of the invention, the heartbeat state information is used for indicating whether the service providing equipment is available or not. The format of the operation state information may be, for example, an IP address + a flag tag, where the flag tag is used to indicate whether the service providing device is healthy or not, for example, when the flag tag value is 1, the device may be indicated as healthy, that is, the device is usable, and when the flag tag value is 0, the device may be indicated as unhealthy, that is, the device is unusable, or when the flag tag value is 1, the device is indicated as unhealthy, and when the flag tag value is 0, the device is indicated as healthy.

In the embodiment of the present invention, the monitoring process may report the heartbeat state information periodically, for example, the monitoring process may be executed every 3s, and of course, the interval duration may be specifically set according to an actual situation, which is not limited in the embodiment of the present invention. In practical application, the monitored process may start timing after the last report of the running state information is completed, and when the timing duration is greater than or equal to the interval duration, it is determined that the reporting time has arrived.

In order to ensure that the monitored process can have enough time to write the running state information of the monitored process into the shared memory, the interval time for reporting the heartbeat state information by the monitoring process can be set to be longer than the interval time for reporting the running state information by the monitored process.

And if the determination result in the step 601 is negative, continuing to return to the step 601 until the reporting time is reached.

Step 602: if the determination result in step 601 is yes, the service providing device invokes the monitoring process to traverse the running state information of the monitored process included in each module.

In the embodiment of the invention, the monitoring process can acquire the modules to be monitored and the process number list of each module to be monitored from the configuration file. For each module, the monitoring process can also find a memory area corresponding to the module, so as to traverse each grid in the memory area and check whether the running state information is stored in each grid. Because the manner in which the monitoring process acquires the memory region corresponding to each module may be the same as that of the monitored process, the description of the monitored process part may be referred to for the acquisition of the memory region corresponding to each module by the monitoring process, and redundant description is not repeated here.

When the reporting time of the heartbeat state information reaches, the following API can be called by the monitoring process, so that traversal and confirmation of whether the heartbeat state information is reported are realized:

bool check_and_resert()

specifically, when the monitoring process traverses the shared memory, the following three situations may occur, and the following example is taken to indicate that the process operates normally when the flag tag value is 1:

(1) if the stored content in the grid is empty, indicating that a process possibly has not been reported, and continuously checking the next grid until traversal is completed;

(2) if the stored content in the grid is not empty, but the flag tag is 0, the fact that the process is not reported or the process is reported but the process is abnormal is indicated, the running state corresponding to the process ID is recorded, and the next grid is continuously checked until traversal is completed;

(3) if the stored content in the grid is not empty and the flag tag is 1, the process is reported, the process is indicated to run normally, the monitoring process records the running state of the process, the flag tag is reset to the non-reported state, namely the flag tag is reset to 0 or the flag tag is cleared, and the next grid is continuously checked after the process is finished until the traversal is finished.

Step 603: the service providing equipment calls a monitoring process to determine whether to report the heartbeat state information.

In the embodiment of the invention, after the monitoring process is traversed, the dead-live condition of the process of each module can be known according to the traversal result, and whether the heartbeat state information is reported is determined according to the dead-live condition. Specifically, when all monitored processes are normally operated, that is, the operation state information is reported and the process is indicated to be normally operated, the heartbeat state information is reported, and when the monitored processes are abnormally operated, that is, the monitored processes are not reported or the reported operation state information indicates that the process is abnormal, the heartbeat state information is not reported.

Of course, in practical application, the monitoring process may also generate corresponding heartbeat state information according to the dead-live condition without determining whether to report, for example, when all monitored processes are operating normally, heartbeat state information indicating that the service providing device is available is reported, and when the monitored processes are operating abnormally, heartbeat state information indicating that the service providing device is unavailable is reported. In the embodiment of the present invention, the first reporting manner, that is, determining whether to report the heartbeat status information according to the dead-end condition, is mainly used as an example in the following.

Step 604: if the determination result in step 603 is yes, the service providing device invokes the monitoring process to report the heartbeat state information to the scheduling platform.

In the embodiment of the invention, if the monitoring process successfully reports the heartbeat state information to the scheduling platform, the service providing equipment is indicated to be normally operated, namely available, otherwise, the service providing equipment is indicated to be unavailable.

If the determination result in step 603 is negative, the monitoring process does not report the heartbeat status information, and continues to wait for the next monitoring process. When the service providing device is abnormal, a general maintenance worker can repair each module or process to ensure normal service provision, after repair, the service providing device can become available again, and the monitoring process can report heartbeat state information of the device again.

In the embodiment of the present invention, after step 604 is completed, the reporting of the current heartbeat state information is completed, or when it is determined that the heartbeat state information is not reported, timing may be restarted, and the time for reporting the next heartbeat state information is waited to arrive, so that step 601 is performed again.

In the embodiment of the present invention, please refer to fig. 7, which is a schematic flow chart of service invocation.

Step 701: and the service providing equipment reports the heartbeat state information to the scheduling platform.

Step 702: and the scheduling platform adds the service providing equipment which successfully reports the heartbeat state information into the list of the service providing equipment, and deletes the service providing equipment which does not report due to timeout from the list of the service providing equipment.

Specifically, for a service providing device that successfully reports heartbeat state information and a service providing device that does not exist in the service providing device list, the scheduling platform may add the service providing device to the service providing device list, and for a service providing device that already exists in the service providing device list, because the heartbeat state information is not reported due to timeout, the service providing device is unavailable and should be deleted from the service providing device list.

Step 703: and the service calling equipment sends an acquisition request of the list of the available service equipment to the dispatching platform.

The provisionable service device list acquisition request requests acquisition of a provisionable service device list from the scheduling platform.

Step 704: and the scheduling platform sends the list of the available service equipment to the service calling equipment.

Referring to fig. 8, a schematic diagram of a list of available service devices is shown, where the list of available service devices may include information such as IP addresses and port numbers of the available service devices.

Step 705: the service invocation device selects one of the service providing devices from the list of providable service devices.

Step 706: and the service calling device initiates a service calling request to the selected service providing device.

Step 707: the service providing device returns a service response to the service invoking device.

To sum up, in the embodiment of the present invention, the service providing device may invoke the monitoring process to traverse the running state information of the monitored processes included in each module of the device, and then only when it is determined that all the monitored processes successfully report the running state information and the running state information indicates that the running state of the processes is normal, the service providing device reports the heartbeat state information indicating normal running of the service providing device to the scheduling platform, so that the heartbeat state information reported to the scheduling platform by the service providing device is based on normal running of all the monitored processes, thereby avoiding a situation that some processes fail but normally report the heartbeat state information, and so that all devices in the providable service device list provided by the scheduling platform to the service invoking device are available, thereby reducing the possibility that the service invoking device invokes unsuccessful service, thereby improving the user experience.

Referring to fig. 9, based on the same inventive concept, an embodiment of the present invention further provides a heartbeat status information reporting apparatus 90, which is applied to a service providing device, where the service providing device is configured to provide a service for a service invoking device, and the apparatus includes:

the monitoring unit 901 is configured to invoke a monitoring process to traverse the running state information of the monitored process included in each module in the service providing device when the heartbeat state information reporting time arrives;

a heartbeat reporting unit 902, configured to, when the monitoring process determines that all monitored processes successfully report the running state information according to the traversal result and the running state information indicates that the running state of the process is normal, report the heartbeat state information indicating that the service providing device normally runs to the scheduling platform by the service providing device, so that when the service invoking device requests the service provisionable list from the scheduling platform, the scheduling platform sends the provisionable service list including the service providing device to the service invoking device.

Optionally, the apparatus further includes an operation status reporting unit 903, configured to:

when the reporting time of the running state information is reached, calling the monitored process to traverse the memory area corresponding to the module where the monitored process is located;

if the monitored process confirms that the process identification ID of the monitored process is stored in the memory area, calling the monitored process to update the running state information in the storage position where the process ID of the monitored process is located; alternatively, the first and second electrodes may be,

and if the monitored process confirms that the process identification ID of the monitored process is not stored in the memory area, calling the monitored process to write the running state information of the monitored process into a storage position with empty storage content in the memory area.

Optionally, the running state reporting unit 903 is further configured to:

the monitored process calls an initial position acquisition interface according to the index of the module where the monitored process is located, and the initial position of the monitored process in the memory according to the module where the monitored process is located is acquired;

and acquiring a memory area corresponding to the module of the monitored process according to the initial position and the length of the storage space set for each module.

Optionally, the monitoring unit 901 is specifically configured to:

the monitoring process traverses to a storage position with non-empty storage content, and when the running state information stored in the storage position indicates that the running state of the process is normal, the running state of the process corresponding to the process ID stored in the storage position is determined to be normal; and the number of the first and second electrodes,

and the monitoring process resets the running state information stored in the storage position to an unreported state.

Optionally, the apparatus further includes a configuration update unit 904 and a memory cleaning unit 905;

an updating unit 904, configured to invoke all processes to determine whether the configuration file is updated, and the configuration file includes configuration information used for indicating each module and a process that needs to monitor operating state information; when confirming the updating of the configuration file, the service providing equipment calls all the processes to load the updated configuration file;

the memory cleaning unit 905 is configured to invoke a part of or all processes to empty a memory area storing running state information.

Optionally, the memory cleaning unit 905 is further configured to:

and when at least one process is determined to be restarted, calling a part of or all the processes to empty the memory area for storing the running state information.

Optionally, the monitoring unit 901 is further configured to:

when the module is confirmed to close the monitoring operation state function, deleting all monitored processes included in the module from the monitored process list; alternatively, the first and second electrodes may be,

and when the monitored process closing monitoring running state function is confirmed, deleting the monitored process from the monitored process list.

The apparatus may be configured to execute the methods shown in the embodiments shown in fig. 3 to fig. 8, and therefore, for functions and the like that can be realized by each functional module of the apparatus, reference may be made to the description of the embodiments shown in fig. 3 to fig. 8, which is not repeated here. Although the operation status reporting unit 903, the configuration updating unit 904, and the memory cleaning unit 905 are shown in fig. 9, it should be noted that the operation status reporting unit 903, the configuration updating unit 904, and the memory cleaning unit 905 are not essential functional units, and are shown by dashed lines in fig. 9.

Referring to fig. 10, based on the same technical concept, an embodiment of the present invention further provides a computer apparatus 100, which may include a memory 1001 and a processor 1002.

The memory 1001 is used for storing computer programs executed by the processor 1002. The memory 1001 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the computer device, and the like. The processor 1002 may be a Central Processing Unit (CPU), a digital processing unit, or the like. The embodiment of the present invention does not limit the specific connection medium between the memory 1001 and the processor 1002. In fig. 10, the memory 1001 and the processor 1002 are connected by a bus 1003, the bus 1003 is shown by a thick line in fig. 10, and the connection manner between other components is only schematically illustrated and is not limited. The bus 1003 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 10, but this is not intended to represent only one bus or type of bus.

Memory 1001 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1001 may also be a non-volatile memory (non-volatile) such as, but not limited to, a read-only memory (rom), a flash memory (flash memory), a hard disk (HDD) or a solid-state drive (SSD), or the memory 1001 may be any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 1001 may be a combination of the above memories.

A processor 1002 for executing the method performed by the device in the embodiments shown in fig. 3-8 when invoking the computer program stored in said memory 1001.

In some possible embodiments, various aspects of the methods provided by the present invention may also be implemented in the form of a program product including program code for causing a computer device to perform the steps of the methods according to various exemplary embodiments of the present invention described above in this specification when the program product is run on the computer device, for example, the computer device may perform the methods performed by the devices in the embodiments shown in fig. 3-8.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product of the method of embodiments of the present invention may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units described above may be embodied in one unit, according to embodiments of the invention. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for reporting heartbeat state information is applied to service providing equipment, wherein the service providing equipment is used for providing service for service invoking equipment, and the method comprises the following steps:

when the heartbeat state information reporting time is reached, the service providing equipment calls a monitoring process to traverse the running state information of the monitored process included by all modules in the service providing equipment, the service providing equipment calls the monitored process to traverse a memory region corresponding to the module where the monitored process is located, if the monitored process confirms that the process identification ID of the monitored process is stored in the memory region, the service providing equipment calls the monitored process and updates the running state information in the storage position where the process ID of the monitored process is located; wherein the all modules comprise a master module and a plurality of slave modules, the master module and the plurality of slave modules operating dependently;

2. The method of claim 1, wherein the method further comprises:

if the monitored process confirms that the process identification ID of the monitored process is not stored in the memory area, the service providing equipment calls the monitored process and writes the running state information of the monitored process into a storage position with empty storage content in the memory area.

3. The method as claimed in claim 1, wherein before the service providing device calls the monitored process to traverse the memory region corresponding to the module where the monitored process is located, the method further comprises:

4. The method of claim 1, wherein the calling of the monitoring process by the service providing device to traverse the running state information of the monitored process included in all modules in the service providing device comprises:

the monitoring process traverses to a storage position with non-empty storage content, and when the running state information stored in the storage position indicates that the running state of the process is normal, the process corresponding to the process ID stored in the storage position is determined to run normally; and the number of the first and second electrodes,

5. The method of any one of claims 1 to 4, further comprising:

the service providing equipment calls all processes to confirm whether a configuration file is updated, wherein the configuration file is used for indicating configuration information of each module and the process needing monitoring operation state information;

when confirming that the configuration file is updated, the service providing equipment calls all processes to load the updated configuration file; and the number of the first and second electrodes,

and the service providing equipment calls part or all processes to empty the memory area for storing the running state information.

6. The method of any one of claims 1 to 4, further comprising:

and when the service providing equipment confirms that at least one process is restarted, calling part or all processes to empty the memory area for storing the running state information.

7. The method of any one of claims 1 to 4, further comprising:

when the service providing equipment confirms that the module closes the monitoring running state function, all monitored processes included by the module are deleted from the monitored process list; alternatively, the first and second electrodes may be,

and the service providing equipment deletes the monitored process from the monitored process list when the monitored process closes the monitoring running state function.

8. A heartbeat status information reporting device is applied to a service providing device, wherein the service providing device is used for providing service for a service invoking device, and the device comprises:

the monitoring unit is used for calling a monitoring process to traverse the running state information of the monitored process included by all modules in the service providing equipment when the heartbeat state information reporting time is reached, calling the monitored process to traverse a memory area corresponding to the module where the monitored process is located, and calling the monitored process to update the running state information in the storage position where the process ID of the monitored process is located if the monitored process confirms that the process identification ID of the monitored process is stored in the memory area; wherein the all modules comprise a master module and a plurality of slave modules, and the master module and the plurality of slave modules run dependently;

9. A computer device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

10. A computer-readable storage medium, characterized in that,

the computer readable storage medium has stored therein computer instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1-7.