CN115396341A - Service stability evaluation method and device, storage medium and electronic device - Google Patents

Service stability evaluation method and device, storage medium and electronic device Download PDF

Info

Publication number
CN115396341A
CN115396341A CN202210980612.6A CN202210980612A CN115396341A CN 115396341 A CN115396341 A CN 115396341A CN 202210980612 A CN202210980612 A CN 202210980612A CN 115396341 A CN115396341 A CN 115396341A
Authority
CN
China
Prior art keywords
target
service
stability
evaluation
change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210980612.6A
Other languages
Chinese (zh)
Other versions
CN115396341B (en
Inventor
孙宏远
陈存利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Du Xiaoman Technology Beijing Co Ltd
Original Assignee
Du Xiaoman Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Du Xiaoman Technology Beijing Co Ltd filed Critical Du Xiaoman Technology Beijing Co Ltd
Priority to CN202210980612.6A priority Critical patent/CN115396341B/en
Publication of CN115396341A publication Critical patent/CN115396341A/en
Application granted granted Critical
Publication of CN115396341B publication Critical patent/CN115396341B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a service stability evaluation method and device, a storage medium and an electronic device. Wherein, the method comprises the following steps: detecting target change information of a target micro service in the running process, wherein the target change information is used for representing the stability change condition of the target micro service; identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information; determining at least one target observation index corresponding to the target identification result; and carrying out quantitative evaluation on the target micro service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro service. The invention solves the technical problems of high service stability evaluation level and low evaluation frequency in the related technology.

Description

Service stability evaluation method and device, storage medium and electronic device
Technical Field
The present invention relates to the field of information processing, and in particular, to a method and an apparatus for evaluating service stability, a storage medium, and an electronic apparatus.
Background
In the related art, the evaluation method for service stability is mainly used for evaluating the system level, but quantitative evaluation is not performed for the stability of the low-level micro-service, so that the evaluation level of the related art is too high, and the evaluation result cannot directly guide the responsible team of each micro-service to avoid the related stability risk. In addition, the evaluation for service stability in the related art generally requires a relevant evaluation organization to perform field evaluation, which takes a long time to evaluate, resulting in low evaluation frequency. Moreover, since the online service is continuously updated and the stability of the online service is continuously changed, the construction of the service stability cannot be effectively guided according to the last evaluation result in the interval time of two evaluations. Therefore, the related art has the problems of high evaluation level of service stability, low evaluation frequency and the like.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for evaluating service stability, a storage medium and an electronic device, which are used for at least solving the technical problems of high service stability evaluation level and low evaluation frequency caused by the related technology.
According to an embodiment of the present invention, there is provided a method for evaluating service stability, including:
detecting target change information of a target micro service in the running process, wherein the target change information is used for representing the stability change condition of the target micro service; identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information; determining at least one target observation index corresponding to the target identification result; and carrying out quantitative evaluation on the target micro service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
Optionally, identifying the target change information, and obtaining a target identification result includes: acquiring a target classification rule; and identifying the target change information by using a target classification rule to obtain a target identification result.
Optionally, recognizing the target change information by using the target classification rule, and obtaining a target recognition result includes: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for indicating a large change class corresponding to the target change information;
determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of variation subclasses corresponding to the first recognition result;
and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for indicating a change subclass and a change reason corresponding to the target change information.
Optionally, the first classification range includes a first change, and determining at least one target observation indicator corresponding to the target recognition result includes: in response to the second recognition result being a transformation subclass contained in the first variation, determining the at least one target observation indicator includes at least one of: latest on-line list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the first classification range includes a second change, and determining at least one target observation indicator corresponding to the target recognition result includes: and in response to the second recognition result being a transformation subclass contained in the second variation, determining at least one parameter index associated with the transformation subclass as a target observation index, wherein the second variation is used for representing a stability variation caused by the system abnormality.
Optionally, performing quantitative evaluation on the target micro service based on at least one target observation index, and obtaining a stability evaluation result includes:
acquiring a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between at least one target observation index and an evaluation standard;
calculating an evaluation value corresponding to the target micro service, wherein the evaluation value is used for representing a quantitative value corresponding to at least one target observation index in the running process of the target micro service; stability evaluation results were determined based on the evaluation values.
According to an embodiment of the present invention, there is also provided an apparatus for evaluating service stability, including:
the system comprises a detection module, a storage module and a processing module, wherein the detection module is used for detecting target change information of a target micro service in the operation process, and the target change information is used for representing the stability change condition of the target micro service;
the identification module is used for identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
the determining module is used for determining at least one target observation index corresponding to the target recognition result;
and the evaluation module is used for carrying out quantitative evaluation on the target micro service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
Optionally, the apparatus for evaluating service stability further includes an obtaining module, configured to obtain the target classification rule; and the identification module is also used for identifying the target change information by using the target classification rule to obtain a target identification result.
Optionally, the identification module is further configured to: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for indicating a large change class corresponding to the target change information; determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of variation subclasses corresponding to the first recognition result; and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for indicating a change subclass and a change reason corresponding to the target change information.
Optionally, the determining module is further configured to: in response to the second recognition result being a transformation subclass contained in the first variation, determining the at least one target observation indicator includes at least one of: latest on-line list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the determining module is further configured to: and in response to the second recognition result being a transformation subclass included in a second variation, determining at least one parameter index associated with the transformation subclass as a target observation index, wherein the second variation is used for representing a stability variation caused by a system abnormality.
Optionally, the obtaining module is further configured to: and acquiring a target evaluation rule, wherein the target evaluation rule is used for recording the mapping relation between at least one target observation index and the evaluation standard.
Optionally, the apparatus for evaluating service stability further includes: and the calculation module is used for calculating an evaluation value corresponding to the target micro service, wherein the evaluation value is used for representing a quantitative value corresponding to at least one target observation index in the running process of the target micro service.
Optionally, the determining module is further configured to: stability evaluation results were determined based on the evaluation values.
According to an embodiment of the present invention, there is further provided a non-volatile storage medium having a computer program stored therein, wherein the computer program is configured to execute the method for evaluating service stability in any one of the above methods when running.
There is further provided, according to an embodiment of the present invention, a processor, configured to execute the program, where the program is configured to execute the method for evaluating service stability in any one of the above-mentioned embodiments when running.
There is further provided, according to an embodiment of the present invention, an electronic apparatus including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the method for evaluating service stability in any one of the above.
In the embodiment of the invention, the target change information is identified by detecting the target change information of the target micro-service in the operation process to obtain the target identification result, and then at least one target observation index corresponding to the target identification result is determined, and the target micro-service is quantitatively evaluated based on the at least one target observation index to obtain the stability evaluation result corresponding to the target micro-service, so that the purposes of reducing the evaluation level and improving the evaluation frequency are achieved, the technical effect of quantitatively evaluating the stability of a single micro-service is realized, and the technical problems of high service stability evaluation level and low evaluation frequency in the related technology are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention and do not constitute a limitation of the invention. In the drawings:
FIG. 1 is a flow chart of a method for evaluating service stability according to one embodiment of the present invention;
fig. 2 is a block diagram of an apparatus for evaluating service stability according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the related art, the evaluation method for service stability is mainly used for evaluating the service stability at a system level, and comprises the step of evaluating the capability of guaranteeing the stable operation of a system, namely when the system fails, troubleshooting is carried out and the operation of the system is recovered. However, the stability of the low-level micro-service is not quantitatively evaluated, so that the evaluation level of the related technology is too high, and the evaluation result cannot directly guide the responsible team of each micro-service to avoid the related stability risk. In addition, in the related art, the evaluation for the service stability generally requires a related evaluation organization to perform field evaluation, and the evaluation takes a long time, so that the related art cannot perform daily or real-time evaluation on the service, thereby resulting in low evaluation frequency. Moreover, the online service is continuously updated, and the stability of the online service also changes continuously, so that the construction of the service stability cannot be effectively guided according to the last evaluation result in the interval time of two evaluations.
Therefore, the related art has the problems of high evaluation level of service stability, low evaluation frequency and the like. Therefore, the present application proposes an evaluation method of service stability to solve the above technical problem.
In accordance with an embodiment of the present invention, there is provided an embodiment of a method for evaluating service stability, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that herein.
The method embodiments may be performed in an electronic device or similar computing device that includes a memory and a processor. Taking the example of running on a computer terminal, the computer terminal may include one or more processors (which may include, but are not limited to, processing devices such as Central Processing Units (CPUs), graphics Processing Units (GPUs), digital Signal Processing (DSP) chips, microprocessors (MCUs), programmable logic devices (FPGAs), neural Network Processors (NPUs), tensor Processors (TPUs), artificial Intelligence (AI) type processors, etc.) and memory for storing data. Optionally, the computer terminal may further include a transmission device, an input/output device, and a display device for a communication function. It will be appreciated by persons skilled in the art that the above description of the architecture is illustrative only and is not intended to limit the architecture of the computer terminal described above. For example, the computer terminal may also include more or fewer components than described above, or have a different configuration than described above.
The memory may be used to store computer programs, for example, software programs and modules of application software, such as a computer program corresponding to the method for evaluating service stability in the embodiment of the present invention, and the processor executes various functional applications and data processing by running the computer program stored in the memory, that is, implements the method for evaluating service stability described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory located remotely from the processor, and these remote memories may be connected to the mobile terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The display device may be, for example, a touch screen type Liquid Crystal Display (LCD) and a touch display (also referred to as a "touch screen" or "touch display screen"). The liquid crystal display may enable a user to interact with a user interface of the mobile terminal. In some embodiments, the mobile terminal has a Graphical User Interface (GUI) with which a user can interact by touching finger contacts and/or gestures on a touch-sensitive surface, where the human-machine interaction function optionally includes the following interactions: executable instructions for creating web pages, drawing, word processing, making electronic documents, games, video conferencing, instant messaging, emailing, call interfacing, playing digital video, playing digital music, and/or web browsing, etc., for performing the above-described human-computer interaction functions, are configured/stored in one or more processor-executable computer program products or readable storage media.
In this embodiment, a method for evaluating service stability running on the computer terminal is provided, and fig. 1 is a flowchart of a method for evaluating service stability according to an embodiment of the present invention, as shown in fig. 1, the flowchart includes the following steps:
step S12, detecting target change information of the target micro service in the running process, wherein the target change information is used for representing the stability change condition of the target micro service;
step S14, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
s16, determining at least one target observation index corresponding to the target identification result;
and S18, quantitatively evaluating the target micro service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
Specifically, the classification information corresponding to the target change information indicates the classification of the stability change, and may include a change large class, a change sub class, and a stability deterioration cause. The target observation indexes can be divided into two categories, namely a change observation index and a passive change observation index, and the target observation indexes can be the corresponding relation between the observation indexes and the scoring standard, and the scores corresponding to different observation indexes are different, and the higher the score is, the worse the stability is. The artificial change observation indexes can include the latest on-line list state, engineering availability, service availability and the like, and the passive change observation indexes can include at least one parameter index associated with a change subclass.
The method comprises the steps of detecting the stability change condition of a target micro service in the operation process, identifying the corresponding change major class, change minor class and stability deterioration reason of the target micro service, further determining a corresponding target observation index according to an identification result, scoring the stability of a single micro service according to a corresponding table of the observation index and a scoring standard, and finally accumulating scores corresponding to all target observation indexes to obtain an evaluation result of the stability corresponding to the target micro service, wherein the higher the score of the evaluation result is, the worse the stability of the target micro service is.
Based on the steps S12 to S18, the target change information is identified by detecting the target change information of the target micro service in the operation process to obtain a target identification result, and then at least one target observation index corresponding to the target identification result is determined, and the target micro service is quantitatively evaluated based on the at least one target observation index, so as to obtain a stability evaluation result corresponding to the target micro service, thereby achieving the purposes of reducing an evaluation level and improving evaluation frequency, and further achieving the technical effect of quantitatively evaluating the stability of a single micro service, and further solving the technical problems of high service stability evaluation level and low evaluation frequency caused in the related technology.
Optionally, in step S14, identifying the target change information to obtain a target identification result includes:
step S141, acquiring a target classification rule;
and step S142, identifying the target change information by using the target classification rule to obtain a target identification result.
Specifically, the target classification rule is used to indicate the correspondence between the classification and the cause of the micro-service stability change, and can be explained by using the micro-service stability change classification and the cause correspondence table shown in table 1. As shown in table 1, the major groups of changes affecting microservice stability include two groups, artificial and passive. Wherein, the artificial change represents the stability change caused by the preset of the user, for example, the artificial service is stopped on line, the operation violates the specification, etc.; passive changes represent changes in stability due to system failures, e.g., machine hardware failures, etc.
As shown in table 1, the artificial change categories include three change subclasses, which are service on-line and code change, configuration change, and instance expansion/contraction. The service on-line and code change subclass comprises three stability deterioration reasons which are respectively code logic error, unreasonable exception handling and non-compliance with development specifications; the configuration change subclass comprises a stability deterioration reason which is caused by configuration error or unreasonable; the instance expansion/contraction subclass includes a stability penalty due to unexpected instance bearer traffic.
The passive change category includes four change sub-categories, which are traffic change, service capacity change, other service failure, and machine change. The traffic variation subclass comprises five reasons for stability deterioration, namely traffic reduction-outer network service outer network channel failure, traffic reduction-inner network service inner network channel failure, traffic increase-attack traffic, traffic increase-active operation or peak period and traffic increase-inner network service upstream abnormal calling. The subclass of service capacity variation includes three reasons for stability deterioration, namely, an increase in service flat sound, an increase in CPU (Central Processing Unit) utilization rate in a service package, and an increase in memory utilization rate in a service package. Other service fault subclasses include four reasons for stability deterioration, which are filled in database problems, namely Structured Query Language (SQL), big transactions, and connections, respectively; cache middleware problems-large Key value identification (Key), cache penetration, cache breakdown, cache avalanche; message queue middleware problem-message loss, repeat consumption; downstream or third party service failures. Two reasons for stability degradation are included in the machine change subclass, basic environment upgrade/downgrade and machine hardware failure, respectively.
Specifically, the micro-service stability change classification and reason correspondence table may be used to identify target change information of a single micro-service, so as to obtain a target identification result, including a change large class, a change sub-class and a stability deterioration reason corresponding to the target change information. For example, the micro-service stability change classification and the reason correspondence table are used for identifying the target change information of the single micro-service 1 to obtain a target identification result, wherein the target identification result comprises that the change corresponding to the target change information is a large artificial change, the change subclass is service online and code change, and the stability deterioration reason is code logic error and unreasonable exception handling. For another example, the micro-service stability change classification and reason correspondence table is used to identify the target change information of a single micro-service 2, and a target identification result is obtained, wherein the change category corresponding to the target change information is passive change, the change subclass is service capacity change, and the stability deterioration reason is that the utilization rate of the CPU in the service package is high and the utilization rate of the memory in the service package is increased.
Based on the above steps S141 to S142, by obtaining the target classification rule and identifying the target change information by using the target classification rule to obtain the target identification result, the specific major and minor categories and deterioration reasons corresponding to the target change information can be determined, so as to evaluate the stability of a single micro service with respect to the target identification result.
TABLE 1 micro-service stability variation Classification and reason correspondence Table
Figure BDA0003800276200000081
Optionally, in step S142, recognizing the target change information by using the target classification rule, and obtaining a target recognition result includes:
step S1421, identifying the target change information in the first classification range, and determining a first identification result, where the first identification result is used to indicate a change major class corresponding to the target change information;
step S1422, determining a second classification range according to the first recognition result, wherein the second classification range includes a plurality of variation subclasses corresponding to the first recognition result;
step S1423, identify the target change information in the second classification range, and determine a second identification result, where the second identification result is used to indicate a change subclass and a change reason corresponding to the target change information.
Specifically, the first classification range represents a large class of changes affecting the stability of the microservice, including artificial changes and passive changes. The target change information can be identified in the change categories, and the change categories corresponding to the target change information are determined.
And obtaining the change large class corresponding to the target change information according to the first identification result, so that the change subclass range corresponding to the target change information can be determined. For example, it is known that the change category corresponding to the target change information is an artificial change according to the first recognition result, so that the change subclass range corresponding to the target change information can be determined as the service on-line and code change, the configuration change and the instance capacity expansion/capacity reduction, that is, the change subclass corresponding to the target change information is at least one of the service on-line and code change, the configuration change and the instance capacity expansion/capacity reduction.
After the change subclass range corresponding to the target change information is determined, the target change information can be identified, so that the change subclass and the change reason corresponding to the target change information can be determined. For example, when determining that the change subclass range corresponding to the target change information is online service and code change, configuration change, and instance expansion/contraction, the target change information may be identified, and thus, the change subclass corresponding to the target change information is determined to be online service and code change, and the change cause is code logic error.
Based on the above steps S1421 to S1423, the target change information is identified in the first classification range, the first identification result is determined, the second classification range is determined according to the first identification result, the target change information is further identified in the second classification range, and the second identification result is determined, so that the change major category, the change minor category, and the change reason corresponding to the target change information can be obtained, and the stability of a single microservice can be evaluated in detail with respect to the change major category, the change minor category, and the change reason.
Optionally, in step S16, the first classification range includes a first change, and determining at least one target observation indicator corresponding to the target recognition result includes:
step S161, in response to the second recognition result being a variation subclass included in the first variation, determining that the at least one target observation index includes at least one of: latest on-line list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Specifically, the first variation represents an artificial variation. When the micro-service stability changes due to the user's preset, at least one index may be determined among the latest on-line state, engineering availability, and service availability as a target observation index for evaluating the micro-service stability. For example, when the micro service is online, and the stability of the micro service changes due to the fact that the new and old versions of the micro service coexist because the online is manually terminated, the latest online form state can be determined as a target observation index for evaluating the stability of the micro service among the latest online form state, engineering availability and service availability.
Based on the above step S161, by determining at least one target observation index in response to the second recognition result being the change subclass included in the first change, it is possible to determine a target observation index for evaluating the stability of the micro-service affected by the user' S preset, so as to evaluate a single micro-service in detail using the target observation index corresponding to the change subclass in human change.
Optionally, in step S16, the first classification range includes the second variation, and determining at least one target observation indicator corresponding to the target recognition result includes:
and step S162, responding to the second identification result as a change subclass contained in a second change, determining at least one parameter index associated with the change subclass as a target observation index, wherein the second change is used for representing stability change caused by system abnormity.
In particular, the second variation represents a passive variation. The parameter indexes associated with the variation subclass in the passive variation may include a packet loss rate, round-Trip Time (RTT), a bandwidth utilization rate, an unhealthy number of an external network access point, a hijacked number of a domain name, and the like.
When the second recognition result is a variation subclass included in the large passive variation class, at least one parameter index associated with the variation subclass may be determined as a target observation index for evaluating micro-service stability. For example, when the second recognition result is a traffic variation subclass included in the passive variation subclass, the packet loss rate and RTT associated with the traffic variation subclass may be determined as a target observation indicator for evaluating the micro-service stability.
Based on the above step S162, by determining at least one parameter index associated with the change subclass as the target observation index in response to the second recognition result being the change subclass included in the second change, it is possible to determine the target observation index for evaluating the stability of the micro-service affected by the system anomaly, so as to perform a detailed evaluation of a single micro-service using the target observation index corresponding to the change subclass in the passive change.
Optionally, in step S18, performing quantitative evaluation on the target microservice based on at least one target observation indicator, and obtaining the stability evaluation result includes:
s181, obtaining a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between at least one target observation index and an evaluation standard;
s182, calculating an evaluation value corresponding to the target micro service, wherein the evaluation value is used for representing a quantitative value corresponding to at least one target observation index in the running process of the target micro service;
and S183, determining the stability evaluation result based on the evaluation value.
Specifically, the target evaluation rule is used for representing a mapping relationship between at least one target observation index and the evaluation standard, and different variation subclasses have different corresponding relationships between the observation indexes and the scoring standards. Based on the mapping relation between the target observation index and the evaluation standard, the evaluation value corresponding to the target micro-service can be calculated to obtain the quantitative value corresponding to the target observation index, and then the stability evaluation result is obtained according to the evaluation value, so that the quantitative evaluation of the stability of the target micro-service is realized.
Based on the steps S181 to S183, the stability of the target micro service can be quantitatively evaluated by obtaining the target evaluation rule, calculating the evaluation value corresponding to the target micro service, and obtaining the stability evaluation result according to the evaluation value.
The mapping relationship between the target observation indexes and the evaluation standards can be represented by a target observation index and evaluation standard correspondence table, and different variation subclasses correspond to different target observation index and evaluation standard correspondence tables.
And recording corresponding online lists for the results of service online and code change. One on-line unit has two states of a final state and a non-final state, wherein the final state represents that on-line is completed, and the non-final state represents that on-line is not completed. The final state of the online list is divided into success and termination, wherein success means that all instances are deployed successfully, and termination means that the online is stopped due to manual intervention.
Generally, the micro-service online cannot be completed in a short time, and processes such as gray-scale publishing and hierarchical publishing are generally required so as to gradually observe whether the online of the micro-service is expected, but the online time is usually not more than 3 days. Therefore, when a service has a long-time non-final-state online form, new and old versions coexist, uncertainty is introduced, and the stability of the micro-service is affected. In addition, when the final state of the last online form is a termination, it indicates that manual intervention exists in the last online process to terminate online, and the service does not roll back to the last version or update to a new version, and also has a risk of coexistence of new and old versions, which may affect the stability of the microservice.
Therefore, the result of the service on-line and the code change can be evaluated by using the latest on-line state of the service.
In addition, the influence on the service online can be evaluated by using a service availability index, wherein the service availability can be divided into two types, namely service engineering availability and service business availability. The method comprises the steps that engineering availability judges and analyzes the running state of a server by using an HTTP (Hyper Text Transfer Protocol) state code returned by a service, and when the HTTP state code is larger than or equal to 499, a request fails; defining the number of failed requests as PVLost, the total number of requests as PV, and the calculation formula of service Engineering Availability (Engineering Availability) as shown in formula (1):
Figure BDA0003800276200000111
however, in the case of a normal HTTP return, there may also be cases of an erroneous traffic return due to erroneous code logic, for which an evaluation can be made with service traffic availability. Defining the service error request number as BPVLost, the total service request number as PV, and the calculation formula of the service engineering Availability (Business Availability) as shown in formula (2):
Figure BDA0003800276200000121
therefore, the change of the sub-category of the service online and code change can be evaluated by using the latest online list state of the service, the availability of service engineering and the availability of service business, and specific observation indexes and evaluation criteria are shown in table 2. Table 2 is a table corresponding to the service on-line and code change subclass observation indicators and evaluation criteria in the artificial change class.
Table 2 service on-line and code change subclass observation index and scoring standard correspondence table
Figure BDA0003800276200000122
As shown in table 2, the observation indicators include the latest on-line status of the service, service engineering availability, and service business availability. The scoring standard corresponding to the latest on-line list state of the service may include the following scoring levels: when the latest online form is in the final state and the online form is terminated, 5 points are recorded; when the latest on-line list is in a non-final state and the on-line list is not ended for more than 3 days, 5 points are recorded. The scoring criteria for service engineering availability may include the following scoring levels: when the service level project availability is between 99.91% and 99.95%, 5 points are recorded; when the service day level engineering availability is between 99.51% and 99.9%, recording 10 points; when the service-level project availability is less than 99.5%, 15 points are recorded. The scoring criteria corresponding to service availability may include the following scoring levels: when the service day level service availability is between 99.91% and 99.95%, recording 5 points; when the service day level service availability is between 99.51% and 99.9%, 10 points are recorded; when the service day level service availability is less than 99.5%, 15 points are recorded.
The service stability influenced by the change of the online service and the subclass change of the code change can be carefully evaluated by using the table 2, and meanwhile, the evaluation can be carried out to a level higher than a day level. For example, when the latest on-line ticket is the final state and the on-line ticket is the termination, 5 points are scored. For another example, when the number of failed requests PVLost in one day is 6 and the total number of requests PV is 1000, the service-day-level project availability can be calculated to be 99.4% according to formula (1), and further the evaluation value can be 15 points according to table 2.
In addition, since the impact of the change of the configuration change subclass on the service stability is similar to the service coming online, the micro-service stability affected by the change of the configuration change subclass can also be evaluated in table 2. Since the scaling and contraction of an instance affects the stability of the service based on unexpected instance-accepted traffic, the micro-service stability affected by the scaling and contraction sub-class changes of the instance can also be evaluated in table 2.
By acquiring the service on-line and code change subclass observation indexes and the scoring standard corresponding table, the evaluation value corresponding to the target micro-service influenced by the service on-line and code change can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the day-level quantitative evaluation of the stability of the target micro-service is realized.
In addition to the evaluation of the stability of the micro-services affected by the artificial changes in table 2, the stability of the micro-services affected by the passive changes may also be evaluated using other corresponding tables of observation indicators and scoring criteria. Table 3 shows a correspondence relationship between the observation index and the score criterion for the traffic variation subclass.
As shown in table 3, the observation indicators of the traffic variation subclasses include packet loss rate of micro service instance Internet Protocol (IP), round-Trip Time (RTT) of the IP, bandwidth usage rate of the external network service external network access point, unhealthy number of the external network service external network access point, and domain name hijacking number of the domain name access service.
The scoring criteria corresponding to the packet loss rate of the micro service instance IP may include the following scoring levels: when the same machine room antenna-level packet loss rate is between 0.069% and 0.347%, 5 points are marked; when the same machine room antenna level packet loss rate is between 0.348% and 0.69%, recording 10 points; and when the same machine room antenna level packet loss rate is more than 0.691%, recording 15 points. When the packet loss rate of the same region across machine rooms is between 0.069% and 0.347%, 5 points are recorded; when the packet loss rate of the same region across the machine room at the antenna level is between 0.348 and 0.69 percent, recording 10 points; and when the packet loss rate of the same region across the machine room is more than 0.691%, recording 15 points. When the cross-region cross-machine-room antenna-level packet loss rate is between 0.069% and 0.347%, 5 points are marked; when the cross-region cross-machine-room antenna-level packet loss rate is between 0.348% and 0.69%, recording 10 points; and when the cross-region cross-machine-room antenna-level packet loss rate is more than 0.691%, 15 points are recorded.
The scoring criteria corresponding to RTT of the micro service instance IP may include the following scoring levels: when the average value of RTT of the same machine room at the antenna level is between 0.3 milliseconds (ms) and 0.6ms, 10 minutes are counted; when the average value of RTT of the same machine room antenna is more than 0.61ms, 20 minutes are counted. When the average value of RTT of the same region across machine room antenna is between 4ms and 5ms, 5 points are counted; when the average value of the RTT across the machine rooms in the same region is larger than 5.1ms, 10 points are counted. When the average value of the cross-region cross-machine-room antenna RTT is between 40ms and 50ms, 2 points are counted; when the average value of the cross-region cross-machine-room antenna RTT is larger than 51ms, 5 points are recorded.
The scoring criteria corresponding to the bandwidth utilization of the extranet access point may include 10 points when the peak value of the daily bandwidth utilization is greater than 50%.
The scoring criterion corresponding to the number of unhealthy external network access points of the external network service may include scoring 10 points when the number of unhealthy external network access points is greater than or equal to 1.
The scoring criteria corresponding to the domain name hijacking number of the domain name access service may include that when the domain name hijacking number is greater than or equal to 1, 10 points are counted.
Table 3 correspondence table between traffic variation subclass observation index and scoring standard
Figure BDA0003800276200000141
Using table 3, the service stability affected by the change in the traffic variation subclass can be evaluated in detail, and the evaluation can be performed on a daily scale or more. For example, the packet loss rate of the same region across machine rooms at the antenna level is 0.1%, and 5 points are recorded; the trans-regional trans-machine room antenna level packet loss rate is 0.35%, and 10 points are recorded; and accumulating the scores to obtain the evaluation value of the service of 15 points. For another example, the average value of RTT of the same machine room at the antenna level is 0.45ms, and 10 minutes is recorded; the peak value of the day-level bandwidth utilization rate is 60%, and 10 points are recorded; the number of unhealthy external network access points is equal to 1, and 10 points are counted; and accumulating all the scores to obtain the evaluation value of the service of 30 points.
By obtaining the corresponding relation table of the flow variation subclass observation indexes and the grading standards, the evaluation value corresponding to the target micro-service influenced by the flow variation can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the day-level quantitative evaluation of the stability of the target micro-service can be realized.
For the subclass of service capacity variation, the capability of the system to carry traffic is usually evaluated. Similar to the capacity of a container to hold liquid, service capacity is defined as the quantification of the traffic carrying capacity of the system. For general traffic, the capacity model is modeled based on the incidence relation between traffic (Drivers), availability, quality of Service (QoS), and Resources (Resources). An increase in Drivers will bring about resource consumption and QoS degradation. For the CPU intensive service, defining a service capacity water level formula as shown in a formula (3), wherein the CPU usage is used for expressing the use condition of resources; QPS is the number of requests per second, used to represent traffic situation; flat response is the average response time of a service processing all requests, and is used to indicate availability and quality of service.
Figure BDA0003800276200000151
If the flow borne by the micro-service exceeds the range that the service can bear, the capacity water level is increased. In a general same-region double-machine-room architecture, the capacity of each machine room of a micro-service needs to be capable of bearing the flow caused by the service stop of another machine room, that is, the water level of a single machine room is guaranteed to be below 50% at any time.
The correspondence between the observation index for the subclass of service capacity variation and the scoring criterion is shown in table 4. The observation index of the service capacity variation subclass comprises a micro service capacity water level, and the scoring standard comprises that when the water level of a single machine room exceeds 50% at any time, 20 points are recorded.
Table 4 correspondence table between service capacity change subclass observation index and scoring standard
Observation index Scoring criteria
Micro service capacity water level The water level of the single room exceeds 50% at any time: record 20 points
By obtaining the corresponding relation table of the observation indexes of the subclasses of the service capacity variation and the scoring standards, the evaluation value corresponding to the target micro service influenced by the service capacity variation can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the quantitative evaluation of the stability of the target micro service can be realized.
Faults in other service fault subclasses in the large passive change class can be divided into two classes, one class is downstream problems caused by unreasonable use, such as slow database check, full connection number and the like, and the other class is self faults of the downstream service. The correspondence between the observation indexes and the scoring criteria for other service fault subclasses is shown in table 5.
As shown in table 5, the observation indicators of other service fault subclasses may include the number of slow Structured Query Language (SQL) of the database, the number of large transactions of the database, the percentage of the current connection number of the database in the total links, the maximum CPU usage rate of the database, and the maximum memory usage rate of the database; the number of large key value identifications (keys) of the cache middleware, the slow check number of the cache middleware, the cache hit rate of the cache middleware, the maximum CPU utilization rate of the cache middleware and the maximum memory utilization rate of the cache middleware; message queue middleware single topic lag (topic lag), message queue middleware maximum CPU utilization, message queue middleware maximum memory utilization, message queue middleware individual disk partition input/output (I/O) utilization, message queue middleware individual disk partition utilization.
Other service fault subclass scoring criteria may include scoring 2 points for each occurrence of slow SQL; recording 5 points when a large transaction occurs; when the current connection number of the database accounts for 40-79% of the total connection percentage, 10 points are counted; when the current connection number of the database accounts for more than 80 percent of the total connection percentage, recording 20 points; when the maximum CPU utilization rate of the database is between 40% and 79%, recording 10 points; when the maximum CPU utilization rate of the database is more than 80%, recording 20 points; when the maximum memory utilization rate of the database is between 40% and 79%, recording 10 points; when the maximum memory utilization rate of the database is more than 80%, recording 20 points; when a big key appears, 2 points are counted; when one slow check occurs, 2 points are recorded; when the cache hit rate of the cache middleware is between 40% and 60%, recording 10 points; when the cache hit rate of the cache middleware is less than 39%, recording 20 points; when the maximum CPU utilization rate of the cache middleware is between 40% and 79%, recording 10 points; when the maximum CPU utilization rate of the cache middleware is more than 80%, recording 20 points; when the maximum memory utilization rate of the cache middleware is between 40% and 79%, recording 10 points; when the maximum memory utilization rate of the cache middleware is greater than 80%, recording for 20 minutes; when the single topic lag increased for 30 minutes, 5 points were scored; when the maximum CPU utilization rate of the message queue middleware is between 40% and 79%, 10 points are counted; when the maximum CPU utilization rate of the message queue middleware is more than 80%, recording for 20 minutes; when the maximum memory utilization rate of the message queue middleware is between 40% and 79%, recording 10 points; when the maximum CPU utilization rate of the message queue middleware is more than 80%, recording for 20 minutes; when the I/O utilization rate of each disk partition of the message queue middleware is between 40% and 79%, recording 10 points; when the I/O utilization rate of each disk partition of the message queue middleware is greater than 80%, recording for 20 minutes; when the utilization rate of each disk partition of the message queue middleware is between 40% and 79%, recording 10 points; and when the utilization rate of each disk partition in the message queue middleware is more than 80%, recording 20 minutes.
Table 5 corresponding relationship table between other service fault subclasses observation index and scoring standard
Figure BDA0003800276200000161
Figure BDA0003800276200000171
Using table 5, the service stability affected by other service fault class changes can be carefully evaluated. For example, five pieces of slow SQL appear in the database, scoring 10; the maximum CPU utilization rate of the database is 82%, and 20 points are recorded; the maximum CPU utilization rate of the cache middleware is 83%, and 20 points are recorded; thus, the scores can be accumulated to obtain a service evaluation value of 50 points.
By obtaining the corresponding relation table of the observation indexes of the subclasses of other service faults and the scoring standard, the evaluation value corresponding to the target micro service influenced by other service faults can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the quantitative evaluation of the stability of the target micro service can be realized.
The correspondence between the observation index for the machine change subclass and the scoring criterion is shown in table 6. The machine is the bottom layer of the system, and the change of kernel parameters and the failure of hardware can cause serious influence on the stability of service. The hardware fault can be evaluated according to a system log (syslog) and an out-of-band log of a machine where the service is located. The utilization rate of a CPU of the whole machine, the utilization rate of a memory of the whole machine, the utilization rate of a disk, the utilization rate of an I/O of the disk and the utilization rate of a bandwidth of a network card are also important indexes for evaluating the performance of the machine.
If the machine is down or the data cannot be recovered, whether disaster recovery and backup exist needs to be judged. If the backup data of the important data (or the original important data information) generated by the application system exists, the service can continue to work normally after the backup data is restored. For general microservices, day-level data backup needs to be guaranteed.
As shown in table 6, the machine change subclass observation indicators may include the number of hardware faults of the machine where the service is located; the maximum CPU utilization rate of a machine where the service is located, the maximum memory utilization rate of the machine where the service is located, the I/O utilization rate of each disk partition of the machine where the service is located, the utilization rate of each disk partition of the machine where the service is located, and the maximum bandwidth utilization rate of a network card of the machine where the service is located; and (4) preparing the number of the day-level service data.
The machine variation subclass scoring standard can comprise that each piece of machine hardware has a fault alarm and is scored 10; when the maximum CPU utilization rate of the machine is between 40% and 79%, 10 points are counted; when the maximum CPU utilization rate of the machine is more than 80%, recording 20 points; when the maximum memory utilization rate of the machine is between 40% and 79%, recording 10 points; when the maximum memory utilization rate of the machine is more than 80%, recording 20 points; when the I/O utilization rate of each disk partition of the machine is between 40% and 79%, recording 10 points; when the I/O utilization rate of each disk partition of the machine is more than 80%, recording for 20 minutes; when the utilization rate of each disk partition of the machine is between 40% and 79%, recording 10 points; when the utilization rate of each disk partition of the machine is more than 80%, recording 20 points; when the maximum bandwidth utilization rate of the machine network card is between 40% and 79%, recording 10 points; when the maximum bandwidth utilization rate of the machine network card is more than 80%, recording 20 points; when the backup number of the day-class service data is less than 1, record 10 points.
Table 6 table of correspondence between observation indexes of subclasses of machine variations and scoring standards
Figure BDA0003800276200000181
Machine variability subclasses can be carefully evaluated using the observation criteria and scoring criteria shown in Table 6. For example, a hardware fault alarm of a 3-bar machine occurs, and 30 points are recorded; the maximum CPU utilization rate of the machine is 50%, and 10 points are recorded; the maximum memory utilization rate of the machine is 49%, and 10 points are recorded; thus, the scores can be accumulated to obtain the evaluation value of the micro service of 50 points.
By obtaining the corresponding relation table of the machine change subclass observation indexes and the scoring standards, the evaluation value corresponding to the target micro service influenced by the machine change can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the day-level quantitative evaluation of the target micro service stability is realized.
And calculating an evaluation value corresponding to the target micro service according to the obtained target evaluation rule, and giving the evaluation value to obtain an evaluation result of the micro service stability. The higher the evaluation value is, the worse the stability of the micro service is, and the time consumption for evaluation is shorter because the quantitative evaluation of the single micro service can be carried out in a day scale, so that the stability of the micro service can be expressed by the fluctuation of the evaluation value in a period of time.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a device for evaluating service stability is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 2 is a block diagram of an apparatus for evaluating service stability according to an embodiment of the present invention, and as shown in fig. 2, the apparatus for evaluating service stability includes:
the detection module 201 is configured to detect target change information of a target micro-service in an operation process, where the target change information is used to indicate a stability change condition of the target micro-service;
the identification module 202 is configured to identify the target change information to obtain a target identification result, where the target identification result is used to represent classification information corresponding to the target change information;
the determining module 203 is configured to determine at least one target observation indicator corresponding to the target recognition result;
the evaluation module 204 is configured to perform quantitative evaluation on the target micro service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
Optionally, the apparatus for evaluating service stability further includes: an obtaining module 205, configured to obtain a target classification rule; the identification module 202 is further configured to: and identifying the target change information by using a target classification rule to obtain a target identification result.
Optionally, the identification module 202 is further configured to: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for indicating a large change class corresponding to the target change information; determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of variation subclasses corresponding to the first recognition result; and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for indicating a change subclass and a change reason corresponding to the target change information.
Optionally, the determining module 203 is further configured to: in response to the second recognition result being a transformation subclass included in the first variation, determining at least one target observation indicator includes at least one of: latest on-line list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the determining module 203 is further configured to: and in response to the second recognition result being a transformation subclass contained in the second variation, determining at least one parameter index associated with the transformation subclass as a target observation index, wherein the second variation is used for representing a stability variation caused by the system abnormality.
Optionally, the obtaining module 205 is further configured to: and acquiring a target evaluation rule, wherein the target evaluation rule is used for recording the mapping relation between at least one target observation index and the evaluation standard.
Optionally, the apparatus for evaluating service stability further includes a calculating module 206, configured to calculate an evaluation value corresponding to the target micro service, where the evaluation value is used to represent a quantitative value corresponding to at least one target observation indicator in an operation process of the target micro service.
Optionally, the determining module 203 is further configured to: stability evaluation results were determined based on the evaluation values.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a non-volatile storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
the method comprises the following steps of S1, detecting target change information of a target micro service in the operation process, wherein the target change information is used for representing the stability change condition of the target micro service;
s2, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
s3, determining at least one target observation index corresponding to the target identification result;
and S4, carrying out quantitative evaluation on the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide a processor for executing a program, wherein the program is configured to perform the steps in any of the above method embodiments when executed.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
the method comprises the following steps that S1, target change information of a target micro service in the running process is detected, wherein the target change information is used for representing the stability change condition of the target micro service;
s2, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
s3, determining at least one target observation index corresponding to the target identification result;
and S4, carrying out quantitative evaluation on the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Embodiments of the present invention further provide an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the steps in any one of the above method embodiments.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.
In the above embodiments of the present invention, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described in detail in a certain embodiment.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit may be a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partly contributing to the related art or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims (10)

1. A method for evaluating service stability, comprising:
detecting target change information of a target micro service in the running process, wherein the target change information is used for representing the stability change condition of the target micro service;
identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
determining at least one target observation index corresponding to the target identification result;
and quantitatively evaluating the target micro service based on the at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
2. The method of claim 1, wherein identifying the target change information and obtaining the target identification result comprises:
acquiring a target classification rule;
and identifying the target change information by using the target classification rule to obtain the target identification result.
3. The method of claim 2, wherein identifying the target change information by using the target classification rule, and obtaining the target identification result comprises:
identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a large change class corresponding to the target change information;
determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of variation subclasses corresponding to the first recognition result;
and identifying the target change information in the second classification range, and determining a second identification result, wherein the second identification result is used for indicating a change subclass and a change reason corresponding to the target change information.
4. The method of claim 3, wherein the first classification range includes a first variation, and wherein determining the at least one target observation indicator corresponding to the target recognition result includes:
in response to the second recognition result being a transformation subclass contained in the first variation, determining the at least one target observation indicator comprises at least one of: latest on-line list status, engineering availability and service availability, wherein the first change is used for representing a stability change due to a user's preset.
5. The method of claim 3, wherein the first classification range includes a second variation, and wherein determining the at least one target observation indicator corresponding to the target recognition result includes:
and in response to the second identification result being a transformation subclass contained in the second variation, determining at least one parameter index associated with the transformation subclass as the target observation index, wherein the second variation is used for representing stability variation caused by system abnormity.
6. The method of claim 1, wherein performing a quantitative evaluation of the target microservice based on the at least one target observation indicator, and wherein obtaining the stability evaluation comprises:
acquiring a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between the at least one target observation index and an evaluation standard;
calculating an evaluation value corresponding to the target micro service, wherein the evaluation value is used for representing a quantitative value corresponding to the at least one target observation index in the running process of the target micro service;
determining the stability assessment result based on the evaluation value.
7. An apparatus for evaluating service stability, comprising:
the system comprises a detection module, a processing module and a processing module, wherein the detection module is used for detecting target change information of a target micro service in the operation process, and the target change information is used for representing the stability change condition of the target micro service;
the identification module is used for identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
the determining module is used for determining at least one target observation index corresponding to the target recognition result;
and the evaluation module is used for carrying out quantitative evaluation on the target micro service based on the at least one target observation index to obtain a stability evaluation result corresponding to the target micro service.
8. A non-volatile storage medium, characterized in that a computer program is stored in the storage medium, wherein the computer program is arranged to execute the method for assessing service stability as claimed in any one of claims 1 to 6 when running.
9. A processor for running a program, wherein the program is configured to execute the method for assessing service stability of any one of claims 1 to 6 when running.
10. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method for evaluating service stability as claimed in any one of claims 1 to 6.
CN202210980612.6A 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device Active CN115396341B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210980612.6A CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210980612.6A CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN115396341A true CN115396341A (en) 2022-11-25
CN115396341B CN115396341B (en) 2023-12-05

Family

ID=84119855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210980612.6A Active CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN115396341B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116074185A (en) * 2023-01-18 2023-05-05 北京奇艺世纪科技有限公司 Request detection method, system, device, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180246941A1 (en) * 2017-02-24 2018-08-30 Oracle International Corporation Optimization for scalable analytics using time series models
CN108874640A (en) * 2018-05-07 2018-11-23 北京京东尚科信息技术有限公司 A kind of appraisal procedure and device of clustering performance
CN109933452A (en) * 2019-03-22 2019-06-25 中国科学院软件研究所 A kind of micro services intelligent monitoring method towards anomalous propagation
US10684940B1 (en) * 2018-09-18 2020-06-16 Amazon Technologies, Inc. Microservice failure modeling and testing
CN111835592A (en) * 2020-07-14 2020-10-27 北京百度网讯科技有限公司 Method, apparatus, electronic device and readable storage medium for determining robustness
CN112039689A (en) * 2020-07-21 2020-12-04 网宿科技股份有限公司 Network equipment performance evaluation method, device, equipment and storage medium
CN112181759A (en) * 2020-09-04 2021-01-05 广东电力信息科技有限公司 Method for monitoring micro-service performance and diagnosing abnormity
CN112241350A (en) * 2019-07-16 2021-01-19 中国移动通信集团浙江有限公司 Micro-service evaluation method and device, computing device and micro-service detection system
CN112540905A (en) * 2020-12-18 2021-03-23 青岛特来电新能源科技有限公司 System risk assessment method, device, equipment and medium under micro-service architecture
CN114138625A (en) * 2021-12-08 2022-03-04 中国工商银行股份有限公司 Method and system for evaluating health state of server, electronic device and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180246941A1 (en) * 2017-02-24 2018-08-30 Oracle International Corporation Optimization for scalable analytics using time series models
CN108874640A (en) * 2018-05-07 2018-11-23 北京京东尚科信息技术有限公司 A kind of appraisal procedure and device of clustering performance
US10684940B1 (en) * 2018-09-18 2020-06-16 Amazon Technologies, Inc. Microservice failure modeling and testing
CN109933452A (en) * 2019-03-22 2019-06-25 中国科学院软件研究所 A kind of micro services intelligent monitoring method towards anomalous propagation
CN112241350A (en) * 2019-07-16 2021-01-19 中国移动通信集团浙江有限公司 Micro-service evaluation method and device, computing device and micro-service detection system
CN111835592A (en) * 2020-07-14 2020-10-27 北京百度网讯科技有限公司 Method, apparatus, electronic device and readable storage medium for determining robustness
CN112039689A (en) * 2020-07-21 2020-12-04 网宿科技股份有限公司 Network equipment performance evaluation method, device, equipment and storage medium
CN112181759A (en) * 2020-09-04 2021-01-05 广东电力信息科技有限公司 Method for monitoring micro-service performance and diagnosing abnormity
CN112540905A (en) * 2020-12-18 2021-03-23 青岛特来电新能源科技有限公司 System risk assessment method, device, equipment and medium under micro-service architecture
CN114185760A (en) * 2020-12-18 2022-03-15 青岛特来电新能源科技有限公司 System risk assessment method and device and charging equipment operation and maintenance detection method
CN114138625A (en) * 2021-12-08 2022-03-04 中国工商银行股份有限公司 Method and system for evaluating health state of server, electronic device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116074185A (en) * 2023-01-18 2023-05-05 北京奇艺世纪科技有限公司 Request detection method, system, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115396341B (en) 2023-12-05

Similar Documents

Publication Publication Date Title
US10002144B2 (en) Identification of distinguishing compound features extracted from real time data streams
US8745202B2 (en) Tracking remote browser crashes via cookies
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN109992473B (en) Application system monitoring method, device, equipment and storage medium
US20140172371A1 (en) Adaptive fault diagnosis
US10896073B1 (en) Actionability metric generation for events
CN110401580B (en) Webpage state monitoring method based on heartbeat mechanism and related equipment
CN111784173B (en) AB experiment data processing method, device, server and medium
CN115396341A (en) Service stability evaluation method and device, storage medium and electronic device
CN109542722A (en) Anomaly analysis processing method, device and storage medium
CN113704018A (en) Application operation and maintenance data processing method and device, computer equipment and storage medium
CN111626498A (en) Equipment operation state prediction method, device, equipment and storage medium
CN107871213B (en) Transaction behavior evaluation method, device, server and storage medium
CN107480703B (en) Transaction fault detection method and device
CN107943678A (en) A kind of method for evaluating application access process and evaluation server
CN111858108A (en) Hard disk fault prediction method and device, electronic equipment and storage medium
CN110719337A (en) Service system, service request processing method, device and server
CN111026882A (en) Default determination method, device, equipment and storage medium based on knowledge graph
CN114219486A (en) Batch transaction processing method and device, computer equipment and storage medium
CN111815442B (en) Link prediction method and device and electronic equipment
CN114064757A (en) Application program optimization method, device, equipment and medium
AU2014200806B1 (en) Adaptive fault diagnosis
CN111338609B (en) Information acquisition method, device, storage medium and terminal
CN113703993A (en) Service message processing method, device and equipment
CN112764957A (en) Application fault delimiting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant