CN115396341B - Service stability evaluation method and device, storage medium and electronic device - Google Patents

Service stability evaluation method and device, storage medium and electronic device Download PDF

Info

Publication number
CN115396341B
CN115396341B CN202210980612.6A CN202210980612A CN115396341B CN 115396341 B CN115396341 B CN 115396341B CN 202210980612 A CN202210980612 A CN 202210980612A CN 115396341 B CN115396341 B CN 115396341B
Authority
CN
China
Prior art keywords
target
service
change
stability
evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210980612.6A
Other languages
Chinese (zh)
Other versions
CN115396341A (en
Inventor
孙宏远
陈存利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Du Xiaoman Technology Beijing Co Ltd
Original Assignee
Du Xiaoman Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Du Xiaoman Technology Beijing Co Ltd filed Critical Du Xiaoman Technology Beijing Co Ltd
Priority to CN202210980612.6A priority Critical patent/CN115396341B/en
Publication of CN115396341A publication Critical patent/CN115396341A/en
Application granted granted Critical
Publication of CN115396341B publication Critical patent/CN115396341B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a service stability evaluation method and device, a storage medium and an electronic device. Wherein the method comprises the following steps: detecting target change information of the target micro-service in the running process, wherein the target change information is used for representing the stability change condition of the target micro-service; identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information; determining at least one target observation index corresponding to the target recognition result; and quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service. The invention solves the technical problems of high service stability evaluation layer and low evaluation frequency caused by the related technology.

Description

Service stability evaluation method and device, storage medium and electronic device
Technical Field
The present invention relates to the field of information processing, and in particular, to a service stability evaluation method, a service stability evaluation device, a storage medium, and an electronic device.
Background
In the related art, the evaluation method for service stability is mainly used for performing system-level evaluation, but quantitative evaluation is not performed for the stability of the low-level micro-service, so that the evaluation level of the related art is too high, and the evaluation result cannot directly guide the responsible team of each micro-service to avoid the related stability risk. Furthermore, the evaluation of service stability in the related art generally requires an on-site evaluation by a related evaluation organization, which takes a long time, resulting in low evaluation frequency. In addition, since the online service is continuously updated and the stability thereof is continuously changed, the construction of the service stability cannot be effectively guided according to the last evaluation result within the interval time of two evaluations. As can be seen from the above, the related art has problems such as high evaluation level and low evaluation frequency of service stability.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a service stability evaluation method, a device, a storage medium and an electronic device, which are used for at least solving the technical problems of high service stability evaluation layer and low evaluation frequency caused by related technologies.
According to one embodiment of the present invention, there is provided a service stability evaluation method, including:
detecting target change information of the target micro-service in the running process, wherein the target change information is used for representing the stability change condition of the target micro-service; identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information; determining at least one target observation index corresponding to the target recognition result; and quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Optionally, identifying the target change information, and obtaining a target identification result includes: obtaining a target classification rule; and identifying the target change information by utilizing the target classification rule to obtain a target identification result.
Optionally, identifying the target change information by using a target classification rule, and obtaining a target identification result includes: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a change category corresponding to the target change information;
determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of change subclasses corresponding to the first recognition result;
and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for representing a change subclass and a change reason corresponding to the target change information.
Optionally, the first classification range includes a first change, and determining at least one target observation index corresponding to the target recognition result includes: in response to the second recognition result being a sub-class of the transformation included in the first variation, determining at least one target observation includes at least one of: the latest online list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the first classification range includes a second change, and determining at least one target observation index corresponding to the target recognition result includes: and determining at least one parameter index associated with the sub-class of changes as a target observation index in response to the second recognition result being a sub-class of changes included in the second change, wherein the second change is indicative of a change in stability due to a system anomaly.
Optionally, performing quantitative evaluation on the target micro-service based on at least one target observation index, and obtaining a stability evaluation result includes:
acquiring a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between at least one target observation index and an evaluation standard;
calculating an evaluation value corresponding to the target micro-service, wherein the evaluation value is used for representing a quantized value corresponding to at least one target observation index in the running process of the target micro-service; and determining a stability evaluation result based on the evaluation value.
According to one embodiment of the present invention, there is also provided an apparatus for evaluating service stability, including:
the detection module is used for detecting target change information of the target micro-service in the running process, wherein the target change information is used for representing the stability change condition of the target micro-service;
the identification module is used for identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
the determining module is used for determining at least one target observation index corresponding to the target identification result;
and the evaluation module is used for quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Optionally, the evaluation device of service stability further comprises an acquisition module, configured to acquire a target classification rule; and the identification module is also used for identifying the target change information by utilizing the target classification rule to obtain a target identification result.
Optionally, the identification module is further configured to: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a change category corresponding to the target change information; determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of change subclasses corresponding to the first recognition result; and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for representing a change subclass and a change reason corresponding to the target change information.
Optionally, the determining module is further configured to: in response to the second recognition result being a sub-class of the transformation included in the first variation, determining at least one target observation includes at least one of: the latest online list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the determining module is further configured to: and determining at least one parameter index associated with the sub-class of changes as a target observation index in response to the second recognition result being a sub-class of changes included in the second change, wherein the second change is indicative of a change in stability due to a system anomaly.
Optionally, the obtaining module is further configured to: and obtaining a target evaluation rule, wherein the target evaluation rule is used for recording the mapping relation between at least one target observation index and the evaluation standard.
Optionally, the service stability evaluation device further includes: and the calculation module is used for calculating an evaluation value corresponding to the target micro-service, wherein the evaluation value is used for representing a quantized value corresponding to at least one target observation index in the running process of the target micro-service.
Optionally, the determining module is further configured to: and determining a stability evaluation result based on the evaluation value.
According to an embodiment of the present invention, there is also provided a nonvolatile storage medium in which a computer program is stored, wherein the computer program is configured to execute the method of evaluating service stability in any one of the above-described aspects when run.
According to an embodiment of the present invention, there is also provided a processor for running a program, wherein the program is configured to execute the method of evaluating service stability in any one of the above-mentioned aspects at runtime.
According to one embodiment of the present application, there is also provided an electronic device including a memory in which a computer program is stored, and a processor configured to run the computer program to perform the method of evaluating service stability in any one of the above.
In the embodiment of the application, the target change information is identified by detecting the target change information of the target micro-service in the running process, so as to obtain a target identification result, further at least one target observation index corresponding to the target identification result is determined, and the target micro-service is quantitatively evaluated based on the at least one target observation index, so that the stability evaluation result corresponding to the target micro-service is obtained, the purposes of reducing the evaluation level and improving the evaluation frequency are achieved, the technical effect of quantitatively evaluating the stability of a single micro-service is achieved, and the technical problems of high service stability evaluation level and low evaluation frequency caused in the related technology are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of a method of evaluating service stability according to one embodiment of the present invention;
fig. 2 is a block diagram of a service stability evaluation apparatus according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The evaluation method for service stability in the related art mainly performs system level evaluation on the service stability, and comprises the steps of evaluating the capability of guaranteeing the stable operation of the system, namely, when the system fails, troubleshooting the failure and recovering the operation of the system. However, the stability of the low-level micro-service is not quantitatively evaluated, so that the evaluation level of the related technology is too high, and the evaluation result cannot directly guide the responsible team of each micro-service to avoid the related stability risk. In addition, in the related art, the evaluation of service stability generally requires a related evaluation mechanism to perform field evaluation, which takes a long time, so that the related art cannot perform day-level or real-time evaluation on the service, thereby resulting in low evaluation frequency. In addition, the online service is continuously updated, and the stability of the online service is continuously changed, so that the construction of the service stability cannot be effectively guided according to the last evaluation result within the interval time of two evaluations.
As can be seen from the above, the related art has problems of high evaluation level, low evaluation frequency, and the like of service stability. Therefore, the present application proposes a service stability evaluation method to solve the above technical problems.
According to an embodiment of the present invention, there is provided an embodiment of a method of evaluating service stability, it being noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowchart, in some cases the steps shown or described may be performed in an order different from that herein.
The method embodiments may be performed in an electronic device or similar computing device that includes a memory and a processor. Taking an example of running on a computer terminal, the computer terminal may include one or more processors (which may include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processor (GPU), a Digital Signal Processing (DSP) chip, a Microprocessor (MCU), a programmable logic device (FPGA), a neural Network Processor (NPU), a Tensor Processor (TPU), an Artificial Intelligence (AI) type processor, etc.) and a memory for storing data. Optionally, the above computer terminal may further include a transmission device for a communication function, an input-output device, and a display device. It will be appreciated by those of ordinary skill in the art that the above description of the structure is illustrative only and is not intended to limit the structure of the computer terminal described above. For example, the computer terminal may also include more or fewer components than the above structural description, or have a different configuration than the above structural description.
The memory may be used to store a computer program, for example, a software program of application software and a module, for example, a computer program corresponding to the service stability evaluation method in the embodiment of the present invention, and the processor executes various functional applications and data processing by running the computer program stored in the memory, that is, implements the service stability evaluation method described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the mobile terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission means comprises a network adapter (Network Interface Controller, simply referred to as NIC) that can be connected to other network devices via a base station to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
Display devices may be, for example, touch screen type Liquid Crystal Displays (LCDs) and touch displays (also referred to as "touch screens" or "touch display screens"). The liquid crystal display may enable a user to interact with a user interface of the mobile terminal. In some embodiments, the mobile terminal has a Graphical User Interface (GUI), and the user may interact with the GUI by touching finger contacts and/or gestures on the touch-sensitive surface, where the man-machine interaction functions optionally include the following interactions: executable instructions for performing the above-described human-machine interaction functions, such as creating web pages, drawing, word processing, making electronic documents, games, video conferencing, instant messaging, sending and receiving electronic mail, talking interfaces, playing digital video, playing digital music, and/or web browsing, are configured/stored in a computer program product or readable storage medium executable by one or more processors.
In this embodiment, there is provided a method for evaluating service stability of a computer terminal, and fig. 1 is a flowchart of a method for evaluating service stability according to an embodiment of the present invention, as shown in fig. 1, the flowchart includes the following steps:
Step S12, detecting target change information of the target micro-service in the running process, wherein the target change information is used for representing the stability change condition of the target micro-service;
step S14, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
step S16, determining at least one target observation index corresponding to the target recognition result;
and S18, quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Specifically, the classification information corresponding to the target change information indicates the classification of the stability change, and may include a change major class, a change sub-class, and a stability deterioration reason. The target observation indexes can be divided into two main types, namely an artificial change observation index and a passive change observation index, the target observation indexes can be the corresponding relation between the observation indexes and the scoring standard, the scores corresponding to different observation indexes are different, and the higher the score is, the worse the stability is. Wherein, the artificial change type observation index can comprise the latest online single state, engineering availability, service availability and the like, and the passive change type observation index can comprise at least one parameter index associated with the change subclass.
Detecting the stability change condition of the target micro-service in the operation process, identifying the corresponding change major class, change sub-class and stability deterioration reason, further determining the corresponding target observation index according to the identification result, grading the stability of the single micro-service according to the corresponding table of the observation index and the grading standard, and finally accumulating the scores corresponding to all the target observation indexes to obtain the evaluation result of the stability corresponding to the target micro-service, wherein the higher the score of the evaluation result is, the worse the stability of the target micro-service is.
Based on the steps S12 to S18, the target change information is identified by detecting the target change information of the target micro-service in the operation process, so as to obtain a target identification result, further determine at least one target observation index corresponding to the target identification result, and quantitatively evaluate the target micro-service based on the at least one target observation index, so as to obtain a stability evaluation result corresponding to the target micro-service, thereby achieving the purposes of reducing the evaluation level and improving the evaluation frequency, realizing the technical effect of quantitatively evaluating the stability of a single micro-service, and further solving the technical problems of high service stability evaluation level and low evaluation frequency caused in the related technology.
Optionally, in step S14, identifying the target change information, and obtaining the target identification result includes:
step S141, obtaining a target classification rule;
step S142, the target change information is identified by utilizing the target classification rule, and a target identification result is obtained.
Specifically, the target classification rule is used to represent the correspondence between the micro-service stability change classification and the cause, and can be explained by using the correspondence table of the micro-service stability change classification and the cause shown in table 1. As shown in table 1, the major classes of changes affecting the stability of the microservice include two classes, artificial and passive, respectively. Wherein the human change represents a stability change due to a preset by a user, for example, a human out-of-service online, an operation violation of a specification, etc.; passive changes represent stability changes due to system failures, e.g., machine hardware failures, etc.
As shown in table 1, the human change major class includes three change subclasses, which are service online and code change, configuration change, and instance capacity expansion/contraction, respectively. Three reasons for deteriorating stability are included in the service online and code change subclasses, namely code logic errors, unreasonable exception handling and non-compliance with development specifications; the configuration change subclass includes a stability degradation cause due to configuration errors or unreasonable; one stability penalty is included in the instance capacity expansion/contraction subclass because unexpected instances accept traffic.
The passive variation major class includes four sub-classes of variation, namely flow variation, service capacity variation, other service faults and machine variation. The flow change subclasses comprise five stability deterioration reasons, namely flow reduction-external network service external network access failure, flow reduction-internal network service internal network access failure, flow increase-attack flow, flow increase-active operation or peak period and flow increase-internal network service upstream abnormal call. The service capacity change subclass includes three reasons for stability deterioration, namely, a service flat sound increase, a central processing unit (Central Processing Unit, CPU) utilization increase in the service package and a memory utilization increase in the service package. The other service failure subclasses comprise four stability deterioration reasons, namely database problems-slow structured query language (Structured Query Language, SQL), large transaction and full connection; cache middleware problem-large Key value identification (Key), cache penetration, cache breakdown, cache avalanche; message queue middleware problem-message loss and repeated consumption; downstream or third party service failures. Two reasons for stability degradation are included in the machine change subclass, namely basic environment upgrades/downgrades and machine hardware failures.
Specifically, the micro-service stability change classification and reason correspondence table can be utilized to identify the target change information of the single micro-service, so as to obtain a target identification result, wherein the target identification result comprises a change major class, a change sub class and a stability deterioration reason corresponding to the target change information. For example, the micro-service stability change classification and reason correspondence table is utilized to identify the target change information of the single micro-service 1, so as to obtain a target identification result, wherein the change major category corresponding to the target change information is artificial change, the change sub-category is service online and code change, and the stability deterioration reason is that the code logic error and the exception handling are unreasonable. For another example, the micro service stability change classification and reason correspondence table is used to identify the target change information of the single micro service 2, so as to obtain a target identification result, wherein the change major class corresponding to the target change information is passive change, the change sub-class is service capacity change, and the stability deterioration reason is that the CPU usage rate in the service package is high and the memory usage rate in the service package is increased.
Based on the steps S141 to S142, by acquiring the target classification rule and identifying the target change information by using the target classification rule, a target identification result is obtained, and the specific major class, the specific sub class and the specific degradation cause corresponding to the target change information can be determined, so that the stability of the single micro service can be evaluated according to the target identification result.
TABLE 1 micro-service stability Change Classification and reason correspondence table
Optionally, in step S142, identifying the target change information by using the target classification rule, and obtaining the target identification result includes:
step S1421, identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a change category corresponding to the target change information;
step S1422, determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of change subclasses corresponding to the first recognition result;
step S1423, identifying the target change information in the second classification range, and determining a second identification result, wherein the second identification result is used for representing a change subclass and a change reason corresponding to the target change information.
In particular, the first classification range represents a broad class of variations affecting the stability of the microservice, including human variations and passive variations. The target change information can be identified in the change major class, and the change major class corresponding to the target change information is determined.
And knowing the change major class corresponding to the target change information according to the first identification result, so that the change sub-class range corresponding to the target change information can be determined. For example, the change large class corresponding to the target change information is known to be artificial change according to the first identification result, so that it can be determined that the change subclass range corresponding to the target change information is at least one of service online and code change, configuration change and instance expansion/contraction, that is, the change subclass corresponding to the target change information is at least one of service online and code change, configuration change and instance expansion/contraction.
After the change subclass range corresponding to the target change information is determined, the target change information can be identified, so that the change subclass and the change reason corresponding to the target change information are determined. For example, when the range of the change subclass corresponding to the target change information is determined to be the service online and code change, the configuration change and the instance capacity expansion/contraction, the target change information can be identified, so that the change subclass corresponding to the target change information is determined to be the service online and code change, and the change reason is a code logic error.
Based on the above steps S1421 to S1423, by identifying the target change information in the first classification range, determining a first identification result, determining a second classification range according to the first identification result, further identifying the target change information in the second classification range, and determining a second identification result, the change major class, the change sub-class, and the change reason corresponding to the target change information can be obtained, so that the stability of the single micro-service can be evaluated in detail with respect to the change major class, the change sub-class, and the change reason.
Optionally, in step S16, the first classification range includes a first change, and determining at least one target observation index corresponding to the target recognition result includes:
Step S161, in response to the second recognition result being a sub-class of the change included in the first change, determining at least one target observation index includes at least one of: the latest online list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Specifically, the first variation represents an artificial variation. When the micro service stability is changed due to the preset of the user, at least one index can be determined from the latest online list state, the engineering availability and the service availability as a target observation index for evaluating the micro service stability. For example, when the micro service is on line, the new and old versions of the micro service coexist due to manual termination of the on line, and thus the stability of the micro service is changed, the latest on line single state can be determined from the latest on line single state, engineering availability and service availability as a target observation index for evaluating the stability of the micro service.
Based on the above step S161, by determining at least one target observation index for the change subclass included in the first change in response to the second recognition result, it is possible to realize determination of a target observation index for evaluating the stability of the micro-service affected by the user' S preset, so that a detailed evaluation of the individual micro-service is performed using the target observation index corresponding to the change subclass in the human change.
Optionally, in step S16, the first classification range includes a second change, and determining at least one target observation index corresponding to the target recognition result includes:
in step S162, at least one parameter index associated with the sub-class of the change is determined as the target observation index in response to the second recognition result being the sub-class of the change included in the second change, wherein the second change is used to represent the stability change due to the system abnormality.
In particular, the second change represents a passive change. The parameter indicators associated with the change subclass in passive change may include packet loss rate, round Trip Time (RTT), bandwidth usage, number of unhealthy external network access points, number of hijacked domain names, etc.
When the second recognition result is a variation subclass included in the passive variation subclass, at least one parameter index associated with the variation subclass may be determined as a target observation index for evaluating stability of the micro-service. For example, when the second recognition result is a traffic variation subclass included in the passive variation subclass, the packet loss rate and RTT associated with the traffic variation subclass may be determined as target observation indexes for evaluating the stability of the micro service.
Based on the above step S162, by determining at least one parameter index associated with the change subclass as the target observation index in response to the second recognition result as the change subclass included in the second change, it is possible to realize a determination of the target observation index for evaluating the stability of the micro-service affected by the system abnormality, so that the single micro-service is evaluated in detail using the target observation index corresponding to the change subclass in the passive change.
Optionally, in step S18, performing quantitative evaluation on the target micro-service based on at least one target observation index, to obtain a stability evaluation result includes:
s181, acquiring a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between at least one target observation index and an evaluation standard;
s182, calculating an evaluation value corresponding to the target micro-service, wherein the evaluation value is used for representing a quantized value corresponding to at least one target observation index in the running process of the target micro-service;
s183, the stability evaluation result is determined based on the evaluation value.
Specifically, the target evaluation rule is used for representing a mapping relationship between at least one target observation index and an evaluation standard, and different change subclasses have different corresponding relationships between the observation index and the evaluation standard. Based on the mapping relation between the target observation index and the evaluation standard, the evaluation value corresponding to the target micro-service can be calculated to obtain the quantized value corresponding to the target observation index, and then the stability evaluation result is obtained according to the evaluation value, so that the quantized evaluation of the stability of the target micro-service is realized.
Based on the steps S181 to S183, the quantitative evaluation of the stability of the target micro-service can be achieved by obtaining the target evaluation rule and calculating the evaluation value corresponding to the target micro-service, and further obtaining the stability evaluation result according to the evaluation value.
The mapping relation between the target observation index and the evaluation standard can be represented by using a corresponding table of the target observation index and the evaluation standard, and different change subclasses correspond to different corresponding tables of the target observation index and the evaluation standard.
The results of service online and code change are recorded with corresponding online sheets. A line-up list has two states of a final state and a non-final state, wherein the final state indicates that the line-up is completed, and the non-final state indicates that the line-up is not completed. The final state of the online ticket is further divided into success and termination, wherein success indicates that all instances are successfully deployed and termination indicates that the online is stopped due to manual intervention.
In general, the micro service online cannot be completed in a short time, and gray level distribution, hierarchical distribution and other processes are generally required to gradually observe whether the micro service online meets the expectations, but the micro service online is usually not performed for more than 3 days at a time. Therefore, when a service has a long-time non-final online bill, new and old versions coexist, uncertainty is introduced, and stability of micro-service is affected. In addition, when the final state of the last online list is terminated, it is indicated that there is a manual intervention to terminate the online in the last online process, and the service is not rolled back to the last version or updated to the new version, and there is a risk of coexistence of the new version and the old version, which will affect the stability of the micro-service.
Therefore, the results of service online and code change can be evaluated by using the latest online single state of the service.
In addition, the impact on service online can be evaluated using a service availability index, wherein service availability can be divided into two types, namely service engineering availability and service business availability. Judging and analyzing the running state of the server by using the HTTP status code returned by the service, and when the HTTP status code is greater than or equal to 499, defining the hypertext transfer protocol (Hyper Text Transfer Protocol, HTTP), requesting to fail; the calculation formula of the service engineering availability (Engineering Availability) is shown in the formula (1), wherein the number of failure requests is PVLost, the total number of requests is PV:
however, in the case of HTTP normal return, there may also be a case where erroneous traffic is returned due to erroneous code logic, for which the evaluation can be made with service traffic availability. The calculation formula of the service engineering availability (Business Availability) is shown in the formula (2), wherein the service error request number is defined as BPVLost, the service total request number is defined as PV:
therefore, the service latest online list state, service engineering availability and service business availability can be used for evaluation aiming at the change of the service online and code change subclasses, and specific observation indexes and evaluation criteria are shown in table 2. Table 2 is a table of service online and code change subclass observation index and evaluation criteria correspondence in the human change class.
Table 2 service online and code change subclass observations and scoring criteria correspondence table
As shown in table 2, the observation indexes include service latest online order status, service engineering availability, and service business availability. The scoring criteria corresponding to the latest service online list state may include the following scoring levels: when the latest online list is in a final state and the online list is terminated, 5 minutes are recorded; when the latest line list is in a non-final state and the line list is not checked for more than 3 days, 5 minutes are recorded. The scoring criteria corresponding to service engineering availability may include the following scoring levels: when the service day-level engineering availability is between 99.91% and 99.95%, 5 points are recorded; when the service day-level engineering availability is between 99.51% and 99.9%, recording 10 points; when the service day engineering availability is less than 99.5%, record 15 points. The scoring criteria corresponding to service availability may include the following scoring levels: when the service day-level service availability is between 99.91% and 99.95%, 5 minutes are recorded; when the service day-level service availability is between 99.51% and 99.9%, recording 10 points; when the service day service availability is less than 99.5%, a score of 15 is recorded.
The service stability affected by the service on-line and code change subclass changes can be carefully evaluated by using table 2, and the evaluation can be performed above the level of the day. For example, when the latest up-to-line list is in the final state and the up-to-line list is terminated, 5 minutes are recorded. For another example, when the number of failed requests PVLost is 6 and the total number of requests PV is 1000 in one day, the service day level project availability is 99.4% according to the formula (1), and the evaluation value is 15 minutes according to the table 2.
In addition, since the influence of the change of the configuration change subclass on the service stability is similar to the service on-line, the micro service stability influenced by the change of the configuration change subclass can also be evaluated by using table 2. Since the capacity expansion and contraction of an instance affects the stability of the service according to unexpected instance socket traffic, the stability of the micro-service affected by the changes in the capacity expansion and contraction subclasses of the instance can also be evaluated by table 2.
By acquiring the service online and code change subclass observation indexes and the scoring standard corresponding tables, the evaluation values corresponding to the target micro-services affected by service online and code change can be calculated, and then the stability evaluation results are obtained according to the evaluation values, so that the stability of the target micro-services is quantitatively evaluated in the sky.
In addition to micro-service stability affected by human variation being assessed using table 2, micro-service stability affected by passive variation may also be assessed using other observation index and scoring criteria correspondence tables. For the flow change subclass, the corresponding relationship between the observation index and the scoring standard is shown in table 3.
As shown in table 3, the observation indexes of the traffic variation subclass include packet loss rate of the micro service instance internet protocol (Internet Protocol, IP), round Trip Time (RTT) of the micro service instance IP, bandwidth usage of the external network access point of the external network service, unhealthy number of the external network access point of the external network service, and domain hijacking number of domain name access service.
The scoring criteria corresponding to the packet loss rate of the micro service instance IP may include the following scoring levels: when the day-level packet loss rate of the same machine room is between 0.069% and 0.347%, 5 minutes are recorded; when the day-level packet loss rate of the same machine room is between 0.348% and 0.69%, recording 10 minutes; when the day-level packet loss rate of the same machine room is more than 0.691%, 15 minutes are recorded. When the day-level packet loss rate of the same region crossing the machine room is between 0.069% and 0.347%, 5 minutes are recorded; when the day-level packet loss rate of the same region crossing the machine room is between 0.348% and 0.69%, recording 10 points; when the day-level packet loss rate of the same region crossing the machine room is more than 0.691%, 15 minutes are recorded. When the inter-region and inter-machine room day packet loss rate is between 0.069% and 0.347%, 5 minutes are recorded; when the inter-region and inter-machine room day packet loss rate is between 0.348% and 0.69%, recording 10 points; when the cross-region and cross-machine room day-level packet loss rate is more than 0.691%, 15 minutes are recorded.
The scoring criteria corresponding to RTT of the micro service instance IP may include the following scoring levels: when the average value of the RTT of the same machine room is between 0.3 milliseconds (ms) and 0.6ms, recording 10 minutes; and when the average value of the same-machine room day-level RTT is greater than 0.61ms, recording 20 minutes. When the average value of the daily RTT of the same region crossing the machine room is between 4ms and 5ms, 5 minutes are recorded; when the average value of the daily RTT of the same region and the machine room is greater than 5.1ms, recording 10 minutes. When the average value of the inter-region and inter-machine room day-level RTT is between 40ms and 50ms, recording 2 minutes; and when the average value of the inter-regional and inter-machine room day-level RTT is greater than 51ms, 5 minutes are recorded.
The scoring criteria corresponding to the bandwidth usage of the extranet service extranet access point may include scoring 10 points when the peak level bandwidth usage is greater than 50%.
The scoring criteria corresponding to the number of unhealthy extranet access points may include scoring 10 points when the number of unhealthy extranet access points is greater than or equal to 1.
The scoring criteria corresponding to the domain name hijacking number of the domain name access service may include scoring 10 points when the number of hijacked domain names is greater than or equal to 1.
TABLE 3 correspondence table of flow Change subclass Observation index and scoring Standard
The service stability affected by the change of the traffic change subclass can be carefully evaluated by using table 3, and the evaluation can be made to be more than the level of the day. For example, the day-level packet loss rate of the same region crossing the machine room is 0.1%, and 5 minutes are recorded; and the inter-region and inter-machine room day packet loss rate is 0.35%, and 10 minutes are recorded; and accumulating the scores to obtain a service evaluation value of 15 points. For another example, the average value of the same-machine-room day-level RTT is 0.45ms, and 10 minutes are recorded; the peak value of the use ratio of the top level bandwidth is 60 percent, and the score is recorded as 10 minutes; the unhealthy number of the external network access point is equal to 1, and 10 minutes are recorded; and accumulating the scores to obtain the service evaluation value of 30 points.
By acquiring the corresponding relation table of the observation indexes of the flow change subclasses and the scoring standard, the evaluation value corresponding to the target micro-service influenced by the flow change can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the day-level quantitative evaluation on the stability of the target micro-service can be realized.
For the service capacity change subclass, the ability of the system to carry traffic is typically evaluated. Similar to the capacity of a container to hold a liquid, the service capacity is defined as the quantification of the service carrying capacity of the system. For general traffic, the capacity model is modeled based on associations between traffic (Drivers), availability, quality of service (Quality of Service, qoS), and Resources (Resources). The growth of Drivers can bring about consumption of Resources and degradation of QoS. For CPU intensive service, defining a service capacity water level formula as shown in formula (3), wherein the CPU usage is used for representing the use condition of resources; QPS is the number of requests per second to represent traffic conditions; the flat response is the average response time for the service to process all requests, indicating availability and quality of service.
If the flow carried by the micro-service is outside the service's affordable range, an increase in the capacity water level may result. In a general co-regional double-machine-room architecture, the capacity of each machine room of the micro service can bear the flow caused by stopping the service of the other machine room, namely, the water level of the single machine room is ensured to be below 50% at any moment.
The correspondence between the observation index and the scoring criteria for the service capacity change subclass is shown in table 4. The service capacity change subclass observation index comprises a micro service capacity water level, and the scoring standard comprises 20 minutes when the water level of a single room exceeds 50% at any time.
Table 4 correspondence table of observation indexes and scoring criteria for service capacity change subclasses
Observation index Scoring criteria
Micro service capacity water level The water level of the single room exceeds 50% at any time: record 20 minutes
By acquiring the corresponding relation table of the observation indexes of the service capacity change subclasses and the scoring standards, the evaluation value corresponding to the target micro-service influenced by the service capacity change can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the quantitative evaluation on the stability of the target micro-service can be realized.
Faults in other service fault subclasses in the passive variant major class can be divided into two classes, one is downstream problems caused by unreasonable use, such as slow database lookup and full connection number, and the other is downstream service self fault. The correspondence between the observation indexes and the scoring criteria for other service fault subclasses is shown in table 5.
As shown in table 5, the observation indexes of other service fault subclasses may include database slow structured query language (Structured Query Language, SQL) number, database large transaction number, database current connection number in total link percentage, database maximum CPU usage, database maximum memory usage; the method comprises the steps of caching a large key value identification (key) number of a middleware, caching middleware slow check number, caching middleware cache hit rate, caching middleware maximum CPU (central processing unit) utilization rate and caching middleware maximum memory utilization rate; message queue middleware single subject lag (topic lag), message queue middleware maximum CPU usage, message queue middleware maximum memory usage, message queue middleware individual disk partition input/output (I/O) usage, message queue middleware individual disk partition usage.
Other service fault subclass scoring criteria may include scoring 2 points each time a slow SQL occurs; recording 5 minutes when a large transaction occurs; when the current connection number of the database accounts for 40% -79% of the total link percentage, recording 10 points; when the current connection number of the database accounts for more than 80% of the total link percentage, recording 20 points; when the maximum CPU utilization rate of the database is between 40% and 79%, recording 10 minutes; when the maximum CPU utilization rate of the database is more than 80%, recording 20 minutes; when the maximum memory utilization rate of the database is between 40% and 79%, recording 10 minutes; when the maximum memory utilization rate of the database is more than 80%, recording 20 minutes; recording 2 points when one big key appears; recording 2 minutes when each slow check occurs; when the cache hit rate of the cache middleware is between 40% and 60%, recording 10 minutes; when the cache hit rate of the cache middleware is less than 39%, recording 20 minutes; when the maximum CPU utilization rate of the cache middleware is between 40% and 79%, recording 10 minutes; recording 20 minutes when the maximum CPU utilization rate of the cache middleware is more than 80 percent; when the maximum memory utilization rate of the cache middleware is between 40% and 79%, recording 10 minutes; when the maximum memory utilization rate of the cache middleware is more than 80%, recording 20 minutes; when 30 minutes of monoscopic lag increased, record 5 minutes; recording 10 minutes when the maximum CPU utilization rate of the message queue middleware is between 40 and 79 percent; recording 20 minutes when the maximum CPU utilization rate of the message queue middleware is more than 80 percent; when the maximum memory utilization rate of the message queue middleware is between 40% and 79%, recording 10 minutes; recording 20 minutes when the maximum CPU utilization rate of the message queue middleware is more than 80 percent; recording 10 minutes when the I/O utilization rate of each disk partition of the message queue middleware is between 40 and 79 percent; recording 20 minutes when the I/O utilization rate of each disk partition of the message queue middleware is greater than 80 percent; recording 10 minutes when the utilization rate of each disk partition of the message queue middleware is between 40 and 79 percent; when the utilization rate of each disk partition of the message queue middleware is more than 80%, recording 20 minutes.
Table 5 correspondence table of other service fault subclass observation index and scoring criteria
The service stability affected by other service fault class variations can be carefully evaluated using table 5. For example, five slow SQL's appear in the database, recorded 10 points; the maximum CPU utilization rate of the database is 82%, and the database is recorded as 20 minutes; the maximum CPU utilization rate of the cache middleware is 83%, and the cache middleware is recorded as 20 minutes; thus, the scores can be accumulated to obtain a service evaluation value of 50 points.
By acquiring the corresponding relation table of the observation indexes and the scoring standards of the sub-class of other service faults, the evaluation value corresponding to the target micro-service influenced by the other service faults can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the quantitative evaluation on the stability of the target micro-service can be realized.
The correspondence between the observation index and the scoring criteria for the machine change subclass is shown in table 6. The machine is the bottom layer of the system, and the change of kernel parameters and the failure of hardware can have serious influence on the stability of the service. The hardware fault may be evaluated according to a system log (syslog) and an out-of-band log of a machine where the service is located. The CPU utilization rate of the whole machine, the memory utilization rate of the whole machine, the disk utilization rate, the disk I/O utilization rate and the network card bandwidth utilization rate are also important indexes for evaluating the performance of the machine.
If the machine is down or the data cannot be recovered, whether disaster recovery and backup exist or not needs to be judged. If the backup data of the important data (or the original important data information) generated by the application system exists, the service can continue to work normally after the backup data is restored. For general micro services, it is necessary to guarantee a data backup at the day level.
As shown in Table 6, the machine change subclass observation index may include the number of machine hardware failures at which the service resides; the method comprises the steps of maximum CPU utilization rate of a machine where a service is located, maximum memory utilization rate of the machine where the service is located, I/O utilization rate of each disk partition of the machine where the service is located, and maximum bandwidth utilization rate of a network card of the machine where the service is located; the number of backup of the day service data.
The scoring standard of the machine change subclass can comprise a fault alarm of each machine hardware, and score 10; when the maximum CPU utilization rate of the machine is between 40% and 79%, recording 10 minutes; when the maximum CPU utilization rate of the machine is more than 80%, recording 20 minutes; when the maximum memory utilization rate of the machine is between 40% and 79%, recording 10 minutes; when the maximum memory utilization rate of the machine is more than 80%, recording 20 minutes; when the I/O utilization rate of each disk partition of the machine is between 40% and 79%, recording 10 minutes; recording 20 minutes when the I/O utilization rate of each disk partition of the machine is greater than 80 percent; when the utilization rate of each disk partition of the machine is between 40% and 79%, recording 10 minutes; recording 20 minutes when the utilization rate of each disk partition of the machine is more than 80 percent; when the maximum bandwidth utilization rate of the machine network card is between 40% and 79%, recording 10 minutes; when the maximum bandwidth utilization rate of the machine network card is more than 80%, recording 20 minutes; when the number of the service data of the current day is less than 1, 10 points are recorded.
TABLE 6 correspondence table of machine Change subclass Observation index and scoring Standard
The machine variation subclass can be carefully evaluated using the observation criteria and scoring criteria shown in table 6. For example, 3 machine hardware fault alarms occur, 30 minutes; the maximum CPU utilization rate of the machine is 50%, and 10 minutes are recorded; the maximum memory utilization rate of the machine is 49%, and the memory is recorded as 10 minutes; thus, the scores can be accumulated to obtain the micro-service evaluation value of 50 points.
By acquiring the corresponding relation table of the observation indexes of the machine change subclasses and the scoring standard, the evaluation value corresponding to the target micro-service influenced by the machine change can be calculated, and then the stability evaluation result is obtained according to the evaluation value, so that the day-level quantitative evaluation of the stability of the target micro-service is realized.
According to the obtained target evaluation rule, calculating an evaluation value corresponding to the target micro-service, and giving the evaluation value to obtain an evaluation result of the micro-service stability. The higher the evaluation value is, the worse the stability of the micro service is, and meanwhile, the evaluation time is short because the single micro service can be evaluated quantitatively on the day level, so that the fluctuation of the evaluation value in a period of time can be used for representing the stability of the micro service.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
In this embodiment, an evaluation device for service stability is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and will not be described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 2 is a block diagram of a service stability evaluation apparatus according to an embodiment of the present invention, and as shown in fig. 2, the service stability evaluation apparatus includes:
the detection module 201 is configured to detect target change information of the target micro service in the operation process, where the target change information is used to represent a stability change condition of the target micro service;
the identifying module 202 is configured to identify the target change information to obtain a target identifying result, where the target identifying result is used to represent classification information corresponding to the target change information;
a determining module 203, configured to determine at least one target observation index corresponding to the target recognition result;
and the evaluation module 204 is configured to quantitatively evaluate the target micro-service based on at least one target observation index, so as to obtain a stability evaluation result corresponding to the target micro-service.
Optionally, the service stability evaluation device further includes: an obtaining module 205, configured to obtain a target classification rule; the identification module 202 is further configured to: and identifying the target change information by utilizing the target classification rule to obtain a target identification result.
Optionally, the identification module 202 is further configured to: identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a change category corresponding to the target change information; determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of change subclasses corresponding to the first recognition result; and identifying the target change information in a second classification range, and determining a second identification result, wherein the second identification result is used for representing a change subclass and a change reason corresponding to the target change information.
Optionally, the determining module 203 is further configured to: in response to the second recognition result being a sub-class of the transformation included in the first variation, determining at least one target observation includes at least one of: the latest online list status, engineering availability and service availability, wherein the first change is used to represent a stability change due to a user's preset.
Optionally, the determining module 203 is further configured to: and determining at least one parameter index associated with the sub-class of changes as a target observation index in response to the second recognition result being a sub-class of changes included in the second change, wherein the second change is indicative of a change in stability due to a system anomaly.
Optionally, the obtaining module 205 is further configured to: and obtaining a target evaluation rule, wherein the target evaluation rule is used for recording the mapping relation between at least one target observation index and the evaluation standard.
Optionally, the service stability evaluation device further includes a calculation module 206, configured to calculate an evaluation value corresponding to the target micro-service, where the evaluation value is used to represent a quantized value corresponding to at least one target observation index during an operation of the target micro-service.
Optionally, the determining module 203 is further configured to: and determining a stability evaluation result based on the evaluation value.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Embodiments of the present invention also provide a non-volatile storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:
step S1, detecting target change information of a target micro-service in the operation process, wherein the target change information is used for representing the stability change condition of the target micro-service;
step S2, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
step S3, determining at least one target observation index corresponding to the target identification result;
and S4, quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
The embodiment of the invention also provides a processor for running a program, wherein the program is configured to execute the steps in any of the method embodiments described above when run.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
step S1, detecting target change information of a target micro-service in the operation process, wherein the target change information is used for representing the stability change condition of the target micro-service;
step S2, identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information;
step S3, determining at least one target observation index corresponding to the target identification result;
and S4, quantitatively evaluating the target micro-service based on at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
An embodiment of the application also provides an electronic device comprising a memory in which a computer program is stored and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of units may be a logic function division, and there may be another division manner in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be essentially or a part contributing to the related art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (10)

1. A method for evaluating service stability, comprising:
detecting target change information of a target micro-service in the operation process, wherein the target change information is used for representing the stability change condition of the target micro-service;
identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information, and the classification information comprises a change major class, a change sub-class and a stability deterioration reason;
determining at least one target observation index corresponding to the target identification result;
and quantitatively evaluating the target micro-service based on the at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
2. The method of claim 1, wherein identifying the target variation information to obtain the target identification result comprises:
Obtaining a target classification rule;
and identifying the target change information by utilizing the target classification rule to obtain the target identification result.
3. The method of claim 2, wherein identifying the target variation information using the target classification rule, the target identification result comprising:
identifying the target change information in a first classification range, and determining a first identification result, wherein the first identification result is used for representing a change major class corresponding to the target change information;
determining a second classification range according to the first recognition result, wherein the second classification range comprises a plurality of change subclasses corresponding to the first recognition result;
and identifying the target change information in the second classification range, and determining a second identification result, wherein the second identification result is used for representing a change subclass and a change reason corresponding to the target change information.
4. A method according to claim 3, wherein the first classification range includes a first variation, and determining the at least one target observation index corresponding to the target recognition result includes:
In response to the second recognition result being a sub-class of transformations included in the first variation, determining the at least one target observation includes at least one of: the method comprises the steps of up-to-date online list state, engineering availability and service availability, wherein the first change is used for representing stability change caused by preset of a user.
5. A method according to claim 3, wherein the first classification range includes a second variation, and determining the at least one target observation index corresponding to the target recognition result includes:
and determining at least one parameter index associated with the sub-class of changes as the target observation index in response to the second recognition result being a sub-class of changes included in the second change, wherein the second change is used for representing a stability change caused by a system abnormality.
6. The method of claim 1, wherein quantitatively evaluating the target micro-service based on the at least one target observation index, the obtaining the stability evaluation result comprises:
obtaining a target evaluation rule, wherein the target evaluation rule is used for recording a mapping relation between the at least one target observation index and an evaluation standard;
Calculating an evaluation value corresponding to the target micro-service, wherein the evaluation value is used for representing a quantized value corresponding to the at least one target observation index in the running process of the target micro-service;
and determining the stability evaluation result based on the evaluation value.
7. An evaluation device for service stability, comprising:
the detection module is used for detecting target change information of the target micro-service in the operation process, wherein the target change information is used for representing the stability change condition of the target micro-service;
the identification module is used for identifying the target change information to obtain a target identification result, wherein the target identification result is used for representing classification information corresponding to the target change information, and the classification information comprises a change major class, a change sub-class and a stability deterioration reason;
the determining module is used for determining at least one target observation index corresponding to the target identification result;
and the evaluation module is used for quantitatively evaluating the target micro-service based on the at least one target observation index to obtain a stability evaluation result corresponding to the target micro-service.
8. A non-volatile storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to perform the service stability assessment method according to any of the claims 1 to 6 at run-time.
9. A processor, characterized in that the processor is arranged to run a program, wherein the program is arranged to execute the method of evaluating service stability as claimed in any of the claims 1 to 6 at run time.
10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of evaluating service stability as claimed in any of the claims 1 to 6.
CN202210980612.6A 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device Active CN115396341B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210980612.6A CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210980612.6A CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN115396341A CN115396341A (en) 2022-11-25
CN115396341B true CN115396341B (en) 2023-12-05

Family

ID=84119855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210980612.6A Active CN115396341B (en) 2022-08-16 2022-08-16 Service stability evaluation method and device, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN115396341B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108874640A (en) * 2018-05-07 2018-11-23 北京京东尚科信息技术有限公司 A kind of appraisal procedure and device of clustering performance
CN109933452A (en) * 2019-03-22 2019-06-25 中国科学院软件研究所 A kind of micro services intelligent monitoring method towards anomalous propagation
US10684940B1 (en) * 2018-09-18 2020-06-16 Amazon Technologies, Inc. Microservice failure modeling and testing
CN111835592A (en) * 2020-07-14 2020-10-27 北京百度网讯科技有限公司 Method, apparatus, electronic device and readable storage medium for determining robustness
CN112039689A (en) * 2020-07-21 2020-12-04 网宿科技股份有限公司 Network equipment performance evaluation method, device, equipment and storage medium
CN112181759A (en) * 2020-09-04 2021-01-05 广东电力信息科技有限公司 Method for monitoring micro-service performance and diagnosing abnormity
CN112241350A (en) * 2019-07-16 2021-01-19 中国移动通信集团浙江有限公司 Micro-service evaluation method and device, computing device and micro-service detection system
CN112540905A (en) * 2020-12-18 2021-03-23 青岛特来电新能源科技有限公司 System risk assessment method, device, equipment and medium under micro-service architecture
CN114138625A (en) * 2021-12-08 2022-03-04 中国工商银行股份有限公司 Method and system for evaluating health state of server, electronic device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10949436B2 (en) * 2017-02-24 2021-03-16 Oracle International Corporation Optimization for scalable analytics using time series models

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108874640A (en) * 2018-05-07 2018-11-23 北京京东尚科信息技术有限公司 A kind of appraisal procedure and device of clustering performance
US10684940B1 (en) * 2018-09-18 2020-06-16 Amazon Technologies, Inc. Microservice failure modeling and testing
CN109933452A (en) * 2019-03-22 2019-06-25 中国科学院软件研究所 A kind of micro services intelligent monitoring method towards anomalous propagation
CN112241350A (en) * 2019-07-16 2021-01-19 中国移动通信集团浙江有限公司 Micro-service evaluation method and device, computing device and micro-service detection system
CN111835592A (en) * 2020-07-14 2020-10-27 北京百度网讯科技有限公司 Method, apparatus, electronic device and readable storage medium for determining robustness
CN112039689A (en) * 2020-07-21 2020-12-04 网宿科技股份有限公司 Network equipment performance evaluation method, device, equipment and storage medium
CN112181759A (en) * 2020-09-04 2021-01-05 广东电力信息科技有限公司 Method for monitoring micro-service performance and diagnosing abnormity
CN112540905A (en) * 2020-12-18 2021-03-23 青岛特来电新能源科技有限公司 System risk assessment method, device, equipment and medium under micro-service architecture
CN114185760A (en) * 2020-12-18 2022-03-15 青岛特来电新能源科技有限公司 System risk assessment method and device and charging equipment operation and maintenance detection method
CN114138625A (en) * 2021-12-08 2022-03-04 中国工商银行股份有限公司 Method and system for evaluating health state of server, electronic device and storage medium

Also Published As

Publication number Publication date
CN115396341A (en) 2022-11-25

Similar Documents

Publication Publication Date Title
US9672085B2 (en) Adaptive fault diagnosis
WO2019237118A1 (en) Business-aware intelligent incident and change management
US8745202B2 (en) Tracking remote browser crashes via cookies
US20070005761A1 (en) Predictive monitoring and problem identification in an information technology (it) infrastructure
CN110287081A (en) A kind of service monitoring system and method
KR102432284B1 (en) A system that automatically responds to event alarms or failures in IT management in real time and its operation method
CN108776861A (en) Railway Communication safety risk estimating method and device
CN109992473A (en) Monitoring method, device, equipment and the storage medium of application system
CN114201201A (en) Method, device and equipment for detecting abnormity of business system
CN116418653A (en) Fault positioning method and device based on multi-index root cause positioning algorithm
CN115396341B (en) Service stability evaluation method and device, storage medium and electronic device
CN114169767A (en) Risk assessment method and device
CN111784173B (en) AB experiment data processing method, device, server and medium
CN107943678A (en) A kind of method for evaluating application access process and evaluation server
CN111626498A (en) Equipment operation state prediction method, device, equipment and storage medium
CN110675240B (en) Monitoring method and system for risk radar early warning
CN115713395A (en) Flink-based user wind control management method, device and equipment
CN112699048B (en) Program fault processing method, device, equipment and storage medium based on artificial intelligence
CN113472881B (en) Statistical method and device for online terminal equipment
CN116308370A (en) Training method of abnormal transaction recognition model, abnormal transaction recognition method and device
US11144383B2 (en) Platform for automated administration and monitoring of in-memory systems
CN113407374A (en) Fault processing method and device, fault processing equipment and storage medium
AU2014200806B1 (en) Adaptive fault diagnosis
CN110766544A (en) Credit risk detection method and device, storage medium and electronic device
CN116405587B (en) Intelligent monitoring method, system and medium for after-sale performance condition of mobile phone

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant