CN113204464A - Real-time service monitoring method, system, terminal and medium based on service scene - Google Patents

Real-time service monitoring method, system, terminal and medium based on service scene Download PDF

Info

Publication number
CN113204464A
CN113204464A CN202110439635.1A CN202110439635A CN113204464A CN 113204464 A CN113204464 A CN 113204464A CN 202110439635 A CN202110439635 A CN 202110439635A CN 113204464 A CN113204464 A CN 113204464A
Authority
CN
China
Prior art keywords
service
time
real
data
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110439635.1A
Other languages
Chinese (zh)
Other versions
CN113204464B (en
Inventor
邵志鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shangmeng Business Service Co ltd
Original Assignee
Shangmeng Business Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shangmeng Business Service Co ltd filed Critical Shangmeng Business Service Co ltd
Priority to CN202110439635.1A priority Critical patent/CN113204464B/en
Publication of CN113204464A publication Critical patent/CN113204464A/en
Application granted granted Critical
Publication of CN113204464B publication Critical patent/CN113204464B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Computer Hardware Design (AREA)
  • Game Theory and Decision Science (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a real-time service monitoring method and a system based on a service scene, which comprises a service application log embedded point; collecting data of buried points in real time; processing complex events in real time; and outputting the overtime alarm message. And carrying out data monitoring at a service application log buried point, configuring overtime time according to a state change process from an initial state to a processing state to a final state of a service list, and carrying out real-time monitoring on the service list with total time consumption exceeding the overtime time. And performing overtime mode matching on each pre-sequenced business list through a Flink CEP model to obtain a business abnormal event. The invention can well fit the service scene, monitor each service list and find the overtime abnormal service list in time, and can be extended to any service scene with different state sequence changes.

Description

Real-time service monitoring method, system, terminal and medium based on service scene
Technical Field
The present invention relates to the technical field of service monitoring, and in particular, to a real-time service monitoring method, system, terminal and medium based on a service scenario.
Background
In the payment service, the payment request concurrency is large, the payment service concurrency is higher and higher, the complex service processing logic needs to perform real-time monitoring of a complete link for each payment order, and timely discover the service payment order subjected to overtime processing; various third-party network interfaces such as channels and merchants which are butted by the application system are more, and the condition that business processing is overtime due to unexpected abnormality of network, channel or merchant system maintenance and even burst is easy to occur, and the business processing needs to be monitored in time and alarms are sent out. The traditional solution is to adopt a relational database of a continuous polling scanning service system, query a service payment order with a processing state through SQL, calculate corresponding overtime time, and send the information of the service payment order with the overtime time exceeding a certain threshold value to an alarm platform. The traditional relational database SQL query mode has high requirements on the database, is difficult to deal with processing of sub-database sub-tables and different business classifications, and is low in large-scale concurrent processing performance and efficiency.
Therefore, the conventional method has a relatively large problem:
1. polling SQL query is carried out on an online transaction database table of large-scale data, so that huge pressure is caused on a relational database, and normal online business is influenced;
2. flexible expansion caused by different service thresholds of different types;
3. the efficiency and performance are low, and large-scale concurrency cannot be handled.
Through search, the following results are found:
the invention of Chinese patent application No. 201610921493.1 discloses a method and a system for monitoring service, which comprises: extracting a service log from a service system, and printing the service log according to the service log, wherein the process of printing the service log comprises the following steps: storing the service log extracted from the service system according to the service index; collecting the printed service log, and pushing the service log to a distributed publishing and subscribing message system; extracting service logs through the distributed publish-subscribe message system according to the service indexes of the service system, and performing aggregation calculation; and scanning to obtain abnormal service indexes according to the service logs subjected to the aggregation calculation. Although the patent technology discloses the related content for performing anomaly monitoring by acquiring information such as a 'service log', a 'distributed publish-subscribe message system (Kafka)', the document in the patent technology is a print log, the distributed publish-subscribe message system extracts the service log, and aggregation calculation scanning is performed to obtain an abnormal service index; however, when this method is applied to a payment service, the following problems still exist:
1. the patent needs to perform aggregate calculation on the extracted service logs to obtain service indexes;
2. the patent scans the service indexes to obtain abnormal service indexes, and the monitoring granularity is relatively coarse.
The invention discloses a method, a device and a monitoring system for processing abnormal data, which are invented in China with the application number of 201410528628.9, and relates to a method for processing abnormal data, wherein the method judges data abnormality through 'timeout', but the method for judging data abnormality through 'timeout' adopted in the patent technology still has the following problems when the method is applied to payment service:
the patent uses the difference value of a first time stamp and a second time stamp recorded by transmitted test data as timeout time, and does not relate to the business meaning of the data, namely the event state of the payment business.
The chinese patent application No. 201811373337.1, "a Flink-based data stream multidirectional processing system", describes that data from multiple data sources can be accurately collected and summarized by using the high scalability and high reliability of Kafka, and efficiently combined with Flink. However, when this method is applied to a payment service, the following problems still exist:
the core of the patent lies in integration with the Flink, data conversion and filtering are carried out by utilizing the real-time computing capability of the Flink and a transformation operator, the rapidity of computing and storing is ensured, and the business payment order timeout event cannot be identified.
In summary, the prior art still has the problems of affecting the normal online service, poor expansion flexibility, low efficiency and performance, and the like, and no explanation or report of the similar technology to the present invention is found at present, and similar data at home and abroad is not collected.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a real-time service monitoring method, a system, a terminal and a medium based on a service scene.
According to an aspect of the present invention, a real-time service monitoring method based on a service scenario is provided, which includes:
embedding points in a service application log, and collecting the data of the embedded points in real time;
and according to the collected data of the buried points, taking the service list with total time consumption exceeding a set threshold value as a service abnormal event for real-time monitoring, wherein the total time consumption refers to the time consumed in the process from the initial state to the final completion state of the service list.
Preferably, the collecting the data of the embedded point in real time at the embedded point of the service application log includes:
applying log data at a service system embedded point, and asynchronously sending the log data to a Kafka cluster theme partition;
and (3) consuming the business form data in different subject partitions in real time and concurrently by adopting a Flink process according to the specified subject registered by the Kafka consumer, and screening the data of the business form in the initial state and the final state by using a filter operator to obtain real-time buried point data.
Preferably, the monitoring, in real time, the taking the service unit with the total time consumption exceeding the set threshold as the service abnormal event according to the collected data of the buried point, includes:
pre-sequencing the state change logs of each business order by adopting a Redis Lua script;
and performing overtime mode matching on each pre-sequenced service list through a Flink CEP model to obtain a service abnormal event.
Preferably, the pre-sorting the state change log of each service order by using the Redis Lua script includes:
and sequentially inputting the data of the embedded points in the same service list under different states into Redis, pre-sequencing the sequence of the data of the embedded points under different states through Lua scripts, and determining that certain data of the embedded points temporarily reside in a hash data structure in a sequencing logic to wait or are immediately sent to downstream.
Preferably, the performing timeout pattern matching on each pre-sorted service ticket through the Flink CEP library includes:
aiming at each business order sent from the upstream, a CEP complex event processing model in a Flink process is utilized to carry out real-time mode matching;
and according to the preset starting condition and ending condition in the CEP complex event processing model and the overtime threshold value for completely achieving the starting condition and the ending condition, taking the service sheet exceeding the overtime threshold value as a service abnormal event.
Preferably, the service exception event is obtained by:
constructing a CEP complex event processing model according to the starting condition, the ending condition and the overtime threshold of the service, and converting an upstream service event stream into a model stream Pattern stream by a static method of the CEP complex event processing model; and then calling a flatSelect plane selection method aiming at the model stream Pattern, performing matching shunting from the model stream Pattern, and finally calling a getSideOutput method to obtain a timeout event stream to obtain a service abnormal event.
Preferably, in the CEP complex event processing model:
the preset starting conditions are as follows: the business event initializes state 02.
The preset termination condition is as follows: traffic event end state 00 or 01.
The preset timeout threshold is: 3-5 minutes.
Preferably, the method further comprises: and packaging the service abnormal event and sending the service abnormal event to a downstream alarm platform.
According to another aspect of the present invention, there is provided a real-time service monitoring system based on service scenarios, including:
an application log point burying module, which buries points in the service application logs;
the real-time data acquisition module acquires the data of the embedded points in real time by applying the log embedded point module;
and the complex event processing module is used for monitoring the business sheet with total time consumption exceeding a timeout threshold value in real time as a business abnormal event according to the state change process from the initial state to the processing state to the final state of the business sheet aiming at the buried point data.
Preferably, the complex event processing module uses a Redis Lua script to pre-sort the state change log of each business order, and performs timeout mode matching on each pre-sorted business order through a Flink CEP model to obtain a business abnormal event.
Preferably, the application log embedding module is used for embedding the application log data in the service system and asynchronously sending the application log data to the Kafka cluster theme partition; the real-time data acquisition module consumes the business form data in different subject partitions in real time and concurrently by adopting a Flink process according to the specified subject registered by the Kafka consumer, and screens out the data of the business form in the initial state and the final state through a filter operator to obtain real-time data of the embedded data.
Preferably, the system further comprises:
and the alarm platform module is used for carrying out real-time alarm according to the service abnormal event output by the complex event processing module.
According to a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the program being operable to perform any of the methods described above.
According to a fourth aspect of the invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any of the above.
Due to the adoption of the technical scheme, compared with the prior art, the embodiment of the invention has the following beneficial effects:
the real-time service monitoring method, the system, the terminal and the medium based on the service scene acquire the payment service application log through Kafka, match and determine the overtime payment order of the total consumption time, and perform real-time monitoring through the overtime condition.
The method, the system, the terminal and the medium for monitoring the real-time service based on the service scene adopt a Redis Lua script to sort the state change logs of each payment order in Kafka and adopt a Flink CEP model to perform overtime mode matching on each payment order subjected to state pre-sorting.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene provided by the invention can quickly monitor the overtime abnormal payment order in real time in the high-concurrency payment service processing process, and output the payment order to the alarm platform in time, thereby avoiding possible economic loss in advance.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene provided by the invention can find the abnormal overtime payment bill from the large-scale concurrent payment bills in real time in the process of the online transaction service, and early warn in advance.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene, provided by the invention, can easily expand the cluster scale horizontally to support quick service scale increment of payment.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene, which are provided by the invention, are combined with a real-time computing engine and a high-performance CEP complex event processing model (CEP complex event processing library), so that the monitoring processing efficiency and the performance are very high.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene do not need to perform aggregation calculation to obtain the service index, but only need the service state of the log.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene, which are provided by the invention, are used for processing each service payment order, and the monitoring granularity is finer.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene do not need to additionally record the first timestamp and the second timestamp, and only need to rely on the event state attribute of the service payment bill.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene fully utilize the real-time complex event processing capacity of the Flink CEP to realize the overtime event matching of the service payment order and output the overtime service payment order.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a flowchart of a real-time service monitoring method based on a service scenario in an embodiment of the present invention.
Fig. 2 is a flow chart of a real-time service monitoring method based on service scenarios in a preferred embodiment of the present invention.
Fig. 3 is a schematic diagram of a component module of a real-time service monitoring system based on a service scenario in an embodiment of the present invention.
Fig. 4 is an architecture diagram of a real-time service monitoring system based on service scenarios in a preferred embodiment of the present invention.
Detailed Description
The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Fig. 1 is a flowchart of a real-time service monitoring method based on a service scenario according to an embodiment of the present invention.
As shown in fig. 1, the method for monitoring a real-time service based on a service scenario provided in this embodiment may include the following steps:
s100, embedding points in a service application log, and collecting the embedded points in real time;
in this step, the log burying point is used to obtain the service usage status parameters, such as the Time On Site (Time On Site) of each service state.
S200, aiming at the data of the buried point, according to the state change process of the business form from the initial state to the final completion state, the business form with the total time consumption exceeding a set threshold value is used as a business abnormal event for real-time monitoring;
in the step, firstly, a threshold value, namely an overtime threshold value, is set according to the service condition, if the total time consumption of the service is determined to exceed the set threshold value according to the buried point data, the service is judged to be abnormal, and monitoring and tracking are carried out subsequently; if the total time consumption of the service does not exceed the set threshold, the service is normal service. The total time consumption refers to the time consumed by a service list from the whole process of initiating an initial state, a processing state and a final completion state. Each service ticket has an initial state, an intermediate state (i.e., a state during processing) is omitted during monitoring, and the state during processing may not be monitored.
In order to better implement the monitoring, in a preferred embodiment, a Redis Lua script is used to pre-sort the state change log of each service order, and a Flink CEP model is used to perform timeout mode matching on each pre-sorted service order to obtain a service exception event.
As a preferred embodiment, the method provided by this embodiment may further include the following steps: and encapsulating the service abnormal event and sending the service abnormal event to a downstream alarm platform.
In S100 of this embodiment, as a preferred embodiment, embedding a point in a service application log, and collecting the data of the embedded point in real time may include the following steps:
s101, applying log data at a service system embedded point, and asynchronously sending the log data to a Kafka (distributed publish-subscribe message system) cluster theme partition;
s102, using a Flink process to concurrently consume the business form data in different subject partitions in real time according to the specified subject registered by the Kafka consumer, and screening out the data of the business form state in the initial state and the final state through a filter operator to obtain real-time data of the embedded point.
In S200 of this embodiment, as a preferred embodiment, the pre-sorting the state change logs of each service order by using the Redis Lua script may include the following steps:
and sequentially inputting the data of the embedded points in the same service list under different states into Redis, pre-sequencing the sequence of the data of the embedded points under different states through Lua scripts, and determining that certain data of the embedded points temporarily reside in a hash data structure in a sequencing logic to wait or are immediately sent to downstream.
In S200 of this embodiment, as a preferred embodiment, performing timeout pattern matching on each pre-sorted service ticket through the Flink CEP library may include the following steps:
s201, aiming at each business order sent from the upstream, a real-time mode matching is carried out by utilizing a CEP complex event processing model in a Flink process;
s201, according to a starting condition and an ending condition preset in the CEP complex event processing model and a timeout threshold value for completely achieving the starting condition and the ending condition, taking the service sheet exceeding the timeout threshold value as a service abnormal event.
In S200 of this embodiment, as a preferred embodiment, obtaining the service exception event may specifically include the following steps:
constructing a CEP complex event processing model according to the starting condition, the ending condition and the overtime threshold of the service, and converting an upstream service event stream into a model stream Pattern stream by a static method of the CEP complex event processing model; and then calling a flatSelect plane selection method for the pattern stream pattern to perform matching shunting from the pattern stream pattern, and finally calling a getSideOutput method to obtain a timeout event stream to obtain a service abnormal event.
In S202 of this embodiment, as a preferred embodiment, in the CEP complex event processing model, the preset start condition, end condition, and timeout threshold for completely achieving the start condition and the end condition may include:
the preset starting conditions are as follows: a service event initialization state 02;
the preset termination condition is as follows: service event end state 00 or 01;
the preset timeout threshold is: 3-5 minutes.
The real-time service monitoring method based on the service scene provided by the embodiment of the invention adopts the mode of event logs of single-state change of asynchronous embedded point service of a service system, is decoupled from database table query, and does not influence the normal service processing flow; under high concurrency, a Kafka real-time data disorder solution method is adopted, and the data are quickly and efficiently sequenced at a Redis server through a Lua script; along with the service expansion and the rapid increase of the service scale, the monitoring cluster can be smoothly expanded; time-out event sequencing and time-out event monitoring for various types of services (e.g., payment services) are supported.
Fig. 2 is a flowchart of a real-time service monitoring method based on a service scenario according to a preferred embodiment of the present invention.
As shown in fig. 2, the method for monitoring a real-time service based on a service scenario provided in the preferred embodiment may include the following steps:
step 1, applying log data at a service system embedded point and asynchronously sending the log data to a Kafka cluster theme partition;
and 2, consuming the business form data in different subject partitions in real time and concurrently by adopting a Flink process according to the specified subject registered by the Kafka consumer, screening the data in the initial state and the final state of the business form state by using a filter operator, and distributing the data to downstream to obtain real-time data of the embedded point.
Step 3, sequentially inputting the data of the embedded points in different states of the same service order into Redis, pre-sequencing the sequence of the data of the embedded points in different states through Lua scripts, and determining whether a certain data of the embedded points temporarily resides in a hash data structure for waiting or is immediately sent to downstream in sequencing logic;
step 4, grouping according to the service single main key ID by using a CEP complex event processing model (CEP complex event processing library) in the flight process, namely performing real-time mode matching on each service single sent from the upstream;
and step 5, according to the starting condition and the ending condition set in the CEP complex event processing model and the overtime threshold (such as 3 minutes or 5 minutes) for completely achieving the starting condition and the ending condition, encapsulating the service single data exceeding the overtime threshold according to an agreement and then sending the encapsulated service single data to a downstream alarm platform.
As a preferred embodiment, the states for setting the start condition and the end condition in the CEP complex event processing model may be defined as:
starting conditions were as follows: the business event initializes state 02.
And (4) finishing conditions: traffic event end state 00 or 01.
As shown in table 1.
TABLE 1
Initial state 02
Intermediate state 03、04、05
Final state 00 or 01
The technical solution of the above embodiment of the present invention is further described in detail below with reference to a specific application example.
The transaction payment core application system is positioned on the node 1 machine, receives a payment request, generates a payment business sheet with the sheet number p20210409xxxx and the initialization state of the payment sheet 02, and sends the business event in the initialization state to a theme appointed by a Kafka cluster through log burying; and the payment channel processes the payment business form, sets the payment form into a processing state 03, and sends the processing business event to a theme specified by the Kafka cluster through a log burying point.
And the transaction payment core application system is positioned on the node 2 machine, the payment channel processes the payment business order, the payment result is checked back through a third-party channel interface, the payment success or payment failure is found, the state of the payment order is set to be 00 or 01, and the processed business event is sent to a theme appointed by the Kafka cluster through a log embedded point.
Normally, the states keep the precedence order of 02- >03- >00/01, but probably due to network delay and other reasons (the concurrency is very large), 00/01 state service events reach Kafka before 02 state service events, so that the precedence order of the states needs to be corrected, transient false alarms with ultrahigh concurrency are avoided, and all service events are processed through a Redis Lua script before entering Flink CEP processing.
And (3) adopting Flink CEP to perform model matching on the service events processed by the Redis Lua script, and encapsulating the service events exceeding a set threshold value and sending the encapsulated service events to a Kafka cluster theme appointed by a downstream alarm platform.
As can be seen from the above specific application examples, the technical solutions provided by the above embodiments of the present invention have the following technical effects:
the capability of keeping the order of the states of the payment business lists consistent in a high-volume concurrent scene under distributed application is processed;
for the model matching of the overtime event, only the state attribute of the business event is needed, and an additional recording timestamp is not needed to calculate the time for judging whether the overtime event occurs or not;
based on the real-time complex event processing of the Flink cluster and the Flink CEP, overtime monitoring can be carried out on each payment business sheet without extra calculation of other business indexes, and timely and accurate monitoring can be achieved.
An embodiment of the present invention provides a real-time service monitoring system based on a service scenario, as shown in fig. 3, the real-time service monitoring system may include: the system comprises an application log point burying module, a real-time data acquisition module and a complex event processing module; wherein:
an application log point burying module, which buries points in the service application logs;
the real-time data acquisition module acquires the data of the embedded points in real time by applying the log embedded point module;
the complex event processing module is used for monitoring the data buried in the data in real time by taking the service list with total time consumption exceeding a timeout threshold as a service abnormal event according to the state change process from the initial state to the processing state to the final state of the service list; wherein: and pre-sequencing the state change logs of each business order by adopting a Redis Lua script, and performing overtime mode matching on each pre-sequenced business order through a Flink CEP model to obtain a business abnormal event.
As a preferred embodiment, the complex event processing module uses a Redis Lua script to pre-sort the state change log of each business order, and performs timeout mode matching on each pre-sorted business order through a Flink CEP model to obtain a business abnormal event.
As a preferred embodiment, the application log embedding module is used for embedding the log data in the service system and asynchronously sending the log data to the Kafka cluster theme partition; the real-time data acquisition module consumes the business form data in different subject partitions in real time and concurrently by adopting a Flink process according to the specified subject registered by the Kafka consumer, and screens out the data of the business form in the initial state and the final state through a filter operator to obtain real-time data of the embedded data.
As a preferred embodiment, the system may further include an alarm platform module; wherein:
and the alarm platform module is used for carrying out real-time alarm according to the service abnormal event output by the complex event processing module.
The system architecture diagram of the real-time service monitoring system based on the service scenario provided by this embodiment is shown in fig. 4.
An embodiment of the present invention provides a terminal, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor can be configured to execute the method in any one of the above embodiments when executing the computer program.
Optionally, a memory for storing a program; a Memory, which may include a volatile Memory (RAM), such as a Random Access Memory (SRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and the like; the memory may also comprise a non-volatile memory, such as a flash memory. The memories are used to store computer programs (e.g., applications, functional modules, etc. that implement the above-described methods), computer instructions, etc., which may be stored in partition in the memory or memories. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
The computer programs, computer instructions, etc. described above may be stored in one or more memories in a partitioned manner. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.
A processor for executing the computer program stored in the memory to implement the steps of the method according to the above embodiments. Reference may be made in particular to the description relating to the preceding method embodiment.
The processor and the memory may be separate structures or may be an integrated structure integrated together. When the processor and the memory are separate structures, the memory, the processor may be coupled by a bus.
An embodiment of the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is operable to perform the method of any one of the above.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene provided by the embodiments of the present invention perform the overtime monitoring on the service flow of each service order, configure the overtime threshold value for 3 to 5 minutes according to the state change process from the initial state to the processing state of the service order to the final state (success or failure) of payment completion, and perform the real-time monitoring on the service order of which the total time consumption exceeds the overtime threshold value.
In the method, the system, the terminal and the medium for monitoring the real-time service based on the service scene, which are provided by the embodiment of the invention, the distributed message system Kafka is used as a log collecting and transmitting component, so that the real-time log data with ultra-large data volume can be collected and transmitted with high throughput and low delay; the self-defined state sorting algorithm realized by the cache Redis and the Lua script can very quickly sort out-of-order states of the same payment order and provide an ordered event stream for downstream; and matching the timeout event from different states of log data accurately in real time by using a Flink CEP complex event processing engine, and encapsulating the timeout event into a timeout alarm message in real time to send the timeout alarm message to an alarm platform.
The real-time service monitoring method, system, terminal and medium based on the service scene provided by the embodiments of the present invention bury points in each service application system deployed in a distributed manner, adopt a back-end asynchronous point burying technology, collect the state change log of the payment statement in the distributed message system Kafka in real time under the condition that the service processing flow is not affected, and can process large-scale service concurrency in real time by using the characteristics of Kafka of high throughput and low delay, and are also thoroughly decoupled from the service system database.
In the method, the system, the terminal and the medium for monitoring the real-time service based on the service scene, provided by the embodiment of the invention, the Redis Lua script is adopted to sequence the state change logs of each service order in Kafka, and because the state change of the payment orders under high concurrency is relatively quick, the natural sequence sent to the Kafka is probably inconsistent with the actual state change sequence, so that the disorder is caused, the pre-sequencing is needed, and preparation is made for matching the complex event pattern of the next step.
The method, the system, the terminal and the medium for monitoring the real-time service based on the service scene provided by the embodiment of the invention adopt the Flink CEP library to perform overtime mode matching on each service list which is well subjected to state pre-sequencing. And matching each business order in real time through the compiled pattern matching rules, judging the business orders exceeding a time threshold value as overtime abnormal orders, packaging related information and sending the packaged information to an alarm platform.
The real-time service monitoring method, the system, the terminal and the medium based on the service scene provided by the embodiment of the invention can well fit the service scene, monitor each service list and timely find the overtime abnormal service list, and can be expanded and applied to any service scene with different state sequence changes.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may implement the composition of the system by referring to the technical solution of the method, that is, the embodiment in the method may be understood as a preferred example for constructing the system, and will not be described herein again.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims (10)

1. A real-time service monitoring method based on service scenes is characterized by comprising the following steps:
embedding points in a service application log, and collecting the data of the embedded points in real time;
and according to the collected data of the buried points, taking the service list with total time consumption exceeding a set threshold value as a service abnormal event for real-time monitoring, wherein the total time consumption refers to the time consumed in the process from the initial state to the final completion state of the service list.
2. The real-time service monitoring method based on service scenario as claimed in claim 1, wherein the collecting the data of the embedded point in real time at the service application log embedded point comprises:
applying log data at a service system embedded point, and asynchronously sending the log data to a Kafka cluster theme partition;
and (3) consuming the business form data in different subject partitions in real time and concurrently by adopting a Flink process according to the specified subject registered by the Kafka consumer, and screening the data of the business form in the initial state and the final state by using a filter operator to obtain real-time buried point data.
3. The real-time service monitoring method based on the service scenario as claimed in claim 1, wherein the real-time monitoring of the service unit with total time consumption exceeding a set threshold as a service abnormal event according to the collected data of the buried point comprises:
pre-sequencing the state change logs of each business order by adopting a Redis Lua script;
and performing overtime mode matching on each pre-sequenced service list through a Flink CEP model to obtain a service abnormal event.
4. The real-time service monitoring method based on service scenario as claimed in claim 3, wherein said pre-ordering the status change log of each service ticket by using Redis Lua script comprises:
and sequentially inputting the data of the embedded points in the same service list under different states into Redis, pre-sequencing the sequence of the data of the embedded points under different states through Lua scripts, and determining that certain data of the embedded points temporarily reside in a hash data structure in a sequencing logic to wait or are immediately sent to downstream.
5. The real-time service monitoring method based on service scenario as claimed in claim 4, wherein said performing timeout pattern matching on each pre-ordered service ticket through a Flink CEP library comprises:
aiming at each business order sent from the upstream, a CEP complex event processing model in a Flink process is utilized to carry out real-time mode matching;
and according to the preset starting condition and ending condition in the CEP complex event processing model and the overtime threshold value for completely achieving the starting condition and the ending condition, taking the service sheet exceeding the overtime threshold value as a service abnormal event.
6. The real-time service monitoring method based on the service scenario as claimed in claim 5, wherein the service exception event is obtained by:
constructing a CEP complex event processing model according to the starting condition, the ending condition and the overtime threshold of the service, and converting an upstream service event stream into a model stream Pattern stream by a static method of the CEP complex event processing model; then, calling a flatSelect plane selection method aiming at the model stream Pattern, carrying out matching shunting from the model stream Pattern, and finally calling a getSideOutput method to obtain an overtime event stream to obtain a service abnormal event;
and/or
In the CEP complex event processing model:
the preset starting conditions are as follows: a service event initialization state 02;
the preset termination condition is as follows: service event end state 00 or 01;
the preset timeout threshold is: 3-5 minutes.
7. The real-time service monitoring method based on the service scenario as claimed in any one of claims 1 to 6, further comprising:
and packaging the service abnormal event and sending the service abnormal event to a downstream alarm platform.
8. A real-time service monitoring system based on service scenes is characterized by comprising the following components:
an application log point burying module, which buries points in the service application logs;
the real-time data acquisition module acquires the data of the embedded points in real time by applying the log embedded point module;
and the complex event processing module is used for monitoring the service list with total time consumption exceeding a set threshold value in real time as a service abnormal event according to the collected data of the buried point, wherein the total time consumption refers to the time consumed by the process from the initial state to the final finished state of the service list.
9. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, is operative to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.
CN202110439635.1A 2021-04-23 2021-04-23 Real-time service monitoring method, system, terminal and medium based on service scene Active CN113204464B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110439635.1A CN113204464B (en) 2021-04-23 2021-04-23 Real-time service monitoring method, system, terminal and medium based on service scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110439635.1A CN113204464B (en) 2021-04-23 2021-04-23 Real-time service monitoring method, system, terminal and medium based on service scene

Publications (2)

Publication Number Publication Date
CN113204464A true CN113204464A (en) 2021-08-03
CN113204464B CN113204464B (en) 2023-04-25

Family

ID=77028051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110439635.1A Active CN113204464B (en) 2021-04-23 2021-04-23 Real-time service monitoring method, system, terminal and medium based on service scene

Country Status (1)

Country Link
CN (1) CN113204464B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514678A (en) * 2022-09-23 2022-12-23 四川新网银行股份有限公司 Continuity monitoring method and device for internet financial business

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729214A (en) * 2017-10-13 2018-02-23 福建富士通信息软件有限公司 A kind of visual distributed system monitors O&M method and device in real time
CN109241128A (en) * 2018-07-16 2019-01-18 北京百度网讯科技有限公司 A kind of expired events automatic trigger method and system
CN109660402A (en) * 2018-12-25 2019-04-19 钛马信息网络技术有限公司 Operation system realtime running monitor supervision platform and method
CN110928717A (en) * 2019-11-14 2020-03-27 北京神州绿盟信息安全科技股份有限公司 Complex time sequence event detection method and device
CN112422445A (en) * 2020-10-10 2021-02-26 四川新网银行股份有限公司 Kafka-based real-time acquisition, calculation and storage method for buried point data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729214A (en) * 2017-10-13 2018-02-23 福建富士通信息软件有限公司 A kind of visual distributed system monitors O&M method and device in real time
CN109241128A (en) * 2018-07-16 2019-01-18 北京百度网讯科技有限公司 A kind of expired events automatic trigger method and system
CN109660402A (en) * 2018-12-25 2019-04-19 钛马信息网络技术有限公司 Operation system realtime running monitor supervision platform and method
CN110928717A (en) * 2019-11-14 2020-03-27 北京神州绿盟信息安全科技股份有限公司 Complex time sequence event detection method and device
CN112422445A (en) * 2020-10-10 2021-02-26 四川新网银行股份有限公司 Kafka-based real-time acquisition, calculation and storage method for buried point data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
秋华: "《flink(七)电商用户行为分析(七)订单支付实时监控之订单超时、订单交易匹配》", 《HTTPS://WWW.CNBLOGS.COM/QIU-HUA/P/13492162.HTML》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514678A (en) * 2022-09-23 2022-12-23 四川新网银行股份有限公司 Continuity monitoring method and device for internet financial business
CN115514678B (en) * 2022-09-23 2023-09-26 四川新网银行股份有限公司 Continuity monitoring method for internet financial business

Also Published As

Publication number Publication date
CN113204464B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
WO2020215532A1 (en) System and method for data synchronization between heterogeneous databases, and storage medium
CN106354765B (en) Log analysis system and method based on distributed acquisition
CN111147322B (en) Test system and method for micro service architecture of 5G core network
CN111752799A (en) Service link tracking method, device, equipment and storage medium
CN109165153B (en) Performance test method of high-simulation securities trade system
CN108334557B (en) Aggregated data analysis method and device, storage medium and electronic equipment
CN108304286A (en) A kind of system and method carrying out automatic test to transcoding server concurrency performance
CN110147470B (en) Cross-machine-room data comparison system and method
CN111177193A (en) Flink-based log streaming processing method and system
CN108984404A (en) A kind of exception information processing method and system, a kind of computer product
CN113204464A (en) Real-time service monitoring method, system, terminal and medium based on service scene
CN111970151A (en) Flow fault positioning method and system for virtual and container network
CN110516738B (en) Distributed comparison clustering method and device, electronic equipment and storage medium
CN114466227B (en) Video analysis method and device, electronic equipment and storage medium
WO2022068392A1 (en) Database cluster capacity expansion and reduction method, service system and storage medium
CN111639022B (en) Transaction testing method and device, storage medium and electronic device
CN107688660A (en) The execution method and device of parallel executive plan
CN106304122B (en) Business data analysis method and system
CN115664992A (en) Network operation data processing method and device, electronic equipment and medium
CN116186082A (en) Data summarizing method based on distribution, first server and electronic equipment
CN116264541A (en) Multi-dimension-based database disaster recovery method and device
CN113872814A (en) Information processing method, device and system for content distribution network
CN112764888A (en) Distributed transaction checking and judging method and system based on log analysis
CN105590224A (en) Method for determining failure node in transaction process
CN115203063B (en) Playback method and system of production flow re-running risk program based on real-time recording

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant