CN110365537A - Middleware business fault treatment method and system - Google Patents

Middleware business fault treatment method and system Download PDF

Info

Publication number
CN110365537A
CN110365537A CN201910648096.5A CN201910648096A CN110365537A CN 110365537 A CN110365537 A CN 110365537A CN 201910648096 A CN201910648096 A CN 201910648096A CN 110365537 A CN110365537 A CN 110365537A
Authority
CN
China
Prior art keywords
middleware
data
business datum
business
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910648096.5A
Other languages
Chinese (zh)
Inventor
靳扬
李雅洁
张辉疆
郭江涛
买合布拜·肖开提
王燕军
朱毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Network Xinjiang Electric Power Co Ltd Information And Communication Co
Information and Telecommunication Branch of State Grid Xinjiang Electric Power Co Ltd
Original Assignee
National Network Xinjiang Electric Power Co Ltd Information And Communication Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Network Xinjiang Electric Power Co Ltd Information And Communication Co filed Critical National Network Xinjiang Electric Power Co Ltd Information And Communication Co
Priority to CN201910648096.5A priority Critical patent/CN110365537A/en
Publication of CN110365537A publication Critical patent/CN110365537A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning

Abstract

The present invention provides middleware business fault treatment method and systems, the former includes S1: the business datum from different application systems being sent to monitoring system by middleware, to be monitored to the business datum;S2: setting undertakes the management node of troubleshooting, and one chosen in management node is used as host node, remaining is as slave node;S3: and judge whether to break down, if so, monitoring system notification message middleware services module sends failure message etc. to administrative center's service module of host node.The present invention forms the monitoring network of an entire computer cluster node of covering using message-oriented middleware and single node monitoring programme, monitors the service state and network state of each node in real time;By the credibility interval and the business efficiency baseline value that calculate response time and portfolio;It determines whether the portfolio of the middleware link of efficiency of service difference and average response time deviate business efficiency baseline value, middleware link is reset according to confirmation result.

Description

Middleware business fault treatment method and system
Technical field
The present invention relates to field of new energy technologies, and in particular to a kind of middleware business fault treatment method and system.
Background technique
Current computer IT field, significant components of the middleware as basic software are flush with operating system, database It goes forward side by side, worldwide shows the impetus of fast development, formed a huge industry.Middleware is entire at home It should be one of market with the fastest developing speed in Software Industry.And the monitoring technology of traditional middleware, including monitoring line at present Number of passes, processing time, number of request, byte number, Cluster (cluster), storehouse, thread pool, connection pool, Web (World Wide Web, WWW) application etc. index parameters.
In existing operation system, tens links of middleware cluster are mutually indepedent, and each link can receive client The demand issued and back-end data base etc. is held to interact transacting business, when there is the single link failure of middleware, as long as middleware The load balancer of cluster front end remains to work normally, it will new service request is distributed to other links, system will not be made Generally in the malfunction of " full resistance ".
For more serious failure, usual method is that maintenance personnel enters more sections of the computer room in computer cluster Failed machines are searched in point, then determine the failure cause of machine, then carry out maintenance work, when the quantity of node increases, are needed Increase the quantity and workload of maintenance personnel, not only cost is higher, but also working efficiency is very low.
Summary of the invention
In view of this, the present invention provides middleware business fault treatment method and system, for solving above-mentioned existing skill The shortcomings that art and deficiency.
To achieve the above object, the embodiment of the present invention provides the following technical solutions:
A kind of middleware business fault treatment method, comprising:
S1: the business datum from different application systems is sent to monitoring system by middleware, to the business Data are monitored;
S2: setting undertakes the management node of troubleshooting, and one chosen in management node is used as host node, remaining conduct Slave node;
S3: and judge whether to break down, if so, pipe of the monitoring system notification message middleware services module to host node It manages center service module and sends failure message;
S4: administrative center's service module of host node carries out troubleshooting according to the failure message;
S5: the middleware link of efficiency of service difference is screened from failure message;
S6: whether the portfolio and average response time for determining the middleware link of the efficiency of service difference are in the response In the credibility interval of time and portfolio, if so, otherwise executing step S7 resets the middleware link;
S7: determine whether the portfolio of the middleware link of the efficiency of service difference and average response time deviate the industry Otherwise business efficiency baseline value, does not deal with if so, resetting the middleware link.
Further, the monitoring process of monitoring system, comprising:
S101: data storage and data modeling are carried out to the business datum according to default storage rule;
S102: data classification is carried out to the business datum by data storage and data modeling processing;
S103: establishing service correlation model according to the data classification, with to business datum described in every carry out track with Track;
S104: according to the service correlation model, threshold value of warning is set, to be monitored to the business datum.
Further, further includes: administrative center's service module of host node identifies the node with defined state value State, and specific fault message is notified into maintenance personnel.
Second aspect of the present invention discloses a kind of middleware traffic failure processing system, which is characterized in that is including monitoring System, screening unit, credibility interval confirmation unit, baseline value confirmation unit, reset cell, wherein
Monitoring system includes: that judging unit, data sorting unit, business model unit, data monitoring unit and data are deposited Storage unit;
The judging unit, for before the business datum is sent to the monitoring system by the middleware, Judge whether the business datum being sent to the monitoring system;
The data storage cell, for according to default storage rule to the business datum be based on NoSQL database into The storage of row data and data modeling;
The data sorting unit, for the business number by data storage and data modeling processing According to progress data classification;
The business model unit, for establishing service correlation model according to the data classification, to industry described in every Data of being engaged in carry out track following;
The data monitoring unit, for threshold value of warning being arranged, to the business number according to the service correlation model According to being monitored;
The screening unit, business datum and average response time for being monitored according to monitoring system calculate service effect Rate screens the middleware link of efficiency of service difference;
The credibility interval confirmation unit, the business datum peace of the middleware link for determining the efficiency of service difference Whether the equal response time is in the credibility interval of the response time and portfolio;
The baseline value confirmation unit, for determining the business datum of the middleware link of the efficiency of service difference and being averaged Whether the response time deviates the business efficiency baseline value;
The reset cell, for resetting the middleware link.
Further, monitoring system further includes Date Conversion Unit, and Date Conversion Unit will be for that will come from different application systems Business datum monitoring system is sent to by middleware, to be monitored to the business datum;When judgment module judges to tie Fruit is when being, to be sent to the monitoring system after the business datum is converted to XML data by adapter;
Further, the data storage cell is specifically used for: storing the XML data according to the default storage rule In the extremely NoSQL database Cassandra, and carry out the data modeling;And the data categorization module is specifically used for: The weighted value of the field of every XML data is calculated, and the data point are carried out to the XML data according to the weighted value Class.
The present invention forms the entire computer cluster of covering using message-oriented middleware and single node monitoring programme The monitoring network of node monitors the service state and network state of each node in real time;By calculating response time and business The credibility interval of amount and business efficiency baseline value;Filter out the middleware link of efficiency of service difference;Efficiency of service is determined respectively Difference middleware link portfolio and average response time whether in the credibility interval of response time and portfolio, and really Whether the portfolio and average response time for determining the middleware link of efficiency of service difference deviate business efficiency baseline value, according to confirmation As a result middleware link is reset.Compared with prior art, the present invention solves the problems, such as in middleware cluster caused by single-link High real-time business efficiency reduces, artificial to investigate the defects of time-consuming, problem is easily magnified, and promotes work from actual scene Efficiency reduces risk.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of middleware business fault treatment method flow chart that the embodiment of the present invention one provides;
Fig. 2 is another middleware business fault treatment method flow chart that the embodiment of the present invention one provides;
Fig. 3 is a kind of middleware traffic failure processing system structural schematic diagram provided by Embodiment 2 of the present invention;
Fig. 4 is another middleware traffic failure processing system structural schematic diagram that the embodiment of the present invention three provides;
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Embodiment one
The embodiment of the present invention one provides a kind of thermal power plant unit thermoelectricity Relationship Prediction method based on Method Using Relevance Vector Machine, is suitable for Thermal power plant unit, referring to Fig. 1, S1: the business datum from different application systems being sent to monitoring system by middleware, with right The business datum is monitored;
S2: setting undertakes the management node of troubleshooting, and one chosen in management node is used as host node, remaining conduct Slave node;
S3: and judge whether to break down, if so, pipe of the monitoring system notification message middleware services module to host node It manages center service module and sends failure message;
S4: administrative center's service module of host node carries out troubleshooting according to the failure message;
S5: the middleware link of efficiency of service difference is screened from failure message;
S6: whether the portfolio and average response time for determining the middleware link of the efficiency of service difference are in the response In the credibility interval of time and portfolio, if so, otherwise executing step S7 resets the middleware link;
S7: determine whether the portfolio of the middleware link of the efficiency of service difference and average response time deviate the industry Otherwise business efficiency baseline value, does not deal with if so, resetting the middleware link.
Optionally, the specific implementation procedure of S1 shown in above-mentioned Fig. 1 is, as shown in Figure 2, comprising the following steps:
Further, the monitoring process of monitoring system, comprising:
S101: data storage and data modeling are carried out to the business datum according to default storage rule;
S102: data classification is carried out to the business datum by data storage and data modeling processing;
S103: establishing service correlation model according to the data classification, with to business datum described in every carry out track with Track;
S104: according to the service correlation model, threshold value of warning is set, to be monitored to the business datum.
A variety of different application systems are connected by middleware, the business datum of different application systems is sent to monitoring system System, and by data storage, data modeling, and combine NoSQL database can the not ability of designated data structure in advance, general The structure of data model is designed to dynamic change, so that model self can increase during practical O&M, self-teaching, so Pass through data classification afterwards, establish service correlation model, setting monitoring and early warning threshold value, the intelligence to different business data may be implemented Monitoring, effectively improves the adaptability of versatility and the variation to the variation and business rule of business datum, and can quickly expire The monitoring demand of the big conglomerate of foot realizes self-teaching by using big data, and passes through intelligent modeling, intelligent classification, intelligence Prediction and warning reaches the target of monitoring business data.
As needed, further includes: administrative center's service module of host node identifies the node with defined state value State, and specific fault message is notified into maintenance personnel.
Embodiment two
Based on a kind of middleware business fault treatment method disclosed in the embodiments of the present invention one, the embodiment of the present invention is also Middleware traffic failure processing system is disclosed, as shown in Figure 3, comprising: true including monitoring system, screening unit, credibility interval Recognize unit, baseline value confirmation unit, reset cell, wherein
Monitoring system includes: that judging unit, data sorting unit, business model unit, data monitoring unit and data are deposited Storage unit;
The judging unit, for before the business datum is sent to the monitoring system by the middleware, Judge whether the business datum being sent to the monitoring system;
The data storage cell, for according to default storage rule to the business datum be based on NoSQL database into The storage of row data and data modeling;
The data sorting unit, for the business number by data storage and data modeling processing According to progress data classification;
The business model unit, for establishing service correlation model according to the data classification, to industry described in every Data of being engaged in carry out track following;
The data monitoring unit, for threshold value of warning being arranged, to the business number according to the service correlation model According to being monitored;
The screening unit, business datum and average response time for being monitored according to monitoring system calculate service effect Rate screens the middleware link of efficiency of service difference;
The credibility interval confirmation unit, the business datum peace of the middleware link for determining the efficiency of service difference Whether the equal response time is in the credibility interval of the response time and portfolio;
The baseline value confirmation unit, for determining the business datum of the middleware link of the efficiency of service difference and being averaged Whether the response time deviates the business efficiency baseline value;
The reset cell, for resetting the middleware link.
As shown in attached drawing 3,4, monitoring system further includes Date Conversion Unit, and Date Conversion Unit from difference for that will answer Monitoring system is sent to by middleware with the business datum of system, to be monitored to the business datum;Work as judgment module Judging result is when being, to be sent to the monitoring system after the business datum is converted to XML data by adapter;
As shown in attached drawing 3,4, the data storage cell is specifically used for: according to the default storage rule by the XML Data are stored into the NoSQL database Cassandra, and carry out the data modeling;And the data categorization module It is specifically used for: calculates the weighted value of the field of every XML data, and the XML data is carried out according to the weighted value The data classification.
The present invention forms the entire computer cluster of covering using message-oriented middleware and single node monitoring programme The monitoring network of node monitors the service state and network state of each node in real time;By calculating response time and business The credibility interval of amount and business efficiency baseline value;Filter out the middleware link of efficiency of service difference;Efficiency of service is determined respectively Difference middleware link portfolio and average response time whether in the credibility interval of response time and portfolio, and really Whether the portfolio and average response time for determining the middleware link of efficiency of service difference deviate business efficiency baseline value, according to confirmation As a result middleware link is reset.Compared with prior art, the present invention solves the problems, such as in middleware cluster caused by single-link High real-time business efficiency reduces, artificial to investigate the defects of time-consuming, problem is easily magnified, and promotes work from actual scene Efficiency reduces risk.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (6)

1. a kind of middleware business fault treatment method, which comprises the following steps:
S1: the business datum from different application systems is sent to monitoring system by middleware, to the business datum It is monitored;
S2: setting undertakes the management node of troubleshooting, and one chosen in management node is used as host node, remaining is as standby section Point;
S3: and judge whether to break down, if so, monitoring system notification message middleware services module is into the management of host node Central server module sends failure message;
S4: administrative center's service module of host node carries out troubleshooting according to the failure message;
S5: the middleware link of efficiency of service difference is screened from failure message;
S6: whether the portfolio and average response time for determining the middleware link of the efficiency of service difference are in the response time In the credibility interval of portfolio, if so, otherwise executing step S7 resets the middleware link;
S7: determine whether the portfolio of the middleware link of the efficiency of service difference and average response time deviate the business effect Otherwise rate baseline value is not dealt with if so, resetting the middleware link.
2. the method according to claim 1, wherein the monitoring process of monitoring system, comprising:
S101: data storage and data modeling are carried out to the business datum according to default storage rule;
S102: data classification is carried out to the business datum by data storage and data modeling processing;
S103: establishing service correlation model according to the data classification, to carry out track following to business datum described in every;
S104: according to the service correlation model, threshold value of warning is set, to be monitored to the business datum.
3. method according to claim 1 or 2, which is characterized in that further include: administrative center's service module of host node with Defined state value identifies the state of the node, and specific fault message is notified maintenance personnel.
4. a kind of middleware traffic failure processing system, which is characterized in that true including monitoring system, screening unit, credibility interval Recognize unit, baseline value confirmation unit, reset cell, wherein
Monitoring system includes: that judging unit, data sorting unit, business model unit, data monitoring unit and data storage are single Member;
The judging unit, for judging before the business datum is sent to the monitoring system by the middleware Whether the business datum is sent to the monitoring system;
The data storage cell is counted for being based on NoSQL database to the business datum according to default storage rule According to storage and data modeling;
The data sorting unit, for by the data storage and the data modeling processing the business datum into Row data classification;
The business model unit, for establishing service correlation model according to the data classification, to business number described in every According to progress track following;
The data monitoring unit, for threshold value of warning to be arranged according to the service correlation model, with to the business datum into Row monitoring;
The screening unit, business datum and average response time for being monitored according to monitoring system calculate efficiency of service, sieve Select the middleware link of efficiency of service difference;
The credibility interval confirmation unit, for determine the middleware link of the efficiency of service difference business datum and average sound Between seasonable whether in the credibility interval of the response time and portfolio;
The baseline value confirmation unit, the business datum and average response of the middleware link for determining the efficiency of service difference Whether the time deviates the business efficiency baseline value;
The reset cell, for resetting the middleware link.
5. system according to claim 4, which is characterized in that the prediction model training unit, comprising:
Monitoring system further includes Date Conversion Unit, and Date Conversion Unit is for leading to the business datum from different application systems It crosses middleware and is sent to monitoring system, to be monitored to the business datum;When judgment module judging result, which is, is, pass through Adapter is sent to the monitoring system after the business datum is converted to XML data.
6. system according to claim 4 or 5, which is characterized in that the data storage cell is specifically used for: according to described Default storage rule stores the XML data into the NoSQL database Cassandra, and carries out the data modeling; And the data categorization module is specifically used for: calculating the weighted value of the field of every XML data, and according to the weight Value carries out the data classification to the XML data.
CN201910648096.5A 2019-07-16 2019-07-16 Middleware business fault treatment method and system Pending CN110365537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910648096.5A CN110365537A (en) 2019-07-16 2019-07-16 Middleware business fault treatment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910648096.5A CN110365537A (en) 2019-07-16 2019-07-16 Middleware business fault treatment method and system

Publications (1)

Publication Number Publication Date
CN110365537A true CN110365537A (en) 2019-10-22

Family

ID=68220303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910648096.5A Pending CN110365537A (en) 2019-07-16 2019-07-16 Middleware business fault treatment method and system

Country Status (1)

Country Link
CN (1) CN110365537A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110955551A (en) * 2019-11-26 2020-04-03 上海新炬网络技术有限公司 Fault intelligent diagnosis device based on tomcat middleware
CN111143163A (en) * 2019-12-13 2020-05-12 上海硬通网络科技有限公司 Data monitoring method and device, computer equipment and storage medium
CN113630284A (en) * 2020-05-08 2021-11-09 网联清算有限公司 Message middleware monitoring method, device and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110955551A (en) * 2019-11-26 2020-04-03 上海新炬网络技术有限公司 Fault intelligent diagnosis device based on tomcat middleware
CN111143163A (en) * 2019-12-13 2020-05-12 上海硬通网络科技有限公司 Data monitoring method and device, computer equipment and storage medium
CN111143163B (en) * 2019-12-13 2024-04-16 上海硬通网络科技有限公司 Data monitoring method, device, computer equipment and storage medium
CN113630284A (en) * 2020-05-08 2021-11-09 网联清算有限公司 Message middleware monitoring method, device and equipment

Similar Documents

Publication Publication Date Title
CN110365537A (en) Middleware business fault treatment method and system
CN105187249B (en) A kind of fault recovery method and device
CN104796273B (en) A kind of method and apparatus of network fault root diagnosis
CN110430071A (en) Service node fault self-recovery method, apparatus, computer equipment and storage medium
CN103281366B (en) A kind of support real-time running state to obtain embedded agent supervising device and method
CN107544839B (en) Virtual machine migration system, method and device
CN104683446A (en) Method and system for monitoring service states of cloud storage cluster nodes in real time
CN101183993B (en) Network management system and performance data processing method
CN103853627A (en) Method and system for analyzing root causes of relating performance issues among virtual machines to physical machines
CN105245381B (en) Cloud Server delay machine monitors migratory system and method
CN106878466B (en) A kind of Hydropower Unit data management and equipment control unified platform
CN102254016B (en) Cloud-computing-environment-oriented fault-tolerant parallel Skyline inquiry method
US20220052923A1 (en) Data processing method and device, storage medium and electronic device
CN112162907A (en) Health degree evaluation method based on monitoring index data
CN111817911A (en) Method and device for detecting network quality, computing equipment and storage medium
CN105306272A (en) Method and system for collecting fault scene information of information system
CN107947998A (en) A kind of real-time monitoring system based on application system
CN105516293A (en) Cloud resource monitoring system of intelligent substation
CN108234150A (en) For the data acquisition and processing (DAP) method and system of data center's monitoring system
CN106657212A (en) Self-service terminal state monitoring method and system
CN106130778A (en) A kind of method processing clustering fault and a kind of management node
CN102314521A (en) Distributed parallel Skyline inquiring method based on cloud computing environment
CN103095598A (en) Monitoring data aggregate method under large-scale cluster environment
CN108199901A (en) Hardware reports method, system, equipment, hardware management server and storage medium for repairment
CN103944784A (en) Large-scale-cloud-data-center-oriented server cooperative monitoring method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191022

WD01 Invention patent application deemed withdrawn after publication