CN101562827A - Fault information acquisition method and system - Google Patents

Fault information acquisition method and system Download PDF

Info

Publication number
CN101562827A
CN101562827A CNA2009100854951A CN200910085495A CN101562827A CN 101562827 A CN101562827 A CN 101562827A CN A2009100854951 A CNA2009100854951 A CN A2009100854951A CN 200910085495 A CN200910085495 A CN 200910085495A CN 101562827 A CN101562827 A CN 101562827A
Authority
CN
China
Prior art keywords
failure
event
detection data
fault
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2009100854951A
Other languages
Chinese (zh)
Other versions
CN101562827B (en
Inventor
林青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN2009100854951A priority Critical patent/CN101562827B/en
Publication of CN101562827A publication Critical patent/CN101562827A/en
Application granted granted Critical
Publication of CN101562827B publication Critical patent/CN101562827B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a fault information acquisition method, which comprises the following steps: acquiring detection data of fault events and analyzing the detection data of the fault events; and uniformly storing and modifying the fault information of each fault event according to an analysis result. The invention also correspondingly discloses a fault information acquisition system, which comprises a detection data acquisition unit, a detection data analysis unit and a fault information storage unit. The method and the system adopted uniformly acquire the detection data of various fault events with different interfaces, analyze the acquired detection data according to related state of history data and/or the fault events, and uniformly store and maintain the fault information obtained by analysis; and when an alarm unit and a diagnosis unit need acquiring the fault information, the alarm unit and the diagnosis unit can directly read the stored fault information and do not need analyzing the detection data so as to acquire the fault information; therefore, the method and the system can save system resources, improve system stability and simplify operation.

Description

A kind of fault information acquisition method and system
Technical field
(Base Transceiver Station, information acquiring technology BTS) relates in particular to a kind of fault information acquisition method and system to the present invention relates to base station transceiver.
Background technology
Alarm among the BTS and diagnosis are generally handled respectively by Alarm Unit and diagnosis unit, two parts of tables of general storage in the Alarm Unit, fault inquiry sign indicating number, the alarm code of each event of failure of record in a table, fault inquiry sign indicating number, alarm cause sign indicating number, alarm level, the additional information of each event of failure of record in another part table, Alarm Unit can be analyzed described detection data after obtaining detection data about event of failure.Alarm Unit can periodically scan first part of table, judge it is report and alarm message or recovery message according to scanning result, when needing report and alarm message, fault inquiry sign indicating number according to described event of failure is inquired about second table, again according to the alarm cause sign indicating number, alarm level, the additional information that inquire, generate alarm information and report operation maintenance center (Operation Maintain Center, OMC) webmaster.After the OMC webmaster is received alarm information or recovered message, judge alarm information or recover message whether reported,, just do not process to reporting specifically if reported.
After diagnosis unit is received the diagnosis request of OMC webmaster to certain event of failure, can initiatively obtain the detection data of this event of failure to detecting unit, through analysis to the detection data, determine the wrong code value of this event of failure, and reporting the OMC webmaster, the OMC webmaster can obtain diagnostic result by the parse error sign indicating number.
As can be seen, Alarm Unit and diagnosis unit among the existing BTS carry out Data Detection and analysis to event of failure respectively, to realize alarm function and diagnostic function, for detecting the more frequent event of failure of data variation, be easy to occur that the detection data that Alarm Unit and diagnosis unit obtain are inconsistent, alarm information and diagnostic result situation devious, thereby influence is to the analysis of event of failure; And, when Alarm Unit scans at every turn, all can send the alarm information of event of failure or recover message, under the situation that the state of event of failure remains unchanged for a long time, this reporting repeatedly can waste system resource; In addition, when we want to shield detection to some event of failure, need be provided with at Alarm Unit and diagnosis unit respectively, operate more loaded down with trivial details, and the also bad assurance of consistency of setting.
Also having at present some patents that the alarm method of Alarm Unit is improved, is the Korean Patent " a kind of base station alarm collection method (A method for collectingalarm in a base station system) " that 200410021983.3 the Chinese patent application " a kind of data acquisition and storage means " and the patent No. are KR20040073220 as number of patent application.
Wherein, application number is 200410021983.3 Chinese patent application, be just to judge in the acquisition testing data whether it is the data that need alarm, and according to information such as historical data and alarming thresholds, abandon the detection data that part is gathered, though this method has reduced to detect the workload of data analysis link, but because when judging abnormal state according to the detection data of gathering each time, all need report and alarm message, the alarm management unit that is in one flow process in back is for the alarm information that reports, to exist alarm information relatively with numerous, see whether this alarm information reported, if reported then abandon described alarm information, and most of the time, the data of gathering are less than too big variation, so the same existence of this patent application reports repeatedly, the problem of waste system resource.
The patent No. be the Korean Patent of KR20040073220 by sending the status data that query messages obtains event of failure, the state of event of failure be correctly, then reported data; The state of event of failure is a mistake, then report and alarm message.Because after this patent obtains fault state data, do not preserve data, all report and alarm or data at every turn, the problem that reports is repeatedly arranged equally, in addition, when the malfunction of event of failure was converted into correctly from mistake after, this patent did not have corresponding flow process to send recovery message, is unfavorable for the analysis of event of failure.
Summary of the invention
In view of this, main purpose of the present invention is to provide a kind of fault information acquisition method and system, can conserve system resources, improve the stability of a system and simplify the operation.
For achieving the above object, technical scheme of the present invention is achieved in that
A kind of fault information acquisition method comprises:
Obtain the detection data of event of failure and the detection data of described event of failure are analyzed;
According to analysis result the fault message of each event of failure is carried out storage and uniform and modification.
The described detection data of obtaining event of failure are: obtain by function call mode and/or message transfer mode.
Described fault message comprises: event of failure type codes, event of failure numbering, event of failure result data, error code, event of failure result data type, shielding mark.
Event of failure type codes, event of failure numbering, the buffer memory frequency threshold value of event of failure, the error code span of event of failure are set,
The described detection data of obtaining event of failure comprise step afterwards: according to the event of failure type codes and the event of failure numbering of described event of failure, or only number according to the event of failure of described event of failure, whether the group number that the described event of failure of judging buffer memory detects data has reached the buffer memory frequency threshold value of described event of failure, if, with buffer memory the earliest the one group detection data of the described detection Data Update of obtaining about described event of failure; Otherwise, the detection data of the direct described described event of failure of receiving of buffer memory;
Described detection data to event of failure are analyzed to the detection data to event of failure and are carried out historical data analysis, be specially: whether the detection data of described event of failure of judging buffer memory are consistent, if revise the fault message of described event of failure according to the error code span of the event of failure of the detection data of buffer memory and setting; Otherwise, can not make amendment to the fault message of described event of failure.
Describedly detect data with the earliest one group of the detection Data Update buffer memory that obtains and be: cover about the earliest one group of the buffer memory of described event of failure with the described detection data of obtaining and detect data or realize by shifting function.
The associated shield relation between mark, dependent failure incident, event of failure and the dependent failure incident, the error code span of event of failure are numbered, shielded to event of failure type codes, event of failure that event of failure is set,
Before obtaining the detection data of event of failure, also comprise step:
Event of failure type codes and event of failure numbering according to described event of failure, or, inquire about the dependent failure incident and the relation of the associated shield between described event of failure and the dependent failure incident of described event of failure only according to the event of failure numbering of described event of failure;
Judge according to described associated shield relation whether described event of failure is the master control event of failure, described event of failure is the master control event of failure, after obtaining the detection data and analysis of described master control event of failure, according to the state of described master control event of failure, revise the shielding mark of the controlled event of failure corresponding with it;
Described event of failure is controlled event of failure, judges then the shielding of described controlled event of failure marks whether to be 0, if, the detection data and the analysis of obtaining described controlled event of failure; Otherwise the error code of revising described controlled event of failure according to the error code span of the event of failure that is provided with is the value between the unknown area, shield detection.
Correlation behavior relation between event of failure type codes, event of failure numbering, dependent failure incident, event of failure and the dependent failure incident of event of failure, the error code span of event of failure are set,
Described detection data to event of failure are analyzed to the detection data to event of failure and are carried out correlation analysis, be specially: obtain after the detection data of event of failure, event of failure type codes and event of failure numbering according to described event of failure, or only number according to the event of failure of described event of failure, inquire about the dependent failure incident and the relation of the correlation behavior between described event of failure and the dependent failure incident of described event of failure, then according to the state of the dependent failure incident of described event of failure, correlation behavior relation between event of failure and the dependent failure incident, and the error code span of the event of failure that is provided with, the fault message of described event of failure or its dependent failure incident is made amendment.
This method also comprises:
The fault message of the event of failure of Alarm Unit periodic queries storage, the shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, with error code be converted into the alarm or return to form;
If the state consistency whether state after the comparison error sign indicating number transforms reports with described event of failure the last time consistent, is not carried out any operation; Otherwise, according to the fault message of described event of failure, send alarm information or recover message, and upgrade the state that described event of failure the last time of self storage reports to the OMC of operation maintenance center webmaster.
This method also comprises:
Diagnosis unit is according to the diagnosis request of OMC webmaster, and the fault message of read failure incident directly is dealt into the OMC webmaster to event of failure type codes, event of failure numbering, error code, event of failure result data with the form of message;
The OMC webmaster provides corresponding failure-description according to event of failure type codes, event of failure numbering, and parse error sign indicating number, event of failure result data draw diagnostic result.
A kind of fault information acquisition system comprises: detect data capture unit, detect data analysis unit, fault message memory cell; Wherein,
Described detection data capture unit is used to obtain the detection data of event of failure;
Described detection data analysis unit is used for the detection data that described detection data capture unit obtains are analyzed, and according to analysis result the fault message of fault message cell stores is made amendment;
Described fault message memory cell is used to store the fault message of each event of failure.
Described fault information acquisition system also comprises parameter set unit, historical data buffer unit;
Described parameter set unit is used to be provided with event of failure type codes, event of failure numbering, the buffer memory frequency threshold value of event of failure, the error code span of event of failure;
Described detection data analysis unit also is used for the parameter that is provided with according to parameter set unit, with the detection metadata cache that obtains to the historical data buffer unit; And at the detection metadata cache that will obtain to the historical data buffer unit, whether the detection data of judging the event of failure of buffer memory in the described historical data buffer unit are consistent, if it is consistent, according to the error code span of the event of failure of parameter set unit setting and the detection data of obtaining, revise the fault message of the described event of failure of storing in the fault message memory cell; Otherwise, the fault message of the described event of failure stored in the fault message memory cell is not made amendment;
Described historical data buffer unit, be used for after receiving from the detection data that detect data analysis unit, parameter according to the parameter set unit setting, judge self the buffer memory event of failure group number that detects data whether reached the buffer memory frequency threshold value of described event of failure, if, with buffer memory the earliest the one group detection data of the described detection Data Update of obtaining about described event of failure; Otherwise, the detection data of the direct described described event of failure of receiving of buffer memory.
Described fault information acquisition system also comprises parameter set unit, described parameter set unit be used to be provided with event of failure type codes, event of failure numbering, shielding mark, event of failure the dependent failure incident, concern with associated shield between the dependent failure incident, the error code span of event of failure
Described detection data capture unit is before obtaining the detection data of event of failure, also be used for judging according to the associated shield relation of described parameter set unit setting whether described event of failure is the master control event of failure, described event of failure is the master control event of failure, after then obtaining the detection data and analysis of described master control event of failure, according to the state of the described master control event of failure of storing in the fault message memory cell and the shielding relation of parameter set unit setting, revise the shielding mark of the controlled event of failure of storing in the fault message memory cell corresponding with it;
Described event of failure is controlled event of failure, and the shielding that then detects the described event of failure of storing in the data capture unit failure judgement information memory cell marks whether to be 0, if detect the detection data that data capture unit obtains described event of failure; Otherwise, detecting the parameter that data capture unit is provided with according to parameter set unit, the error code of revising controlled event of failure described in the fault message memory cell is the value between the unknown area, shield detection.
Described fault information acquisition system also comprises parameter set unit, described parameter set unit be used to be provided with event of failure type codes, event of failure numbering, event of failure the dependent failure incident, concern with correlation behavior between the dependent failure incident, the error code span of event of failure
Described detection data analysis unit also is used for the parameter that is provided with according to parameter set unit, and the described event of failure stored in the fault message memory cell or the fault message of its dependent failure incident are made amendment.
This system also comprises Alarm Unit, is used for the fault message of the event of failure that periodic queries fault message memory cell stores, and the shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, error code is converted into alarm or returns to form, and the state consistency that whether reports with described event of failure the last time of wherein storage of the state of comparison error sign indicating number after transforming, if consistent, do not carry out any operation; Otherwise, according to the fault message of described event of failure, send alarm information or recover message, and upgrade the state that described event of failure the last time of self storage reports to the OMC webmaster.
This system also comprises diagnosis unit, be used to respond the diagnosis request of OMC webmaster, the fault message of the event of failure of storing in the read failure information memory cell directly is dealt into the OMC webmaster to event of failure type codes, event of failure numbering, error code, event of failure result data with the form of message.
Fault information acquisition method of the present invention and system, detection data to the event of failure various, that interface is different are unified to obtain, and the detection data of obtaining are analyzed according to the correlation behavior of historical data and/or event of failure, the fault message unification that again analysis is obtained is stored and is safeguarded, when Alarm Unit and diagnosis unit need obtain fault message, the fault message that directly reads storage gets final product, and does not need to obtain fault message by the detection data are analyzed.Alarm Unit is when the scanning fault message, the malfunction malfunction current with it that reports before the event of failure can be compared, if malfunction does not change, report and alarm message or recover message not then, so, the present invention can avoid the problem that repeats to report, conserve system resources;
And, among the present invention, Alarm Unit and diagnosis unit do not need to gather and analyze detecting data respectively, but directly read the fault message of having analyzed, thereby can avoid alarm information to a great extent or recover message and diagnostic result situation devious, help improving system conformance;
In addition, in the time of need operation such as shielding to event of failure, the present invention need not carry out parameter modification respectively at Alarm Unit and diagnosis unit, and only need revise primary parameter, so the present invention can simplify the operation;
The present invention also carries out the fault message analysis according to the correlation behavior of historical data and/or event of failure, can ensure stability, the accuracy of fault message, thereby further improves the stability of a system.
Description of drawings
Fig. 1 is a fault information acquisition method flow chart of the present invention;
Fig. 2 revises the implementation method flow chart of the fault message of event of failure for the present invention by carrying out historical data analysis;
Fig. 3 is a fault information acquisition system construction drawing of the present invention;
Fig. 4 is the fault information acquisition method flow chart of embodiment of the invention 1CPU power supply status;
Fig. 5 is after the embodiment of the invention 2 is received light mouth related data, revises the realization flow figure of fault message according to the correlation of event of failure;
Fig. 6 is the correlation shielding realization flow figure of the embodiment of the invention 3.
Embodiment
Basic thought of the present invention is: the detection data to the event of failure various, that interface is different are unified to obtain, and the detection data of obtaining are analyzed according to the correlation behavior of historical data and/or event of failure, the fault message unification that again analysis is obtained is stored and is safeguarded, when Alarm Unit and diagnosis unit need obtain fault message, the fault message that directly reads storage gets final product, and does not need to obtain fault message by the detection data are analyzed.
The present invention stores the fault message of each event of failure according to unified and standard data structure, and be responsible for the detection data of event of failure are collected, and according to historical data analysis and/or correlation analysis to the detection data, revise the fault message of the event of failure of storage, for Alarm Unit or diagnosis unit inquiry.Here, the fault message according to unified and standard data structure storage generally comprises parameter: event of failure type codes, event of failure numbering, event of failure result data, error code, event of failure result data type, shielding mark.
Wherein, event of failure type codes sign event of failure type is as software fault, hardware fault, configuration error, preheating mistake, run-time error etc.
The event of failure numbering is used to identify event of failure.
When event of failure is numbered, can the event of failure that have same fault event type sign indicating number, promptly belongs to same event of failure type be numbered, also can be numbered various types of event of failure unifications.When being numbered belonging to event of failure in the same event of failure type, event of failure type codes and event of failure numbering be unique together determines an event of failure; When various types of event of failure unifications are numbered, the unique definite event of failure of event of failure numbering.Description herein is all based on event of failure type and the unique together situation of determining an event of failure of event of failure numbering.
The event of failure result data indicates the detailed information of event of failure, is generally concrete detection data.
Error code identifies the state of current event of failure, and the analytic method that shows the event of failure result data.
Error code is used to represent the state of event of failure, its value can be varied, generally be divided into normal, unusual, mistake, unknown these 4 intervals, the malfunction of expression event of failure is caused by manual operation between exceptions area, as configuration error etc., the malfunction of expression event of failure is caused by hardware itself between error-zone, expression is to the detection conductively-closed of event of failure between the unknown area, or obtain be one and can't conclude whether be the status data of fault, error code is between exceptions area or between error-zone, and showing needs alarm.
In actual applications, different event of failures, it is normally, unusually, mistake, the concrete value in unknown these four intervals is not necessarily identical, for example, normal interval interior value is 1-10, the corresponding normal interval value of event of failure A is 1, the corresponding normal interval value of event of failure B then may be 8, this is because the detection data of gathering may be that true service data is through certain calculated result, for example amplify several times, dwindle computings such as several times, concrete process just can be distinguished any computing by error code, for example, the value that error code is got in the normal interval is 1 o'clock, when showing resolve fault event result data, need take advantage of ten computings to it, the value that error code is got in the normal interval is 2 o'clock, when showing resolve fault event result data, need remove 10 computings to it, but, each event of failure is unique in each interval value, pre-sets.
Event of failure result data type shows that the event of failure result data is numerical value or character.
The shielding mark is used to identify the detection data of whether obtaining event of failure, and initial value generally gets 0.
Be described in further detail below in conjunction with the enforcement of accompanying drawing technical scheme.
Fig. 1 is a fault information acquisition method flow chart of the present invention, and as shown in Figure 1, fault information acquisition method of the present invention generally comprises following steps:
Step 11: the fault information acquisition parameter is set.
Carry out the event of failure of historical data analysis for needs, described fault information acquisition parameter generally comprises: event of failure type codes, event of failure numbering, shielding mark, the buffer memory frequency threshold value of event of failure, the error code span of event of failure etc.
Carry out the event of failure of correlation analysis for needs, described fault information acquisition parameter generally comprises: the dependency relation between event of failure type codes, event of failure numbering, shielding mark, dependent failure incident, event of failure and the dependent failure incident, the error code span of event of failure etc.
Here, the initial value of shielding mark is generally zero; Dependency relation between event of failure and the dependent failure incident refers to how to adjust oneself state according to the state of dependent failure incident, or how adjust the state of dependent failure incident according to oneself state, it can be correlation shielding relation and/or correlation state relation, the associated shield relation relates generally to shield the modification of sign and error code, and relevant condition relation relates generally to the modification of error code.
Step 12: the detection data of obtaining event of failure.
Here, can obtain detection data by function call mode and/or message transfer mode from the external fault checkout equipment.Wherein, the function call mode is to call in the cycle, for different event of failures, can set the different cycles, and the message transfer mode then need only be waited for the message of reception from the external fault checkout equipment, to obtain the detection data.
Generally speaking, event of failure utility function method of calling such as, preheating poor to hardware fault, environmental condition or run-time error is to event of failure application message transfer modes such as software fault, configuration errors.
In step 11, dependency relation between event of failure and the dependent failure incident is that associated shield is when concerning, concrete, according to the shielding mark of other event of failures of State Control of event of failure or the control that the shielding mark of self is subjected to other event of failure states, event of failure can be divided into master control event of failure and controlled event of failure, during the controlled event of failure of one of a plurality of master control event of failures shielding, as long as it is fault state (being that error code is got the value between error-zone or between exceptions area) that a master control event of failure is arranged, just shield the detection of controlled event of failure, reach the detection data of not obtaining controlled event of failure, specifically being labeled as one non-0 number by the shielding of revising controlled event of failure realizes, unless all master control event of failures are in normal condition (being the value that error code is got normal interval), just controlled event of failure is detected, specifically be labeled as 0 and realize by the shielding of revising controlled event of failure.
For the master control event of failure, after it is detected and analyze, need revise the shielding mark of the controlled event of failure corresponding according to the state of described master control event of failure with it.
For controlled event of failure, if its shielding is labeled as 0, then obtains it and detect data and analysis, if its shielding mark is not 0, the error code of then revising described controlled event of failure is the value between the unknown area, shield detection.
Step 13: judge whether described event of failure needs to carry out historical data analysis, if, execution in step 14; Otherwise, execution in step 15.
For the very high event of failure of stability requirement, the result that generally can require to gather several times continuously is a fault state or when all being normal condition, the value of just upgrading the error code in its fault message is between exceptions area, between error-zone or the value in normal interval.Which event of failure will save historical data, and the buffer memory how many times all is (the buffer memory frequency threshold value of the event of failure of concrete corresponding step 11 production) appointed, in case determine, cannot change in system's operation.
Here, judge that whether described event of failure needs to carry out historical data analysis is exactly according to event of failure type and event of failure numbering, searches it and whether is provided with the buffer memory frequency threshold value.
Step 14: these detection data of receiving are combined with the historical data of described event of failure analyze, and revise the fault message of described event of failure, forward step 16 afterwards to.
Fig. 2 revises the implementation method flow chart of the fault message of event of failure for the present invention by carrying out historical data analysis, and as shown in Figure 2, the present invention generally comprises following steps by the fault message that carries out historical data analysis modification event of failure:
Step 21: judge whether the buffer memory number of times of described event of failure has reached the buffer memory frequency threshold value of described event of failure, if, execution in step 22; Otherwise, execution in step 26.
Step 22: with buffer memory the earliest the one group detection data of the described detection Data Update of receiving about described event of failure.
Here, detect data and generally be buffered in the historical data buffer area in the mode of first in first out, during system start-up, can be with the zero clearing of historical data buffer area.
Mode buffer memory with first in first out detects data, and multiple implementation method can be arranged, and for example, can realize by shifting function, also can realize with buffer memory one group of detection data the earliest that the described detection data of receiving cover about described event of failure.
For example, the buffer memory frequency threshold value that event of failure A is set in the step 11 is 3, so, when receiving the detection data of event of failure A for first to three time, preserves the detection data of receiving respectively, and writes down respectively that to preserve positions for three times be that first of event of failure A preserves position A 0, second preserve position A 1, the 3rd preserve position A 2, when then receiving the detection data of event of failure A for the 4th time, cover the buffer memory first preservation position A the earliest, that be kept at event of failure A with it 0The detection data; When receiving the detection data of event of failure A for the 6th time, cover buffer memory the 3rd preservation position A the earliest, that be kept at event of failure A with it 2The detection data; When receiving the detection data of event of failure A for the 7th time, cover the buffer memory first preservation position A the earliest, that be kept at event of failure A with it 0The detection data ..., so circulation.Above-mentioned storage mode specifically can realize by formula I=(I++) %N, and wherein, I is the subscript of the current up-to-date preservation position of event of failure, and N is the buffer memory frequency threshold value of event of failure,, preserves the following I that is designated as of position I+1 here.
Step 23: whether the detection data of described event of failure of judging buffer memory are consistent, if, execution in step 24; Otherwise step 14 flow process finishes.
Here, see whether the detection data of buffer memory are consistent, see exactly whether detect data stablizes.
Step 24: determine whether to revise the fault message of described event of failure according to the detection data of buffer memory, if, execution in step 25; Otherwise step 14 flow process finishes.
Here, if it is corresponding with the value of error code in the fault message to detect data, then do not make an amendment, if it is not corresponding to detect the value of error code in data and the fault message, then according to detection data modification error code, and according to the event of failure result data in the detection data modification fault message of historical data buffer area buffer memory.
Step 25: according to the fault message of the described event of failure of detection data modification of buffer memory, step 14 flow process finishes.
Step 26: the detection data of the direct described described event of failure of receiving of buffer memory, step 14 flow process finishes.
Here, when preserving the detection data of the described event of failure receive, the group that also needs to detect data according to the described event of failure of buffer memory is counted n, and the preservation position of the described detection data of receiving of recorded and stored is n+1 preservation position of described event of failure.
Step 15:, revise the fault message of described event of failure according to the detection data of described event of failure.
Here, the fault message of revising described event of failure generally comprises error code and the event of failure result data of revising described event of failure, during the fault message of storage failure incident, also needs to revise the event of failure result data type of event of failure for the first time.
Step 16: judge whether and need carry out correlation analysis described event of failure, if, execution in step 17, otherwise flow process finishes.
The value of event of failure error code is not the detection data that only depend on described event of failure, also needs to consider other factors simultaneously, the comprehensive numerical value of judging the error code that draws described event of failure.For have reason as a result corresponding relation maybe must cause two event of failures concerning, in the time of previous going wrong, one of back must be unusual, so, can the fault message of a back event of failure be made amendment, in addition according to the fault message of previous event of failure, for some event of failure, if related resource is configuration not, do not need to detect for example not configuration of certain veneer yet, with regard to there is no need the "on" position of veneer CPU is detected, which event of failure will carry out relevant treatment, how to be correlated with, and makes an appointment, in case determine, in system's operation, cannot change.Being that correlation behavior described here can be a configuring condition, also can be the state of dependent failure incident.
The dependent failure incident of event of failure has one, also have a plurality of, correlation according to event of failure, the fault message of event of failure can be subjected to the control of its dependent failure incident passively, also can go active to remove to revise the fault message (being generally error code) of its dependent failure incident according to the fault message of oneself.
Judge whether that need carry out correlation analysis to described event of failure promptly carries out parameter query according to the event of failure type codes and the event of failure numbering of described event of failure, see the dependent failure incident that whether is provided with described event of failure and with the dependent failure incident between dependency relation, and whether described dependency relation is that correlation behavior concerns.
Step 17: according to the correlation behavior of described event of failure, revise the fault message of event of failure, flow process finishes.
Here, specifically concern, the fault message of described event of failure or its dependent failure incident is made amendment according to the dependent failure incident of the described event of failure that is provided with and with correlation behavior between the dependent failure incident.The fault message of revising is generally error code.
The fault message of the event of failure of Alarm Unit periodic scan storage is more than reported alarm message or recovers message that diagnosis unit is according to the diagnosis request of OMC webmaster, and the fault message of scanning event of failure is to report diagnostic message.
Alarm Unit comprises step behind the fault message of common data layer read failure incident:
Shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, with error code be converted into the alarm or return to form;
Whether the state whether state after the comparison error sign indicating number transforms reports with the last time of the described event of failure of storage is consistent, if unanimity is not carried out any operation; Otherwise, the event of failure type codes in the fault message, event of failure numbering are converted into alarm code, alarm cause sign indicating number; With error code be converted into the alarm or return to form; The event of failure result data is converted into the alarm additional information, send and carry alarm code, alarm cause sign indicating number, alarm or return to form, alarm the alarm information of additional information at last or recover message, and upgrade the state that the last time of described event of failure reports to the OMC webmaster.
Diagnosis unit is from common data layer read failure information, when producing diagnostic test results, directly event of failure type codes, event of failure numbering, error code, event of failure result data are dealt into the OMC webmaster with the form of message, provide corresponding failure-description by the OMC webmaster according to event of failure type codes, event of failure numbering, parse error sign indicating number, event of failure result data draw diagnostic result.Here, diagnosis unit also can replace with the event of failure result data many groups of detection data in the historical data buffer area.
Need to prove that when carrying out the fault message analysis, historical data analysis and correlation analysis can carry out simultaneously, also can select one and carry out.
Fig. 3 is a fault information acquisition system construction drawing of the present invention, and as shown in Figure 3, fault information acquisition of the present invention system comprises: detect data capture unit 31, detect data analysis unit 32, fault message memory cell 33; Wherein,
Detect the detection data that data capture unit 31 is used to obtain event of failure.
Here, can obtain detection data by function call mode and/or message transfer mode from the external fault checkout equipment.Wherein, the function call mode is to call in the cycle, for different event of failures, can set the different cycles, and the message transfer mode then need only be waited for the message of reception from the external fault checkout equipment, to obtain the detection data.
Detect data analysis unit 32 and be used for the detection data that described detection data capture unit 31 obtains are analyzed, and the fault message of fault message memory cell 33 storages is made amendment according to analysis result.
Fault message memory cell 33 is used to store the fault message of each event of failure.
In the fault message memory cell 33, according to unified and standard data structure each event of failure is stored, here, the fault message according to unified and standard data structure storage generally comprises parameter: event of failure type codes, event of failure numbering, event of failure result data, error code, event of failure result data type, shielding mark.
Described fault information acquisition system also comprises parameter set unit 34, historical data buffer unit 35,
Parameter set unit 34 is used to be provided with event of failure type codes, event of failure numbering, the buffer memory frequency threshold value of event of failure, the error code span of event of failure, here,
Detect data analysis unit 32, also be used for the parameter that is provided with according to parameter set unit 34, with the detection metadata cache that obtains to historical data buffer unit 35; And at the detection metadata cache that will obtain to historical data buffer unit 35, whether the detection data of judging the event of failure of buffer memory in the described historical data buffer unit 35 are consistent, if it is consistent, the error code span of the event of failure that is provided with according to parameter set unit 34 and the detection data of obtaining are revised the fault message of the described event of failure of storage in the fault message memory cell 33; Otherwise, the fault message of the described event of failure of storage in the fault message memory cell 33 is not made amendment.
Historical data buffer unit 35, also be used for after receiving from the detection data that detect data analysis unit 32, parameter according to parameter set unit 34 settings, judge self the buffer memory event of failure group number that detects data whether reached the buffer memory frequency threshold value of described event of failure, if, with buffer memory the earliest the one group detection data of the described detection Data Update of obtaining about described event of failure; Otherwise, the detection data of the direct described described event of failure of receiving of buffer memory, specific implementation is referring to the description in the step 22.
Parameter set unit 34 can also be used to be provided with event of failure type codes, event of failure numbering, shielding mark, event of failure the dependent failure incident, concern with associated shield between the dependent failure incident, the error code span of event of failure, here,
Described detection data capture unit 31 is before obtaining the detection data of event of failure, the associated shield relation that also is used for being provided with according to parameter set unit 34 judges whether described event of failure is the master control event of failure, described event of failure is the master control event of failure, after obtaining the detection data and analysis of described master control event of failure, according to the state of the described master control event of failure of storage in the fault message memory cell 33 and the parameter of parameter set unit 34 settings, revise the shielding mark of the controlled event of failure corresponding of storage in the fault message memory cell 33 with it;
Described event of failure is controlled event of failure, the shielding that then detects the described event of failure of storage in the data capture unit 31 failure judgement information memory cells 33 marks whether to be 0, if detect the detection data that data capture unit 31 obtains described event of failure; Otherwise, detecting the parameter that data capture unit 31 is provided with according to parameter set unit 34, the error code of revising controlled event of failure described in the fault message memory cell 33 is the value between the unknown area, shield detection.
Parameter set unit 34 can also be used to be provided with event of failure type codes, event of failure numbering, event of failure the dependent failure incident, concern with correlation behavior between the dependent failure incident, the error code span of event of failure, here,
Described detection data analysis unit 32 also is used for the parameter according to parameter set unit 34 settings, and the described event of failure of storage in the fault message memory cell 33 or the fault message of its dependent failure incident are made amendment.
In addition, fault information acquisition of the present invention system also comprises Alarm Unit 36, the fault message that is used for the event of failure of periodic queries fault message memory cell 33 storages, shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, error code is converted into alarm or returns to form, and the state consistency that whether reports with described event of failure the last time of wherein storage of the state of comparison error sign indicating number after transforming, if consistent, do not carry out any operation; Otherwise, according to the fault message of described event of failure, send alarm information or recover message, and upgrade the state that described event of failure the last time of self storage reports to the OMC webmaster.
Fault information acquisition of the present invention system also comprises diagnosis unit 37, be used to respond the diagnosis request of OMC webmaster, the fault message of the event of failure of storage directly is dealt into the OMC webmaster to event of failure type codes, event of failure numbering, error code, event of failure result data with the form of message in the read failure information memory cell 33.
Here, the data of storage also can be used for diagnosis unit in the historical data buffer unit 35.
Embodiment 1
In present embodiment, event of failure is the cpu power state, and the fault message of this event of failure need obtain by historical data analysis, and default buffer memory frequency threshold value is 3, and this event of failure does not have correlation.
Fig. 4 is the fault information acquisition method flow chart of embodiment of the invention 1CPU power supply status, and as shown in Figure 4, in the embodiment of the invention 1, the fault message of gathering the cpu power state comprises following steps;
Step 41: receive cpu power status detection data.
Here, cpu power status detection data are a numerical value, show whether CPU switches on.
Step 42: whether what judgement had been stored has reached 3 groups about cpu power status detection data, if, execution in step 43; Otherwise, forward step 46 to.
Step 43: cover detection data the earliest in the detection data of having preserved with the cpu power status detection data of described reception.
Step 44: whether 3 groups of judging preservation are consistent about cpu power status detection data, if, execution in step 45; Otherwise flow process finishes.
Step 45: according to judged result, revise the fault message of cpu power state, flow process finishes.
Here, be any in normal, unusual, mistake, the unknown according to detecting the value of data modification, and from respective bins, take out the value of the value of correspondence as error code in the current event of failure at the error code of current event of failure.
Step 46: the cpu power status detection data of the direct described reception of buffer memory, flow process finishes.
Here, after the cpu power status detection data of the described reception of buffer memory, also need to write down which memory location that current memory location is the cpu power state.
Embodiment 2
In the present embodiment, the detection data of receiving are light mouth related data, relevant event of failure is that light mouth state on the throne, Guang Kou have light detection, light mouth reverse frames out-of-lock detection, the dependency relation of above-mentioned event of failure is: if the light mouth is not on the throne, need not to detect the light mouth has unglazed, and whether have light mouth reverse frames losing lock, if the light mouth is unglazed, need not to detect whether have light mouth reverse frames losing lock.Mainly comprise following step:
Fig. 5 is for after the embodiment of the invention 2 receives light mouth related data, revises the realization flow figure of fault message according to the correlation of event of failure, as shown in Figure 5, receive light mouth related data after, revise fault message according to the correlation of event of failure and comprise step:
Step 501: receive the relevant detection data of light mouth.
Here, whether on the throne, the light mouth of the data carry light mouth that the light mouth is relevant have unglazed, whether have data such as light mouth reverse frames losing lock.
Step 502:, judge whether the light mouth is on the throne, if execution in step 503 according to the detection data that receive; Otherwise, forward step 508 to.
Step 503: the error code of revising light mouth state on the throne is normal interval value.
Step 504: according to the detection data that receive, judge whether the light mouth has light, if, execution in step 505; Otherwise, forward step 509 to.
Step 505: the error code that revising the light mouth has light to detect is normal interval value.
Step 506: according to the detection data that receive, judge whether light mouth reverse frames is not in out-of-lock condition, if, execution in step 507; Otherwise, forward step 510 to.
Step 507: the error code of revising light mouth reverse frames out-of-lock detection is normal interval value, and flow process finishes.
Step 508: the error code of revising light mouth state on the throne is the value between exceptions area, and revises relative event of failure light mouth and have that light detects, the error code of light mouth reverse frames out-of-lock detection is the value between the unknown area, and flow process finishes.
The light mouth is not on the throne to be to need alarm, so, certain number during event of failure light mouth status error sign indicating number on the throne is revised as between exceptions area.
Whether do not need to detect the light mouth when light mouth is not on the throne has light, so, certain number during the error code that event of failure light mouth has light to detect is revised as between the unknown area.
Do not need to detect light mouth reverse frames losing lock when the light mouth is not on the throne, so, certain number during the error code of event of failure light mouth reverse frames out-of-lock detection is revised as between the unknown area.
Step 509: the error code that revising the light mouth has light to detect is the value between exceptions area, and the error code of revising relative event of failure light mouth reverse frames out-of-lock detection is the value between the unknown area, and flow process finishes.
The light mouth is unglazed to be to need alarm, so, certain number during the error code that event of failure light mouth has light to detect is revised as between exceptions area.
Do not need to detect light mouth reverse frames losing lock when the light mouth is unglazed, so, certain number during the error code of event of failure light mouth reverse frames out-of-lock detection is revised as between the unknown area.
Step 510: the error code of revising light mouth reverse frames out-of-lock detection is the value between exceptions area, and flow process finishes.
Because light mouth reverse frames losing lock is to need alarm, so, certain number during the error code of event of failure light mouth reverse frames out-of-lock detection is revised as between exceptions area.
Embodiment 3
Present embodiment shields about many-to-one correlation, and controlled event of failure of promptly a plurality of master control event of failure shieldings is the fault state as long as a master control event of failure is arranged, and just shields the detection of controlled event of failure.Can reach correlation shielding purpose by processing to the shielding mark.
Fig. 6 is the correlation shielding realization flow figure of the embodiment of the invention 3, and as shown in Figure 6, the correlation mask steps of the embodiment of the invention 3 is as follows:
Step 601: the detection data that receive event of failure.
Step 602: judge whether described event of failure is the master control event of failure, if, execution in step 603; Otherwise, forward step 608 to.
Step 603: described master control event of failure is detected, and according to the value that detects the data modification error code.
Step 604: judge whether the master control event of failure is in malfunction, if, execution in step 605; Otherwise, forward step 606 to.
Step 605: the shielding mark value of the controlled event of failure of described master control event of failure correspondence is set, and flow process finishes.
Here, the shielding mark value that controlled event of failure specifically is set is N, and wherein, N is the master control event of failure number relevant with controlled event of failure.
Step 606: whether the shielding mark value of judging its controlled event of failure is 0, if flow process finishes; Otherwise, execution in step 607.
Step 607: the shielding mark value of controlled event of failure is subtracted 1, and flow process finishes.
Here, the shielding mark value of controlled event of failure is subtracted 1, be when later on detecting described controlled cell, can judge whether shield detection according to this value, because be N to one shielding, any one master control item just can be made as N shielding mark, when all master control items not being the fault state, the shielding mark just can be kept to 0, and the detection of controlled event of failure is only not conductively-closed.
Step 608: whether the shielding mark value of judging described controlled event of failure is 0, if, execution in step 609; Otherwise, forward step 610 to.
Step 609: described controlled event of failure is detected, and according to the value that detects the data modification error code, flow process finishes.
Step 610: shield detection, the value of the error code of described controlled event of failure is revised as value between the unknown area, flow process finishes.
The above is preferred embodiment of the present invention only, is not to be used to limit protection scope of the present invention.

Claims (15)

1, a kind of fault information acquisition method is characterized in that, this method comprises:
Obtain the detection data of event of failure and the detection data of described event of failure are analyzed;
According to analysis result the fault message of each event of failure is carried out storage and uniform and modification.
2, fault information acquisition method according to claim 1 is characterized in that, the described detection data of obtaining event of failure are: obtain by function call mode and/or message transfer mode.
3, fault information acquisition method according to claim 1 is characterized in that, described fault message comprises: event of failure type codes, event of failure numbering, event of failure result data, error code, event of failure result data type, shielding mark.
4, according to each described fault information acquisition method of claim 1 to 3, it is characterized in that, event of failure type codes, event of failure numbering, the buffer memory frequency threshold value of event of failure, the error code span of event of failure be set,
The described detection data of obtaining event of failure comprise step afterwards: according to the event of failure type codes and the event of failure numbering of described event of failure, or only number according to the event of failure of described event of failure, whether the group number that the described event of failure of judging buffer memory detects data has reached the buffer memory frequency threshold value of described event of failure, if, with buffer memory the earliest the one group detection data of the described detection Data Update of obtaining about described event of failure; Otherwise, the detection data of the direct described described event of failure of receiving of buffer memory;
Described detection data to event of failure are analyzed to the detection data to event of failure and are carried out historical data analysis, be specially: whether the detection data of described event of failure of judging buffer memory are consistent, if revise the fault message of described event of failure according to the error code span of the event of failure of the detection data of buffer memory and setting; Otherwise, can not make amendment to the fault message of described event of failure.
5, fault information acquisition method according to claim 4, it is characterized in that, describedly detect data with the earliest one group of the detection Data Update buffer memory that obtains and be: cover about the earliest one group of the buffer memory of described event of failure with the described detection data of obtaining and detect data or realize by shifting function.
6, according to each described fault information acquisition method of claim 1 to 3, it is characterized in that, the associated shield relation between mark, dependent failure incident, event of failure and the dependent failure incident, the error code span of event of failure are numbered, shielded to event of failure type codes, event of failure that event of failure is set
Before obtaining the detection data of event of failure, also comprise step:
Event of failure type codes and event of failure numbering according to described event of failure, or, inquire about the dependent failure incident and the relation of the associated shield between described event of failure and the dependent failure incident of described event of failure only according to the event of failure numbering of described event of failure;
Judge according to described associated shield relation whether described event of failure is the master control event of failure, described event of failure is the master control event of failure, after obtaining the detection data and analysis of described master control event of failure, according to the state of described master control event of failure, revise the shielding mark of the controlled event of failure corresponding with it;
Described event of failure is controlled event of failure, judges then the shielding of described controlled event of failure marks whether to be 0, if, the detection data and the analysis of obtaining described controlled event of failure; Otherwise the error code of revising described controlled event of failure according to the error code span of the event of failure that is provided with is the value between the unknown area, shield detection.
7, according to each described fault information acquisition method of claim 1 to 3, it is characterized in that, correlation behavior relation between event of failure type codes, event of failure numbering, dependent failure incident, event of failure and the dependent failure incident of event of failure, the error code span of event of failure are set
Described detection data to event of failure are analyzed to the detection data to event of failure and are carried out correlation analysis, be specially: obtain after the detection data of event of failure, event of failure type codes and event of failure numbering according to described event of failure, or only number according to the event of failure of described event of failure, inquire about the dependent failure incident and the relation of the correlation behavior between described event of failure and the dependent failure incident of described event of failure, then according to the state of the dependent failure incident of described event of failure, correlation behavior relation between event of failure and the dependent failure incident, and the error code span of the event of failure that is provided with, the fault message of described event of failure or its dependent failure incident is made amendment.
8, fault information acquisition method according to claim 3 is characterized in that, this method also comprises:
The fault message of the event of failure of Alarm Unit periodic queries storage, the shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, with error code be converted into the alarm or return to form;
If the state consistency whether state after the comparison error sign indicating number transforms reports with described event of failure the last time consistent, is not carried out any operation; Otherwise, according to the fault message of described event of failure, send alarm information or recover message, and upgrade the state that described event of failure the last time of self storage reports to the OMC of operation maintenance center webmaster.
9, fault information acquisition method according to claim 3 is characterized in that, this method also comprises:
Diagnosis unit is according to the diagnosis request of OMC webmaster, and the fault message of read failure incident directly is dealt into the OMC webmaster to event of failure type codes, event of failure numbering, error code, event of failure result data with the form of message;
The OMC webmaster provides corresponding failure-description according to event of failure type codes, event of failure numbering, and parse error sign indicating number, event of failure result data draw diagnostic result.
10, a kind of fault information acquisition system is characterized in that this system comprises: detect data capture unit, detect data analysis unit, fault message memory cell; Wherein,
Described detection data capture unit is used to obtain the detection data of event of failure;
Described detection data analysis unit is used for the detection data that described detection data capture unit obtains are analyzed, and according to analysis result the fault message of fault message cell stores is made amendment;
Described fault message memory cell is used to store the fault message of each event of failure.
11, fault information acquisition according to claim 10 system is characterized in that, described fault information acquisition system also comprises parameter set unit, historical data buffer unit;
Described parameter set unit is used to be provided with event of failure type codes, event of failure numbering, the buffer memory frequency threshold value of event of failure, the error code span of event of failure;
Described detection data analysis unit also is used for the parameter that is provided with according to parameter set unit, with the detection metadata cache that obtains to the historical data buffer unit; And at the detection metadata cache that will obtain to the historical data buffer unit, whether the detection data of judging the event of failure of buffer memory in the described historical data buffer unit are consistent, if it is consistent, according to the error code span of the event of failure of parameter set unit setting and the detection data of obtaining, revise the fault message of the described event of failure of storing in the fault message memory cell; Otherwise, the fault message of the described event of failure stored in the fault message memory cell is not made amendment;
Described historical data buffer unit, be used for after receiving from the detection data that detect data analysis unit, parameter according to the parameter set unit setting, judge self the buffer memory event of failure group number that detects data whether reached the buffer memory frequency threshold value of described event of failure, if, with buffer memory the earliest the one group detection data of the described detection Data Update of obtaining about described event of failure; Otherwise, the detection data of the direct described described event of failure of receiving of buffer memory.
12, fault information acquisition according to claim 10 system, it is characterized in that, described fault information acquisition system also comprises parameter set unit, described parameter set unit be used to be provided with event of failure type codes, event of failure numbering, shielding mark, event of failure the dependent failure incident, concern with associated shield between the dependent failure incident, the error code span of event of failure
Described detection data capture unit is before obtaining the detection data of event of failure, also be used for judging according to the associated shield relation of described parameter set unit setting whether described event of failure is the master control event of failure, described event of failure is the master control event of failure, after then obtaining the detection data and analysis of described master control event of failure, according to the state of the described master control event of failure of storing in the fault message memory cell and the shielding relation of parameter set unit setting, revise the shielding mark of the controlled event of failure of storing in the fault message memory cell corresponding with it;
Described event of failure is controlled event of failure, and the shielding that then detects the described event of failure of storing in the data capture unit failure judgement information memory cell marks whether to be 0, if detect the detection data that data capture unit obtains described event of failure; Otherwise, detecting the parameter that data capture unit is provided with according to parameter set unit, the error code of revising controlled event of failure described in the fault message memory cell is the value between the unknown area, shield detection.
13, fault information acquisition according to claim 10 system, it is characterized in that, described fault information acquisition system also comprises parameter set unit, described parameter set unit be used to be provided with event of failure type codes, event of failure numbering, event of failure the dependent failure incident, concern with correlation behavior between the dependent failure incident, the error code span of event of failure
Described detection data analysis unit also is used for the parameter that is provided with according to parameter set unit, and the described event of failure stored in the fault message memory cell or the fault message of its dependent failure incident are made amendment.
14, fault information acquisition according to claim 10 system, it is characterized in that, this system also comprises Alarm Unit, the fault message that is used for the event of failure that periodic queries fault message memory cell stores, shielding in the failure judgement information marks whether to be 0, if, do not carry out any operation, described event of failure is in masked state; If not, error code is converted into alarm or returns to form, and the state consistency that whether reports with described event of failure the last time of wherein storage of the state of comparison error sign indicating number after transforming, if consistent, do not carry out any operation; Otherwise, according to the fault message of described event of failure, send alarm information or recover message, and upgrade the state that described event of failure the last time of self storage reports to the OMC webmaster.
15, fault information acquisition according to claim 10 system, it is characterized in that, this system also comprises diagnosis unit, be used to respond the diagnosis request of OMC webmaster, the fault message of the event of failure of storing in the read failure information memory cell directly is dealt into the OMC webmaster to event of failure type codes, event of failure numbering, error code, event of failure result data with the form of message.
CN2009100854951A 2009-05-22 2009-05-22 Fault information acquisition method and system Expired - Fee Related CN101562827B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100854951A CN101562827B (en) 2009-05-22 2009-05-22 Fault information acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100854951A CN101562827B (en) 2009-05-22 2009-05-22 Fault information acquisition method and system

Publications (2)

Publication Number Publication Date
CN101562827A true CN101562827A (en) 2009-10-21
CN101562827B CN101562827B (en) 2011-05-25

Family

ID=41221402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100854951A Expired - Fee Related CN101562827B (en) 2009-05-22 2009-05-22 Fault information acquisition method and system

Country Status (1)

Country Link
CN (1) CN101562827B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102215145A (en) * 2011-06-07 2011-10-12 中兴通讯股份有限公司 Method and device for reporting detection result of link connected state
CN102693177A (en) * 2011-03-23 2012-09-26 中国移动通信集团公司 Fault diagnosing and processing methods of virtual machine as well as device and system thereof
CN102984739A (en) * 2011-09-07 2013-03-20 中兴通讯股份有限公司 Breakdown information processing method and processing device
WO2013040922A1 (en) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 Method and apparatus for acquiring data after failures occurred in base station
WO2013046210A1 (en) * 2011-09-26 2013-04-04 Wipro Limited System and method for active knowledge management
CN105044550A (en) * 2015-04-28 2015-11-11 国家电网公司 Distribution network line fault positioning method based on fault current discharge path
WO2016062154A1 (en) * 2014-10-21 2016-04-28 中兴通讯股份有限公司 Information acquisition method and device, and communication system
WO2016090982A1 (en) * 2014-12-12 2016-06-16 中兴通讯股份有限公司 Base station malfunction collection method and system
CN106126397A (en) * 2016-06-19 2016-11-16 乐视控股(北京)有限公司 The processing method of program crashing message and system
CN106155827A (en) * 2016-06-28 2016-11-23 浪潮(北京)电子信息产业有限公司 A kind of cpu fault its diagnosis processing method based on Linux system and system
CN106155040A (en) * 2015-05-13 2016-11-23 奥特润株式会社 The failure code control system of engine controller and control method
CN106339297A (en) * 2016-09-14 2017-01-18 郑州云海信息技术有限公司 Method and system for warning failures of storage system in real time
CN106708234A (en) * 2016-12-28 2017-05-24 郑州云海信息技术有限公司 Method and device for monitoring states of power supplies of system on basis of CPLD
CN106873576A (en) * 2017-03-21 2017-06-20 奇瑞汽车股份有限公司 The detection method and device of vehicle trouble
CN107885838A (en) * 2017-11-09 2018-04-06 陕西外号信息技术有限公司 A kind of optical label fault detecting and positioning method and system based on user data
CN108249243A (en) * 2018-02-02 2018-07-06 河南中盛物联网有限公司 A kind of elevator Internet of Things fault recognition method
CN108667918A (en) * 2018-04-25 2018-10-16 青岛海信移动通信技术股份有限公司 A kind of device status monitoring method and device
CN109459635A (en) * 2018-11-09 2019-03-12 杭州妙娱科技有限公司 Reality-virtualizing game equipment fault monitoring method and device
CN109947798A (en) * 2017-09-18 2019-06-28 中国移动通信有限公司研究院 A kind of processing method and processing device of stream event
CN110532122A (en) * 2019-08-26 2019-12-03 东软医疗系统股份有限公司 Failure analysis methods and system, electronic equipment, storage medium
CN110728261A (en) * 2019-10-23 2020-01-24 武汉奇致激光技术股份有限公司 Method for feeding back fault information of laser medical cosmetic equipment
CN112867040A (en) * 2020-11-11 2021-05-28 南京熊猫电子股份有限公司 Automatic alarm analysis system for small base station
CN113281587A (en) * 2021-04-26 2021-08-20 Tcl王牌电器(惠州)有限公司 Detection method and system based on manufacturability design simulator
CN116032799A (en) * 2021-10-25 2023-04-28 中移物联网有限公司 Fault detection method, device and storage medium

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693177A (en) * 2011-03-23 2012-09-26 中国移动通信集团公司 Fault diagnosing and processing methods of virtual machine as well as device and system thereof
CN102693177B (en) * 2011-03-23 2015-02-04 中国移动通信集团公司 Fault diagnosing and processing methods of virtual machine as well as device and system thereof
CN102215145A (en) * 2011-06-07 2011-10-12 中兴通讯股份有限公司 Method and device for reporting detection result of link connected state
CN102984739A (en) * 2011-09-07 2013-03-20 中兴通讯股份有限公司 Breakdown information processing method and processing device
WO2013040922A1 (en) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 Method and apparatus for acquiring data after failures occurred in base station
WO2013046210A1 (en) * 2011-09-26 2013-04-04 Wipro Limited System and method for active knowledge management
CN105591687A (en) * 2014-10-21 2016-05-18 中兴通讯股份有限公司 Information acquisition method, Information acquisition device and communication system
WO2016062154A1 (en) * 2014-10-21 2016-04-28 中兴通讯股份有限公司 Information acquisition method and device, and communication system
WO2016090982A1 (en) * 2014-12-12 2016-06-16 中兴通讯股份有限公司 Base station malfunction collection method and system
CN105744556A (en) * 2014-12-12 2016-07-06 中兴通讯股份有限公司 Method and system for base station fault acquisition
CN105044550A (en) * 2015-04-28 2015-11-11 国家电网公司 Distribution network line fault positioning method based on fault current discharge path
CN106155040B (en) * 2015-05-13 2019-11-05 奥特润株式会社 The fault code control system and control method of engine controller
CN106155040A (en) * 2015-05-13 2016-11-23 奥特润株式会社 The failure code control system of engine controller and control method
CN106126397A (en) * 2016-06-19 2016-11-16 乐视控股(北京)有限公司 The processing method of program crashing message and system
CN106155827A (en) * 2016-06-28 2016-11-23 浪潮(北京)电子信息产业有限公司 A kind of cpu fault its diagnosis processing method based on Linux system and system
CN106339297A (en) * 2016-09-14 2017-01-18 郑州云海信息技术有限公司 Method and system for warning failures of storage system in real time
CN106708234A (en) * 2016-12-28 2017-05-24 郑州云海信息技术有限公司 Method and device for monitoring states of power supplies of system on basis of CPLD
CN106873576A (en) * 2017-03-21 2017-06-20 奇瑞汽车股份有限公司 The detection method and device of vehicle trouble
CN109947798A (en) * 2017-09-18 2019-06-28 中国移动通信有限公司研究院 A kind of processing method and processing device of stream event
CN107885838A (en) * 2017-11-09 2018-04-06 陕西外号信息技术有限公司 A kind of optical label fault detecting and positioning method and system based on user data
CN107885838B (en) * 2017-11-09 2021-12-21 陕西外号信息技术有限公司 Optical label fault detection and positioning method and system based on user data
CN108249243A (en) * 2018-02-02 2018-07-06 河南中盛物联网有限公司 A kind of elevator Internet of Things fault recognition method
CN108667918A (en) * 2018-04-25 2018-10-16 青岛海信移动通信技术股份有限公司 A kind of device status monitoring method and device
CN109459635A (en) * 2018-11-09 2019-03-12 杭州妙娱科技有限公司 Reality-virtualizing game equipment fault monitoring method and device
CN110532122A (en) * 2019-08-26 2019-12-03 东软医疗系统股份有限公司 Failure analysis methods and system, electronic equipment, storage medium
CN110532122B (en) * 2019-08-26 2023-05-30 东软医疗系统股份有限公司 Fault analysis method and system, electronic equipment and storage medium
CN110728261A (en) * 2019-10-23 2020-01-24 武汉奇致激光技术股份有限公司 Method for feeding back fault information of laser medical cosmetic equipment
CN112867040A (en) * 2020-11-11 2021-05-28 南京熊猫电子股份有限公司 Automatic alarm analysis system for small base station
CN113281587A (en) * 2021-04-26 2021-08-20 Tcl王牌电器(惠州)有限公司 Detection method and system based on manufacturability design simulator
CN113281587B (en) * 2021-04-26 2023-03-10 Tcl王牌电器(惠州)有限公司 Detection method and system based on manufacturability design simulator
CN116032799A (en) * 2021-10-25 2023-04-28 中移物联网有限公司 Fault detection method, device and storage medium

Also Published As

Publication number Publication date
CN101562827B (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN101562827B (en) Fault information acquisition method and system
US5119377A (en) System and method for software error early detection and data capture
US6651183B1 (en) Technique for referencing failure information representative of multiple related failures in a distributed computing environment
US7941707B2 (en) Gathering information for use in diagnostic data dumping upon failure occurrence
US7007269B2 (en) Method of providing open access to application profiling data
US7313661B1 (en) Tool for identifying causes of memory leaks
EP2609501B1 (en) Dynamic calculation of sample profile reports
CN102340808B (en) Alert processing method and device
CN102567185B (en) Monitoring method of application server
CN101188523A (en) Generation method and generation system of alarm association rules
WO2002054255A1 (en) A method for managing faults in a computer system environment
CN111046011A (en) Log collection method, system, node, electronic device and readable storage medium
US20200073738A1 (en) Error incident fingerprinting with unique static identifiers
US20130318505A1 (en) Efficient Unified Tracing of Kernel and User Events with Multi-Mode Stacking
CN101645736A (en) Detection method and device of validity of historical performance data
CN114385551B (en) Log time-sharing management method, device, equipment and storage medium
CN108089978A (en) A kind of diagnostic method for analyzing ASP.NET application software performance and failure
CN110659147A (en) Self-repairing method and system based on module self-checking behavior
CN1671110A (en) An automatic fault location method and system
Ng Developing RFID database models for analysing moving tags in supply chain management
JP2004530209A (en) Method for detecting and recording system information and behavior in a component-based distributed software system operating in parallel
US7447732B2 (en) Recoverable return code tracking and notification for autonomic systems
CN112100019B (en) Multi-source fault collaborative analysis positioning method for large-scale system
JP2015049705A (en) Log generation device and log generation method
CN106096804B (en) Monitoring method for whole maintenance process of intelligent power grid dispatching control system model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110525

Termination date: 20170522

CF01 Termination of patent right due to non-payment of annual fee