CN103793309A - Method and device for early warning of batch services - Google Patents

Method and device for early warning of batch services Download PDF

Info

Publication number
CN103793309A
CN103793309A CN201210423143.4A CN201210423143A CN103793309A CN 103793309 A CN103793309 A CN 103793309A CN 201210423143 A CN201210423143 A CN 201210423143A CN 103793309 A CN103793309 A CN 103793309A
Authority
CN
China
Prior art keywords
average
batch service
batch
time
baseline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210423143.4A
Other languages
Chinese (zh)
Other versions
CN103793309B (en
Inventor
吴永卫
陈航
肖爱元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Group Zhejiang Co Ltd
Priority to CN201210423143.4A priority Critical patent/CN103793309B/en
Publication of CN103793309A publication Critical patent/CN103793309A/en
Application granted granted Critical
Publication of CN103793309B publication Critical patent/CN103793309B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention is suitable for the field of databases and provides a method and device for early warning of batch services. The method comprises the steps of detecting system resource consumption, and utilizing detection results to calculate system resource consumption indexes; calculating time baselines, comparing the system resource consumption indexes with the time baselines, and judging whether a system runs normally; when the system does not run normally, comparing currently running services with batch service characteristics of known batch services in a batch service list, and recognizing abnormal batch services; and sending warning information to a warning platform according to recognition results. The calculated system resource consumption indexes are compared with the time baselines to judge whether the system runs normally, the abnormal batch services are recognized by comparison with the batch service characteristics of the known batch services, and the corresponding warning information is sent to the warning platform. Through the method and device for early warning of the batch services, maintenance workers can rapidly foresee the abnormal batch services and make rapid and correct treatment decisions.

Description

A kind of batch service method for early warning and device
Technical field
The present invention is applicable to database field, relates in particular to a kind of batch service method for early warning and device.
Background technology
Along with business operation support system (Business & Operation Support System, BOSS) the continuous increase of business demand in, in BOSS, the type of batch service and quantity are more and more, and the abnormal number of times of batch service also increases thereupon.Batch service refers to that batch service is not in section initiation normal time extremely, although or in section initiation normal time, took multi-system resource.Batch service can seriously consume central processing unit (Central Processing Unit extremely; and input and output (input/output CPU); I/O) resource of bus, causes the processing speed of each business in BOSS system to reduce, and increases user's stand-by period.
In prior art, adopt the means of Centralized Monitoring to find Structured Query Language (SQL) (the Structured Query Language that took multi-system resource, SQL) statement warn data base administrator (DatabaseAdministrator, DBA), but, because class of business is various, DBA cannot understand the model of each business in depth, cause DBA immediately problem SQL statement to be corresponded to concrete business, the abnormal problem of batch service cannot quick solution having occurred, the slow-paced problem of business processing is not resolved.
Summary of the invention
The embodiment of the present invention provides a kind of batch service method for early warning, be intended to solve available technology adopting Centralized Monitoring technology and find the SQL statement that took multi-system resource, the batch service cannot quick solution having occurred is abnormal, the slow-paced problem of business processing.
The embodiment of the present invention is achieved in that a kind of batch service method for early warning, and described method comprises the steps:
System resources consumption is surveyed, and utilized result of detection computing system resource to consume index;
Computing time, baseline, consumed index and the contrast of described time basis by described system resource, judged whether system operation is normal;
Move when undesired when system, by the batch service Characteristic Contrast of known batch service in the business of current operation and batch service list, identify abnormal batch service;
Send a warning message to alarm platform according to recognition result.
The embodiment of the present invention also provides a kind of batch service prior-warning device, and described device comprises:
Resource probe unit, for system resources consumption is surveyed, and utilizes result of detection computing system resource to consume index;
Time basis unit, for baseline computing time, consumes index and the contrast of described time basis by described system resource, judges whether system operation is normal;
Batch service recognition unit, moves when undesired when system, by the batch service Characteristic Contrast of known batch service in the business of current operation and batch service list, identifies abnormal batch service; And
Information Alarm Unit, for sending a warning message to alarm platform according to recognition result.
In embodiments of the present invention, judge by the system resource of calculating being consumed to index and time basis contrast whether system operation is normal, by identifying abnormal batch service with the batch service Characteristic Contrast of known batch service, and send corresponding warning information to alarm platform.Contribute to maintainer predict rapidly abnormal batch service and make correct processing decision-making fast.
Accompanying drawing explanation
Fig. 1 represents the realization flow figure of the batch service method for early warning that the embodiment of the present invention provides;
Fig. 2 represents the structural drawing of the batch service prior-warning device that the embodiment of the present invention provides;
Fig. 3 represents the structural drawing of the resource probe unit that the embodiment of the present invention provides;
Fig. 4 represents the structural drawing of the time basis unit that the embodiment of the present invention provides;
Fig. 5 represents the structural drawing of the batch service recognition unit that the embodiment of the present invention provides;
Fig. 6 represents the structural drawing of the information Alarm Unit that the embodiment of the present invention provides.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
In inventive embodiments, consume index and time basis by exploration operation system resources consumption and/or Database Systems resource consumption computing system resource, both are contrasted and judge whether system operation is normal.To moving undesired system, by the batch service Characteristic Contrast of the business of current operation and known batch service is identified to abnormal batch service, and send corresponding warning information to alarm platform.
Fig. 1 shows the realization flow of the batch service method for early warning that the embodiment of the present invention provides, and details are as follows:
In step S101, system resources consumption is surveyed, and utilized result of detection computing system resource to consume index;
In embodiments of the present invention, system resources consumption is at least one in operating-system resources consumption and Database Systems resource consumption.
In embodiments of the present invention, every 30 seconds, operating-system resources consumption and/or Database Systems resource consumption are surveyed once.
As one embodiment of the present of invention, utilization rate pcpu, memory usage pmen that operating-system resources consumption is CPU, and at least one in I/O utilization rate pio.
As one embodiment of the present of invention, Database Systems resource consumption comprises statistical information consumption, and live traffic consumption.Wherein:
Statistical information consumption is accumulative total use amount ps_undo, the average hard parsing ps_hard per second that average login times ps_logon per second, average logic per second are read ps_logic, physical read ps_physi average per second, number of transactions ps_trans average per second, day quality ps_redos average per second, active session ps_activ average per second, rewind journal average per second, and on average vernier per second is opened at least one in several ps_cursor.
Live traffic consumption is that roll-back segment contention is waited for pw_us, index contention wait pw_ic, serial contention wait pw_sq, focus piece contention wait pw_hotblk, and journal file is synchronously waited at least one in pw_logfile.
In embodiments of the present invention, when calculating operation system resource consumes index p sys, consider the importance of CPU and memory usage, by CPU, internal memory, and I/O three's proportion is configured to 5: 4: 1, and 50% is made up of cpu resource, 40% is made up of memory source, and 10% is made up of I/O resource.
In step S102, computing time, baseline, consumed index and time basis contrast by system resource, judged whether system operation is normal;
As one embodiment of the present of invention, time basis comprises at least one in operating-system resources baseline and Database Systems resource baseline.Wherein:
Operating-system resources baseline is CPU baseline bs_cpu_N, internal memory baseline bs_io_N, and I/O baseline bs_mem_N.Database Systems resource baseline comprises login times baseline bs_logon_N, logic is read baseline bs_logic_N, physical read baseline bs_physi_N, number of transactions baseline bs_trans_N, daily record amount baseline bs_redos_N, active session baseline bs_redos_N, the accumulative total use amount baseline bs_undo_N of rewind journal, the hard baseline bs_hard_N that resolves, vernier is opened base line bs_cursor_N, roll-back segment contention is waited for baseline bs_us_N, index contention is waited for baseline bs_ic_N, series contention is waited for baseline bs_sq_N, focus piece contention is waited for baseline bs_hotblk_N, and journal file is synchronously waited at least one in baseline bs_logfile_N.
In embodiments of the present invention, operating-system resources baseline calculates by following step:
1, according to the batch service execution time of statistics, the activity duration is carried out to segmentation;
2, respectively calculating operation system resources consumption in setting-up time threshold value within each time period of every day the average M1 of average, and in setting-up time threshold value within each time period of every day the average M2 of mxm., and the average M3 of computation of mean values M1 and average M2.
In embodiments of the present invention, the batch service execution time of statistics can be divided into 4 time periods, be respectively 22:00-6:00,6:00-8:00,8:00-17:00,17:00-22:00, the time threshold of calculating operation system resource baseline is set as 2 weeks.
To calculate CPU baseline bs_cpu_N as example is described further, in 12 o'clock mornings of every day, calculate respectively utilization rate nearest 2 weeks the average M1-N of average and the average M2-N of mxm. within 4 time periods of every CPU day, then M1-N and M2-N are on average obtained to bs_cpu_N.Wherein, N gets 1,2,3,4, and bs_cpu_1 is illustrated in the CPU baseline of 22:00-6:00, and bs_cpu_2 is illustrated in the CPU baseline of 6:00-8:00, and bs_cpu_3 is illustrated in the CPU baseline of 8:00-17:00, and bs_cpu_4 is illustrated in the CPU baseline of 17:00-22:00.
In embodiments of the present invention, Database Systems resource baseline calculates by following step:
1, according to the batch service execution time of statistics, the activity duration is carried out to segmentation;
2, respectively computational data storehouse system resources consumption in setting-up time threshold value within each time period of every day the average M4 of average, and in setting-up time threshold value within each time period of every day the average M5 of mxm., and the average M6 of computation of mean values M4 and average M5.
In embodiments of the present invention, the computing method of Database Systems resource baseline are identical with the computing method of operating-system resources baseline.Each baseline computation process, all identical with the computation process of above-mentioned CPU baseline bs_cpu_N.
In step S103, move when undesired when system, by the batch service Characteristic Contrast of known batch service in the business of current operation and batch service list, identify abnormal batch service;
In embodiments of the present invention, when system is moved when undesired, the batch service feature of known batch service in the business of current operation and batch service list is compared to the abnormal batch service of identification, in the time of system normal operation, etc. survey pending next time.Batch service feature comprises execution time section, characteristic query statement SQL, characteristic query statement mark SQL_ID, and batch service title.Wherein, determine characteristic query statement mark SQL_ID by characteristic query statement SQL.
In embodiments of the present invention, the step of identifying abnormal batch service is specially:
1, timing scans data base view V session, arranges out system resource and consume resource in seniority among brothers and sisters first busy waiting event and consume the SQL_ID of seniority among brothers and sisters first.
If 2 system resources consume the batch service baseline of index higher than current time, this SQL_ID is scanned in batch service list.
3, by the SQL_ID contrast in SQL_ID and the batch service list of the session perform statement SQL of current operation, if coupling is that known batch service is abnormal, may be that unknown batch service is abnormal if do not mate, may this abnormal traffic not be batch service yet.
In step S104, send a warning message to alarm platform according to recognition result.
As a preferred embodiment of the present invention, the step sending a warning message to alarm platform according to recognition result is at least one in following step:
1,, in the time that abnormal batch service is known batch service, whether judgement is abnormal carries out in the normal period in batches, is to send batch service property abnormality warning information to alarm platform, carries out warning information to alarm platform otherwise send batch service abnormal time point;
2,, in the time that abnormal batch service is unknown batch service, send unknown service exception warning information to alarm platform;
3, when finding new batch service, or while finding the new feature of existing batch service, the new feature information of new batch service or existing batch service of adding is in batch service list;
4, in the time that abnormal traffic is irrelevant with batch service, sends non-batch abnormality alarming information to alarm platform, and add document library to put on record abnormal conditions.
In embodiments of the present invention, suppose known batch service A and responsible official B thereof.If batch service A is abnormal, further compare its uptime section, if the time period does not meet, " the improper time period of batch service A carries out, and please contact batch service responsible official B and stop immediately in alarm." if the time period meets, " batch service A property abnormality, please contact application batch service responsible official B and investigation immediately in alarm! ".
For batch service in batch service list not, press the dissimilar division of system resources consumption index: newly-increased type (Insertion Insensible by force, II), by force more remodeling (Modification Insensible, and strong deletion type (Deletion Insensible MI), DI), the combined business (CombinationJob of first three types, and abnormal (Not Batch, the NB) totally five types of non-batch service CJ).In the time of alarm, can send warning information according to these five kinds of dissimilar key words, the personnel that maintain easily link up identification according to type and related application personnel rapidly, and carry out respective handling.
Fig. 2 shows the batch service prior-warning device structure that the embodiment of the present invention provides, and for convenience of description, only shows the part relevant to the embodiment of the present invention.
21 pairs of system resources consumptions of resource probe unit (Resource Probe Model, RPM) are surveyed, and utilize result of detection computing system resource to consume index.
Time basis unit (Time-based Baseline Model, TBM) baseline 22 computing time, consumes index and time basis contrast by system resource, judges whether system operation is normal.
Batch service recognition unit (Batch Job Identification Model, BJIM) 23 move when undesired when system, by the batch service Characteristic Contrast of known batch service in the business of current operation and batch service list, identify abnormal batch service.
Information Alarm Unit (Alarm Policy Model, APM) 24 sends a warning message to alarm platform according to batch service recognition unit 23 recognition results.
The embodiment of the present invention adopts three layers of application architecture from overall design, is respectively: metadata acquisition layer, data analysis layer and application layer.Three layers of application architecture, separately between layers interdepend again, embody the dirigibility of framework, for batch jobs provide method for early warning extremely.
Resource probe unit RPM belongs to metadata acquisition layer, is responsible for the host resources such as monitored host CPU, internal memory and IO to survey and be kept in database, for data analysis layer provides basic data.
Time basis unit TBM and batch service recognition unit BJIM belong to data analysis layer, be responsible for the collected metadata of acquisition layer to analyze, timing is calculated the data in database, draws time basis, and generated data result, for application layer provides alarm foundation.
Information Alarm Unit APM belongs to application layer, is responsible for alarm and the note upgrading etc. of batch jobs abnormal conditions, and for maintainer can follow the tracks of, fast and accurately processing provides decision information extremely.
Fig. 3 shows the structure of the resource probe unit that the embodiment of the present invention provides, and details are as follows:
Resource probe unit RPM21 comprises operating-system resources probe module 31 and/or Database Systems resource probe module 32.Wherein:
Operating-system resources probe module 31 exploration operation system resources consumptions.
Database Systems resource probe module 32 detection data storehouse system resources consumptions.
As one embodiment of the present of invention, Database Systems resource probe module 32 comprises statistical information probe submodule 321 and/or live traffic probe submodule 322.Wherein:
Statistical information probe submodule 321 is surveyed statistical information consumption.
Live traffic probe submodule 322 detected event business consume.
As one embodiment of the present of invention, utilization rate pcpu, memory usage pmen that operating-system resources consumption is CPU, and at least one in I/O utilization rate pio.
As one embodiment of the present of invention, statistical information consumption is accumulative total use amount ps_undo, the average hard parsing ps_hard per second that average login times ps_logon per second, average logic per second are read ps_logic, physical read ps_physi average per second, number of transactions ps_trans average per second, day quality ps_redos average per second, active session ps_activ average per second, rewind journal average per second, and on average vernier per second is opened several ps_cursor.Live traffic consumption comprises roll-back segment contention wait pw_us, index contention wait pw_ic, serial contention wait pw_sq, focus piece contention wait pw_hotblk, and journal file is synchronously waited at least one in pw_logfile.
In embodiments of the present invention, when calculating operation system resource consumes index p sys, consider the importance of CPU and memory usage, by CPU, internal memory, and I/O three's proportion is configured to 5: 4: 1, and 50% is made up of cpu resource, 40% is made up of memory source, and 10% is made up of I/O resource.
Fig. 4 shows the structure of the time basis unit that the embodiment of the present invention provides, and details are as follows:
Time basis unit TBM22 comprises operating-system resources baseline module 41 and/or Database Systems resource baseline module 42.Wherein:
Operating-system resources baseline module 41 calculating operation system resource baselines.
Database Systems resource baseline module 42 computational data storehouse system resource baselines.
As one embodiment of the present of invention, operating-system resources baseline module 41 comprises that the activity duration splits submodule 411 and the first mean value computation submodule 412.Wherein:
Activity duration splits submodule 411, according to the batch service execution time of statistics, the activity duration is carried out to segmentation.
The first mean value computation submodule 412 respectively calculating operation system resources consumptions in setting-up time threshold value within each time period of every day the average M1 of average, and in setting-up time threshold value within each time period of every day the average M2 of mxm., and the average M3 of computation of mean values M1 and average M2.
As one embodiment of the present of invention, Database Systems resource baseline module 42 comprises the second mean value computation submodule 421.The second mean value computation submodule 421 respectively computational data storehouse system resources consumptions in setting-up time threshold value within each time period of every day the average M4 of average, and in setting-up time threshold value within each time period of every day the average M5 of mxm., and the average M6 of computation of mean values M4 and average M5.
As one embodiment of the present of invention, operating-system resources baseline is CPU baseline bs_cpu_N, internal memory baseline bs_io_N, and at least one in I/O baseline bs_mem_N.Database Systems resource baseline is login times baseline bs_logon_N, logic is read baseline bs_logic_N, physical read baseline bs_physi_N, number of transactions baseline bs_trans_N, daily record amount baseline bs_redos_N, active session baseline bs_redos_N, the accumulative total use amount baseline bs_undo_N of rewind journal, the hard baseline bs_hard_N that resolves, vernier is opened base line bs_cursor_N, roll-back segment contention is waited for baseline bs_us_N, index contention is waited for baseline bs_ic_N, series contention is waited for baseline bs_sq_N, focus piece contention is waited for baseline bs_hotblk_N, and journal file is synchronously waited at least one in baseline bs_logfile_N.
Fig. 5 shows the structure of the batch service recognition unit that the embodiment of the present invention provides, and details are as follows:
Batch service recognition unit BJIM31 comprises batch service list block 51 and batch service characteristic module.Batch service list block 51 is stored known batch service, and batch service characteristic module 52 identifies abnormal batch service.
In embodiments of the present invention, by the abnormal batch service of batch service feature relative discern of known batch service in the business of current operation and batch service list.Batch service feature comprises execution time section, characteristic query statement SQL, characteristic query statement mark SQL_ID, and batch service title.Wherein, determine characteristic query statement mark SQL_ID by characteristic query statement SQL.
Fig. 6 shows the structure of the information Alarm Unit that the embodiment of the present invention provides, and details are as follows:
Information Alarm Unit APM24 comprises known batch service abnormality alarming module 61, unknown batch service abnormality alarming module 62, newly-increased batch service alarm module 63, and non-batch service abnormality alarming module 64.
Known batch service abnormality alarming module 61 is in the time that abnormal batch service is known batch service, judge whether abnormal batch service is carried out in the normal period, be to send batch service property abnormality warning information to alarm platform, carry out warning information to alarm platform otherwise send batch service abnormal time point.
Unknown batch service abnormality alarming module 62, in the time that abnormal batch service is unknown batch service, is sent unknown service exception warning information to alarm platform.
Newly-increased batch service alarm module 63 is when finding new batch service, or while finding the new feature of existing batch service, the new feature information of new batch service or existing batch service of adding is in batch service list.
Non-batch service abnormality alarming module 64, in the time that abnormal traffic is irrelevant with batch service, is sent non-batch abnormality alarming information to alarm platform, and is added document library to put on record abnormal conditions.
In inventive embodiments, consume index and time basis by exploration operation system resources consumption and/or Database Systems resource consumption computing system resource, both are contrasted and judge whether system operation is normal.To moving undesired system, by the batch service Characteristic Contrast of the business of current operation and known batch service is identified to abnormal batch service, and send corresponding warning information to alarm platform.Maintainer can predict rapidly possible abnormal traffic like this, the correct decision-making of handling it fast, thus progressively stop the core system fault that abnormal batch service causes.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (13)

1. a batch service method for early warning, is characterized in that, described method comprises the steps:
System resources consumption is surveyed, and utilized result of detection computing system resource to consume index;
Computing time, baseline, consumed index and the contrast of described time basis by described system resource, judged whether system operation is normal;
Move when undesired when system, by the batch service Characteristic Contrast of known batch service in the business of current operation and batch service list, identify abnormal batch service;
Send a warning message to alarm platform according to recognition result.
2. the method for claim 1, is characterized in that, described system resources consumption comprises operating-system resources consumption and/or Database Systems resource consumption, and described time basis comprises operating-system resources baseline and/or Database Systems resource baseline.
3. method as claimed in claim 2, is characterized in that, utilization rate, memory usage that described operating-system resources consumption is central processor CPU, and at least one in input/output bus I/O utilization rate.
4. method as claimed in claim 2, is characterized in that, described Database Systems resource consumption comprises statistical information consumption and/or live traffic consumption;
Described statistical information consumption is that average login times per second, average logic per second are read, accumulative total use amount, the average hard parsing per second of physical read average per second, number of transactions average per second, day quality average per second, active session average per second, rewind journal average per second, and average vernier per second is opened at least one in counting;
Described live traffic consumption is that roll-back segment contention is waited for, index contention is waited for, serial contention is waited for, focus piece contention is waited for, and journal file synchronous waiting at least one.
5. method as claimed in claim 2, is characterized in that, described operating-system resources baseline calculates by following step:
According to statistics the batch service execution time activity duration is carried out to segmentation;
Calculate respectively described operating-system resources consumption in setting-up time threshold value within each time period of every day the average M1 of average, and in setting-up time threshold value within each time period of every day the average M2 of mxm., and calculate the average M3 of described average M1 and average M2.
6. method as claimed in claim 2, is characterized in that, the following step of passing through of described Database Systems resource baseline is calculated:
According to statistics the batch service execution time activity duration is carried out to segmentation;
Calculate respectively described Database Systems resource consumption in setting-up time threshold value within each time period of every day the average M4 of average, and in setting-up time threshold value within each time period of every day the average M5 of mxm., and calculate the average M6 of described average M4 and average M5.
7. the method for claim 1, is characterized in that, the described step sending a warning message to alarm platform according to recognition result is at least one in following steps:
In the time that described abnormal batch service is known batch service, judge whether described abnormal batch is carried out in the normal period, be to send batch service property abnormality warning information to alarm platform, carry out warning information to alarm platform otherwise send batch service abnormal time point;
In the time that described abnormal batch service is unknown batch service, send unknown service exception warning information to alarm platform;
When finding new batch service, or while finding the new feature of existing batch service, the new feature information of described new batch service or existing batch service of adding is in batch service list;
In the time that described abnormal traffic is irrelevant with batch service, sends non-batch abnormality alarming information to alarm platform, and add document library to put on record described abnormal conditions.
8. a batch service prior-warning device, is characterized in that, described device comprises:
Resource probe unit, for system resources consumption is surveyed, and utilizes result of detection computing system resource to consume index;
Time basis unit, for baseline computing time, consumes index and the contrast of described time basis by described system resource, judges whether system operation is normal;
Batch service recognition unit, by the batch service Characteristic Contrast of the business of current operation and the known batch service of batch service list, identifies abnormal batch service when undesired for moving when system; And
Information Alarm Unit, for sending a warning message to alarm platform according to recognition result.
9. device as claimed in claim 8, is characterized in that, described resource probe unit comprises operating-system resources probe module and/or Database Systems resource probe module;
Described operating-system resources probe module, for exploration operation system resources consumption;
Described Database Systems resource probe module, for detection data storehouse system resources consumption.
Described time basis unit comprises operating-system resources baseline module and/or Database Systems resource baseline module;
Described operating-system resources baseline module, for calculating operation system resource baseline;
Described Database Systems resource baseline module, for computational data storehouse system resource baseline.
10. device as claimed in claim 8, is characterized in that, described Database Systems resource probe module comprises statistical information probe submodule and/or live traffic probe submodule;
Described statistical information probe submodule, for surveying statistical information consumption;
Described live traffic probe submodule, consumes for detected event business;
Described statistical information consumption is that average login times per second, average logic per second are read, accumulative total use amount, the average hard parsing per second of physical read average per second, number of transactions average per second, day quality average per second, active session average per second, rewind journal average per second, and average vernier per second is opened at least one in counting;
Described live traffic consumption is that roll-back segment contention is waited for, index contention is waited for, serial contention is waited for, focus piece contention is waited for, and journal file synchronous waiting at least one.
11. devices as claimed in claim 9, is characterized in that, described operating-system resources baseline module comprises:
Activity duration splits submodule, for the activity duration being carried out to segmentation according to the batch service execution time of statistics;
The first mean value computation submodule, for calculate respectively described operating-system resources consumption in setting-up time threshold value within each time period of every day the average M1 of average, and in setting-up time threshold value within each time period of every day the average M2 of mxm., and calculate the average M3 of described average M1 and average M2.
12. devices as claimed in claim 9, is characterized in that, described Database Systems resource baseline module comprises:
The second mean value computation submodule, for calculate respectively described Database Systems resource consumption in setting-up time threshold value within each time period of every day the average M4 of average, and in setting-up time threshold value within each time period of every day the average M5 of mxm., and calculate the average M6 of described average M4 and average M5.
13., device as claimed in claim 8, is characterized in that, information Alarm Unit comprises as follows:
Known batch service abnormality alarming module, for in the time that described abnormal batch service is known batch service, judge whether described abnormal batch service is carried out in the normal period, be to send batch service property abnormality warning information to alarm platform, carry out warning information to alarm platform otherwise send batch service abnormal time point;
Unknown batch service abnormality alarming module, in the time that described abnormal batch service is unknown batch service, sends unknown service exception warning information to alarm platform;
Newly-increased batch service alarm module, for when finding new batch service, or while finding the new feature of existing batch service, the new feature information of described new batch service or existing batch service of adding is in batch service list;
Non-batch service abnormality alarming module, in the time that described abnormal traffic is irrelevant with batch service, sends non-batch abnormality alarming information to alarm platform, and adds document library to put on record described abnormal conditions.
CN201210423143.4A 2012-10-29 2012-10-29 A kind of batch service method for early warning and device Active CN103793309B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210423143.4A CN103793309B (en) 2012-10-29 2012-10-29 A kind of batch service method for early warning and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210423143.4A CN103793309B (en) 2012-10-29 2012-10-29 A kind of batch service method for early warning and device

Publications (2)

Publication Number Publication Date
CN103793309A true CN103793309A (en) 2014-05-14
CN103793309B CN103793309B (en) 2017-11-21

Family

ID=50669012

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210423143.4A Active CN103793309B (en) 2012-10-29 2012-10-29 A kind of batch service method for early warning and device

Country Status (1)

Country Link
CN (1) CN103793309B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104363113A (en) * 2014-10-29 2015-02-18 中国建设银行股份有限公司 Business continuity detection method
CN105589785A (en) * 2015-12-08 2016-05-18 中国银联股份有限公司 Device and method for monitoring IO (Input/Output) performance of storage equipment
CN105610647A (en) * 2015-12-30 2016-05-25 华为技术有限公司 Service abnormity detection method and server
CN111427748A (en) * 2020-03-31 2020-07-17 携程计算机技术(上海)有限公司 Task warning method, system, equipment and storage medium
CN112860523A (en) * 2021-03-16 2021-05-28 中国工商银行股份有限公司 Fault prediction method and device for batch job processing and server
CN112882807A (en) * 2021-02-04 2021-06-01 永辉云金科技有限公司 Breakpoint re-running batch processing system and method
CN115114374A (en) * 2022-06-27 2022-09-27 腾讯科技(深圳)有限公司 Transaction execution method and device, computing equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677934A (en) * 2004-03-31 2005-10-05 华为技术有限公司 Method and system for monitoring network service performance
CN1794645A (en) * 2005-08-24 2006-06-28 上海浦东软件园信息技术有限公司 Invading detection method and system based on procedure action
CN101075919A (en) * 2006-06-22 2007-11-21 腾讯科技(深圳)有限公司 Method and system for monitoring Internet service
CN101123786A (en) * 2007-07-26 2008-02-13 中国移动通信集团山东有限公司 Intelligent control method for GRPS service
US20090024502A1 (en) * 2007-07-20 2009-01-22 Nanjing Lianchuang Science & Technology Inc., Ltd. Application Method of Online Charging System in Arrears Risk Control System
CN101692736A (en) * 2009-09-16 2010-04-07 南京联创科技集团股份有限公司 Method for monitoring telecom mobile service exchange based on flex technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677934A (en) * 2004-03-31 2005-10-05 华为技术有限公司 Method and system for monitoring network service performance
CN1794645A (en) * 2005-08-24 2006-06-28 上海浦东软件园信息技术有限公司 Invading detection method and system based on procedure action
CN101075919A (en) * 2006-06-22 2007-11-21 腾讯科技(深圳)有限公司 Method and system for monitoring Internet service
US20090024502A1 (en) * 2007-07-20 2009-01-22 Nanjing Lianchuang Science & Technology Inc., Ltd. Application Method of Online Charging System in Arrears Risk Control System
CN101123786A (en) * 2007-07-26 2008-02-13 中国移动通信集团山东有限公司 Intelligent control method for GRPS service
CN101692736A (en) * 2009-09-16 2010-04-07 南京联创科技集团股份有限公司 Method for monitoring telecom mobile service exchange based on flex technology

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104363113A (en) * 2014-10-29 2015-02-18 中国建设银行股份有限公司 Business continuity detection method
CN105589785A (en) * 2015-12-08 2016-05-18 中国银联股份有限公司 Device and method for monitoring IO (Input/Output) performance of storage equipment
CN105610647A (en) * 2015-12-30 2016-05-25 华为技术有限公司 Service abnormity detection method and server
CN111427748A (en) * 2020-03-31 2020-07-17 携程计算机技术(上海)有限公司 Task warning method, system, equipment and storage medium
CN112882807A (en) * 2021-02-04 2021-06-01 永辉云金科技有限公司 Breakpoint re-running batch processing system and method
CN112860523A (en) * 2021-03-16 2021-05-28 中国工商银行股份有限公司 Fault prediction method and device for batch job processing and server
CN112860523B (en) * 2021-03-16 2024-06-25 中国工商银行股份有限公司 Batch job processing fault prediction method, device and server
CN115114374A (en) * 2022-06-27 2022-09-27 腾讯科技(深圳)有限公司 Transaction execution method and device, computing equipment and storage medium

Also Published As

Publication number Publication date
CN103793309B (en) 2017-11-21

Similar Documents

Publication Publication Date Title
CN103793309A (en) Method and device for early warning of batch services
WO2021212756A1 (en) Index anomaly analysis method and apparatus, and electronic device and storage medium
US10339457B2 (en) Application performance analyzer and corresponding method
CN102713861B (en) Operation management device, operation management method and program recorded medium
US9424157B2 (en) Early detection of failing computers
US9389946B2 (en) Operation management apparatus, operation management method, and program
CN102576326B (en) Operation monitoring equipment, operation monitoring method and program recorded medium
Huang et al. Detecting regions of disequilibrium in taxi services under uncertainty
CN105630885A (en) Abnormal power consumption detection method and system
CN111339175B (en) Data processing method, device, electronic equipment and readable storage medium
WO2017116627A1 (en) System and method for unsupervised prediction of machine failures
CN104820630A (en) System resource monitoring device based on business variable quantity
Fu et al. Quantifying temporal and spatial correlation of failure events for proactive management
Gurumdimma et al. Crude: Combining resource usage data and error logs for accurate error detection in large-scale distributed systems
CN103186603A (en) Method, system and equipment for determining influence of SQL statements on performance of key businesses
CN105637488A (en) Tracing source code for end user monitoring
CN103617104A (en) Active and passive redundant computer system node fault detection method based on IPMI
Veasey et al. Anomaly detection in application performance monitoring data
CN114327964A (en) Method, device, equipment and storage medium for processing fault reasons of service system
CN114365094A (en) Timing anomaly detection using inverted indices
JP6252309B2 (en) Monitoring omission identification processing program, monitoring omission identification processing method, and monitoring omission identification processing device
CN113742118B (en) Method and system for detecting anomalies in data pipes
WO2023106504A1 (en) Method, device, and computer-readable recording medium for machine learning-based observation level measurement using server system log, and for risk level calculation according to same measurement
CN117251074A (en) Touch all-in-one machine management system and method based on artificial intelligence
Han et al. Feedback‑Aware Anomaly Detection Through Logs for Large‑Scale Software Systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant