CN115712664A - Method and system for screening cases according to time frame based on log data - Google Patents
Method and system for screening cases according to time frame based on log data Download PDFInfo
- Publication number
- CN115712664A CN115712664A CN202310030955.0A CN202310030955A CN115712664A CN 115712664 A CN115712664 A CN 115712664A CN 202310030955 A CN202310030955 A CN 202310030955A CN 115712664 A CN115712664 A CN 115712664A
- Authority
- CN
- China
- Prior art keywords
- log data
- case
- timestamp
- event
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 67
- 238000012216 screening Methods 0.000 title claims abstract description 42
- 238000001914 filtration Methods 0.000 claims abstract description 116
- 238000004458 analytical method Methods 0.000 claims abstract description 28
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 17
- 238000005520 cutting process Methods 0.000 claims abstract description 15
- 230000000694 effects Effects 0.000 claims description 46
- 238000000605 extraction Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 9
- 230000002085 persistent effect Effects 0.000 claims description 7
- 230000003139 buffering effect Effects 0.000 claims description 5
- 230000002452 interceptive effect Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000005065 mining Methods 0.000 abstract description 5
- 238000012550 audit Methods 0.000 description 19
- 238000010219 correlation analysis Methods 0.000 description 8
- 238000009825 accumulation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001364 causal effect Effects 0.000 description 2
- 230000001427 coherent effect Effects 0.000 description 2
- 238000005206 flow analysis Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
Images
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a method and a system for screening cases according to time frames based on log data, which can rapidly and effectively extract log data with service relevance from event logs in a time frame mode, and simultaneously carry out cutting and filtering on error timestamps existing in the logs or logs exceeding an analysis range, thereby bringing convenience to service process discovery and process mining analysis, wherein the method comprises the following steps: the method comprises the steps of obtaining log data of a business event, adopting a process discovery algorithm to screen, sequence process and classify the log data, generating and storing an event case table, setting a timestamp, extracting relevant temporary log data in a timestamp interval from the event case table, setting a filtering mode, adopting different filtering modes to filter the temporary log data, obtaining effective case log data, and outputting the effective case log data.
Description
Technical Field
The invention relates to the technical field of information systems, in particular to a method and a system for screening cases according to timestamps based on log data.
Background
The computer system is widely applied to the fields of network service, database and the like due to the advantages of good expandability, high-speed calculation and the like, and event records are generated in the operation process of the computer system and combined to form a log file. The event record includes information such as a timestamp, a message, and a server workstation application operation record, and a record of activity associated with an object such as a database system. From the log information recorded by the log file, failure analysis, discovery of characteristics and rules among events, search for failure phenomena or association between logs and events, and the like can be realized.
In an information system, the behavior process of information processing and circulation is generally regarded as an event, and the event is recorded by the information system in a service log data mode, so that system management personnel can monitor and audit the operation state of the service system. The service log data is usually recorded in a persistent memory in a time series form, in an actual production service, because the service log data relates to multiple transactions and multiple resources, each event is discretely distributed when being recorded in a service log data table, and each piece of data does not have an absolute sequence and is difficult to acquire service flow data associated with a service within a certain period of time. Therefore, correlation analysis of log events is needed to accurately screen out cases for failure analysis or characterization features and rules.
Most of the currently marketed log analysis software is only used for classifying and counting events occurring in the information system, and more is concerned about monitoring and positioning of event dimensions, for example: frequency of occurrence of events, time of occurrence of events, resources triggered by events. Because the recording mode of log data is mainly oriented to events rather than services, it is difficult to obtain the relevance of events before and after the events from log records, which causes certain difficulty in troubleshooting the event occurrence, currently, the commonly used target case screening is mainly realized based on time frames, and the existing flow method for obtaining service logs based on time stamp time ranges is as follows:
selecting the time start of searching the event log;
b. and editing the SQL data statements according to a time range set by the analysis requirement, setting a start time stamp and an end time stamp, and extracting the log data in the time range from the log data table through corresponding fields. In the aspect of log analysis, professional analysts are required to screen, combine and investigate the log data in the database by writing specific SQL data query statements. These all require the analyst to have a profound understanding of the structure of the business and data.
However, the above method has the following problems:
(1) The extracted log data is in a sequential state on the time axis, but has no coherent state at the service level, so that more value information for correlation analysis cannot be provided at the service level.
(2) When log data is extracted, two time stamps of beginning and ending are set, and when the data is absolutely screened and cut, the whole business (namely the whole case) near the time stamps is cut. For example, some services start before the start timestamp and some services do not end after the end timestamp, and such absolute clipping causes information loss on log data and irreversible influence on service flow discovery and mining.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method for screening cases according to time frames based on log data, which can quickly and effectively extract log data with service relevance from event logs in a time frame mode, and meanwhile, cut and filter error time stamps or logs beyond an analysis range in the logs, thereby bringing convenience to service process discovery and process mining analysis.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for screening cases according to time frame based on log data, the method being implemented based on a computer system, the method comprising: s1, acquiring log data of a service event: extracting and storing the log data of the service event according to the time log analysis requirement;
s2, screening, ranking and classifying the log data by adopting a process discovery algorithm;
s3, generating an event case table based on the classified log data and storing the event case table;
s4, setting a time stamp and a time stamp interval according to the extraction requirement, wherein the time stamp comprises a start time stamp and an end time stamp, and the time stamp interval refers to a time range between the start time stamp and the end time stamp which comprise the start time stamp and the end time stamp; extracting relevant temporary log data in a timestamp interval from the event case table, wherein the temporary log data refers to data in a time range of a start timestamp and an end timestamp in the log data;
s5, setting a filtering mode, and filtering temporary log data by adopting different filtering modes to obtain effective case log data; the filtering pattern includes at least three: a first filtering mode, a second filtering mode and a third filtering mode;
the first filtering mode refers to: acquiring first case log data intersected with a timestamp interval from the log data; the second filtering mode refers to: cutting out incomplete case log data in the log data, and taking the remaining log data in a time stamp interval as second case log data; the third filtering mode refers to: taking all temporary log data in the timestamp interval as third case log data;
the specific steps of filtering the log data by adopting the corresponding filtering mode comprise:
s51, acquiring a start case and an end case which are respectively cut off by a start timestamp and an end timestamp in the temporary log data;
s52, selecting a filtering mode, and filtering the log data by adopting the corresponding filtering mode to obtain effective case log data, wherein the effective case log data is one of first case log data, second case log data and third case log data;
and adopting a corresponding filtering mode to filter the log data: selecting a first filtering mode, adding the cut log data cut by the timestamp with the temporary log data, or selecting a second filtering mode, subtracting the temporary log data from the incomplete case log data, or selecting a third filtering mode, and taking all temporary log data in the timestamp interval as third case log data;
and S6, outputting the effective case log data and the start case and the end case which are respectively cut off by the start timestamp and the end timestamp.
It is further characterized in that it comprises,
in step S1, the step of extracting the relevant temporary log data includes: s11, defining a business activity object according to the time log analysis requirement;
s12, accessing a service information system database and positioning a log record table;
s13, according to the activity object and the event name related field defined in the step S11, finding an event name corresponding to the business activity object in a log record table;
s14, forming a set by the event names corresponding to the business activity objects;
s15, inquiring and extracting all corresponding log data of the event name related fields in the set through an inquiry interface provided by a database;
s16, selecting relevant fields corresponding to the business cases from the log data as case fields;
s17, loading the log data into a computer memory for storage;
further, in step S2, the step of screening, ranking and classifying the log data by using the process discovery algorithm includes:
s21, setting parameters of a flow discovery algorithm: a case field, an event field, a timestamp field;
s22, calculating log data by adopting a process discovery algorithm, screening, sequencing and classifying the activity event log data according to case fields to obtain event classification data, wherein the activity event log data refers to data related to a business activity object in the log data;
s23, loading the event classification data into a buffer for buffering;
further, in step S3, the event classification data is put into a table, and an event case table is generated and cached;
further, in step S51, the selecting step includes:
s511, selecting a first event activity of the earliest record in the temporary log data with a timestamp greater than or equal to a starting timestamp, and selecting a second event activity of the latest record in the temporary log data with a timestamp less than or equal to an ending timestamp;
s512, taking the complete case corresponding to the first event activity as a starting case, taking the complete case corresponding to the second event activity as an ending case, and taking the starting case and the ending case as cases cut off by timestamps;
further, in step S52, filtering the log data by using a corresponding filtering mode includes:
s521, selecting a first filtering mode, searching a case cut off by the timestamp in the case table, and obtaining cutting log data cut off by the timestamp according to the integrity of event activities and the case;
adding the cutting log data and the temporary log data to obtain first case log data with intersection with the timestamp;
s522, selecting a second filtering mode, and searching incomplete case log data in the case table;
subtracting the temporary log data from the incomplete case log data to obtain second case log data in the time stamp interval;
and S523, selecting a third filtering mode, wherein all temporary log data in the timestamp interval are used as third case log data, and case integrity check is not performed in the mode.
A system used for realizing the method for screening the cases according to the time frame based on the log data comprises a computer system, and is characterized in that the computer system comprises a data processor, a memory and a display device, wherein the data processor is internally provided with an analysis extraction module and a classification module which are sequentially connected; the display equipment is provided with a timestamp setting module and a filtering module, and the analysis and extraction module is used for analyzing the business events and extracting the log data of the business events;
the classification module is used for screening, order processing and classifying the log data;
the timestamp setting module is used for setting a timestamp interval;
the filtering module is used for setting a filtering mode, filtering the log data by adopting the selected filtering mode and acquiring effective case log data in a timestamp interval; the filtering pattern includes at least three: a first filtering mode, a second filtering mode and a third filtering mode; the first filtering mode refers to: acquiring first case log data with intersection with a timestamp interval from the log data; the second filtering mode refers to: cutting out incomplete case log data in the log data, and taking the remaining log data in a time stamp interval as second case log data; the third filtering mode refers to: taking all temporary log data in the timestamp interval as third case log data; and adopting a corresponding filtering mode to filter the log data: selecting a first filtering mode, adding the cut log data cut by the timestamp with the temporary log data, or selecting a second filtering mode, subtracting the temporary log data from the incomplete case log data, or selecting a third filtering mode, and taking all temporary log data in the timestamp interval as third case log data; the effective case log data is one of first case log data, second case log data and third case log data;
the storage is used for storing the log data, the intermediate data and the screening case result;
and the display equipment calls the data stored in the memory and outputs and displays the data through an interactive interface.
It is further characterized in that the method further comprises the steps of,
the memory comprises a persistent memory and a buffer, wherein the persistent memory is used for storing log data; the buffer is used for buffering intermediate data and screening case results;
the intermediate data comprises event classification data and an event case table;
the screening case results at least comprise: valid case log data, and a start case and an end case that are cut off by the start timestamp and the end timestamp, respectively.
The method of the invention can achieve the following beneficial effects: the method comprises the steps of combing the order of events in a service layer through a process discovery algorithm, deducing and classifying event log data, and screening and filtering case classified data through different filtering modes to obtain effective case log data. Before the log data are screened and filtered, events are sorted in order of a business layer by adopting a process finding algorithm, and the event log data are deduced and classified, so that the log data cut by a timestamp belong to the same integral case, and other cases which are not cut in the timestamp interval are also ensured to be integral cases, so that the accurate identification of the timestamp interval and the cases nearby during subsequent cutting and filtering is facilitated.
In the screening method, before temporary log data in case classification data are filtered by adopting different filtering modes, the case integrity is analyzed by using a time frame (or a timestamp), and the case within a timestamp range and adjacent to the timestamp is ensured to be in a coherent state at a service level, so that useful value information is provided for service level event log correlation analysis.
Service personnel can flexibly select one of the filtering modes to carry out correlation or non-correlation analysis on the event log according to the analysis requirement of actual service activity, so that the application range of the system is expanded by setting three filtering modes in the system for screening the case according to the time frame based on the log data.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings may be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of the system architecture of the present invention;
FIG. 3 is a graph of log data for events occurring before adjustment of a timestamp interval according to the present invention;
FIG. 4 is a graph of log data for events occurring after adjustment of timestamp intervals in accordance with the present invention;
reference numerals: the system comprises a computer system 1, an analysis and extraction module 101, a classification module 102, a timestamp setting module 103, a filtering module 104, a display device 21 and an interactive interface 201.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of this invention and the above-described drawings are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or device.
Aiming at the technical problems that the correlation of events before and after the occurrence of the events is difficult to obtain from log data only by adopting the modes of event classification and statistics and monitoring and positioning the dimension of the events in the prior art, so that the root of the occurrence of the events is difficult to find, and more value information for correlation analysis cannot be provided on a business level, absolute cutting causes information loss on the log data, and the accuracy of business process discovery and mining is influenced, the following provides a specific embodiment of a method for screening cases according to time frames based on the log data, which is shown in figure 1, and the method is realized based on a computer system and comprises the following steps: s1, acquiring log data of a service event: extracting and storing the log data of the service event according to the time log analysis requirement, and the specific steps comprise:
s11, defining a business activity object according to the time log analysis requirement; business activity objects such as audit billing and audit processes;
s12, accessing a service information system database, and positioning a log record table according to the original service information system design specification;
s13, according to the activity object and the event name related field defined in the step S11, finding an event name corresponding to the business activity object in a log record table;
s14, forming a set by the event names corresponding to the business activity objects;
s15, inquiring and extracting all corresponding log data of the event name related fields in the set through an inquiry interface provided by a database;
s16, selecting relevant fields corresponding to the business cases from the log data as case fields;
and S17, loading the log data into a persistent memory for storage.
S2, screening, order processing and classifying the log data by adopting a process discovery algorithm, and the method specifically comprises the following steps:
s21, configuring parameters of a process discovery algorithm: a case field, an event field, a timestamp field; the above parameters are three essential fields for mapping log data.
S22, calculating the log data by adopting a process discovery algorithm, screening, sequencing and classifying the activity event log data according to the case fields to obtain event classification data, wherein the activity event log data refers to data related to a service activity object in the log data. The process discovery algorithm (e.g., alpha algorithm) obtains a process model by defining the relationship between activities in the four log data, and the model takes a business process object as a core and takes business integrity as a leading direction.
When the log data is calculated by adopting a process discovery algorithm, four sequential relations based on the log data are defined: immediate, causal, parallel, unrelated, wherein immediate is also called direct following, e.g. activity X and activity y in log data, a causal relationship refers to when X > y, and if and only if there is a trace such that activity X is followed by y: x- > y, currently only if x > y and not y > x; the parallel relationship means: x// y, currently only if x > y and y > x, independent relationships refer to: x ≠ y if and only if x > y and not y > x; secondly, generating a footprint matrix based on the order relation; and finally, based on the footprint matrix, realizing screening, ranking and classification of log data according to the case fields.
And S23, loading the event classification data into a buffer for buffering.
And S3, based on the classified log data, putting the event classified data into a table, generating an event case table and caching.
And S4, setting a time stamp according to the extraction requirement, wherein the time stamp comprises a start time stamp and an end time stamp. Extracting relevant temporary log data within the time range of the timestamp from the event case table, specifically, S41, extracting the service event according to the requirement, and setting a starting timestamp and an ending timestamp;
and S42, extracting temporary log data related to the business activity object, wherein the temporary log data refers to data in the time range of the start timestamp and the end timestamp in the log data, and the business information (namely, the log data) contained in the temporary log data is intercepted and cannot be directly used because the temporary log data is in the timestamp interval but the relevance of case events is not considered, so that the temporary log data is cut and filtered by adopting the following step S5.
S5, setting a filtering mode, wherein the filtering mode comprises at least three types: the filter comprises a first filtering mode, a second filtering mode and a third filtering mode, wherein the first filtering mode refers to that: acquiring first case log data with intersection with the timestamp interval from the log data;
the second filtering mode refers to: cutting incomplete case log data in the log data to obtain second case log data in a timestamp interval;
the third filtering mode refers to: all temporary log data within the time stamp interval are taken as third case log data.
Cutting and filtering the temporary log data by adopting different filtering modes to obtain effective case log data, and the method specifically comprises the following steps:
s51, selecting a start case and an end case which are respectively cut off by a start timestamp and an end timestamp in the temporary log data, wherein the selecting step comprises the following steps: s511, selecting a first event activity of the earliest record in the temporary log data with the timestamp more than or equal to the starting timestamp, and selecting a second event activity of the latest record in the temporary log data with the timestamp less than or equal to the ending timestamp;
and S512, taking the complete case corresponding to the first event activity as a starting case, taking the complete case corresponding to the second event activity as an ending case, and taking the starting case and the ending case as cases which are cut off by the time stamps.
S52, selecting a filtering mode, cutting and filtering the log data by adopting the corresponding filtering mode to obtain effective case log data, wherein the effective case log data is one of first case log data, second case log data or third case log data, the method comprises the steps of S521, selecting the first filtering mode, searching a case cut by a time stamp in a case table, and obtaining cut log data cut by the time stamp according to the integrity of event activities and the case;
adding the cropping log data and the temporary log data to obtain first case log data intersected with the timestamp.
All log data contents of the cases within the time stamp interval and truncated by the time stamp are reserved in the first filtering mode, so that the log data contents of the screened cases are more comprehensive, the problem that the correlation analysis accuracy of the event logs is influenced due to information loss is avoided, and the analysis accuracy of the event logs is improved. For example, in the audit flow analysis, the time stamp interval is set to be one month, the first filtering mode is adopted, the starting case is the case that the audit flow is not ended in the last month of audit, the ending case is the case that the audit flow is not ended in the current month (i.e. in the time stamp interval), and the case that the audit flow is not ended in the previous month, the case that the audit flow is not ended in the current month and the case that the audit flow is not ended in the current month (i.e. all the complete cases in the time stamp interval) are integrated in a direct manner, which is beneficial to the complete and accurate analysis of the audit flow.
S522, selecting a second filtering mode, and searching incomplete case log data in the case table;
subtracting the temporary log data from the incomplete case log data to obtain second case log data in a timestamp interval;
incomplete cases in the timestamp interval in the second filtering mode are cut, only complete case (namely, integral case) log data in the timestamp interval are reserved, log data which do not belong to the time period range are removed, and the accuracy of event log correlation analysis in the timestamp interval is improved. For example, in the audit process analysis, the time stamp interval is set to be one month, and the second filtering mode is adopted to remove the audit process cases which are not finished in the previous month and the audit process cases which are not finished in the current month, so that the accurate analysis of only the complete audit process cases in the current month is facilitated.
S523, selecting a third filtering mode, and using all temporary log data in the timestamp interval as third case log data, where the mode does not perform case integrity check.
All log data in a timestamp interval are reserved in the third filtering mode, namely the third filtering mode comprises complete case log data in the timestamp interval and partial log data contents of an incomplete case close to a start timestamp and an end timestamp, and continuity of the case on a service level is guaranteed, so that the accuracy of event log analysis is further improved. For example, in the case analysis of the audit flow, the timestamp interval is set to be one month, and a third filtering mode is adopted to calculate the content of the audit flow analysis in the current month from the remaining flows (i.e., the first event activity recorded in the earliest piece of temporary log data) in the case of the audit flow which has not ended in the previous month and the flows (i.e., the second event activity recorded in the latest piece of temporary data) which have been completed in the current month in the case of the audit flow which has not ended in the current month.
And S6, outputting the effective case log data in the time range of the timestamp and the start case and the end case which are respectively cut off by the start timestamp and the end timestamp. The output of effective case log data and the output of the start case and the end case which are respectively cut off by the start timestamp and the end timestamp ensure that an analyst can obtain the log data of all complete cases within the range of the timestamp and cut off by the timestamp, and the influence on correlation analysis caused by information loss of the log data due to absolute cutting is avoided.
A system for screening cases according to time frames based on log data comprises a computer system 1, see FIG. 2, wherein the computer system 1 is used for screening business cases according to time frames, the computer system 1 comprises a data processor, a memory and a display device 21, and the display device comprises an interactive interface 201 for displaying. The data processor is provided with an analysis and extraction module 101 and a classification module 102 which are connected in sequence, the display device 21 is provided with a timestamp setting module and a filtering module, and the analysis and extraction module 101 is used for analyzing the business events and extracting log data of the business events; the classification module 102 is used for screening, order processing and classifying the log data; the timestamp setting module 103 is configured to set a timestamp interval; the filtering module 104 is configured to set or select a filtering mode, and filter the log data by using the selected filtering mode to obtain valid case log data within a timestamp interval; the storage is used for storing the log data, the intermediate data and the screening case result, in the embodiment, the storage comprises a persistent storage and a buffer, and the persistent storage is used for storing the log data; the buffer is used for caching intermediate data and effective case log data, the intermediate data comprises event classification data and an event case table, and the case screening result comprises the following steps: valid case log data and a start case and an end case which are respectively cut off by a start timestamp and an end timestamp; and the service equipment calls the data stored in the memory and outputs and displays the data through the display equipment.
Fig. 3 and 4 provide log data curves of events corresponding to a time frame displayed by the interactive interface 201 in the system, wherein the horizontal axis represents a time line, which represents the time from the start time to the end time of the log data, and the vertical axis represents the number accumulation of the events, through the log data curves, the frequency and the state of the events occurring in different time stamp intervals can be observed globally, and an analyst can acquire the number accumulation condition of the events in the time frame interval rapidly and intuitively.
By the method and the system, under the condition that the event log data is extracted through the time frame interval range, the integrity and the flexibility of the service data can be guaranteed, accidental information loss can not be caused, the service value of the log data is also guaranteed, and the service process discovery and mining work can be conveniently and accurately carried out.
The above is only a preferred embodiment of the present application, and the present invention is not limited to the above embodiments. It is to be understood that other modifications and variations directly derived or suggested to those skilled in the art without departing from the spirit and scope of the invention are to be considered as included within the scope of the invention.
Claims (7)
1. A method for screening cases according to time frame based on log data, the method being implemented based on a computer system, the method comprising: s1, acquiring log data of a service event: extracting and storing the log data of the service event according to the time log analysis requirement;
s2, screening, ranking and classifying the log data by adopting a process discovery algorithm;
s3, generating an event case table based on the classified log data and storing the event case table;
s4, setting a timestamp and a timestamp interval according to the extraction requirement, wherein the timestamp comprises a starting timestamp and an ending timestamp, and the timestamp interval refers to a time range between the starting timestamp and the ending timestamp which comprise the starting timestamp and the ending timestamp; extracting relevant temporary log data in a timestamp interval from the event case table, wherein the temporary log data refers to data in a time range of a start timestamp and an end timestamp in the log data;
s5, setting a filtering mode, and filtering temporary log data by adopting different filtering modes to obtain effective case log data; the filtering modes include at least three: a first filtering mode, a second filtering mode and a third filtering mode;
the first filtering mode refers to: acquiring first case log data with intersection with a timestamp interval from the log data; the second filtering mode refers to: cutting out incomplete case log data in the log data, and taking the remaining log data in a time stamp interval as second case log data; the third filtering mode refers to: taking all temporary log data in the timestamp interval as third case log data;
the specific steps of filtering the log data by adopting the corresponding filtering mode comprise:
s51, acquiring a start case and an end case which are respectively cut off by a start timestamp and an end timestamp in the temporary log data;
s52, selecting a filtering mode, and filtering the log data by adopting the corresponding filtering mode to obtain effective case log data, wherein the effective case log data is one of first case log data, second case log data and third case log data;
and adopting a corresponding filtering mode to filter the log data: selecting a first filtering mode, adding the cut log data cut by the timestamp with the temporary log data, or selecting a second filtering mode, subtracting the temporary log data from the incomplete case log data, or selecting a third filtering mode, and taking all temporary log data in the timestamp interval as third case log data;
and S6, outputting the valid case log data and the start case and the end case which are respectively cut off by the start timestamp and the end timestamp.
2. The method for screening cases according to time frame based on log data of claim 1, wherein the step of extracting relevant temporary log data in step S1 comprises: s11, defining a business activity object according to the time log analysis requirement;
s12, accessing a service information system database and positioning a log record table;
s13, according to the activity object and the event name related field in the step S11, finding an event name corresponding to the business activity object in a log record table;
s14, forming a set by the event names corresponding to the business activity objects;
s15, inquiring and extracting all corresponding log data of the event name related fields in the set through an inquiry interface provided by a database;
s16, selecting relevant fields corresponding to the service cases from the log data as case fields;
and S17, loading the log data into a computer memory for storage.
3. The method for screening cases according to time frame based on log data of claim 2, wherein in step S2, the step of screening, order processing and classifying the log data by using the process discovery algorithm comprises:
s21, setting parameters of a process discovery algorithm: a case field, an event field, a timestamp field;
s22, calculating log data by adopting a process discovery algorithm, screening, sequencing and classifying the activity event log data according to case fields to obtain event classification data, wherein the activity event log data refers to data related to a business activity object in the log data;
and S23, loading the event classification data into a buffer for buffering.
4. The method for screening cases according to time frame based on log data of claim 3, wherein in step S51, the selecting step comprises:
s511, selecting a first event activity of the earliest record in the temporary log data with the timestamp being more than or equal to the starting timestamp, and selecting a second event activity of the latest record in the temporary log data with the timestamp being less than or equal to the ending timestamp;
and S512, taking the complete case corresponding to the first event activity as a starting case, taking the complete case corresponding to the second event activity as an ending case, and taking the starting case and the ending case as cases cut off by the timestamps.
5. The method for screening cases according to time frame based on log data of claim 4, wherein in step S52, the log data is filtered by using the corresponding filtering mode,
s521, selecting a first filtering mode, searching a case truncated by a time stamp in the case table, and obtaining cutting log data cut by the time stamp according to the integrity of event activity and the case;
adding the cutting log data and the temporary log data to obtain first case log data with intersection with the timestamp;
s522, selecting a second filtering mode, and searching incomplete case log data in the case table;
subtracting the temporary log data from the incomplete case log data to obtain second case log data in a timestamp interval;
and S523, selecting a third filtering mode, wherein all temporary log data in the timestamp interval are used as third case log data, and case integrity check is not performed in the mode.
6. A system for screening cases according to time frames based on log data, which is used for realizing the method for screening cases according to time frames based on log data as claimed in claim 1, the system comprises a computer system, and is characterized in that the computer system comprises a data processor, a memory and a display device, and the data processor is internally provided with an analysis extraction module and a classification module which are connected in sequence; the display equipment is provided with a timestamp setting module and a filtering module, and the analysis and extraction module is used for analyzing the business events and extracting the log data of the business events;
the classification module is used for screening, order processing and classifying the log data;
the timestamp setting module is used for setting a timestamp interval;
the filtering module is used for setting a filtering mode, filtering the log data by adopting the selected filtering mode and acquiring effective case log data in a timestamp interval; the filtering pattern includes at least three: a first filtering mode, a second filtering mode and a third filtering mode; the first filtering mode refers to: acquiring first case log data with intersection with a timestamp interval from the log data; the second filtering mode refers to: cutting out incomplete case log data in the log data, and taking the remaining log data in a timestamp interval as second case log data; the third filtering mode refers to: taking all temporary log data in the timestamp interval as third case log data; and adopting a corresponding filtering mode to filter the log data: selecting a first filtering mode, adding the cut log data cut by the timestamp with the temporary log data, or selecting a second filtering mode, subtracting the temporary log data from the incomplete case log data, or selecting a third filtering mode, and taking all temporary log data in the timestamp interval as third case log data; the effective case log data is one of first case log data, second case log data and third case log data;
the storage is used for storing the log data, the intermediate data and the screening case result;
and the display equipment calls the data stored in the memory and outputs and displays the data through an interactive interface.
7. The system for screening cases according to time frame based on log data of claim 6, wherein the memory comprises a persistent memory for storing log data, a buffer; the buffer is used for buffering intermediate data and screening case results; the intermediate data comprises event classification data and an event case table, and the screening case result at least comprises the following steps: the effective case log data, and the start case and the end case which are respectively cut off by the start timestamp and the end timestamp.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310030955.0A CN115712664A (en) | 2023-01-10 | 2023-01-10 | Method and system for screening cases according to time frame based on log data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310030955.0A CN115712664A (en) | 2023-01-10 | 2023-01-10 | Method and system for screening cases according to time frame based on log data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115712664A true CN115712664A (en) | 2023-02-24 |
Family
ID=85236270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310030955.0A Pending CN115712664A (en) | 2023-01-10 | 2023-01-10 | Method and system for screening cases according to time frame based on log data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115712664A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095955A (en) * | 2016-06-16 | 2016-11-09 | 杭州电子科技大学 | The behavior patterns mining method matched based on traffic log and entity track |
CN111897788A (en) * | 2020-07-14 | 2020-11-06 | 中电福富信息科技有限公司 | Log retrieval analysis and visual mining method based on algorithm selection |
CN112559513A (en) * | 2019-09-10 | 2021-03-26 | 网易(杭州)网络有限公司 | Link data access method, device, storage medium, processor and electronic device |
CN112685417A (en) * | 2020-12-30 | 2021-04-20 | 京东数字科技控股股份有限公司 | Database operation method, system, device, server and storage medium |
CN112799863A (en) * | 2019-11-13 | 2021-05-14 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
CN114141381A (en) * | 2021-12-01 | 2022-03-04 | 上海柯林布瑞信息技术有限公司 | Clinical data analysis method and device based on diagnosis and treatment events |
CN114897290A (en) * | 2022-03-22 | 2022-08-12 | 招商局国际科技有限公司 | Evolution identification method and device of business process, terminal equipment and storage medium |
-
2023
- 2023-01-10 CN CN202310030955.0A patent/CN115712664A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095955A (en) * | 2016-06-16 | 2016-11-09 | 杭州电子科技大学 | The behavior patterns mining method matched based on traffic log and entity track |
CN112559513A (en) * | 2019-09-10 | 2021-03-26 | 网易(杭州)网络有限公司 | Link data access method, device, storage medium, processor and electronic device |
CN112799863A (en) * | 2019-11-13 | 2021-05-14 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
CN111897788A (en) * | 2020-07-14 | 2020-11-06 | 中电福富信息科技有限公司 | Log retrieval analysis and visual mining method based on algorithm selection |
CN112685417A (en) * | 2020-12-30 | 2021-04-20 | 京东数字科技控股股份有限公司 | Database operation method, system, device, server and storage medium |
CN114141381A (en) * | 2021-12-01 | 2022-03-04 | 上海柯林布瑞信息技术有限公司 | Clinical data analysis method and device based on diagnosis and treatment events |
CN114897290A (en) * | 2022-03-22 | 2022-08-12 | 招商局国际科技有限公司 | Evolution identification method and device of business process, terminal equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100841876B1 (en) | Automatic monitoring and statistical analysis of dynamic process metrics to expose meaningful changes | |
US7908239B2 (en) | System for storing event data using a sum calculator that sums the cubes and squares of events | |
CN104966172A (en) | Large data visualization analysis and processing system for enterprise operation data analysis | |
US20100122270A1 (en) | System And Method For Consolidating Events In A Real Time Monitoring System | |
US10915510B2 (en) | Method and apparatus of collecting and reporting database application incompatibilities | |
CN115269515B (en) | Processing method for searching specified target document data | |
CN110310127B (en) | Recording acquisition method, recording acquisition device, computer equipment and storage medium | |
CN114971710A (en) | Event log-based multi-dimensional process variant difference analysis method and system | |
CN103426050B (en) | System is supported in business problem analysis | |
CN114116872A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
CN114124743A (en) | Method and system for executing data application full link check rule | |
CN115712664A (en) | Method and system for screening cases according to time frame based on log data | |
CN104317820B (en) | Statistical method and device for report forms | |
CN115718658A (en) | Aging optimization method and device | |
CN115718690A (en) | Data accuracy monitoring system and method | |
CN114996104A (en) | Data processing method and device | |
CN109617734B (en) | Network operation capability analysis method and device | |
CN113742213A (en) | Method, system, and medium for data analysis | |
CN112685376A (en) | Massive log data analysis method and system | |
CN114647555B (en) | Data early warning method, device, equipment and medium based on multi-service system | |
CN109684159A (en) | Method for monitoring state, device, equipment and the storage medium of distributed information system | |
CN117539834A (en) | Data processing method, system, device and storage medium | |
CN116910114A (en) | Data monitoring method and device | |
CN115221223A (en) | CMDB-based configuration management platform implementation method | |
CN114140032A (en) | Facility running state monitoring method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230224 |