CN113157690A - Statistical-oriented running water log data organization method - Google Patents
Statistical-oriented running water log data organization method Download PDFInfo
- Publication number
- CN113157690A CN113157690A CN202011575571.XA CN202011575571A CN113157690A CN 113157690 A CN113157690 A CN 113157690A CN 202011575571 A CN202011575571 A CN 202011575571A CN 113157690 A CN113157690 A CN 113157690A
- Authority
- CN
- China
- Prior art keywords
- statistical
- journal
- field
- log
- oriented
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000008520 organization Effects 0.000 title claims abstract description 21
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title description 5
- 238000012545 processing Methods 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data organization, in particular to a statistical-oriented running log data organization method, which firstly expands a field in a normal operation running log table into three fields, namely a field of an original value, a field of a current value and time for changing the value of the field; and then, using an inquiry statement to obtain a result for the segmented running log, and using an SQL statement to make statistics for the relational database organization. For a system with a relatively idle service system, when a service module generates a flow record in the operation process, a log table meeting the statistical requirement is directly generated. For a busy service system, the operation of writing the journal is as simple and light as possible, so that a common journal table can still be generated, and then the journal is converted into a statistical journal table by a conversion program. The invention can realize various analysis statistics quickly, efficiently and flexibly by adding a plurality of fields to the flow log table.
Description
Technical Field
The invention relates to the technical field of data organization, in particular to a statistical-oriented running water log data organization method.
Background
One function commonly found in systems such as task management, process handling, work order handling, etc. is to quickly count the processing duration of some processes that meet the conditions. Taking a common work order process as an example, a process flow log record of a work order table with 5 fields is listed as shown in table 1. A huge amount of such records are generated in the actual system, and a very common statistical requirement is to count the work order processing speed of a certain work order processor, such as: the average processing time length of all the high-priority work orders processed in the last month, i.e., the time length elapsed from the open state to the other state, is counted by the processing person B. When the number of the pipeline records is large, if the result is recorded based on the pipeline operation log, the speed is quite slow, and the calculation needs to be performed by scanning a log table by a program because the operation of comparing values of different rows in the records is involved.
Currently, there are three methods for storing operation flow log type data:
a relational database: if the work order field is relatively fixed, as described in the previous table, it is common practice to save the journal of the work order into a table of a relational database;
NoSQL library: if the work order field is flexible, even without explicit fields, a modeless document database such as MongoDB can be used to store data with JSON objects. Elastic search, as a search engine, also similar to MongoDB, can be used for modeless document data search;
a time sequence database: for the saving of certain monitoring metric data (metric) in IoT devices and network monitoring, a time-series database is suitable, as is common, such as promemeus/infixdb/OpenTSDB.
In the three data organization methods, some methods related to time statistics are built in the time sequence database, but the time sequence database is only suitable for recording the change of a single index (metric) along with time, so that the time sequence database is not suitable for storing a work order to process the complex data. Regardless of using the relational database or using the document database, the conventional method for organizing the operation journal, as shown in fig. 1, needs calculation analysis across rows, and cannot obtain a statistical result through one query statement, so that the efficiency is very low. For example, in the foregoing example, the processing time duration of all the high-priority work orders processed by the processing person B in the previous month is counted, and then the processing record needs to be filtered according to the time, the record that the processing person is B is searched for, the time point of the work order opening state is recorded, the time point of the work order switching to the non-opening state is recorded, and the difference is made between the two time points, so that the processing time duration of one work order can be calculated. This approach is very inefficient for a large number of statistical scenarios.
A fast statistical approach to such work order transaction class records is needed. The patent provides an organization method of a flow log table, which can quickly, efficiently and flexibly realize various analysis statistics by adding a plurality of fields to the flow log table.
Disclosure of Invention
The invention provides a statistical-oriented running water log data organization method, which comprises the following steps:
the invention relates to a statistical-oriented running water log data organization method;
table 1 processing of worksheet table with 5 fields pipelining log records
ID | Time | Work order title | Status of state | Priority level | Treating person |
1 | 2020-09-17 09:00:00 | Reimbursement system anomalies | New construction | General | A |
1 | 2020-09-17 10:00:00 | Reimbursement system anomalies | Open | Height of | B |
2 | 2020-09-17 11:00:00 | The printer cannot be started | New construction | General | B |
2 | 2020-09-17 12:00:00 | The printer cannot be started | Open | General | C |
1 | 2020-09-17 13:00:00 | Reimbursement system anomalies | To be confirmed | Height of | A |
2 | 2020-09-17 14:00:00 | The printer cannot be started | Rejection of | General | C |
1 | 2020-09-17 15:00:00 | Reimbursement system anomalies | Close off | Height of | A |
2 | 2020-09-17 16:00:00 | The printer cannot be started | Close off | General | C |
The method comprises the following steps of utilizing data pre-calculation and data redundancy to improve the speed of batch query statistics. The basic method is as shown in fig. 2, and extends a field in the normal operation journal table into three fields, namely, a field for storing an original value, a field for storing a current value, and time for changing a field value.
Further, as shown in table 2, there are only two fields: the log table organization method of the work list table of the state and the priority comprises the following steps:
TABLE 2 statistics-oriented work order operation Log Table organization example
Time | State _ original value | State _ present value | Time taken for state change | Priority _ original value | Priority _ present value | Time taken for priority change |
2020-09-17 09:00:00 | New construction | Open | 200 seconds | General | General | 0 second |
2020-09-17 09:00:00 | Open | Open | 0 second | General | Advanced | 120 seconds |
After the log table is organized by the method, for the statistical requirement of the work order processing duration, a query statement can be used for obtaining a result. For relational database organization, a client directly makes statistics with an SQL statement.
For example, in table 2, if statistics are carried out on 9/17/2020, the priority is a common work order, and the total time from new creation to opening is directly executed as follows:
select sum (time for state change) from statistical table where date is "2020-09-17" and priority _ original value is "normal" and state _ original value is "new" and state _ current value is "open";
further, as shown in fig. 3, when the service system is not busy, and the service module generates a flow record during the operation process, a log table meeting the statistical requirement is directly generated. The method directly uses the log table oriented to statistics as a common log table without adding an additional table. However, the service module is required to perform data conversion in real time, which may additionally occupy computing resources.
Offline conversion from normal log sheet: as shown in fig. 4, in the busy service system, the operation of writing the journal is as simple and light as possible, so that a normal journal table can be generated, and then the journal can be converted into a statistical journal table by a conversion program.
The invention has the advantages and positive effects that: by adding a plurality of fields to the flow log table, various analysis statistics can be quickly, efficiently and flexibly realized.
Drawings
FIG. 1 is a diagram of the cross-row calculation required for statistics based on a common log table according to the present invention;
FIG. 2 is a data organization diagram of the present invention;
FIG. 3 is a diagram of a business module directly generating a statistical usage table in accordance with the present invention;
FIG. 4 is a table diagram of conventional log table conversion generation statistics of the present invention;
Detailed Description
The invention will be described in detail below with reference to the following figures and specific examples: in this embodiment, a statistical-oriented method for organizing pipelined log data includes the following steps:
the method comprises the following steps of utilizing data pre-calculation and data redundancy to improve the speed of batch query statistics. The basic method is as shown in fig. 2, and extends a field in the normal operation journal table into three fields, namely, a field for storing an original value, a field for storing a current value, and time for changing a field value.
In this embodiment, as shown in table 2, there are only two fields: the log table organization method of the work list table of the state and the priority comprises the following steps:
TABLE 2 statistics-oriented work order operation Log Table organization example
Time | State _ original value | State _ present value | Time taken for state change | Priority _ original value | Priority _ present value | Time taken for priority change |
2020-09-17 09:00:00 | New construction | Open | 200 seconds | General | General | 0 second |
2020-09-17 09:00:00 | Open | Open | 0 second | General | Advanced | 120 seconds |
After the log table is organized by the method, for the statistical requirement of the work order processing duration, a query statement can be used for obtaining a result. For relational database organization, a client directly makes statistics with an SQL statement.
For example, in table 2, if statistics are carried out on 9/17/2020, the priority is a common work order, and the total time from new creation to opening is directly executed as follows:
select sum (time for state change) from statistical table where date is "2020-09-17" and priority _ original value is "normal" and state _ original value is "new" and state _ current value is "open";
in this embodiment, as shown in fig. 3, when the service system is not busy, and the service module generates a flow record in the running process, a log table meeting the statistical requirement is directly generated. The method directly uses the log table oriented to statistics as a common log table without adding an additional table. However, the service module is required to perform data conversion in real time, which may additionally occupy computing resources.
Offline conversion from normal log sheet: as shown in fig. 4, in the busy service system, the operation of writing the journal is as simple and light as possible, so that a normal journal table can be generated, and then the journal can be converted into a statistical journal table by a conversion program.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims. The techniques, shapes, and configurations not described in detail in the present invention are all known techniques.
Claims (3)
1. A statistical-oriented pipelining log data organization method is characterized by comprising the following steps:
s1: expanding a field in a general operation flow log table into three fields, wherein the three fields are respectively a field of an original value, a field of a current value and time for changing the value of the field;
s2: and (3) obtaining a result of the segmented running log by using an inquiry statement, and counting the relational database organization by using an SQL statement.
2. The statistical-oriented pipelining log data organization method according to claim 1, wherein for a system with a relatively idle service system, when a service module generates a pipelining record during operation, a log table meeting statistical requirements is directly generated.
3. The statistical-oriented method for organizing journal data according to claim 1, wherein the journal is written as simply and lightweight as possible for a busy service system, so that a normal journal table can be generated, and then the journal is converted into a statistical journal table by a conversion program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011575571.XA CN113157690A (en) | 2020-12-28 | 2020-12-28 | Statistical-oriented running water log data organization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011575571.XA CN113157690A (en) | 2020-12-28 | 2020-12-28 | Statistical-oriented running water log data organization method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113157690A true CN113157690A (en) | 2021-07-23 |
Family
ID=76878086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011575571.XA Pending CN113157690A (en) | 2020-12-28 | 2020-12-28 | Statistical-oriented running water log data organization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113157690A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103176888A (en) * | 2011-12-22 | 2013-06-26 | 阿里巴巴集团控股有限公司 | Log recording method and log recording system |
US20180253434A1 (en) * | 2017-03-02 | 2018-09-06 | Discovered Intelligence Inc. | System for Aggregation and Prioritization of IT Asset Field Values from Real-Time Event Logs and Method thereof |
CN110688596A (en) * | 2019-09-09 | 2020-01-14 | 平安普惠企业管理有限公司 | Static webpage updating method and device, computer equipment and storage medium |
CN111159129A (en) * | 2019-12-31 | 2020-05-15 | 北京神州绿盟信息安全科技股份有限公司 | Statistical method and device for log report |
CN111324604A (en) * | 2020-01-19 | 2020-06-23 | 拉扎斯网络科技(上海)有限公司 | Database table processing method and device, electronic equipment and storage medium |
CN111796997A (en) * | 2020-07-02 | 2020-10-20 | 北京字节跳动网络技术有限公司 | Log information processing method and device and electronic equipment |
-
2020
- 2020-12-28 CN CN202011575571.XA patent/CN113157690A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103176888A (en) * | 2011-12-22 | 2013-06-26 | 阿里巴巴集团控股有限公司 | Log recording method and log recording system |
US20180253434A1 (en) * | 2017-03-02 | 2018-09-06 | Discovered Intelligence Inc. | System for Aggregation and Prioritization of IT Asset Field Values from Real-Time Event Logs and Method thereof |
CN110688596A (en) * | 2019-09-09 | 2020-01-14 | 平安普惠企业管理有限公司 | Static webpage updating method and device, computer equipment and storage medium |
CN111159129A (en) * | 2019-12-31 | 2020-05-15 | 北京神州绿盟信息安全科技股份有限公司 | Statistical method and device for log report |
CN111324604A (en) * | 2020-01-19 | 2020-06-23 | 拉扎斯网络科技(上海)有限公司 | Database table processing method and device, electronic equipment and storage medium |
CN111796997A (en) * | 2020-07-02 | 2020-10-20 | 北京字节跳动网络技术有限公司 | Log information processing method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12001439B2 (en) | Information service for facts extracted from differing sources on a wide area network | |
Sarma et al. | Crowd-powered find algorithms | |
EP3413197B1 (en) | Task scheduling method and device | |
US11194812B2 (en) | Efficient aggregation of sliding time window features | |
US20200097483A1 (en) | Novel olap pre-calculation model and method for generating pre-calculation result | |
CN105630934A (en) | Data statistic method and system | |
Liu et al. | Sampling for big data profiling: A survey | |
US10176231B2 (en) | Estimating most frequent values for a data set | |
CN105630706B (en) | Intelligent memory block replacement method, system and computer readable storage medium | |
CN114185885A (en) | Streaming data processing method and system based on column storage database | |
US20160078071A1 (en) | Large scale offline retrieval of machine operational information | |
Bailis et al. | Macrobase: Analytic monitoring for the internet of things | |
CN113157690A (en) | Statistical-oriented running water log data organization method | |
Lou et al. | Research on data query optimization based on SparkSQL and MongoDB | |
CN113220530B (en) | Data quality monitoring method and platform | |
CN111813833B (en) | Real-time two-degree communication relation data mining method | |
CN111179088B (en) | Information processing method and device | |
CN116226296B (en) | OpenGauss-based data packet aggregation method | |
Sarma et al. | Finding with the crowd | |
CN109656981B (en) | Data statistics method and system | |
CN113641654A (en) | Marketing handling rule engine method based on real-time event | |
Ito et al. | Scardina: Scalable Join Cardinality Estimation by Multiple Density Estimators | |
Yong et al. | Optimizing Performance of Aggregate Query Processing with Histogram Data Structure | |
CN117235153B (en) | ProV-DM model-based compliance data evidence-storing and tracing method and system | |
CN112131302B (en) | Commercial data analysis method and platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |