CN113867844A - Offline data storage and calculation method based on time slice intelligent inspection control - Google Patents
Offline data storage and calculation method based on time slice intelligent inspection control Download PDFInfo
- Publication number
- CN113867844A CN113867844A CN202111175485.4A CN202111175485A CN113867844A CN 113867844 A CN113867844 A CN 113867844A CN 202111175485 A CN202111175485 A CN 202111175485A CN 113867844 A CN113867844 A CN 113867844A
- Authority
- CN
- China
- Prior art keywords
- activity
- activities
- time
- offline
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000007689 inspection Methods 0.000 title claims abstract description 19
- 238000004364 calculation method Methods 0.000 title claims abstract description 17
- 238000013500 data storage Methods 0.000 title claims abstract description 17
- 230000000694 effects Effects 0.000 claims abstract description 157
- 238000012545 processing Methods 0.000 claims abstract description 37
- 238000000034 method Methods 0.000 claims abstract description 20
- 238000012544 monitoring process Methods 0.000 claims abstract description 9
- 238000007726 management method Methods 0.000 claims description 11
- 230000007774 longterm Effects 0.000 claims description 7
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 238000004806 packaging method and process Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000005192 partition Methods 0.000 description 6
- 238000003860 storage Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004886 process control Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to an off-line data storage and calculation method and system based on time slice intelligent inspection control, which comprises the following steps: the configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters; the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity; the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments. In the scene of off-line data processing, the invention marks the time stamp of the data by a certain time rule by taking the data time as a precondition from the beginning of the receiving activity and combines with the inspection control to ensure the integrity, timeliness and accuracy in the data processing process.
Description
Technical Field
The invention relates to the technical field of information, in particular to an offline data storage and calculation method based on time slice intelligent inspection control.
Background
Offline and real-time data processing are always important loops of information technology data storage and calculation, and the offline data processing is applicable to scenes with low requirements on post analysis and timeliness compared with real-time processing, and a whole set of standardized definition processes including data receiving, processing, storage, analysis/calculation are available in the industry, but the problem is solved in practical application:
how to guarantee the integrity of the offline data processing process? The data is subjected to a series of data processing processes from the receiving to the final storage (analysis and calculation), and all the processes are required to be completely and correctly executed.
How to improve the timeliness of offline data calculation results? Although offline data has a certain time tolerance relative to real-time computation, it is necessary to ensure "relative" timeliness of the data results.
How to ensure the accuracy of the offline data calculation results? Because data has correlation and dependency, such as compiling provincial data, the result obtained by loading the prefecture-level data must be waited to be valid.
Disclosure of Invention
In view of the above, the present invention provides an offline data storage and calculation method based on time slice intelligent inspection control, which, in an offline data processing scenario, marks a timestamp on data according to a certain time rule by taking data time as a precondition from the beginning of a receiving activity, and ensures integrity, timeliness and accuracy in a data processing process by combining with inspection control.
In order to achieve the purpose, the invention adopts the following technical scheme:
an off-line data storage and calculation method based on time slice intelligent inspection control comprises the following steps:
step S1: presetting and packaging an offline data processing general activity, and establishing a standardized data processing activity;
step S2: creating and configuring the starting and running of each data processing activity instance, including parameter attributes such as time slice period rules and the like;
step S3: reading activity check configuration when initializing or updating the starting parameter, adding the activity check configuration into a control check table of a process control center, and monitoring the change of the check parameter (particularly the time slice related parameter) by activity;
step S4: when the subscription parameters in the check list are found to change, scanning whether each activity condition is met, and if so, starting a related activity instance to execute offline data processing;
step S5: scheduling that only instances run at the same time on the same activity on the starting parameters is guaranteed;
step S6: for the long-term activity instance, sending heartbeat to the process control center at regular time, and restarting the activity instance if the heartbeat interruption exceeds a certain time;
step S7: after the execution of the activity instance is successful, updating the information of the starting parameter value according to the execution result;
step S8: after the execution of the activity instance fails, the task is discarded into a failure queue, and the instance is not executed even if the activity existing in the failure queue meets the condition;
step S9: and according to the failure waiting time length configuration parameters, retrieving the activities from the failure queue and then putting the activities into the starting parameter management range of the center.
Further, the standardized data processing activities include data access activities, data storage activities, data deletion activities, data download activities, and the like, wherein the activities are classified into long-term activities and temporary activities.
Furthermore, in the standardized data processing activity, the off-line data processing process is labeled and distinguished according to a certain time slice period rule, so that time slice identification is performed on the data conveniently.
Further, the parameter types are divided into a starting parameter, an operating parameter and an activity decision parameter, and all the activity parameters are time slice attributes.
An offline data storage and computing system based on time-sliced intelligent inspection control, comprising:
the configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters;
the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity;
the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention establishes the standardization of the offline activity and the standardized definition of the parameters, processes and disassembles the offline data into a plurality of independent activities, and can select the activities required by the definition from an abstract activity model to complete the requirements of the offline data processing;
2. the invention ensures instantiability and performability of the offline activity by predefining different types of activity parameters, wherein the instantaneity comprises runtime parameters and parameter attribute information used in the activity running process; the system comprises a starting parameter, a routing inspection parameter and a monitoring module, wherein the starting parameter is used for judging conditions when an activity is started, and the routing inspection parameter is used for regularly routing inspection of attribute information of the activity starting parameter;
3. the invention innovatively provides a method for controlling the intelligent process of the offline data processing activity, which takes time slice as a preposed core condition, and an intelligent process control center is responsible for managing the instantiation operation of the offline data processing activity and guarding the operation example of the activity through the condition confirmation of the starting parameter, thereby realizing the intelligent process control of the offline data processing activity.
Drawings
FIG. 1 is a system framework diagram of the present invention;
FIG. 2 is a flow diagram of offline data processing activities in an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides an off-line data storage and computation system based on time slice intelligent inspection control, which comprises three components:
a configuration center: and the management of the offline activity abstract model and the information storage of the offline activity parameters are responsible.
The process control center: and the intelligent scheduling function is responsible for initializing, starting, monitoring operation, releasing and other activity instances of the offline data activity.
Operating the nodes: one or more operation nodes can be simultaneously contained, all offline data activities instantiated by the process management and control center are on the operation nodes, and the operation nodes are activity instance operation environments.
Referring to fig. 2, the present invention further provides an offline data storage and calculation method based on time slice intelligent inspection control, which in this embodiment specifically includes the following steps:
1. defining and encapsulating an offline data processing generic activity to establish a standardized data processing activity;
a) configuration center-Activity abstraction management (abstracted off-line data handling activities)
Preferably, the data activities may be abstractly extended according to actual requirements, and further include data compression/decompression activities, data file processing and formatting activities, and the like, which are not listed in the following.
b) Activity classification (two types)
The activity abstract model and the activity instance are in one-to-many relation, and the abstract model of the same activity can set different treatment activities by defining different parameters.
2. The activity configuration center extracts activities according to real-time service requirements and defines parameters of each data activity, wherein the parameter types are divided into decision parameters, starting parameters and operation parameters;
decision parameters: defined by the activity abstraction model, for determining whether the activity instance satisfies the run condition.
Starting parameters: defined by activity definition time, for the environment conditions of activity type, activity instance number and the like when the activity instance is started.
The operation parameters are as follows: defined by the activity definition time, parameter information needed by the activity instance in the running process.
The activity parameters must include attributes associated with time slices, including the latest time slice partition, time slice period, and time slice period unit, so as to confirm the time slice rule, and provide a check decision basis for the control center through time slicing.
a) Configuration center-campaign parameter management examples
i. Move one
Activity one defines that the received data is output as a file in 5 minute time slices by time slicing parameters.
Where { time slicing } is unified by default to 14 bit time + time slice period units, as follows
yyyyMMddHHmmss_5_MINUTES |
Namely the output file is PeopleView _20200720180000_5_ MINUTES
ii. activity two
And determining the downloaded file list information by the activity two through a local path, a time slice period and a time slice unit.
iii. Activity III
And the movable tee determines the uploaded file list information through a local path, a time slice period and a time slice unit.
iv, moving four
And the data table partitions are created according to time slices in a unified mode, the names of the table partitions are calculated by utilizing a time slice period in the activity IV, and corresponding files are loaded into the table partitions according to the time slices.
v. Activity five
Activity five uses the time slice period to identify the range of data periods that need to be calculated.
3. The process control center reads the activity check configuration during initialization (or when the starting parameter is updated), and adds the activity check configuration into a control check table of the process control center, and the activity monitors the change of the check parameter;
the control checking management is used for recording the activity dependence information state of the process control center, the process control center registers the activity running parameters needing monitoring at the first time when starting, subscribes the state of the activity running parameters, and confirms the state change of the checking list by calculating the time slice parameters.
a) Registration and subscription of activity parameters, taking the activity of step S3 as an example
Note:
1. checking the table parameter and value change is realized by registering corresponding file directory snooping/table partition snooping functions.
2. The corresponding table in activity five (data calculation analysis activity) is derived by parsing SQL analysis.
3. The parameters subscribed to by the activities in the look-up table trigger the execution of the activity when changed.
b) Examining information type descriptions
4. The process control center scans whether each activity condition is met or not when the subscription parameters in the check list are found to be changed, and if yes, the process control center starts a related activity instance to execute offline data processing;
the control center activity decision management is a service module maintained by the control management center, the function of the control center is to provide activity operation decision service, all activities defined in the configuration center can register activity operation decision information to the control center activity, so that when the related activity subscription information of the control check list changes, the activity operation decision is triggered to generate an activity instance.
For example:
and (3) making a decision and management for activity starting/stopping, and when the information of 'database table information type-HIVE table time slice table partition' of the control check table is changed from 'PeopleView _20200720000000_1_ DAYS' to 'PeopleView _20200721000000_1_ DAYS', the process control center receives an event subscription message, injects a change condition into an 'activity operation decision' parameter of activity five, instantiates the activity five if the operation is returned, and continues subscribing the change of the monitoring check table if the operation is returned.
5. The process control center ensures that only one instance of the activity with the same defined parameters runs at the same time;
6. the process control center sends heartbeat to the process control center at regular time for the long-term activity example, and the process control center restarts the activity example if the heartbeat interruption exceeds a certain time;
and managing the running state of the active instance, and recording the active state of the active instance on the running node recently.
The process control center monitors the state of the active instance by maintaining an active instance state table, ensures the stable and normal operation of the number of instances of the long-term activity, closes and eliminates the abnormal and unresponsive long-term activity, and starts the operation of a new instance; and ensuring singleton operation for the short-term activity instance, and closing the task instance when the time is out.
7. After the execution of the activity instance is successful, the process control center updates the information of the operation parameter value according to the execution result;
taking activity five as an example, the variable values in the operating parameters are updated after execution is completed:
8. the process control center discards the task to a failure queue after the execution of the activity instance fails, and the instance is not executed even if the activity existing in the failure queue meets the condition;
the process activity failure table is maintained by the process management and control center and used for temporarily storing failure activities:
the process control center configures parameters according to the failure waiting time length, and takes activities from the failure queue and places the activities into a process control check list parameter management range of the center;
as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (5)
1. An off-line data storage and calculation method based on time slice intelligent inspection control is characterized by comprising the following steps:
step S1: presetting and packaging an offline data processing general activity, and establishing a standardized data processing activity;
step S2: creating and configuring the starting and running parameter attributes of each data processing activity instance;
step S3: reading the activity check configuration when initializing or updating the starting parameter, adding the activity check configuration into a control check table of a process control center, and monitoring the change of the check parameter by the activity;
step S4: when the subscription parameters in the check list are found to change, scanning whether each activity condition is met, and if so, starting a related activity instance to execute offline data processing;
step S5: scheduling that only instances run at the same time on the same activity on the starting parameters is guaranteed;
step S6: for the long-term activity instance, sending heartbeat to the process control center at regular time, and restarting the activity instance if the heartbeat interruption exceeds a certain time;
step S7: after the execution of the activity instance is successful, updating the information of the starting parameter value according to the execution result;
step S8: after the execution of the activity instance fails, the task is discarded into a failure queue, and the instance is not executed even if the activity existing in the failure queue meets the condition;
step S9: and according to the failure waiting time length configuration parameters, retrieving the activities from the failure queue and then putting the activities into the starting parameter management range of the center.
2. The offline data storage and calculation method based on time-slice intelligent inspection control as claimed in claim 1, wherein said standardized data processing activities include data access activities, data storage activities, data deletion activities, data download activities, etc., wherein the activities are divided into long-term activities and temporary activities.
3. The offline data storage and computation method based on time-slice intelligent inspection control as recited in claim 1, wherein said parameter types are start-up parameters, operational parameters and activity decision parameters, all activity parameters being time-slice attributes.
4. The off-line data storage and calculation method based on time-slicing intelligent inspection control as claimed in claim 1, wherein the off-line data processing procedures in the standardized data processing activities are labeled and distinguished according to preset time-slicing period rules.
5. An off-line data storage and calculation system based on time slice intelligent inspection control is characterized by comprising
The configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters;
the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity;
the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111175485.4A CN113867844B (en) | 2021-10-09 | 2021-10-09 | Offline data storage and calculation method based on time slice intelligent inspection control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111175485.4A CN113867844B (en) | 2021-10-09 | 2021-10-09 | Offline data storage and calculation method based on time slice intelligent inspection control |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113867844A true CN113867844A (en) | 2021-12-31 |
CN113867844B CN113867844B (en) | 2024-10-18 |
Family
ID=79002105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111175485.4A Active CN113867844B (en) | 2021-10-09 | 2021-10-09 | Offline data storage and calculation method based on time slice intelligent inspection control |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113867844B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106502772A (en) * | 2016-10-09 | 2017-03-15 | 国网浙江省电力公司信息通信分公司 | Electric quantity data batch high speed processing method and system based on distributed off-line technology |
CN106570081A (en) * | 2016-10-18 | 2017-04-19 | 同济大学 | Semantic net based large scale offline data analysis framework |
CN106992872A (en) * | 2016-01-21 | 2017-07-28 | 中国移动通信集团公司 | A kind of method and system of information processing |
CN112445600A (en) * | 2020-12-15 | 2021-03-05 | 北京首汽智行科技有限公司 | Method and system for issuing offline data processing task |
CN112507003A (en) * | 2021-02-03 | 2021-03-16 | 江苏海平面数据科技有限公司 | Internet of vehicles data analysis platform based on big data architecture |
US11113244B1 (en) * | 2017-01-30 | 2021-09-07 | A9.Com, Inc. | Integrated data pipeline |
-
2021
- 2021-10-09 CN CN202111175485.4A patent/CN113867844B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106992872A (en) * | 2016-01-21 | 2017-07-28 | 中国移动通信集团公司 | A kind of method and system of information processing |
CN106502772A (en) * | 2016-10-09 | 2017-03-15 | 国网浙江省电力公司信息通信分公司 | Electric quantity data batch high speed processing method and system based on distributed off-line technology |
CN106570081A (en) * | 2016-10-18 | 2017-04-19 | 同济大学 | Semantic net based large scale offline data analysis framework |
US11113244B1 (en) * | 2017-01-30 | 2021-09-07 | A9.Com, Inc. | Integrated data pipeline |
CN112445600A (en) * | 2020-12-15 | 2021-03-05 | 北京首汽智行科技有限公司 | Method and system for issuing offline data processing task |
CN112507003A (en) * | 2021-02-03 | 2021-03-16 | 江苏海平面数据科技有限公司 | Internet of vehicles data analysis platform based on big data architecture |
Non-Patent Citations (2)
Title |
---|
MARCO BIAGI: "A Continuous-Time Model-Based Approach for Activity Recognition in Pervasive Environments", 《IEEE》, 13 April 2019 (2019-04-13), pages 293 * |
兰慧峰;左旭涛;王美霞;岳阳;周凡;: "基于大数据的城市轨道交通数据处理流程研究", 中国新技术新产品, no. 10, 25 May 2020 (2020-05-25) * |
Also Published As
Publication number | Publication date |
---|---|
CN113867844B (en) | 2024-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6602435B2 (en) | Parallel execution of continuous event processing (CEP) queries | |
FI104018B (en) | Method and system for monitoring a computer system | |
AU2012262153B2 (en) | Systems and methods for executing device control | |
CN109299150B (en) | Configurable multi-data-source adaptation rule engine solution method | |
CN105095048B (en) | A kind of monitoring system alarm association processing method based on business rule | |
CN110166290A (en) | Alarm method and device based on journal file | |
CN108229799B (en) | Multi-source heterogeneous power grid operation real-time data access system and method | |
US20170085512A1 (en) | Generating message envelopes for heterogeneous events | |
CN103370695B (en) | database update notification method | |
CN106407075B (en) | Management method and system for big data platform | |
CN110334126A (en) | Timed task processing method, device and computer equipment based on Spring MVC | |
CN110795264A (en) | Monitoring management method and system and intelligent management terminal | |
CN109542868A (en) | Position method, apparatus, electronic equipment and the storage medium of abnormal SQL statement | |
CN109284331A (en) | Accreditation information acquisition method, terminal device and medium based on business datum resource | |
CN116233164A (en) | Method, apparatus, storage medium and processor for collecting device data | |
CN112925648B (en) | Business strategy issuing method and device | |
CN112835591B (en) | Operation and maintenance configuration management method and system supporting cross-language and cross-platform | |
CN113867844A (en) | Offline data storage and calculation method based on time slice intelligent inspection control | |
CN106911730A (en) | A kind of cloud disk service device accesses moving method and device | |
CN117149295A (en) | Computing power access method and device, electronic equipment and storage medium | |
CN116841831A (en) | Fault-tolerant processing method and device based on comprehensive inspection | |
CN111752838A (en) | Question checking method and device, server and storage medium | |
CN111435356A (en) | Data feature extraction method and device, computer equipment and storage medium | |
CN115396287B (en) | Fault analysis method and device | |
CN115840766A (en) | Log data analysis method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |