CN113867844A - Offline data storage and calculation method based on time slice intelligent inspection control - Google Patents

Offline data storage and calculation method based on time slice intelligent inspection control Download PDF

Info

Publication number
CN113867844A
CN113867844A CN202111175485.4A CN202111175485A CN113867844A CN 113867844 A CN113867844 A CN 113867844A CN 202111175485 A CN202111175485 A CN 202111175485A CN 113867844 A CN113867844 A CN 113867844A
Authority
CN
China
Prior art keywords
activity
activities
time
offline
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111175485.4A
Other languages
Chinese (zh)
Other versions
CN113867844B (en
Inventor
郑松森
康志权
张诗君
林福麟
郑幸源
邱阳
陈涵玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Youke Communication Technology Co ltd
Original Assignee
China Youke Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Youke Communication Technology Co ltd filed Critical China Youke Communication Technology Co ltd
Priority to CN202111175485.4A priority Critical patent/CN113867844B/en
Publication of CN113867844A publication Critical patent/CN113867844A/en
Application granted granted Critical
Publication of CN113867844B publication Critical patent/CN113867844B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to an off-line data storage and calculation method and system based on time slice intelligent inspection control, which comprises the following steps: the configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters; the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity; the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments. In the scene of off-line data processing, the invention marks the time stamp of the data by a certain time rule by taking the data time as a precondition from the beginning of the receiving activity and combines with the inspection control to ensure the integrity, timeliness and accuracy in the data processing process.

Description

Offline data storage and calculation method based on time slice intelligent inspection control
Technical Field
The invention relates to the technical field of information, in particular to an offline data storage and calculation method based on time slice intelligent inspection control.
Background
Offline and real-time data processing are always important loops of information technology data storage and calculation, and the offline data processing is applicable to scenes with low requirements on post analysis and timeliness compared with real-time processing, and a whole set of standardized definition processes including data receiving, processing, storage, analysis/calculation are available in the industry, but the problem is solved in practical application:
how to guarantee the integrity of the offline data processing process? The data is subjected to a series of data processing processes from the receiving to the final storage (analysis and calculation), and all the processes are required to be completely and correctly executed.
How to improve the timeliness of offline data calculation results? Although offline data has a certain time tolerance relative to real-time computation, it is necessary to ensure "relative" timeliness of the data results.
How to ensure the accuracy of the offline data calculation results? Because data has correlation and dependency, such as compiling provincial data, the result obtained by loading the prefecture-level data must be waited to be valid.
Disclosure of Invention
In view of the above, the present invention provides an offline data storage and calculation method based on time slice intelligent inspection control, which, in an offline data processing scenario, marks a timestamp on data according to a certain time rule by taking data time as a precondition from the beginning of a receiving activity, and ensures integrity, timeliness and accuracy in a data processing process by combining with inspection control.
In order to achieve the purpose, the invention adopts the following technical scheme:
an off-line data storage and calculation method based on time slice intelligent inspection control comprises the following steps:
step S1: presetting and packaging an offline data processing general activity, and establishing a standardized data processing activity;
step S2: creating and configuring the starting and running of each data processing activity instance, including parameter attributes such as time slice period rules and the like;
step S3: reading activity check configuration when initializing or updating the starting parameter, adding the activity check configuration into a control check table of a process control center, and monitoring the change of the check parameter (particularly the time slice related parameter) by activity;
step S4: when the subscription parameters in the check list are found to change, scanning whether each activity condition is met, and if so, starting a related activity instance to execute offline data processing;
step S5: scheduling that only instances run at the same time on the same activity on the starting parameters is guaranteed;
step S6: for the long-term activity instance, sending heartbeat to the process control center at regular time, and restarting the activity instance if the heartbeat interruption exceeds a certain time;
step S7: after the execution of the activity instance is successful, updating the information of the starting parameter value according to the execution result;
step S8: after the execution of the activity instance fails, the task is discarded into a failure queue, and the instance is not executed even if the activity existing in the failure queue meets the condition;
step S9: and according to the failure waiting time length configuration parameters, retrieving the activities from the failure queue and then putting the activities into the starting parameter management range of the center.
Further, the standardized data processing activities include data access activities, data storage activities, data deletion activities, data download activities, and the like, wherein the activities are classified into long-term activities and temporary activities.
Furthermore, in the standardized data processing activity, the off-line data processing process is labeled and distinguished according to a certain time slice period rule, so that time slice identification is performed on the data conveniently.
Further, the parameter types are divided into a starting parameter, an operating parameter and an activity decision parameter, and all the activity parameters are time slice attributes.
An offline data storage and computing system based on time-sliced intelligent inspection control, comprising:
the configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters;
the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity;
the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention establishes the standardization of the offline activity and the standardized definition of the parameters, processes and disassembles the offline data into a plurality of independent activities, and can select the activities required by the definition from an abstract activity model to complete the requirements of the offline data processing;
2. the invention ensures instantiability and performability of the offline activity by predefining different types of activity parameters, wherein the instantaneity comprises runtime parameters and parameter attribute information used in the activity running process; the system comprises a starting parameter, a routing inspection parameter and a monitoring module, wherein the starting parameter is used for judging conditions when an activity is started, and the routing inspection parameter is used for regularly routing inspection of attribute information of the activity starting parameter;
3. the invention innovatively provides a method for controlling the intelligent process of the offline data processing activity, which takes time slice as a preposed core condition, and an intelligent process control center is responsible for managing the instantiation operation of the offline data processing activity and guarding the operation example of the activity through the condition confirmation of the starting parameter, thereby realizing the intelligent process control of the offline data processing activity.
Drawings
FIG. 1 is a system framework diagram of the present invention;
FIG. 2 is a flow diagram of offline data processing activities in an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
Referring to fig. 1, the present invention provides an off-line data storage and computation system based on time slice intelligent inspection control, which comprises three components:
a configuration center: and the management of the offline activity abstract model and the information storage of the offline activity parameters are responsible.
The process control center: and the intelligent scheduling function is responsible for initializing, starting, monitoring operation, releasing and other activity instances of the offline data activity.
Operating the nodes: one or more operation nodes can be simultaneously contained, all offline data activities instantiated by the process management and control center are on the operation nodes, and the operation nodes are activity instance operation environments.
Referring to fig. 2, the present invention further provides an offline data storage and calculation method based on time slice intelligent inspection control, which in this embodiment specifically includes the following steps:
1. defining and encapsulating an offline data processing generic activity to establish a standardized data processing activity;
a) configuration center-Activity abstraction management (abstracted off-line data handling activities)
Figure BDA0003295347840000051
Figure BDA0003295347840000061
Preferably, the data activities may be abstractly extended according to actual requirements, and further include data compression/decompression activities, data file processing and formatting activities, and the like, which are not listed in the following.
b) Activity classification (two types)
Figure BDA0003295347840000062
The activity abstract model and the activity instance are in one-to-many relation, and the abstract model of the same activity can set different treatment activities by defining different parameters.
2. The activity configuration center extracts activities according to real-time service requirements and defines parameters of each data activity, wherein the parameter types are divided into decision parameters, starting parameters and operation parameters;
decision parameters: defined by the activity abstraction model, for determining whether the activity instance satisfies the run condition.
Starting parameters: defined by activity definition time, for the environment conditions of activity type, activity instance number and the like when the activity instance is started.
The operation parameters are as follows: defined by the activity definition time, parameter information needed by the activity instance in the running process.
The activity parameters must include attributes associated with time slices, including the latest time slice partition, time slice period, and time slice period unit, so as to confirm the time slice rule, and provide a check decision basis for the control center through time slicing.
a) Configuration center-campaign parameter management examples
i. Move one
Figure BDA0003295347840000071
Figure BDA0003295347840000081
Activity one defines that the received data is output as a file in 5 minute time slices by time slicing parameters.
Where { time slicing } is unified by default to 14 bit time + time slice period units, as follows
yyyyMMddHHmmss_5_MINUTES
Namely the output file is PeopleView _20200720180000_5_ MINUTES
ii. activity two
Figure BDA0003295347840000082
Figure BDA0003295347840000091
And determining the downloaded file list information by the activity two through a local path, a time slice period and a time slice unit.
iii. Activity III
Figure BDA0003295347840000101
And the movable tee determines the uploaded file list information through a local path, a time slice period and a time slice unit.
iv, moving four
Figure BDA0003295347840000102
Figure BDA0003295347840000111
Figure BDA0003295347840000121
And the data table partitions are created according to time slices in a unified mode, the names of the table partitions are calculated by utilizing a time slice period in the activity IV, and corresponding files are loaded into the table partitions according to the time slices.
v. Activity five
Figure BDA0003295347840000122
Figure BDA0003295347840000131
Activity five uses the time slice period to identify the range of data periods that need to be calculated.
3. The process control center reads the activity check configuration during initialization (or when the starting parameter is updated), and adds the activity check configuration into a control check table of the process control center, and the activity monitors the change of the check parameter;
the control checking management is used for recording the activity dependence information state of the process control center, the process control center registers the activity running parameters needing monitoring at the first time when starting, subscribes the state of the activity running parameters, and confirms the state change of the checking list by calculating the time slice parameters.
a) Registration and subscription of activity parameters, taking the activity of step S3 as an example
Figure BDA0003295347840000132
Figure BDA0003295347840000141
Note:
1. checking the table parameter and value change is realized by registering corresponding file directory snooping/table partition snooping functions.
2. The corresponding table in activity five (data calculation analysis activity) is derived by parsing SQL analysis.
3. The parameters subscribed to by the activities in the look-up table trigger the execution of the activity when changed.
b) Examining information type descriptions
Figure BDA0003295347840000151
4. The process control center scans whether each activity condition is met or not when the subscription parameters in the check list are found to be changed, and if yes, the process control center starts a related activity instance to execute offline data processing;
the control center activity decision management is a service module maintained by the control management center, the function of the control center is to provide activity operation decision service, all activities defined in the configuration center can register activity operation decision information to the control center activity, so that when the related activity subscription information of the control check list changes, the activity operation decision is triggered to generate an activity instance.
Figure BDA0003295347840000152
Figure BDA0003295347840000161
For example:
Figure BDA0003295347840000162
and (3) making a decision and management for activity starting/stopping, and when the information of 'database table information type-HIVE table time slice table partition' of the control check table is changed from 'PeopleView _20200720000000_1_ DAYS' to 'PeopleView _20200721000000_1_ DAYS', the process control center receives an event subscription message, injects a change condition into an 'activity operation decision' parameter of activity five, instantiates the activity five if the operation is returned, and continues subscribing the change of the monitoring check table if the operation is returned.
5. The process control center ensures that only one instance of the activity with the same defined parameters runs at the same time;
6. the process control center sends heartbeat to the process control center at regular time for the long-term activity example, and the process control center restarts the activity example if the heartbeat interruption exceeds a certain time;
and managing the running state of the active instance, and recording the active state of the active instance on the running node recently.
Figure BDA0003295347840000171
Figure BDA0003295347840000181
The process control center monitors the state of the active instance by maintaining an active instance state table, ensures the stable and normal operation of the number of instances of the long-term activity, closes and eliminates the abnormal and unresponsive long-term activity, and starts the operation of a new instance; and ensuring singleton operation for the short-term activity instance, and closing the task instance when the time is out.
7. After the execution of the activity instance is successful, the process control center updates the information of the operation parameter value according to the execution result;
taking activity five as an example, the variable values in the operating parameters are updated after execution is completed:
Figure BDA0003295347840000182
8. the process control center discards the task to a failure queue after the execution of the activity instance fails, and the instance is not executed even if the activity existing in the failure queue meets the condition;
the process activity failure table is maintained by the process management and control center and used for temporarily storing failure activities:
Figure BDA0003295347840000191
the process control center configures parameters according to the failure waiting time length, and takes activities from the failure queue and places the activities into a process control check list parameter management range of the center;
as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (5)

1. An off-line data storage and calculation method based on time slice intelligent inspection control is characterized by comprising the following steps:
step S1: presetting and packaging an offline data processing general activity, and establishing a standardized data processing activity;
step S2: creating and configuring the starting and running parameter attributes of each data processing activity instance;
step S3: reading the activity check configuration when initializing or updating the starting parameter, adding the activity check configuration into a control check table of a process control center, and monitoring the change of the check parameter by the activity;
step S4: when the subscription parameters in the check list are found to change, scanning whether each activity condition is met, and if so, starting a related activity instance to execute offline data processing;
step S5: scheduling that only instances run at the same time on the same activity on the starting parameters is guaranteed;
step S6: for the long-term activity instance, sending heartbeat to the process control center at regular time, and restarting the activity instance if the heartbeat interruption exceeds a certain time;
step S7: after the execution of the activity instance is successful, updating the information of the starting parameter value according to the execution result;
step S8: after the execution of the activity instance fails, the task is discarded into a failure queue, and the instance is not executed even if the activity existing in the failure queue meets the condition;
step S9: and according to the failure waiting time length configuration parameters, retrieving the activities from the failure queue and then putting the activities into the starting parameter management range of the center.
2. The offline data storage and calculation method based on time-slice intelligent inspection control as claimed in claim 1, wherein said standardized data processing activities include data access activities, data storage activities, data deletion activities, data download activities, etc., wherein the activities are divided into long-term activities and temporary activities.
3. The offline data storage and computation method based on time-slice intelligent inspection control as recited in claim 1, wherein said parameter types are start-up parameters, operational parameters and activity decision parameters, all activity parameters being time-slice attributes.
4. The off-line data storage and calculation method based on time-slicing intelligent inspection control as claimed in claim 1, wherein the off-line data processing procedures in the standardized data processing activities are labeled and distinguished according to preset time-slicing period rules.
5. An off-line data storage and calculation system based on time slice intelligent inspection control is characterized by comprising
The configuration center is used for managing the offline activity abstract model and storing the information of the offline activity parameters;
the process control center is used for initializing, starting, monitoring operation and releasing the intelligent scheduling function of the activity instance of the offline data activity;
the process control center simultaneously comprises one or more operation nodes, and all the offline data activities instantiated by the process control center are on the operation nodes and serve as activity instance operation environments.
CN202111175485.4A 2021-10-09 2021-10-09 Offline data storage and calculation method based on time slice intelligent inspection control Active CN113867844B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111175485.4A CN113867844B (en) 2021-10-09 2021-10-09 Offline data storage and calculation method based on time slice intelligent inspection control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111175485.4A CN113867844B (en) 2021-10-09 2021-10-09 Offline data storage and calculation method based on time slice intelligent inspection control

Publications (2)

Publication Number Publication Date
CN113867844A true CN113867844A (en) 2021-12-31
CN113867844B CN113867844B (en) 2024-10-18

Family

ID=79002105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111175485.4A Active CN113867844B (en) 2021-10-09 2021-10-09 Offline data storage and calculation method based on time slice intelligent inspection control

Country Status (1)

Country Link
CN (1) CN113867844B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502772A (en) * 2016-10-09 2017-03-15 国网浙江省电力公司信息通信分公司 Electric quantity data batch high speed processing method and system based on distributed off-line technology
CN106570081A (en) * 2016-10-18 2017-04-19 同济大学 Semantic net based large scale offline data analysis framework
CN106992872A (en) * 2016-01-21 2017-07-28 中国移动通信集团公司 A kind of method and system of information processing
CN112445600A (en) * 2020-12-15 2021-03-05 北京首汽智行科技有限公司 Method and system for issuing offline data processing task
CN112507003A (en) * 2021-02-03 2021-03-16 江苏海平面数据科技有限公司 Internet of vehicles data analysis platform based on big data architecture
US11113244B1 (en) * 2017-01-30 2021-09-07 A9.Com, Inc. Integrated data pipeline

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106992872A (en) * 2016-01-21 2017-07-28 中国移动通信集团公司 A kind of method and system of information processing
CN106502772A (en) * 2016-10-09 2017-03-15 国网浙江省电力公司信息通信分公司 Electric quantity data batch high speed processing method and system based on distributed off-line technology
CN106570081A (en) * 2016-10-18 2017-04-19 同济大学 Semantic net based large scale offline data analysis framework
US11113244B1 (en) * 2017-01-30 2021-09-07 A9.Com, Inc. Integrated data pipeline
CN112445600A (en) * 2020-12-15 2021-03-05 北京首汽智行科技有限公司 Method and system for issuing offline data processing task
CN112507003A (en) * 2021-02-03 2021-03-16 江苏海平面数据科技有限公司 Internet of vehicles data analysis platform based on big data architecture

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARCO BIAGI: "A Continuous-Time Model-Based Approach for Activity Recognition in Pervasive Environments", 《IEEE》, 13 April 2019 (2019-04-13), pages 293 *
兰慧峰;左旭涛;王美霞;岳阳;周凡;: "基于大数据的城市轨道交通数据处理流程研究", 中国新技术新产品, no. 10, 25 May 2020 (2020-05-25) *

Also Published As

Publication number Publication date
CN113867844B (en) 2024-10-18

Similar Documents

Publication Publication Date Title
JP6602435B2 (en) Parallel execution of continuous event processing (CEP) queries
FI104018B (en) Method and system for monitoring a computer system
AU2012262153B2 (en) Systems and methods for executing device control
CN109299150B (en) Configurable multi-data-source adaptation rule engine solution method
CN105095048B (en) A kind of monitoring system alarm association processing method based on business rule
CN110166290A (en) Alarm method and device based on journal file
CN108229799B (en) Multi-source heterogeneous power grid operation real-time data access system and method
US20170085512A1 (en) Generating message envelopes for heterogeneous events
CN103370695B (en) database update notification method
CN106407075B (en) Management method and system for big data platform
CN110334126A (en) Timed task processing method, device and computer equipment based on Spring MVC
CN110795264A (en) Monitoring management method and system and intelligent management terminal
CN109542868A (en) Position method, apparatus, electronic equipment and the storage medium of abnormal SQL statement
CN109284331A (en) Accreditation information acquisition method, terminal device and medium based on business datum resource
CN116233164A (en) Method, apparatus, storage medium and processor for collecting device data
CN112925648B (en) Business strategy issuing method and device
CN112835591B (en) Operation and maintenance configuration management method and system supporting cross-language and cross-platform
CN113867844A (en) Offline data storage and calculation method based on time slice intelligent inspection control
CN106911730A (en) A kind of cloud disk service device accesses moving method and device
CN117149295A (en) Computing power access method and device, electronic equipment and storage medium
CN116841831A (en) Fault-tolerant processing method and device based on comprehensive inspection
CN111752838A (en) Question checking method and device, server and storage medium
CN111435356A (en) Data feature extraction method and device, computer equipment and storage medium
CN115396287B (en) Fault analysis method and device
CN115840766A (en) Log data analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant