RU2643620C2 - Method of planning assignments of preparing data of internet of things for analyzing systems - Google Patents

Method of planning assignments of preparing data of internet of things for analyzing systems Download PDF

Info

Publication number
RU2643620C2
RU2643620C2 RU2016118326A RU2016118326A RU2643620C2 RU 2643620 C2 RU2643620 C2 RU 2643620C2 RU 2016118326 A RU2016118326 A RU 2016118326A RU 2016118326 A RU2016118326 A RU 2016118326A RU 2643620 C2 RU2643620 C2 RU 2643620C2
Authority
RU
Russia
Prior art keywords
node
devices
messages
cluster
tasks
Prior art date
Application number
RU2016118326A
Other languages
Russian (ru)
Other versions
RU2016118326A (en
Inventor
Петр Дмитриевич Зегжда
Дарья Сергеевна Лаврова
Александр Игоревич Печенкин
Мария Анатольевна Полтавцева
Original Assignee
федеральное государственное автономное образовательное учреждение высшего образования "Санкт-Петербургский политехнический университет Петра Великого" (ФГАОУ ВО "СПбПУ")
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by федеральное государственное автономное образовательное учреждение высшего образования "Санкт-Петербургский политехнический университет Петра Великого" (ФГАОУ ВО "СПбПУ") filed Critical федеральное государственное автономное образовательное учреждение высшего образования "Санкт-Петербургский политехнический университет Петра Великого" (ФГАОУ ВО "СПбПУ")
Priority to RU2016118326A priority Critical patent/RU2643620C2/en
Publication of RU2016118326A publication Critical patent/RU2016118326A/en
Application granted granted Critical
Publication of RU2643620C2 publication Critical patent/RU2643620C2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Abstract

FIELD: information technology.
SUBSTANCE: in method it is allocated sets of related tasks for preliminary data processing, which are operations of processing messages and highlighting their parameters, aggregating and normalizing the parameters of messages from devices, then on the nodes in the database of each node, intermediate data stores are allocated; allocate a handler node on which the intellectual rules of scheduling tasks between nodes of the cluster are formalized, according to which the computational tasks are evenly distributed to the least loaded nodes of the cluster; accumulate on one node-processor all messages from each device, aggregating the values of the message parameters; on each cluster node, priority planning is proposed with dynamic priorities within each node for each computational task by specifying priority weight formula; perform binding in the hierarchy of devices that have the same type; accumulate messages connected to the hierarchy of devices on one node-handler, forming one aggregated message from the hierarchy devices instead of a set of messages from each of the devices in the hierarchy.
EFFECT: automating task scheduling between cluster nodes.
1 dwg

Description

The invention relates to the field of computer systems, namely to the Internet of Things and the organization of data processing on the Internet of Things.

Known electronic transmission device high-performance interconnect (RF patent No. 2579140, G06F 13/42, publ. 03/27/2016), which solves the problem of hardware distribution of tasks in a multiprocessor cluster. The device includes a synchronization counter designed to locally equalize signal transmission by a specific device with signal transmission in a system containing one or more other devices coupled to the possibility of exchanging information through an interconnect; and a multi-level stack containing physical layer logic, link layer logic and protocol layer logic, while the physical layer logic is at least partially implemented in hardware and is designed to synchronize the reset of the synchronization counter with some external deterministic signal globally supported for the system and synchronization with this deterministic signal of entry into the transmitting state of the data channel based on the synchronization counter.

The disadvantages of this device for solving the task of preprocessing the data of the Internet of things for analysis systems are:

1) Lack of focus on a specific problem being solved, and, as a result, lower performance of universal algorithms when solving a task on a computing cluster.

2) Lack of intelligent planning tools focused on data preprocessing in the tasks of normalization and aggregation of messages of the Internet of things, allowing to avoid system overload.

Universal methods and hardware are known for processing the Internet of things data, including in the cloud system (US 2014297210, G01R 21/133, US 2014303935, G01D 21/00). In both cases, equipment and methods for analyzing data from the Internet of things are proposed. In one case, methods are considered for aggregating, filtering and disseminating data for analysis, obtaining primary information from the Internet of things. According to the method, the analog data of the Internet of things is digitized and transmitted over third networks using the appropriate protocols. The second method, focused on the use of clouds for data analysis, receives data from sensors (including indirect information, which is similar to the system under consideration) and analyzes it taking into account cloud technologies. Based on the analysis of the data according to the method under consideration, a control signal is generated.

The disadvantage of these methods is:

1) Low efficiency, expressed in low speed. General (universal) approaches of the methods do not take into account the features of data preprocessing processes, such as normalization and aggregation of information, and the associated stages of processing and planning necessary for the analysis of the Internet of things, in particular, when solving information security problems.

2) Reduced reliability caused by possible overloads of the computing cluster due to the lack of intelligent algorithms for scheduling tasks within the cloud itself, focused on the specifics of processing aggregated and normalized data.

The basis of the invention is the task of intelligent planning on the computing cluster of tasks for the preliminary processing of Internet data of Things, including normalization and data aggregation processes, for analysis systems, which provide increased processing speed of cluster tasks by computing nodes and optimizing the distribution of these tasks between cluster nodes by presenting computing problems in the form of a task graph, convenient for storing in a database and gaining quick access to data; formalization of intelligent rules for scheduling tasks with an arbitrary flow of incoming messages between cluster nodes, ensuring uniform loading of all cluster nodes by computational tasks by automatically sending newly arriving computational tasks to the least loaded cluster node and redirecting computational tasks accumulated on the cluster node to the free nodes; setting formulas for weighting coefficients of priorities of computational problems within a single cluster node, in accordance with which priority dynamic planning of computational tasks is provided, in which at every moment of time each computational task arriving at a cluster node is assigned a dynamic priority value, thereby, the highest priority computational tasks are processed earlier, regardless of the time when they arrived at the computing node of the cluster.

The problem is solved by the proposed method for planning the tasks of preprocessing the Internet of Things for analysis systems, including the uniform distribution and dynamic planning of computational tasks for data preprocessing, including processing messages and extracting their parameters, aggregating and normalizing message parameters from devices, aggregating message parameters from different devices between cluster nodes and setting the method of their execution within each computing node of the cluster RA, in contrast to the prototype, distinguish sets of related tasks for data preprocessing, typical for processing each message and representing operations of processing messages and extracting their parameters, aggregating and normalizing message parameters from devices, aggregating messages from different devices, then on nodes in the database of each node is allocated intermediate data storages, which are database memory areas; determine the data flows within the system during the calculations, which are a sequence of related tasks for the preliminary processing of data over messages from devices; allocate a processor node among all cluster nodes, on which the intelligent rules for scheduling tasks between cluster nodes are separately formalized, according to which computational tasks uniformly arrive at the least loaded cluster nodes for processing in the shortest time; accumulate on a single processor node all messages from each device, aggregating the values of message parameters; at each node of the cluster, priority planning is offered with dynamic priorities within each node assigned to each computational task of data preprocessing by setting priority weighting formulas; linking to a hierarchy of devices of the same type and located close to each other; accumulate messages related to the hierarchy of devices on one handler node, forming one aggregated message from devices in the hierarchy instead of many messages from each of the devices in the hierarchy.

The invention is illustrated in FIG. 1, which shows the breakdown of the data preprocessing process into a set of interrelated typical tasks and the interaction between these tasks. In FIG. 1 shows internal data streams, input and output data streams, access to metadata sets.

According to the developed method, DB0 storage should be an in memory object (permanently in memory), replicated, and accessible on each node of the cluster. DB1-DB2 storages are local to each cluster node. DB3 repository is deployed on a separate node and serves the remaining messages of all nodes in the cluster. It can be replicated. The following rules have been developed to increase the speed of DB1-DB3 storages and minimize data exchange between nodes:

1. All messages from one device go to one processor node.

2. All messages connected to the hierarchy of devices are sent to one node - the processor.

The implementation of these rules provides an increase in the speed of storage and minimization of data exchange for the following reasons:

1. Since messages from one device are accumulated on one processor node, the system does not forward messages between nodes for aggregation. Therefore, messages are not forwarded between the cluster nodes, due to this:

a) the speed of the data preprocessing process in the system increases, because there is no time spent on sending messages (minimizing data exchange);

b) the number of forwarded and stored messages is reduced, since there is no forwarding of messages from one source between cluster nodes (minimizing data exchange and increasing the speed of storage operations).

2. Since the messages of devices connected to the hierarchy are sent to one processor node, the aggregation of messages from these devices is no longer on individual messages of each device, but on the aggregated messages of each device. Thus, in one aggregated message of each device contains the values of several separate messages from this device. Moreover, each aggregated message contains one value of the same type as in each individual. When aggregating messages from a device hierarchy, a single aggregated message is generated from several devices instead of several messages from each device separately. Therefore, the forwarding of such a message is faster, but the size of the transmitted message does not increase, which is essential for increasing the speed of the message exchange and thereby minimizing the exchange of data between nodes. In addition, the storage operation speed is increased, since one access to the storage is carried out for an aggregated message instead of several calls to receive messages from each device.

When these rules are fulfilled, tasks are distributed between clusters depending on the workload of nodes. The cluster node load is determined by the list of devices attached to it (first physical, then aggregating):

Figure 00000001

where Z is the cluster load, Nf is the number of physical devices tied to the cluster, Na is the number of aggregating devices tied to the cluster.

Profiles of tasks included in the preprocessing process are developed.

Description of task profiles includes:

1. The message parsing module (T1) is initiated by the arrival of a new message. Its purpose: message processing, selection of parameters. A message is received at the input, the extracted parameters are provided at the output.

2. The time aggregation manager (T2) is constantly initiated, its purpose is to view the parameter queue and to aggregate the parameters when the time stamp is reached. The input is a sample of Q1, the output is an aggregated parameter. T2 activation is carried out with:

- the onset of time (t);

- entering the next parameter into the queue (T1).

3. The normalization module (T3) is initiated by the T1 module (when transmitting a parameter) or by the T2 module (when transmitting an aggregated parameter). Its purpose is to normalize parameters. A parameter is input, a normalized parameter is output.

4. The dispatcher (T4) is designed to form aggregate and compound events. The input (or parameters) is received, the event is output. T4 is initiated by modules T3, T5, T6 upon receipt of the parameter.

5. The module for generating composite messages (T5) is intended for the formation of compound events from several messages. The input receives a sample from Q2, the parameter is provided at the output. T5 is initiated:

- module T4 upon receipt of the parameter, the last in the chain to form a composite event;

- by timeout to check for “hung” events for which no final or continuing messages have arrived.

6. The module for the formation of aggregate (composite) devices (T6) is intended for the formation of parameters of composite devices. The input is a sample from Q3, the output is a parameter of the composite device. T6 initiated:

- module T4 upon receipt of the parameter of the composite device;

- when the timeout for the composite device expires.

In accordance with the developed method, intelligent planning in the task of processing big data on a cluster node is implemented as a priority queue with dynamic priority assignment depending on the state of the system and the task being performed. Changing priorities for balancing the execution of tasks should occur at times of instability in the system. The stability of the system in question, based on the profile of the tasks, is determined by the following provisions:

Figure 00000002

Figure 00000003

Figure 00000004

where N is the number of tasks.

Also important statistical indicators of the node are:

1. Ni - the number of tasks of each type in the system;

2. Ti is the average time taken to complete each type of task in the system;

3. Nin0 - the number of tasks entering the system (T1) for Δt (for the previous period)

4. Nin1 - the number of tasks entering the system (T1) for Δt (for the current period)

5. Q01-Q03 - the length of the queues Q1-Q3 at the time Δt (for the previous period);

6. Q1-Q3 - Q1-Q3 queue lengths

7. Qp1-Qp3 - threshold values of lengths.

Thus, the node profile will be: {N = {N1, ... N6}, T = {T1, ... T6}, Nin0, Nin1, Q01, Q02, Q03, Q1, Q2, Q3, Qp1, Qp2, Qp3} .

Based on the above provisions, the following system operation rules have been identified:

1. The number of tasks of each type in the system during the operation (a certain period of operation, determined by the principle of a movable window) is subject to rules 2-4.

2. The ratio of the growth of the queue din should correspond to the dynamics of incoming messages.

The speaker ratio is determined by the following rules:

Figure 00000005

Figure 00000006

Figure 00000007

Figure 00000008

The DB0 storage is replicated over the cluster nodes (and possibly fragmented in accordance with the display directory Device - Cluster Node).

DB1-DB2 storages are local to each cluster.

DB3 repository is deployed on a separate node and serves the remaining messages of all nodes in the cluster. It can be replicated.

The novelty of the proposed solution lies in the accumulation of messages from each device on one processor node in accordance with the intelligent planning rules that this node applies to ensure fast uniform distribution of computational tasks on preliminary data processing between cluster nodes and reduce the number of messages sent between nodes; in the accumulation of messages from device hierarchies on a single processor node, which will ensure the receipt and further forwarding of one aggregated message that stores values from several devices included in the hierarchy. Thus, minimizing the transfer of messages between cluster nodes will be ensured and the speed of storage will be increased by reducing the number of calls to them and minimizing the stored data. At the same time, this method will allow local maintenance of intermediate DB1-DB2 storages.

Novelty in the context of task planning within one node of a computing cluster is expressed in the developed formulas for calculating dynamic priorities in priority planning. The statistical indicators used in the developed planning method are also applicable to establish the stability (stable functioning) of the system.

Claims (1)

  1. A method for scheduling Internet data preprocessing tasks of Things for analysis systems, including uniform distribution and dynamic planning of computational tasks for data preprocessing, including message processing and extraction of their parameters, aggregation and normalization of message parameters from devices, aggregation of message parameters from different devices between cluster nodes and setting the method of their implementation within each computing node of the cluster, characterized in that they distinguish sets of related tasks for the preliminary processing of data typical for processing each message and representing the operations of processing messages and extracting their parameters, aggregating and normalizing message parameters from devices, aggregating messages from different devices, then intermediate nodes storing data on nodes in the database of each node, representing Database memory areas determine the data flows within the system during the calculations, which are a sequence of related tasks for the preliminary processing of data over messages from devices; allocate a processor node among all cluster nodes, on which the intelligent rules for scheduling tasks between cluster nodes are separately formalized, according to which computational tasks uniformly arrive at the least loaded cluster nodes for processing in the shortest time; accumulate on a single processor node all messages from each device, aggregating the values of message parameters; at each node of the cluster, priority planning is offered with dynamic priorities within each node assigned to each computational task of data preprocessing by setting priority weighting formulas; linking to a hierarchy of devices of the same type and located close to each other; accumulate messages related to the hierarchy of devices on one handler node, forming one aggregated message from devices in the hierarchy instead of many messages from each of the devices in the hierarchy.
RU2016118326A 2016-05-11 2016-05-11 Method of planning assignments of preparing data of internet of things for analyzing systems RU2643620C2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
RU2016118326A RU2643620C2 (en) 2016-05-11 2016-05-11 Method of planning assignments of preparing data of internet of things for analyzing systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
RU2016118326A RU2643620C2 (en) 2016-05-11 2016-05-11 Method of planning assignments of preparing data of internet of things for analyzing systems

Publications (2)

Publication Number Publication Date
RU2016118326A RU2016118326A (en) 2017-11-16
RU2643620C2 true RU2643620C2 (en) 2018-02-02

Family

ID=60328217

Family Applications (1)

Application Number Title Priority Date Filing Date
RU2016118326A RU2643620C2 (en) 2016-05-11 2016-05-11 Method of planning assignments of preparing data of internet of things for analyzing systems

Country Status (1)

Country Link
RU (1) RU2643620C2 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013054150A1 (en) * 2011-10-11 2013-04-18 Telefonaktiebolaget Lm Ericsson (Publ) Management of data flows between user equipment nodes and clusters of networked resource nodes
US20140297210A1 (en) * 2011-04-22 2014-10-02 Expanergy, Llc Universal internet of things apparatus and methods
US20140303935A1 (en) * 2011-06-15 2014-10-09 Expanergy, Llc Universal internet of things cloud apparatus and methods
WO2015006080A1 (en) * 2013-07-11 2015-01-15 Neura, Inc. Physical environment profiling through internet of things integration platform
WO2015061976A1 (en) * 2013-10-30 2015-05-07 Nokia Technologies Oy Methods and apparatus for task management in a mobile cloud computing environment
EP2977901A2 (en) * 2014-07-02 2016-01-27 Samsung Electronics Co., Ltd Method for assigning priority to multiprocessor tasks and electronic device supporting the same
RU2579140C1 (en) * 2012-10-22 2016-03-27 Интел Корпорейшн Physical layer of high-efficiency interconnection

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140297210A1 (en) * 2011-04-22 2014-10-02 Expanergy, Llc Universal internet of things apparatus and methods
US20140303935A1 (en) * 2011-06-15 2014-10-09 Expanergy, Llc Universal internet of things cloud apparatus and methods
WO2013054150A1 (en) * 2011-10-11 2013-04-18 Telefonaktiebolaget Lm Ericsson (Publ) Management of data flows between user equipment nodes and clusters of networked resource nodes
RU2579140C1 (en) * 2012-10-22 2016-03-27 Интел Корпорейшн Physical layer of high-efficiency interconnection
WO2015006080A1 (en) * 2013-07-11 2015-01-15 Neura, Inc. Physical environment profiling through internet of things integration platform
WO2015061976A1 (en) * 2013-10-30 2015-05-07 Nokia Technologies Oy Methods and apparatus for task management in a mobile cloud computing environment
EP2977901A2 (en) * 2014-07-02 2016-01-27 Samsung Electronics Co., Ltd Method for assigning priority to multiprocessor tasks and electronic device supporting the same

Also Published As

Publication number Publication date
RU2016118326A (en) 2017-11-16

Similar Documents

Publication Publication Date Title
CN100343810C (en) Task Scheduling method, system and apparatus
US20170201448A1 (en) Techniques Associated with Server Transaction Latency Information
Kaul et al. Real-time status: How often should one update?
US20070016560A1 (en) Method and apparatus for providing load diffusion in data stream correlations
Lu et al. Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services
Khazaei et al. Performance analysis of cloud computing centers using m/g/m/m+ r queuing systems
Zheng et al. An approach for cloud resource scheduling based on Parallel Genetic Algorithm
Awad et al. Enhanced particle swarm optimization for task scheduling in cloud computing environments
US8773992B2 (en) Methods and apparatus for hierarchical routing in communication networks
US20100125847A1 (en) Job managing device, job managing method and job managing program
Kliazovich et al. CA-DAG: Modeling communication-aware applications for scheduling in cloud computing
CN103345514A (en) Streamed data processing method in big data environment
JP5664098B2 (en) Composite event distribution apparatus, composite event distribution method, and composite event distribution program
US20120063313A1 (en) Hybrid weighted round robin (wrr) traffic scheduling
Cardellini et al. On QoS-aware scheduling of data stream applications over fog computing infrastructures
US8387059B2 (en) Black-box performance control for high-volume throughput-centric systems
GB2461244A (en) Network congestion control with feedback to adjust flow rates of source nodes.
Khazaei et al. Modelling of cloud computing centers using M/G/m queues
CN103927229B (en) MapReducing job scheduling method and system for dynamically available in the server cluster
US9386086B2 (en) Dynamic scaling for multi-tiered distributed systems using payoff optimization of application classes
Noormohammadpour et al. Datacenter traffic control: Understanding techniques and tradeoffs
US9967188B2 (en) Network traffic flow management using machine learning
Nan et al. Optimal resource allocation for multimedia cloud in priority service scheme
US20130155858A1 (en) Hierarchical occupancy-based congestion management
CN104317650A (en) Map/Reduce type mass data processing platform-orientated job scheduling method