CN111970374A - Data node grouping method, system and medium based on machine learning - Google Patents

Data node grouping method, system and medium based on machine learning Download PDF

Info

Publication number
CN111970374A
CN111970374A CN202010878186.6A CN202010878186A CN111970374A CN 111970374 A CN111970374 A CN 111970374A CN 202010878186 A CN202010878186 A CN 202010878186A CN 111970374 A CN111970374 A CN 111970374A
Authority
CN
China
Prior art keywords
acquisition
grouping
data
data node
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010878186.6A
Other languages
Chinese (zh)
Other versions
CN111970374B (en
Inventor
古欣
邵慧
房玉飞
刁志峰
黄大伟
迟昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Youren Information Technology Co ltd
Original Assignee
Shandong Youren Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Youren Information Technology Co ltd filed Critical Shandong Youren Information Technology Co ltd
Priority to CN202010878186.6A priority Critical patent/CN111970374B/en
Publication of CN111970374A publication Critical patent/CN111970374A/en
Application granted granted Critical
Publication of CN111970374B publication Critical patent/CN111970374B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Operations Research (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data node grouping method, a system and a medium based on machine learning, comprising the following steps: sorting the data node set according to the size of the initial address; dividing a data node set into a plurality of subsets according to a set rule according to an address difference value between adjacent data nodes; screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; determining the acquisition time of each group acquisition strategy based on a machine learning method, and determining an optimal group acquisition strategy; and combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set. The method has the advantages of shorter time for edge acquisition by the grouping optimization strategy, higher efficiency, greatly reduced time delay of edge acquisition and improved real-time property of edge acquisition.

Description

Data node grouping method, system and medium based on machine learning
Technical Field
The invention relates to the technical field of edge acquisition, in particular to a data node grouping method, a data node grouping system and a data node grouping medium based on machine learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In an application scenario of edge acquisition and edge calculation, a conventional edge acquisition method generally acquires information of all data nodes needing edge acquisition, and then sequentially generates an acquisition instruction of each data node according to the node information; and performing edge acquisition on each data node by using the acquisition instruction of the data node, and acquiring the node state and the data of the acquired data according to the rule of the protocol. The method collects data in sequence and in series among nodes, and the edge collection efficiency is very slow due to the fact that the ratio of effective load data to protocol data is too small and the time consumption of execution of a plurality of instructions is low.
The prior art optimizes the conventional edge acquisition method and logically divides the address space into a plurality of groups according to a fixed address length. And distributing the nodes to be acquired into corresponding groups according to the addresses of the nodes. The data to be acquired is acquired in groups, so that the serial acquisition among the nodes is optimized to be the serial acquisition among the groups, and the acquisition efficiency can be improved by several times to tens of times due to the logic structure of the parallel acquisition of the data nodes in the groups.
However, in practical applications, the inventor finds that the grouping method is a fixed address range grouping, defines logical addresses, and groups all nodes in the range. The method often cannot find a relatively reasonable grouping method, and the effect of acquisition optimization is reduced on the contrary due to an unreasonable or non-optimal grouping method in part of cases; such as:
the method comprises six Modbus-RTU protocols, wherein the register type is a data node (node or data point for short) for holding a register, and the addresses are 0, 31, 32, 63, 64 and 65; the grouping method using 32 addresses for one logical grouping results in: [0, 31], [32, 63], [64, 65 ]; the packet [0, 31] collects the data of all the data nodes with the addresses of 0 to 31; at this time, the data payload has 32 data points, the payload has 2 data points, and the dummy payload has 30 data points. For this grouping, the percentage of valid data points was 6.25%. In addition, protocol leading data and protocol trailing data are also arranged in the communication protocol, and the total effective data percentage is 5.7% which is too low. The replied data contains a large amount of redundant invalid load data, and when a serial port mode is used and the baud rate is low or other modes are adopted, the data transmission time of the large amount of redundant data is very time-consuming, and the acquisition efficiency is reduced.
In addition, the prior art discloses that the acquisition time of each packet is determined by a direct detection method, which requires that the actual detection data acquisition command is sent to the overall time spent for receiving and analyzing the data returned by the lower device, and therefore the efficiency is low;
in some embodiments, it is further disclosed to determine the acquisition time for each packet by predictive analysis, which predicts the data transfer time based on the data length, without actual detection process, and can improve the calculation efficiency of the acquisition time; however, since the time is estimated, the accuracy is not high compared to the actual detection method.
Disclosure of Invention
In view of this, the invention provides a data node grouping method, system and medium based on machine learning, which adopt an optimized grouping acquisition strategy to improve the efficiency of edge acquisition; and meanwhile, the acquisition time of each group acquisition strategy is determined by adopting a machine learning method so as to quickly determine the optimal group acquisition strategy.
In order to achieve the above purpose, in some embodiments, the following technical solutions are adopted:
a data node grouping method based on machine learning comprises the following steps:
sorting the data node set according to the size of the initial address;
dividing a data node set into a plurality of subsets according to a set rule according to an address difference value between adjacent data nodes;
screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; determining the acquisition time of each group acquisition strategy based on a machine learning method, and determining an optimal group acquisition strategy;
and combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
In other embodiments, the following technical solutions are adopted:
an optimized grouping system for improving edge acquisition efficiency, comprising:
means for sorting the set of data nodes according to starting address size;
the device is used for splitting the data node set into a plurality of subsets according to a set rule according to the address difference value between adjacent data nodes;
the system is used for screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; a device for determining the collection time of each group collection strategy and determining the optimal group collection strategy based on a machine learning method;
and the device is used for combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
In other embodiments, the following technical solutions are adopted:
a terminal device comprising a processor and a computer-readable storage medium, the processor being configured to implement instructions; the computer readable storage medium is used for storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned machine learning-based data node grouping method.
In other embodiments, the following technical solutions are adopted:
a computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute the above-mentioned machine learning-based data node grouping method.
Compared with the prior art, the invention has the beneficial effects that:
compared with a grouping strategy with fixed address length, the grouping optimization strategy of the method has shorter time for edge acquisition and higher efficiency, can greatly reduce the time delay of the edge acquisition and improve the real-time property of the edge acquisition.
The invention divides the whole data node set into a plurality of subsets to be processed respectively, thereby greatly reducing the data volume needing to be processed at a time and reducing the requirement on the performance of the data processing equipment.
According to the invention, through setting multiple effective grouping screening strategies, the grouping which does not meet the requirements can be directly filtered, and the grouping acquisition strategy is determined based on the screened effective grouping, so that the complexity of data processing is reduced, and the data processing efficiency is improved.
The method is based on a machine learning method to determine the acquisition time of each group, and then the acquisition time required by each group acquisition strategy is obtained; the method can improve the calculation efficiency of the acquisition time and can ensure the accuracy of the calculation result.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a flowchart of a data node grouping method based on machine learning according to an embodiment of the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example one
In one or more embodiments, a machine learning-based data node grouping method is disclosed, referring to fig. 1, comprising the steps of:
step S101: sorting the data node set according to the size of the initial address;
step S102: dividing a data node set into a plurality of subsets according to a set rule according to an address difference value between adjacent data nodes;
specifically, firstly, determining an address difference value between every two adjacent data nodes, and sequencing the data nodes from large to small; then the resolution is carried out according to the following process:
(1) splitting two non-split adjacent data nodes with the maximum address difference value;
(2) judging whether a subset meets a set condition after the data nodes are split;
(3) if not, entering the step (4); if yes, intercepting the subset, and entering the step (5);
(4) returning to the step (1) to continue splitting;
(5) judging whether the data node set is completely split or not, and if so, finishing; otherwise, returning to the step (1) to continue splitting the residual data nodes.
Wherein the subset satisfying the set condition includes:
condition 1: the number of nodes in the subset does not exceed the limited value X (X subset number limit parameter).
Condition 2: the address range of the subset does not exceed the limited value Y (Y subset address range limitation parameter).
When the two conditions are both satisfied, determining that the subset satisfies the condition; the values of X and Y can be flexibly configured according to requirements; the smaller the values of X and Y are, the more the number of the subsets is, the lower the optimization degree is, and the lower the time complexity is; the larger the values of X and Y, the larger the number of subsets, the higher the optimization degree and the higher the time complexity.
In this embodiment, the maximum number of the subsets is set as needed, and if the number of the split subsets reaches the maximum number, it is determined that the data node set is split completely.
It should be noted that when the address difference between consecutive data nodes, that is, adjacent data nodes, is 1, splitting is not required.
For example, the data node set obtained by sorting the starting address size is as follows:
Figure BDA0002653281870000061
x represents that node data exists under the current address, O represents that the node data does not exist under the current address, and 1-10 are addresses of data nodes.
When splitting is performed, the address difference between two data nodes with addresses 3 and 6 is the largest, so the data set is split into:
Figure BDA0002653281870000062
and
Figure BDA0002653281870000063
two subsets;
and judging whether the two subsets meet the condition or not, and then splitting the subsets which do not meet the condition.
Step S103: screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; calculating the acquisition time of each group acquisition strategy, and determining the optimal group acquisition strategy;
specifically, the specific process of screening valid packets includes:
(1) firstly, traversing all possible grouping modes of data nodes in a subset; if there are n data nodes, the number of all packets is:
Figure BDA0002653281870000071
such as: the subset has 5 points [0, 31, 32, 63, 64 ], denoted [ A, B, C, D, E ] for convenience of description. All possible grouping modes include the following 15 types:
Figure BDA0002653281870000072
(2) in addition to the grouping of the individual data nodes (such as A, B, C, D, E described above), the remaining grouping modes are screened, and the grouping screening method includes:
grouping and screening the adhesivity:
when the address difference value of any two adjacent data nodes is smaller than a set value Z, rejecting all groups only containing any one of the two data nodes; when the difference between two data nodes is smaller than Z, the two data nodes are called a bonded node pair. The smaller the value of Z, the greater the number of packets filtered and the lower the degree of optimization.
Exclusion group screening:
when the address difference value of any two adjacent data nodes is larger than a set value L, rejecting all groups containing the two nodes at the same time; these two nodes are referred to as an exclusive node pair. The larger the value of L, the greater the number of packets filtered and the lower the degree of optimization.
Thirdly, grouping and screening address ranges:
if the address difference value between the first data node and the last data node in the packet is larger than a set address range M, rejecting the packet; the smaller the value of M, the greater the number of packets filtered and the lower the degree of optimization.
Grouping efficiency screening:
generating an acquisition instruction of a corresponding group according to the rule of the protocol, determining the acquisition time of each group, and rejecting the groups with low acquisition efficiency based on the acquisition time; for example: the acquisition time for group a was 5 seconds, for group B was 3 seconds, and for group AB was 10 seconds. And (4) considering that the grouping efficiency of the grouping AB is low, and rejecting.
And the remaining groups after layer-by-layer screening are effective groups.
In this embodiment, the machine learning-based method obtains the acquisition time of each group, and the specific process includes:
in the data acquisition execution process, under the condition of not changing the environmental parameters, only the initial address is changed and the length of the query register unit is not changed, so that the flows of multiple edge acquisition are completely consistent, and the time is completely the same.
In this embodiment, the environmental parameters include: protocol type (for example: PLC protocol such as modbus-RTU, modbus-TCP, modbus-ASCII, PPI, etc.), protocol communication medium, register type (for example, register type such as Siemens PPI protocol having I type, Q type, M type, D type, etc.), data type and communication parameters, wherein the communication parameters further comprise: serial port, baud rate, data bit, stop bit, check bit and start bit. The environmental parameters are combined together to form a scene; any change in the environmental parameters corresponds to a new scene, one sub-classifier for each scene.
Of course, this does not constitute a limitation to the technical solution of the present invention, and those skilled in the art may determine other combinations of environmental parameters as scenes according to actual needs.
For example:
collecting and holding register with initial address of 0X 000X 64 and register unit length of 0X 000X 01
Collecting commands: 010300640001C 5D 5
Replying data: 0103020000B 844 reply data length 2 bytes
② a collection and hold register with the start address of 0X 000 XC8 and the length of the register unit of 0X 000X 01
Collecting commands: 010300C 8000105F 4
Replying data: 0103020000B 844 reply data length 2 bytes
Similarly, if only the register unit length is changed, the flow of the acquisition command and the reply data and the format of the protocol data are controllable.
For example:
collecting and maintaining register with initial address of 0X 000X 64 and register unit length of 0X 000X 02
Collecting commands: 01030064000285D 4
Replying data: 01030400000000 FA 33 reply data length of 4 bytes
② a collection holding register with the starting address of 0X 000 XC8 and the length of the register unit of 0X 000X 02
Collecting commands: 010300C 8000245F 5
Replying data: 01030400000000 FA 33 reply data length of 4 bytes
Therefore, the register unit length and the amount of the interactive data satisfy a linear relationship, that is, the register unit length and the overall execution time T satisfy a linear regression relationship: t ═ Lx + b; wherein, T is execution time, L is data register unit length, and b is fixed time.
If the linear regression equation of the scene is determined, the execution time T can be directly calculated from the linear regression equation if only the register unit length L is adjusted under the condition that the scene is not changed.
And each scene is trained and learned by using supervised learning according to the characteristics of the scene, so that a model of each scene is obtained.
The specific process of machine learning is as follows:
data acquisition- - > data preprocessing- - > model training- - > model verification- - > confirmation model
Acquisition of data
And (3) building a model scene, and creating process data of a crawler script crawling and PLC communication, wherein the process data comprises query time and the length of a queried register unit.
The process is as follows:
and creating the environment of the scene, for example, using a modbus-RTU protocol for communication, and building an edge acquisition environment by using parameters such as a 485 bus, a 9600 baud rate, 8-bit data bits, 1-bit stop bits and no check bits.
Creating a crawler script, realizing the generation of a query instruction, sending the query instruction to the PLC, receiving and analyzing reply data of the PLC, and storing the crawled process data: register unit length, time to issue and receive an instruction.
② data preprocessing
There may be anomalous or invalid data or out of range data due to the large amount of data crawled by the crawler. These data are preprocessed: and eliminating data failed in acquisition and data overtime, and replying abnormal data by the PLC, wherein the length of the register unit exceeds the range.
Model training
The preprocessed data has a large number of repeated data with the length of the register unit. All data is sorted according to register unit length.
For the data classified according to the register unit length, the arithmetic mean of the edge capture time of the register unit length class data is calculated as the mean time of the edge capture of the register unit length.
The specific process of fitting a linear regression equation for the register cell length is described below by way of an example.
Numbering: register cell length Edge acquisition mean time Equation of
1 2 16.9 16.9=2x+b
2 3 22.1 22.1=3x+b
3 4 27.2 27.2=4x+b
4 5 31.9 31.9=5x+b
1) Solving for x
Using the equation numbered n minus the equation numbered n-1
Numbering Equation of Solving for x
2-1 22.1-16.9=(4x+b)-(3x+b) 5.2=x
3-2 27.2-22.1=(4x+b)-(3x+b) 5.1=x
4-3 31.9-27.2=(5x+b)-(4x+b) 4.7=x
The arithmetic mean (5.2+5.1+ 4.7)/3-15/3-5 was calculated for all X, yielding X-5.
2) Solving for b
Substituting the obtained x-5 into each equation, and solving b of each equation
Numbering Register cell length Edge acquisition mean time Equation of b
1 2 16.9 16.9=2*5+b b=6.9
2 3 22.1 22.1=3*5+b b=7.1
3 4 27.2 27.2=4*5+b b=7.2
4 5 31.9 31.9=5*5+b b=6.9
The arithmetic mean (6.9+7.1+7.2+ 6.9)/4-28.1/4-7.025 was calculated for all b yielding b-7.025.
3) Substituting x and b yields a linear regression equation as:
Y=5x+7.025
model verification
Numbering Register cell length Edge acquisition mean time Equation of
1 2 16.9 2*5+7.025=17.025
2 3 22.1 3*5+7.025=22.025
3 4 27.2 4*5+7.025=27.025
4 5 31.9 5*5+7.025=32.025
And substituting the length of the register unit into the edge acquisition time obtained by the obtained linear regression equation to compare and verify the edge acquisition average time. Within reasonable error, to evaluate the performance of the model.
Model use
And performing acquisition time prediction on new data by using the trained model.
And finishing the training of the model.
Respectively training the sub-classifiers corresponding to each register type under each communication protocol by adopting the method to obtain a linear regression model of each sub-classifier, and further obtaining an integral classifier model;
acquiring the environmental parameter of each group in the group acquisition strategy, inputting the environmental parameter information into an integral classifier model, finding a corresponding relation equation T (f) (L) for each group according to the environmental parameter of the current group, and acquiring the acquisition time T of each group according to the length L of the register unit of the group so as to acquire the acquisition time of the group acquisition strategy.
In this embodiment, a combination of a plurality of valid packets, which include all nodes and each node includes only once, is determined as one packet acquisition policy.
Determining all possible grouping collection strategies based on the screened effective groups, wherein the specific process comprises the following steps:
sorting according to the address of the initial data node according to the principle that the initial nodes of the effective groups in each row are the same, and respectively placing the effective groups into different rows; such as:
the fifth element: e
Fourth row: d DE
Third row: c CD CDE
A second row: b BC BCD BCDE
First row: a AB ABC ABCD ABCDE
Firstly, selecting a first group of a group acquisition strategy, sequentially polling a first row, and selecting one group as the first group; and if all the groups in the first row are completely traversed, the collection strategy is completely traversed.
And determining the tail node of the last collection strategy group, and selecting the next node as the starting node of the next group.
Finding out the row where the starting node is located, and then sequentially selecting each group; until the last node E is found.
For example, ABC of the first row is selected as an acquisition strategy head group, and the group is added into an acquisition strategy set; then taking the node D as the head node of the next group, and selecting the group from the fourth row, wherein D or DE can be selected; if node D is selected, continue to select group E from the fifth row, resulting in a group acquisition policy: ABC/D/E; if the node DE is selected, a packet acquisition policy is obtained: ABC/DE.
And calculating the sum of the acquisition time of all the groups in each group acquisition strategy, and selecting the group acquisition strategy with the shortest acquisition time as the optimal group acquisition strategy of the subset.
Step S104: and combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
Compared with the traditional edge acquisition method or the fixed grouping edge acquisition method, the method can find the optimized grouping acquisition strategy by using the shortest time, has shorter time for edge acquisition and higher efficiency, can greatly reduce the time delay of edge acquisition and improve the real-time property of edge acquisition.
Example two
In one or more embodiments, a machine learning based data node grouping system is disclosed, comprising:
means for sorting the set of data nodes according to starting address size;
the device is used for splitting the data node set into a plurality of subsets according to a set rule according to the address difference value between adjacent data nodes;
the system is used for screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; a device for determining the collection time of each group collection strategy and determining the optimal group collection strategy based on a machine learning method;
and the device is used for combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
It should be noted that the specific working manner of the apparatus is implemented by using the method disclosed in step S101 to step S104 in the first embodiment, and details are not described again.
EXAMPLE III
In one or more implementations, a terminal device is disclosed that includes a server including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the machine learning-based data node grouping method of the first embodiment when executing the program. For brevity, no further description is provided herein.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The data node grouping method based on machine learning in the first embodiment can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (10)

1. A data node grouping method based on machine learning is characterized by comprising the following steps:
sorting the data node set according to the size of the initial address;
dividing a data node set into a plurality of subsets according to a set rule according to an address difference value between adjacent data nodes;
screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; determining the acquisition time of each group acquisition strategy based on a machine learning method, and determining an optimal group acquisition strategy;
and combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
2. The machine learning-based data node grouping method according to claim 1, wherein a data node set is split into a plurality of subsets according to a set rule according to an address difference between adjacent data nodes, and the specific process includes:
(1) splitting two non-split adjacent data nodes with the maximum address difference value;
(2) judging whether a subset meets a set condition after the data nodes are split;
(3) if not, entering the step (4); if yes, intercepting the subset, and entering the step (5);
(4) returning to the step (1) to continue splitting;
(5) judging whether the data node set is completely split or not, and if so, finishing; otherwise, returning to the step (1) to continue splitting the residual data nodes.
3. The machine learning-based data node grouping method according to claim 2, wherein the setting of the condition in step (2) specifically includes:
the number of nodes in the subset does not exceed a set value X;
the address range of the subset does not exceed the set value Y.
4. The machine learning-based data node grouping method according to claim 1, wherein for each subset, the specific process of screening valid packets includes:
acquiring all possible grouping modes of the data nodes in the subset;
and screening the rest grouping modes except the grouping of the single data node, wherein the grouping screening method at least adopts one mode of the following modes:
when the address difference value of any two adjacent data nodes is smaller than a set value Z, rejecting all groups only containing any one of the two data nodes;
when the address difference value of any two adjacent data nodes is larger than a set value L, rejecting all groups containing the two nodes simultaneously;
if the difference value of the address of the first data node and the address of the last data node in the group is larger than the set address range M, rejecting the group;
and fourthly, determining the acquisition time of each group, and rejecting the groups with low acquisition efficiency based on the acquisition time.
5. The machine learning-based data node grouping method according to claim 1, wherein a combination of a plurality of valid groups, which contain all data nodes and each data node contains only once, is determined as a group collection policy.
6. The machine learning-based data node grouping method of claim 1, wherein the collection time of each grouping collection strategy is calculated, and the one with the shortest collection time is selected as the optimal grouping collection strategy.
7. The machine learning-based data node grouping method according to claim 1, wherein the machine learning-based method determines the collection time of each group collection strategy, and the specific process includes:
determining environmental parameters for combining to form scenes, wherein the change of each environmental parameter corresponds to a new scene, and each scene corresponds to a sub-classifier;
in a certain scene, crawling grouping acquisition time data corresponding to different register unit lengths; training the sub-classifiers in the scene based on the crawled data to obtain linear regression models of the sub-classifiers;
respectively training the sub-classifiers corresponding to each scene by adopting the method to obtain a linear regression model of each sub-classifier, and further obtaining an integral classifier model;
acquiring the environmental parameter information of each group in the group acquisition strategy, inputting the information into the integral classifier model, and outputting the acquisition time of each group so as to obtain the acquisition time of the group acquisition strategy.
8. An optimized grouping system for improving edge acquisition efficiency, comprising:
means for sorting the set of data nodes according to starting address size;
the device is used for splitting the data node set into a plurality of subsets according to a set rule according to the address difference value between adjacent data nodes;
the system is used for screening effective groups for each subset, and determining all possible group acquisition strategies based on the screened effective groups; a device for determining the collection time of each group collection strategy and determining the optimal group collection strategy based on a machine learning method;
and the device is used for combining the optimal grouping acquisition strategies of all the subsets to obtain the optimal grouping acquisition strategy of the whole data node set.
9. A terminal device comprising a processor and a computer-readable storage medium, the processor being configured to implement instructions; a computer readable storage medium for storing a plurality of instructions adapted to be loaded by a processor and to perform the machine learning based data node grouping method of any of claims 1-7.
10. A computer-readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the machine learning based data node grouping method of any one of claims 1-7.
CN202010878186.6A 2020-08-27 2020-08-27 Data node grouping method, system and medium based on machine learning Active CN111970374B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010878186.6A CN111970374B (en) 2020-08-27 2020-08-27 Data node grouping method, system and medium based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010878186.6A CN111970374B (en) 2020-08-27 2020-08-27 Data node grouping method, system and medium based on machine learning

Publications (2)

Publication Number Publication Date
CN111970374A true CN111970374A (en) 2020-11-20
CN111970374B CN111970374B (en) 2023-02-03

Family

ID=73401221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010878186.6A Active CN111970374B (en) 2020-08-27 2020-08-27 Data node grouping method, system and medium based on machine learning

Country Status (1)

Country Link
CN (1) CN111970374B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220475A1 (en) * 2016-01-29 2017-08-03 International Business Machines Corporation Dynamic management of virtual memory blocks exempted from cache memory access
CN108009089A (en) * 2017-12-01 2018-05-08 中南大学 A kind of increment machine learning method and system based on lucidification disposal
CN108985461A (en) * 2018-06-29 2018-12-11 深圳昂云鼎科技有限公司 A kind of method, apparatus and terminal device of autonomous machine study
CN109026649A (en) * 2018-08-28 2018-12-18 上海弦慧新能源科技有限公司 Data acquisition device and operation management method
CN110909888A (en) * 2019-11-25 2020-03-24 深圳前海微众银行股份有限公司 Method, device and equipment for constructing generic decision tree and readable storage medium
US20200137024A1 (en) * 2018-10-31 2020-04-30 Hewlett Packard Enterprise Development Lp Identify assets of interest in enterprise using popularity as measure of importance
CN111131379A (en) * 2019-11-08 2020-05-08 西安电子科技大学 Distributed flow acquisition system and edge calculation method
CN111159002A (en) * 2019-12-31 2020-05-15 山东有人信息技术有限公司 Data edge acquisition method based on grouping, edge acquisition equipment and system
CN111200547A (en) * 2019-12-31 2020-05-26 苏州数言信息技术有限公司 Multi-node selection communication method based on equipment type in Modbus RTU network
CN111314707A (en) * 2020-01-17 2020-06-19 深圳力维智联技术有限公司 Data mapping identification method, device and equipment and readable storage medium
CN111325485A (en) * 2020-03-22 2020-06-23 东北电力大学 Light-weight gradient elevator power quality disturbance identification method considering internet-of-things bandwidth constraint
CN111400040A (en) * 2020-03-12 2020-07-10 重庆大学 Industrial Internet system based on deep learning and edge calculation and working method

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220475A1 (en) * 2016-01-29 2017-08-03 International Business Machines Corporation Dynamic management of virtual memory blocks exempted from cache memory access
CN108009089A (en) * 2017-12-01 2018-05-08 中南大学 A kind of increment machine learning method and system based on lucidification disposal
CN108985461A (en) * 2018-06-29 2018-12-11 深圳昂云鼎科技有限公司 A kind of method, apparatus and terminal device of autonomous machine study
CN109026649A (en) * 2018-08-28 2018-12-18 上海弦慧新能源科技有限公司 Data acquisition device and operation management method
US20200137024A1 (en) * 2018-10-31 2020-04-30 Hewlett Packard Enterprise Development Lp Identify assets of interest in enterprise using popularity as measure of importance
CN111131379A (en) * 2019-11-08 2020-05-08 西安电子科技大学 Distributed flow acquisition system and edge calculation method
CN110909888A (en) * 2019-11-25 2020-03-24 深圳前海微众银行股份有限公司 Method, device and equipment for constructing generic decision tree and readable storage medium
CN111159002A (en) * 2019-12-31 2020-05-15 山东有人信息技术有限公司 Data edge acquisition method based on grouping, edge acquisition equipment and system
CN111200547A (en) * 2019-12-31 2020-05-26 苏州数言信息技术有限公司 Multi-node selection communication method based on equipment type in Modbus RTU network
CN111314707A (en) * 2020-01-17 2020-06-19 深圳力维智联技术有限公司 Data mapping identification method, device and equipment and readable storage medium
CN111400040A (en) * 2020-03-12 2020-07-10 重庆大学 Industrial Internet system based on deep learning and edge calculation and working method
CN111325485A (en) * 2020-03-22 2020-06-23 东北电力大学 Light-weight gradient elevator power quality disturbance identification method considering internet-of-things bandwidth constraint

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
DELOITTE: ""关于边缘计算和边云协同,看这一篇就够了"", 《腾讯云-开发者社区HTTPS://CLOUD.TENCENT.COM/DEVELOPER/NEWS/443851》 *
HAO CAO,XIAOLONG XU,QINGXIANG LIU,YUAN XUE,LIANYONG QI: ""Uncertainty-Aware Resource Provisioning for Workflow Scheduling in Edge Computing Environment"", 《2019 18TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS》 *
刘云毅,张建敏,杨峰义: ""基于MEC的移动网络CDN增强及部署场景建议"", 《电信科学》 *
夏元清等: "绿色能源互补智能电厂云控制系统研究", 《自动化学报》 *
李蕾,闻征涛,任容玮: ""基于5G和MEC边缘云的智慧商超"", 《2019全国边缘计算学术研讨会论文集》 *
陈世超等: "制造业生产过程中多源异构数据处理方法综述", 《大数据》 *

Also Published As

Publication number Publication date
CN111970374B (en) 2023-02-03

Similar Documents

Publication Publication Date Title
CN110969198A (en) Distributed training method, device, equipment and storage medium for deep learning model
CN112732222A (en) Sparse matrix accelerated calculation method, device, equipment and medium
CN104809161B (en) A kind of method and system that sparse matrix is compressed and is inquired
CN102314336A (en) Data processing method and system
CN111930512B (en) Optimized grouping method and system for improving edge acquisition efficiency
US20220005004A1 (en) Method and device for blockchain transaction tracing
CN107547441A (en) CAN message filtering analytic method, system and electronic control unit
CN111970374B (en) Data node grouping method, system and medium based on machine learning
CN109861791B (en) Periodic data message transmission method, system, device and storage medium
CN109857740B (en) Character string storage method, matching method, electronic device and readable storage medium
CN115952385B (en) Parallel supernode ordering method and system for solving large-scale sparse equation set
CN112199407A (en) Data packet sequencing method, device, equipment and storage medium
CN111079830A (en) Target task model training method and device and server
CN108449231B (en) Transaction data filtering method and device and implementation device
CN113762424B (en) Network packet classification method and related device
WO2022161081A1 (en) Training method, apparatus and system for integrated learning model, and related device
CN113159791A (en) Block chain-based layered transaction parallel execution method and system
CN109388428B (en) Layer traversal method, control device and data processing system
CN107360262B (en) Software updating method and device
CN107544928B (en) Direct memory access control device and method for operating the same
CN111461310A (en) Neural network device, neural network system and method for processing neural network model
CN112308217A (en) Convolutional neural network acceleration method and system
CN103020203B (en) Method and device for processing data
CN117667602B (en) Cloud computing-based online service computing power optimization method and device
CN112149696A (en) Method and device for training graph embedding model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 250101 rooms 1103 and 1105, 11 / F, building 1, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan City, Shandong Province

Applicant after: Shandong Youren networking Co.,Ltd.

Address before: 250101 rooms 1103 and 1105, 11 / F, building 1, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan City, Shandong Province

Applicant before: SHANDONG YOUREN INFORMATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant