CN113238527A

CN113238527A - Industrial data aggregation method and system

Info

Publication number: CN113238527A
Application number: CN202011568569.XA
Authority: CN
Inventors: 代超仁; 陈吉红; 杨建中; 冯冰艳; 晏嫚; 王萧; 陈震
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-12-19
Filing date: 2020-12-25
Publication date: 2021-08-10
Anticipated expiration: 2040-12-25
Also published as: CN113238527B

Abstract

The invention provides an industrial data aggregation method and system, and belongs to the technical field of industrial big data. An industrial data aggregation method, comprising: when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions; the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the data point location change times to the data point location synchronization times. The invention gives consideration to the requirements of high-frequency synchronization of hot data and low-frequency synchronization of cold data, can well control the consumption of the digital twins to the network bandwidth under the condition of giving the upper limit of the network flow, and has higher engineering practical value.

Description

Industrial data aggregation method and system

Technical Field

The invention belongs to the technical field of industrial big data, and particularly relates to an industrial data aggregation method and system.

Background

With the development of intelligent manufacturing, the numerical control machine tool is taken as a main production device, and the digitization, the networking and the intellectualization of the numerical control machine tool become important supports for industrial transformation and upgrading.

The digital twin of the numerical control machine tool is a numerical expression and modeling technology of the numerical control machine tool and is the key point of long-term research in the field of numerical control machining. The method is lack of an efficient and rapid data lightweight algorithm in the data acquisition process of the numerical control machine tool, and cannot meet the requirement of digital twin mass big data convergence. In actual work, two extremes are often generated, one extreme is to extract features at an edge end, and a large amount of useful information is discarded; at the other extreme, large amounts of raw data are uploaded, consuming large amounts of bandwidth resources and computational resources. Because information integrity and cost economy cannot be considered, numerical control machine tool big data acquisition becomes a bottleneck of high-simulation digital twins.

The numerical control system is internally provided with an acquisition module and a communication module, wherein the acquisition module is responsible for acquiring data generated by the numerical control machine, and the communication module is responsible for uploading the acquired data. The communication module of the numerical control system generally uses Ethernet with high flexibility and simplicity, and because the Ethernet adopts a CSMA/CD collision detection mode, the data transmission performance is rapidly reduced when the network load is large, thereby causing the uncertainty of network transmission data and causing the server side to be unable to complete the data synchronization of data twins. Therefore, a method for optimizing data synchronization needs to be designed under the condition of limited network broadband so as to meet the requirement of synchronous acquisition of digital twin data.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide an industrial data aggregation method and system, and aims to solve the problem that a server cannot complete data synchronization of data twins when the network load of the conventional numerical control system is large.

In order to achieve the above object, the present invention provides an industrial data aggregation method, comprising the following steps:

when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions;

the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location to the synchronization times of the data point location; (ii) a Wherein the random number is a number from 0 to 1.

Preferably, the method for acquiring the data heat comprises the following steps:

d1, initializing the change times and the synchronization times of each data point location, synchronizing all the data point locations under the current working condition, calculating the sum of the change frequencies of all the data point locations, and deleting invalid data point locations;

d2 repeatedly synchronizing the effective data point locations, calculating the sum of the change frequency of all the data point locations after each data synchronization and the change frequency of a single effective data point location, and stopping the data synchronization until the end condition is met;

wherein the number of each data pointAccording to the heat degree, the corresponding change frequency obtained by the last data synchronization is obtained; the constraint conditions are as follows:

and

respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; epsilon₂Is the end threshold.

Preferably, D2 specifically includes the following steps:

d2.1, repeatedly synchronizing the effective data point locations, and calculating the sum of the change frequencies of all the data point locations after each data synchronization and the data heat of a single effective data point location;

d2.2, dividing the effective data point positions into hot data point positions and cold data point positions according to the data heat degree and the dividing conditions of the single effective data point position;

d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until a finishing condition is met, and stopping data synchronization;

wherein, the dividing conditions are as follows:

s_irepresenting the number of single data point bit syncs, c_iRepresenting the number of changes, epsilon, of a single data point₁Is a partition threshold.

Preferably, the method for establishing the sampling probability model based on the data heat of the data point location comprises:

setting coordinate intervals of all data point positions according to sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model; wherein, the sampling probability is the data heat of the data point location.

Preferably, the method for determining the collected data point location by using the random number comprises the following steps:

when i is 0, R is belonged to(0,P₀]Then, collecting the 0 th data point bit for synchronization; when i is>0,

And then, selecting the ith data point bit for synchronization.

On the other hand, the invention provides an industrial data convergence system, which comprises a flow sensing module, a data point location selection module and a data control module, wherein the flow sensing module, the data point location selection module and the data control module are used for sequentially transmitting data;

the flow sensing module is used for identifying whether the difference value between the arrival time interval of the two adjacent sampling data and the data uploading period exceeds a congestion threshold value or not, and when the difference value exceeds the congestion threshold value, the data point location selection module is started;

the data point location selection module is used for determining the collected data point locations in the sampling probability model by utilizing random numbers based on the current working condition;

the data control module is used for uploading the collected data point positions;

the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location and the synchronization times of the data point location; wherein the random number is a number from 0 to 1.

Preferably, the data point location selection module comprises a heat learning unit and a data synchronization unit;

the heat learning unit is used for calculating the data heat of each data point location based on the current working condition;

the data synchronization unit is used for establishing a sampling probability model by using the data heat of each data point location; and meanwhile, identifying the current working condition, and updating the data heat of each data point by the scheduling heat learning unit if the working condition changes.

Preferably, the sampling probability model sets coordinate intervals for all data point positions according to the sampling probability, and the coordinate intervals are sequentially distributed on a closed interval [0,1] of the one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.

when i is 0, R is belonged to (0, P)₀]Then, collecting the 0 th data point bit for synchronization; when i is>0,

And then, selecting the ith data point bit for synchronization.

A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the industrial data aggregation method provided by the present invention.

Through the technical scheme, compared with the prior art, the invention has the following beneficial effects:

according to the industrial data aggregation method provided by the invention, the data heat and the sampling probability model are combined, when the situation of uploading data is congested, the sampled data point position can be obtained according to the data heat, the higher the data heat is, the higher the sampling probability is, and the sampled data point position is uploaded to the server after sampling for multiple times.

The invention adopts a heat learning method, can acquire the data heat of each data point location when the working condition of the numerical control machine tool changes, and provides powerful technical support for the subsequent acquisition of the data point locations needing data synchronization.

The method for synchronizing the single cold data point location in the k repeated synchronizations is set, so that the time for heat learning is effectively reduced, and the data heat of each data point location can be acquired more quickly.

The invention transmits the data point positions obtained by sampling for many times in the form of data packets in sequence, thereby not only avoiding the occurrence of flow peak, but also fully utilizing bandwidth resources.

Drawings

FIG. 1 is a graph of the variation of F (t) with time according to the present invention;

FIG. 2 is a schematic view of a survey window provided by the present invention;

FIG. 3 is a schematic diagram of a sampling probability model provided by the present invention;

FIG. 4 is a schematic diagram of an industrial data aggregation system provided by the present invention;

FIG. 5(a) is a schematic flow chart of the prior art one-time synchronization of all data points provided by the present invention;

FIG. 5(b) is a schematic diagram of the optimized flow rate provided by the present invention;

fig. 6 is a schematic diagram of network occupation provided by the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention provides an industrial data aggregation method, which comprises the following steps:

the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location and the synchronization times of the data point location; (ii) a Wherein the random number is a number from 0 to 1.

Preferably, the method for acquiring the data heat of each data point location comprises:

and the data heat of each data point location is the corresponding change frequency obtained by the last data synchronization.

Preferably, D2 specifically includes the following steps:

d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until the end condition is met, and stopping data synchronization.

Preferably, the dividing conditions are:

the end conditions are as follows:

wherein,

and

respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; the time length of the single consideration window is L; setting a consideration window every L/2 time interval; s_iRepresenting the number of single data point bit syncs, c_iRepresenting the number of changes, epsilon, of a single data point₁Is a division threshold; epsilon₂Is the end threshold.

and setting coordinate intervals of all the data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model.

And then, selecting the ith data point bit for synchronization.

Preferably, the total number of data point locations acquired for multiple times is:

wherein, T^uIs a data upload period; t is^sIs a sampling period; the count is the number of data points acquired each time.

In another aspect, the present invention provides an industrial data aggregation system, including: the data transmission system comprises a flow sensing module, a data point location selection module and a data control module which are used for transmitting data in sequence;

Preferably, the sampling probability model sets coordinate intervals for all data point positions according to the sampling probability, and the coordinate intervals are sequentially distributed on the closed interval [0,1] of the one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.

And then, selecting the ith data point bit for synchronization.

Preferably, the method for calculating the data heat of each data point location comprises:

Preferably, D2 specifically includes the following steps:

Preferably, the dividing conditions are:

the end conditions are as follows:

wherein,

and

respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; consider the time length of the window as L; setting a consideration window every L/2 time interval; s_iRepresenting the number of single data point bit syncs, c_iRepresenting the number of changes, epsilon, of a single data point₁Is a division threshold; epsilon₂Is the end threshold.

Examples

The specific principle of the invention is as follows:

in the process of processing and running of the numerical control system, the data change frequency is characterized as follows: (1) the attribute data is basically fixed and unchanged; (2) the change frequency of the parameter data and the task data is not high; (3) controlling the specific event attribute of the data, wherein the change period is not fixed; (4) the state data changes most frequently. The invention provides a concept of 'data heat' as a reference basis for the data point synchronization priority. The data hot degree refers to the change activity degree of a certain data point location in all synchronous data point locations. In the ideal case, the data heat formula is as follows:

wherein h is_iIs the data heat; f. of_iRepresenting the frequency of change of data point locations; Δ t_iTime intervals representing data point locations; i is the data index.

In the actual production and processing process, the numerical control machine tool is considered to have a great number of data point digits, and simultaneously, the synchronization of all data point positions is difficult to realize, and the delta t cannot be obtained_iThe true value of (d). Therefore, the data heat is changed into the following formula:

wherein, h'_iThe data heat of the ith data point defined by the invention; s_iThe number of times of synchronization of a single data point location; c. C_iThe number of changes for a single data point location; s_iAnd c_iThe initial values are all 0, and s is the value of each synchronization of a certain data point_i1 is increased progressively; each time of change, c_iIncrement by 1. From this, it can be seen that the sample change frequency c is obtained when the sample capacity is extremely large (when the number of times of synchronization of a single data point is extremely large)_i/s_iApproximately equal to the actual variation frequency f_iAt this time, h 'can be considered'_iIs equal to h_i。

As can be seen from formula (1), h 'occurs when the number of data point bit synchronization times reaches a certain number'_iCan replace h_iThat is, the data heat of all the data points needs a time process. Therefore, the data point location selection module is divided into a heat learning unitAnd a data synchronization unit;

the principle of the heat learning unit is as follows:

as can be seen from the formula (1), the data heat depends on the synchronization times s of the data points_iAnd the number of changes c_iTherefore, a large number of synchronous tests should be performed on all data points of the numerical control system to obtain the actual change frequency of each data point, and finally, the corresponding data heat degree is calculated.

Since a process is required for synchronizing all data point locations for many times, the sample change frequency (i.e., the recorded change frequency) of each data point location changes continuously and finally tends to be stable.

In order to know whether the sample change frequency of all the data point locations tends to be stable or not, the sum of the change frequencies of all the data point locations is used as a judgment standard. The following formula (2) shows the time variation relationship of the sum of the variation frequencies of all data point positions along with the synchronization process.

Wherein t is time; i is an index of the data point location; f (t) is the sum of the change frequencies of all the data points at the time t; f. of_i(t) is the change frequency of the ith data point at the moment t; setting the initial values of the synchronization times and the change times of all the data point locations as 1, then setting the initial change frequency of all the data point locations as 1, recording the synchronization times and the change times of each data point location every time after synchronization, and calculating the corresponding change frequency; f (t) the trend with time is shown in FIG. 1; with the increase of the number of synchronization times, the change frequency of a single data point location is gradually reduced or unchanged, the sum of the change frequencies of all the data point locations is continuously reduced, and when the change frequency of the single data point location tends to be stable, the sum of the change frequencies of all the data point locations also tends to be stable.

The learning method of the popularity learning unit is as follows:

because the numerical control system is designed to take various numerical control machines into consideration, a large number of abundant data point positions are set for the convenience of development or expansion of a user, but only a part of data point positions are used in actual production and processing, the values of other expansion point positions are always null, and the synchronous acquisition of the data point positions consumes network memory resources and has no practical significance. Therefore, the invention refers to the empty extension point as "invalid data point", and refers to the actually used point as "valid data point".

The method for learning the popularity comprises the following steps:

(1) setting the change times and the synchronization times of each data point location as 1, synchronizing all the data point locations, and calculating the sum of the change frequencies of all the data point locations;

(2) deleting invalid data points;

(3) continuously and circularly synchronizing the effective data points, recording the synchronization times and the data point position change times, and recording the sum of the change frequencies of all the data point positions and the data heat of a single data point position;

(4) and setting a learning end condition, and ending the cycle synchronization when the sum of the change frequencies of all the data point positions tends to reach the learning end condition stably.

Processing cold data change frequency:

there is a large amount of data in a numerical control system that has a low frequency of change or is substantially unchanged, and such data is called cold data. In the stage of data hot degree learning, the number of times of data change c is changed due to cold_iRemains at 1 or has a very small number of changes, while the number of cold data syncs s_iBut keeps increasing, and finally the frequency of change c of these data_i/s_iGoing to 0 gradually, two problems arise due to the large amount of cold data:

(1) the sum of the frequency of change of all cold data points levels off at a very slow rate;

(2) the change frequency of the cold data is infinitely close to 0, the data heat is also close to 0, and when the data is synchronized by sampling the data heat, the probability of synchronizing the cold data is small.

Therefore, in order to speed up the data heat learning process and ensure that the cold data has a certain data heat, a lower limit of the frequency change rate of the cold data needs to be set. The number of times of synchronization of the cold data by a single cold data in k times of synchronization is set to 1. Thus, the synchronization frequency for a single cold datum is obtained as:

finishing to obtain:

wherein i is a data index number, m is the number of hot data, n is the number of cold data, f_i ^coldVarying the frequency setting, f, for cold data_i ^hotThe thermal data change frequency.

In the process of learning the data heat, when the heat data (namely the change times c)_i>1) Tends to be stable, the cold data (i.e., the number of changes c) is set_i1) is set as

The calculated value. Through the arrangement, the sum of the change frequencies of all the data point positions can quickly tend to be stable.

And (4) finishing conditions:

and terminating the data heat learning when the overall data heat of the numerical control machine tool tends to be stable. In order to obtain the termination condition of the data heat learning, the invention provides a survey window (shown as a solid line and a dashed line rectangle in fig. 2), the window length is set as L, and the average value of the sum of all data change frequencies in L time is calculated. Setting a survey window every L/2 time, comparing the average value of the sum of the change frequencies of all the data points calculated by the two survey windows before and after, and if the relative error is at the ending threshold epsilon₂And within the range, the heat degrees of all the data point positions are considered to be stable, and the data heat degree learning is terminated at the moment.

The end conditions are as follows:

wherein,

and

respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; the time length of the single consideration window is L; setting a consideration window every L/2 time interval; epsilon₂Is the end threshold.

The data synchronization principle is as follows:

because the numerical control machine tool data point bit change frequencies are different, the data selection priority of data point bit synchronization should be considered, and when the data point bit change frequency is higher, the frequency for synchronizing the data point bit should be higher.

The data heat expresses the activity degree of data change of a single data point in the whole data point, and the higher the data heat of the single data point is, the higher the change frequency is; similarly, the lower the data heat of a single data point bit, the lower its change frequency. Therefore, the data heat can be considered as the probability that a single data point location is synchronized, and after all the data point locations are synchronized according to a large number of probabilities, the synchronization frequency of the data point locations is close to the change frequency. Therefore, timely synchronization of the data point positions with high frequency changes is guaranteed, and the data point positions with low frequency changes can be synchronized irregularly.

By

It can be known that the sampling probability P can be ordered_i＝h′_iTo be in accordance with P_iSelecting synchronous data point position, setting closed interval [0,1] on one-dimensional axis]All data points are in accordance with the probability P_iDistributed in the interval, i.e. the sampling probability value P of the ith data point_iThe length of the section corresponding to the point (P, as shown in FIG. 3)₁、P₂Representing the data heat of the data point location). Write program in interval [0,1]]A random number R is generated internally,when i is 0, R is belonged to (0, P)₀]Then, selecting the 0 th data locus for synchronization; when i is>0,

Then, the ith data locus is selected for synchronization.

According to the sampling probability model, the higher the data heat corresponding to the data point location is, the higher the synchronized probability is; the lower the data heat corresponding to the data point location, the smaller the probability of being synchronized. Because the frequency of the digital twins synchronous data is very high, the synchronous frequency of the data point position with high-frequency change is close to the change frequency; for data points with low change frequency, under the condition of a large sample (under the condition of very many selection times), synchronization of a certain number of times can be obtained (actually, the change frequency of the low-frequency data is quite low, and the synchronization times obtained under the probability model can completely reach the principle of consistency of data of a numerical control system end and digital twins).

Introduction of industrial data convergence system:

as shown in fig. 4, the industrial data aggregation system is divided into three parts. The flow sensing module is obtained through the arrival time interval of two adjacent sampling data point positions; the flow control module divides the data point into equal data packets and transmits the data packets to achieve the purpose of balancing; the data point location selection module preferentially selects the data point location with high heat degree by counting the change times of the data point location as the heat degree.

The flow sensing module:

since the instruction domain big data sampling frequency is the highest, the instruction domain big data sampling frequency is the largest in data quantity in all sampling channels. Therefore, a sampling channel using large data in the instruction domain is used as a basis for flow sensing. And calculating the time interval of the arrival of two adjacent sampling data.

Wherein, Delta T_NFor this time interval, Δ T_N-1For the last time interval, T_NIs a bookSub-sampled data arrival time, T_N-1For the last sampled data arrival time, where Δ T_NAnd T_N-1The units are milliseconds.

Under the condition of good network condition, the internal sampling period of the numerical control system is appointed to be T^sThe data uploading period is T^uShould have a Δ T_N≈T^u. When Δ T_N>>T^uThat is, when the average sampling data arrival interval is far longer than the specified data uploading period, it indicates that the data transmission network is congested, and at this time, the digital twin data synchronization should be stopped to reduce the network load.

The flow control module:

the digital twin data points of the numerical control machine tool are numerous, the sampling channel is a very precious resource, and the number of data points in the digital twin data points is very small in all the data points. As shown in fig. 5(a), if all data points of a single-time synchronous digital twin cause a network traffic peak, data synchronization of the digital twin cannot be realized. Fig. 5(b) illustrates network traffic optimization, where a data packet with a large memory is split into data packets with smaller memories and then the data packets are transmitted in sequence, so that a traffic peak can be avoided, and bandwidth resources can be fully utilized.

In order to prevent network collision, a mechanism of 'collecting for multiple times and uploading for one time' is adopted in the design of the sampling channel. The mechanism avoids network collision and brings idle network bandwidth. Assuming that the uploading period of the sampling channel is 200ms, if the numerical control system is performing data collection internally, the transmission network may be in an idle state (as shown in fig. 6).

The time length of uploading data in the sampling channel only occupies a small amount of time of an uploading period, and non-sampling data can be synchronized in the idle interval of the network. The utilization rate of the network channel can be improved.

The invention designs a sampling channel for bearing large data of an instruction domain and appoints a sampling period T in a numerical control system^s1ms and a data upload period T^u200ms, the number of data points included in a sampling channel is count 16, and one sampling channel includesThe total number of data of (a) is:

Due to the existence of a collision detection mechanism in the Ethernet, the repeated transmission times of data are increased due to the undersize split of the data packet, and the purpose of optimization cannot be achieved. Number of points D in sampling channel^sampleConversion to network traffic V^sample(floating point values, each bit occupying 8 bytes, 64 bits, are all within the instruction domain sampling channel). Given a bandwidth limiting threshold V^thresholdThen, there are:

wherein, the number of packets synchronized per second of the Package; the split data packet requests are respectively sent out in an asynchronous calling mode, and the same time slice can be utilized for receiving and sending, so that the purpose of network optimization is achieved.

Taking a three-axis vertical machining center as an example, a large data acquisition channel in a designed instruction domain comprises 16 kinds of data such as main shaft rotating speed, main shaft load power, a moving shaft instruction position, an actual position, load current, a G code line number, a program number, a subprogram number, machine tool state and the like; the internal sampling period is 1ms, the uploading period is 200ms, and the size of a sampling channel data packet is as follows:

V^sample＝16×200×64＝204800bit＝200KB

if a machine is allocated 2M bandwidth, then

From the above equation, the machine can synchronize about 10 data packets per second for 32000 data points under the limitation of 2M bandwidth.

Compared with the prior art, the invention has the following advantages:

according to the industrial data aggregation method provided by the invention, the data heat and the sampling probability model are combined, when the situation of uploading data is congested, the sampled data point position can be obtained according to the data heat, the higher the data heat is, the higher the sampling probability is, and the sampled data point position is uploaded to the server after sampling for multiple times. .

The invention adopts a method of setting single cold data point location synchronization in k times of repeated synchronization, effectively reduces the time of data heat learning, and is convenient for obtaining the data heat of each data point location more quickly.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An industrial data aggregation method is characterized by comprising the following steps:

the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location to the synchronization times of the data point location; wherein the random number is a number from 0 to 1.

2. The industrial data aggregation method according to claim 1, wherein the method for acquiring the data heat comprises the following steps:

the data heat of each data point location is the corresponding change frequency obtained by the last data synchronization; the constraint conditions are as follows:

and

3. The industrial data aggregation method according to claim 2, wherein the D2 specifically includes the following steps:

wherein, the dividing conditions are as follows:

s_ifor single data point synchronization times, c_iFor the number of changes of a single data point, ε₁Is a partition threshold.

4. The industrial data aggregation method according to claim 1, wherein the method for establishing the sampling probability model based on the data heat of the data point locations comprises: setting coordinate intervals of all data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model; wherein, the sampling probability is the data heat of the data point location.

5. The industrial data convergence method of claim 4, wherein the method for determining the collected data point positions by using the random numbers comprises the following steps:

when i is 0, R is belonged to (0, P)₀]Then, collecting the 0 th data point bit for synchronization; when in use

And then, selecting the ith data point bit for synchronization.

6. An industrial data aggregation system is characterized by comprising a flow sensing module, a data point location selection module and a data control module, wherein the flow sensing module, the data point location selection module and the data control module are used for sequentially transmitting data;

the flow sensing module is used for identifying whether the difference value between the arrival time interval of two adjacent sampling data and the data uploading period exceeds a congestion threshold value or not, and when the difference value exceeds the congestion threshold value, the data point location selection module is started;

the data point location selection module is used for determining the collected data point location in the sampling probability model by utilizing random numbers based on the current working condition;

7. The industrial data convergence system of claim 6, wherein the data point location selection module comprises a heat learning unit and a data synchronization unit;

8. The industrial data aggregation system according to claim 7, wherein the sampling probability model is formed by setting coordinate intervals for all data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.

9. The industrial data convergence system of claim 8, wherein the method of determining the collected data point locations using random numbers comprises:

And then, selecting the ith data point bit for synchronization.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.