CN113238527A - Industrial data aggregation method and system - Google Patents

Industrial data aggregation method and system Download PDF

Info

Publication number
CN113238527A
CN113238527A CN202011568569.XA CN202011568569A CN113238527A CN 113238527 A CN113238527 A CN 113238527A CN 202011568569 A CN202011568569 A CN 202011568569A CN 113238527 A CN113238527 A CN 113238527A
Authority
CN
China
Prior art keywords
data
data point
synchronization
point location
heat
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011568569.XA
Other languages
Chinese (zh)
Other versions
CN113238527B (en
Inventor
代超仁
陈吉红
杨建中
冯冰艳
晏嫚
王萧
陈震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Publication of CN113238527A publication Critical patent/CN113238527A/en
Application granted granted Critical
Publication of CN113238527B publication Critical patent/CN113238527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/18Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form
    • G05B19/19Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form characterised by positioning or contouring control systems, e.g. to control position from one programmed point to another or to control movement along a programmed continuous path
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/35Nc in input of data, input till input file format
    • G05B2219/35349Display part, programmed locus and tool path, traject, dynamic locus

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Manufacturing & Machinery (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides an industrial data aggregation method and system, and belongs to the technical field of industrial big data. An industrial data aggregation method, comprising: when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions; the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the data point location change times to the data point location synchronization times. The invention gives consideration to the requirements of high-frequency synchronization of hot data and low-frequency synchronization of cold data, can well control the consumption of the digital twins to the network bandwidth under the condition of giving the upper limit of the network flow, and has higher engineering practical value.

Description

Industrial data aggregation method and system
Technical Field
The invention belongs to the technical field of industrial big data, and particularly relates to an industrial data aggregation method and system.
Background
With the development of intelligent manufacturing, the numerical control machine tool is taken as a main production device, and the digitization, the networking and the intellectualization of the numerical control machine tool become important supports for industrial transformation and upgrading.
The digital twin of the numerical control machine tool is a numerical expression and modeling technology of the numerical control machine tool and is the key point of long-term research in the field of numerical control machining. The method is lack of an efficient and rapid data lightweight algorithm in the data acquisition process of the numerical control machine tool, and cannot meet the requirement of digital twin mass big data convergence. In actual work, two extremes are often generated, one extreme is to extract features at an edge end, and a large amount of useful information is discarded; at the other extreme, large amounts of raw data are uploaded, consuming large amounts of bandwidth resources and computational resources. Because information integrity and cost economy cannot be considered, numerical control machine tool big data acquisition becomes a bottleneck of high-simulation digital twins.
The numerical control system is internally provided with an acquisition module and a communication module, wherein the acquisition module is responsible for acquiring data generated by the numerical control machine, and the communication module is responsible for uploading the acquired data. The communication module of the numerical control system generally uses Ethernet with high flexibility and simplicity, and because the Ethernet adopts a CSMA/CD collision detection mode, the data transmission performance is rapidly reduced when the network load is large, thereby causing the uncertainty of network transmission data and causing the server side to be unable to complete the data synchronization of data twins. Therefore, a method for optimizing data synchronization needs to be designed under the condition of limited network broadband so as to meet the requirement of synchronous acquisition of digital twin data.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide an industrial data aggregation method and system, and aims to solve the problem that a server cannot complete data synchronization of data twins when the network load of the conventional numerical control system is large.
In order to achieve the above object, the present invention provides an industrial data aggregation method, comprising the following steps:
when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location to the synchronization times of the data point location; (ii) a Wherein the random number is a number from 0 to 1.
Preferably, the method for acquiring the data heat comprises the following steps:
d1, initializing the change times and the synchronization times of each data point location, synchronizing all the data point locations under the current working condition, calculating the sum of the change frequencies of all the data point locations, and deleting invalid data point locations;
d2 repeatedly synchronizing the effective data point locations, calculating the sum of the change frequency of all the data point locations after each data synchronization and the change frequency of a single effective data point location, and stopping the data synchronization until the end condition is met;
wherein the number of each data pointAccording to the heat degree, the corresponding change frequency obtained by the last data synchronization is obtained; the constraint conditions are as follows:
Figure BDA0002861769860000021
Figure BDA0002861769860000022
and
Figure BDA0002861769860000023
respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; epsilon2Is the end threshold.
Preferably, D2 specifically includes the following steps:
d2.1, repeatedly synchronizing the effective data point locations, and calculating the sum of the change frequencies of all the data point locations after each data synchronization and the data heat of a single effective data point location;
d2.2, dividing the effective data point positions into hot data point positions and cold data point positions according to the data heat degree and the dividing conditions of the single effective data point position;
d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until a finishing condition is met, and stopping data synchronization;
wherein, the dividing conditions are as follows:
Figure BDA0002861769860000024
sirepresenting the number of single data point bit syncs, ciRepresenting the number of changes, epsilon, of a single data point1Is a partition threshold.
Preferably, the method for establishing the sampling probability model based on the data heat of the data point location comprises:
setting coordinate intervals of all data point positions according to sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model; wherein, the sampling probability is the data heat of the data point location.
Preferably, the method for determining the collected data point location by using the random number comprises the following steps:
when i is 0, R is belonged to(0,P0]Then, collecting the 0 th data point bit for synchronization; when i is>0,
Figure BDA0002861769860000031
And then, selecting the ith data point bit for synchronization.
On the other hand, the invention provides an industrial data convergence system, which comprises a flow sensing module, a data point location selection module and a data control module, wherein the flow sensing module, the data point location selection module and the data control module are used for sequentially transmitting data;
the flow sensing module is used for identifying whether the difference value between the arrival time interval of the two adjacent sampling data and the data uploading period exceeds a congestion threshold value or not, and when the difference value exceeds the congestion threshold value, the data point location selection module is started;
the data point location selection module is used for determining the collected data point locations in the sampling probability model by utilizing random numbers based on the current working condition;
the data control module is used for uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location and the synchronization times of the data point location; wherein the random number is a number from 0 to 1.
Preferably, the data point location selection module comprises a heat learning unit and a data synchronization unit;
the heat learning unit is used for calculating the data heat of each data point location based on the current working condition;
the data synchronization unit is used for establishing a sampling probability model by using the data heat of each data point location; and meanwhile, identifying the current working condition, and updating the data heat of each data point by the scheduling heat learning unit if the working condition changes.
Preferably, the sampling probability model sets coordinate intervals for all data point positions according to the sampling probability, and the coordinate intervals are sequentially distributed on a closed interval [0,1] of the one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.
Preferably, the method for determining the collected data point location by using the random number comprises the following steps:
when i is 0, R is belonged to (0, P)0]Then, collecting the 0 th data point bit for synchronization; when i is>0,
Figure BDA0002861769860000041
And then, selecting the ith data point bit for synchronization.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the industrial data aggregation method provided by the present invention.
Through the technical scheme, compared with the prior art, the invention has the following beneficial effects:
according to the industrial data aggregation method provided by the invention, the data heat and the sampling probability model are combined, when the situation of uploading data is congested, the sampled data point position can be obtained according to the data heat, the higher the data heat is, the higher the sampling probability is, and the sampled data point position is uploaded to the server after sampling for multiple times.
The invention adopts a heat learning method, can acquire the data heat of each data point location when the working condition of the numerical control machine tool changes, and provides powerful technical support for the subsequent acquisition of the data point locations needing data synchronization.
The method for synchronizing the single cold data point location in the k repeated synchronizations is set, so that the time for heat learning is effectively reduced, and the data heat of each data point location can be acquired more quickly.
The invention transmits the data point positions obtained by sampling for many times in the form of data packets in sequence, thereby not only avoiding the occurrence of flow peak, but also fully utilizing bandwidth resources.
Drawings
FIG. 1 is a graph of the variation of F (t) with time according to the present invention;
FIG. 2 is a schematic view of a survey window provided by the present invention;
FIG. 3 is a schematic diagram of a sampling probability model provided by the present invention;
FIG. 4 is a schematic diagram of an industrial data aggregation system provided by the present invention;
FIG. 5(a) is a schematic flow chart of the prior art one-time synchronization of all data points provided by the present invention;
FIG. 5(b) is a schematic diagram of the optimized flow rate provided by the present invention;
fig. 6 is a schematic diagram of network occupation provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides an industrial data aggregation method, which comprises the following steps:
when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location and the synchronization times of the data point location; (ii) a Wherein the random number is a number from 0 to 1.
Preferably, the method for acquiring the data heat of each data point location comprises:
d1, initializing the change times and the synchronization times of each data point location, synchronizing all the data point locations under the current working condition, calculating the sum of the change frequencies of all the data point locations, and deleting invalid data point locations;
d2 repeatedly synchronizing the effective data point locations, calculating the sum of the change frequency of all the data point locations after each data synchronization and the change frequency of a single effective data point location, and stopping the data synchronization until the end condition is met;
and the data heat of each data point location is the corresponding change frequency obtained by the last data synchronization.
Preferably, D2 specifically includes the following steps:
d2.1, repeatedly synchronizing the effective data point locations, and calculating the sum of the change frequencies of all the data point locations after each data synchronization and the data heat of a single effective data point location;
d2.2, dividing the effective data point positions into hot data point positions and cold data point positions according to the data heat degree and the dividing conditions of the single effective data point position;
d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until the end condition is met, and stopping data synchronization.
Preferably, the dividing conditions are:
Figure BDA0002861769860000051
the end conditions are as follows:
Figure BDA0002861769860000061
wherein,
Figure BDA0002861769860000062
and
Figure BDA0002861769860000063
respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; the time length of the single consideration window is L; setting a consideration window every L/2 time interval; siRepresenting the number of single data point bit syncs, ciRepresenting the number of changes, epsilon, of a single data point1Is a division threshold; epsilon2Is the end threshold.
Preferably, the method for establishing the sampling probability model based on the data heat of the data point location comprises:
and setting coordinate intervals of all the data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model.
Preferably, the method for determining the collected data point location by using the random number comprises the following steps:
when i is 0, R is belonged to (0, P)0]Then, collecting the 0 th data point bit for synchronization; when i is>0,
Figure BDA0002861769860000064
And then, selecting the ith data point bit for synchronization.
Preferably, the total number of data point locations acquired for multiple times is:
Figure BDA0002861769860000065
wherein, TuIs a data upload period; t issIs a sampling period; the count is the number of data points acquired each time.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the industrial data aggregation method provided by the present invention.
In another aspect, the present invention provides an industrial data aggregation system, including: the data transmission system comprises a flow sensing module, a data point location selection module and a data control module which are used for transmitting data in sequence;
the flow sensing module is used for identifying whether the difference value between the arrival time interval of the two adjacent sampling data and the data uploading period exceeds a congestion threshold value or not, and when the difference value exceeds the congestion threshold value, the data point location selection module is started;
the data point location selection module is used for determining the collected data point locations in the sampling probability model by utilizing random numbers based on the current working condition;
the data control module is used for uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location and the synchronization times of the data point location; (ii) a Wherein the random number is a number from 0 to 1.
Preferably, the data point location selection module comprises a heat learning unit and a data synchronization unit;
the heat learning unit is used for calculating the data heat of each data point location based on the current working condition;
the data synchronization unit is used for establishing a sampling probability model by using the data heat of each data point location; and meanwhile, identifying the current working condition, and updating the data heat of each data point by the scheduling heat learning unit if the working condition changes.
Preferably, the sampling probability model sets coordinate intervals for all data point positions according to the sampling probability, and the coordinate intervals are sequentially distributed on the closed interval [0,1] of the one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.
Preferably, the method for determining the collected data point location by using the random number comprises the following steps:
when i is 0, R is belonged to (0, P)0]Then, collecting the 0 th data point bit for synchronization; when i is>0,
Figure BDA0002861769860000071
And then, selecting the ith data point bit for synchronization.
Preferably, the method for calculating the data heat of each data point location comprises:
d1, initializing the change times and the synchronization times of each data point location, synchronizing all the data point locations under the current working condition, calculating the sum of the change frequencies of all the data point locations, and deleting invalid data point locations;
d2 repeatedly synchronizing the effective data point locations, calculating the sum of the change frequency of all the data point locations after each data synchronization and the change frequency of a single effective data point location, and stopping the data synchronization until the end condition is met;
and the data heat of each data point location is the corresponding change frequency obtained by the last data synchronization.
Preferably, D2 specifically includes the following steps:
d2.1, repeatedly synchronizing the effective data point locations, and calculating the sum of the change frequencies of all the data point locations after each data synchronization and the data heat of a single effective data point location;
d2.2, dividing the effective data point positions into hot data point positions and cold data point positions according to the data heat degree and the dividing conditions of the single effective data point position;
d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until the end condition is met, and stopping data synchronization.
Preferably, the dividing conditions are:
Figure BDA0002861769860000081
the end conditions are as follows:
Figure BDA0002861769860000082
wherein,
Figure BDA0002861769860000083
and
Figure BDA0002861769860000084
respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; consider the time length of the window as L; setting a consideration window every L/2 time interval; siRepresenting the number of single data point bit syncs, ciRepresenting the number of changes, epsilon, of a single data point1Is a division threshold; epsilon2Is the end threshold.
Examples
The specific principle of the invention is as follows:
in the process of processing and running of the numerical control system, the data change frequency is characterized as follows: (1) the attribute data is basically fixed and unchanged; (2) the change frequency of the parameter data and the task data is not high; (3) controlling the specific event attribute of the data, wherein the change period is not fixed; (4) the state data changes most frequently. The invention provides a concept of 'data heat' as a reference basis for the data point synchronization priority. The data hot degree refers to the change activity degree of a certain data point location in all synchronous data point locations. In the ideal case, the data heat formula is as follows:
Figure BDA0002861769860000085
wherein h isiIs the data heat; f. ofiRepresenting the frequency of change of data point locations; Δ tiTime intervals representing data point locations; i is the data index.
In the actual production and processing process, the numerical control machine tool is considered to have a great number of data point digits, and simultaneously, the synchronization of all data point positions is difficult to realize, and the delta t cannot be obtainediThe true value of (d). Therefore, the data heat is changed into the following formula:
Figure BDA0002861769860000086
wherein, h'iThe data heat of the ith data point defined by the invention; siThe number of times of synchronization of a single data point location; c. CiThe number of changes for a single data point location; siAnd ciThe initial values are all 0, and s is the value of each synchronization of a certain data pointi1 is increased progressively; each time of change, ciIncrement by 1. From this, it can be seen that the sample change frequency c is obtained when the sample capacity is extremely large (when the number of times of synchronization of a single data point is extremely large)i/siApproximately equal to the actual variation frequency fiAt this time, h 'can be considered'iIs equal to hi
As can be seen from formula (1), h 'occurs when the number of data point bit synchronization times reaches a certain number'iCan replace hiThat is, the data heat of all the data points needs a time process. Therefore, the data point location selection module is divided into a heat learning unitAnd a data synchronization unit;
the principle of the heat learning unit is as follows:
as can be seen from the formula (1), the data heat depends on the synchronization times s of the data pointsiAnd the number of changes ciTherefore, a large number of synchronous tests should be performed on all data points of the numerical control system to obtain the actual change frequency of each data point, and finally, the corresponding data heat degree is calculated.
Since a process is required for synchronizing all data point locations for many times, the sample change frequency (i.e., the recorded change frequency) of each data point location changes continuously and finally tends to be stable.
In order to know whether the sample change frequency of all the data point locations tends to be stable or not, the sum of the change frequencies of all the data point locations is used as a judgment standard. The following formula (2) shows the time variation relationship of the sum of the variation frequencies of all data point positions along with the synchronization process.
Figure BDA0002861769860000091
Wherein t is time; i is an index of the data point location; f (t) is the sum of the change frequencies of all the data points at the time t; f. ofi(t) is the change frequency of the ith data point at the moment t; setting the initial values of the synchronization times and the change times of all the data point locations as 1, then setting the initial change frequency of all the data point locations as 1, recording the synchronization times and the change times of each data point location every time after synchronization, and calculating the corresponding change frequency; f (t) the trend with time is shown in FIG. 1; with the increase of the number of synchronization times, the change frequency of a single data point location is gradually reduced or unchanged, the sum of the change frequencies of all the data point locations is continuously reduced, and when the change frequency of the single data point location tends to be stable, the sum of the change frequencies of all the data point locations also tends to be stable.
The learning method of the popularity learning unit is as follows:
because the numerical control system is designed to take various numerical control machines into consideration, a large number of abundant data point positions are set for the convenience of development or expansion of a user, but only a part of data point positions are used in actual production and processing, the values of other expansion point positions are always null, and the synchronous acquisition of the data point positions consumes network memory resources and has no practical significance. Therefore, the invention refers to the empty extension point as "invalid data point", and refers to the actually used point as "valid data point".
The method for learning the popularity comprises the following steps:
(1) setting the change times and the synchronization times of each data point location as 1, synchronizing all the data point locations, and calculating the sum of the change frequencies of all the data point locations;
(2) deleting invalid data points;
(3) continuously and circularly synchronizing the effective data points, recording the synchronization times and the data point position change times, and recording the sum of the change frequencies of all the data point positions and the data heat of a single data point position;
(4) and setting a learning end condition, and ending the cycle synchronization when the sum of the change frequencies of all the data point positions tends to reach the learning end condition stably.
Processing cold data change frequency:
there is a large amount of data in a numerical control system that has a low frequency of change or is substantially unchanged, and such data is called cold data. In the stage of data hot degree learning, the number of times of data change c is changed due to coldiRemains at 1 or has a very small number of changes, while the number of cold data syncs siBut keeps increasing, and finally the frequency of change c of these datai/siGoing to 0 gradually, two problems arise due to the large amount of cold data:
(1) the sum of the frequency of change of all cold data points levels off at a very slow rate;
(2) the change frequency of the cold data is infinitely close to 0, the data heat is also close to 0, and when the data is synchronized by sampling the data heat, the probability of synchronizing the cold data is small.
Therefore, in order to speed up the data heat learning process and ensure that the cold data has a certain data heat, a lower limit of the frequency change rate of the cold data needs to be set. The number of times of synchronization of the cold data by a single cold data in k times of synchronization is set to 1. Thus, the synchronization frequency for a single cold datum is obtained as:
Figure BDA0002861769860000101
finishing to obtain:
Figure BDA0002861769860000102
wherein i is a data index number, m is the number of hot data, n is the number of cold data, fi coldVarying the frequency setting, f, for cold datai hotThe thermal data change frequency.
In the process of learning the data heat, when the heat data (namely the change times c)i>1) Tends to be stable, the cold data (i.e., the number of changes c) is seti1) is set as
Figure BDA0002861769860000112
The calculated value. Through the arrangement, the sum of the change frequencies of all the data point positions can quickly tend to be stable.
And (4) finishing conditions:
and terminating the data heat learning when the overall data heat of the numerical control machine tool tends to be stable. In order to obtain the termination condition of the data heat learning, the invention provides a survey window (shown as a solid line and a dashed line rectangle in fig. 2), the window length is set as L, and the average value of the sum of all data change frequencies in L time is calculated. Setting a survey window every L/2 time, comparing the average value of the sum of the change frequencies of all the data points calculated by the two survey windows before and after, and if the relative error is at the ending threshold epsilon2And within the range, the heat degrees of all the data point positions are considered to be stable, and the data heat degree learning is terminated at the moment.
The end conditions are as follows:
Figure BDA0002861769860000113
wherein,
Figure BDA0002861769860000114
and
Figure BDA0002861769860000115
respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; the time length of the single consideration window is L; setting a consideration window every L/2 time interval; epsilon2Is the end threshold.
The data synchronization principle is as follows:
because the numerical control machine tool data point bit change frequencies are different, the data selection priority of data point bit synchronization should be considered, and when the data point bit change frequency is higher, the frequency for synchronizing the data point bit should be higher.
The data heat expresses the activity degree of data change of a single data point in the whole data point, and the higher the data heat of the single data point is, the higher the change frequency is; similarly, the lower the data heat of a single data point bit, the lower its change frequency. Therefore, the data heat can be considered as the probability that a single data point location is synchronized, and after all the data point locations are synchronized according to a large number of probabilities, the synchronization frequency of the data point locations is close to the change frequency. Therefore, timely synchronization of the data point positions with high frequency changes is guaranteed, and the data point positions with low frequency changes can be synchronized irregularly.
By
Figure BDA0002861769860000121
It can be known that the sampling probability P can be orderedi=h′iTo be in accordance with PiSelecting synchronous data point position, setting closed interval [0,1] on one-dimensional axis]All data points are in accordance with the probability PiDistributed in the interval, i.e. the sampling probability value P of the ith data pointiThe length of the section corresponding to the point (P, as shown in FIG. 3)1、P2Representing the data heat of the data point location). Write program in interval [0,1]]A random number R is generated internally,when i is 0, R is belonged to (0, P)0]Then, selecting the 0 th data locus for synchronization; when i is>0,
Figure BDA0002861769860000122
Then, the ith data locus is selected for synchronization.
According to the sampling probability model, the higher the data heat corresponding to the data point location is, the higher the synchronized probability is; the lower the data heat corresponding to the data point location, the smaller the probability of being synchronized. Because the frequency of the digital twins synchronous data is very high, the synchronous frequency of the data point position with high-frequency change is close to the change frequency; for data points with low change frequency, under the condition of a large sample (under the condition of very many selection times), synchronization of a certain number of times can be obtained (actually, the change frequency of the low-frequency data is quite low, and the synchronization times obtained under the probability model can completely reach the principle of consistency of data of a numerical control system end and digital twins).
Introduction of industrial data convergence system:
as shown in fig. 4, the industrial data aggregation system is divided into three parts. The flow sensing module is obtained through the arrival time interval of two adjacent sampling data point positions; the flow control module divides the data point into equal data packets and transmits the data packets to achieve the purpose of balancing; the data point location selection module preferentially selects the data point location with high heat degree by counting the change times of the data point location as the heat degree.
The flow sensing module:
since the instruction domain big data sampling frequency is the highest, the instruction domain big data sampling frequency is the largest in data quantity in all sampling channels. Therefore, a sampling channel using large data in the instruction domain is used as a basis for flow sensing. And calculating the time interval of the arrival of two adjacent sampling data.
Figure BDA0002861769860000123
Wherein, Delta TNFor this time interval, Δ TN-1For the last time interval, TNIs a bookSub-sampled data arrival time, TN-1For the last sampled data arrival time, where Δ TNAnd TN-1The units are milliseconds.
Under the condition of good network condition, the internal sampling period of the numerical control system is appointed to be TsThe data uploading period is TuShould have a Δ TN≈Tu. When Δ TN>>TuThat is, when the average sampling data arrival interval is far longer than the specified data uploading period, it indicates that the data transmission network is congested, and at this time, the digital twin data synchronization should be stopped to reduce the network load.
The flow control module:
the digital twin data points of the numerical control machine tool are numerous, the sampling channel is a very precious resource, and the number of data points in the digital twin data points is very small in all the data points. As shown in fig. 5(a), if all data points of a single-time synchronous digital twin cause a network traffic peak, data synchronization of the digital twin cannot be realized. Fig. 5(b) illustrates network traffic optimization, where a data packet with a large memory is split into data packets with smaller memories and then the data packets are transmitted in sequence, so that a traffic peak can be avoided, and bandwidth resources can be fully utilized.
In order to prevent network collision, a mechanism of 'collecting for multiple times and uploading for one time' is adopted in the design of the sampling channel. The mechanism avoids network collision and brings idle network bandwidth. Assuming that the uploading period of the sampling channel is 200ms, if the numerical control system is performing data collection internally, the transmission network may be in an idle state (as shown in fig. 6).
The time length of uploading data in the sampling channel only occupies a small amount of time of an uploading period, and non-sampling data can be synchronized in the idle interval of the network. The utilization rate of the network channel can be improved.
The invention designs a sampling channel for bearing large data of an instruction domain and appoints a sampling period T in a numerical control systems1ms and a data upload period Tu200ms, the number of data points included in a sampling channel is count 16, and one sampling channel includesThe total number of data of (a) is:
Figure BDA0002861769860000131
wherein, TuIs a data upload period; t issIs a sampling period; the count is the number of data points acquired each time.
Due to the existence of a collision detection mechanism in the Ethernet, the repeated transmission times of data are increased due to the undersize split of the data packet, and the purpose of optimization cannot be achieved. Number of points D in sampling channelsampleConversion to network traffic Vsample(floating point values, each bit occupying 8 bytes, 64 bits, are all within the instruction domain sampling channel). Given a bandwidth limiting threshold VthresholdThen, there are:
Figure BDA0002861769860000141
wherein, the number of packets synchronized per second of the Package; the split data packet requests are respectively sent out in an asynchronous calling mode, and the same time slice can be utilized for receiving and sending, so that the purpose of network optimization is achieved.
Taking a three-axis vertical machining center as an example, a large data acquisition channel in a designed instruction domain comprises 16 kinds of data such as main shaft rotating speed, main shaft load power, a moving shaft instruction position, an actual position, load current, a G code line number, a program number, a subprogram number, machine tool state and the like; the internal sampling period is 1ms, the uploading period is 200ms, and the size of a sampling channel data packet is as follows:
Vsample=16×200×64=204800bit=200KB
if a machine is allocated 2M bandwidth, then
Figure BDA0002861769860000142
From the above equation, the machine can synchronize about 10 data packets per second for 32000 data points under the limitation of 2M bandwidth.
Compared with the prior art, the invention has the following advantages:
according to the industrial data aggregation method provided by the invention, the data heat and the sampling probability model are combined, when the situation of uploading data is congested, the sampled data point position can be obtained according to the data heat, the higher the data heat is, the higher the sampling probability is, and the sampled data point position is uploaded to the server after sampling for multiple times. .
The invention adopts a heat learning method, can acquire the data heat of each data point location when the working condition of the numerical control machine tool changes, and provides powerful technical support for the subsequent acquisition of the data point locations needing data synchronization.
The invention adopts a method of setting single cold data point location synchronization in k times of repeated synchronization, effectively reduces the time of data heat learning, and is convenient for obtaining the data heat of each data point location more quickly.
The invention transmits the data point positions obtained by sampling for many times in the form of data packets in sequence, thereby not only avoiding the occurrence of flow peak, but also fully utilizing bandwidth resources.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. An industrial data aggregation method is characterized by comprising the following steps:
when the difference value between the time interval of the arrival of the two adjacent sampling data and the data uploading period exceeds a congestion threshold, determining the collected data point position in a sampling probability model by using a random number based on the current working condition; uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location to the synchronization times of the data point location; wherein the random number is a number from 0 to 1.
2. The industrial data aggregation method according to claim 1, wherein the method for acquiring the data heat comprises the following steps:
d1, initializing the change times and the synchronization times of each data point location, synchronizing all the data point locations under the current working condition, calculating the sum of the change frequencies of all the data point locations, and deleting invalid data point locations;
d2 repeatedly synchronizing the effective data point locations, calculating the sum of the change frequency of all the data point locations after each data synchronization and the change frequency of a single effective data point location, and stopping the data synchronization until the end condition is met;
the data heat of each data point location is the corresponding change frequency obtained by the last data synchronization; the constraint conditions are as follows:
Figure FDA0002861769850000011
Figure FDA0002861769850000012
and
Figure FDA0002861769850000013
respectively representing the average value of the sum of the change frequencies of all data point positions in the ith and (i + 1) th consideration windows; epsilon2Is the end threshold.
3. The industrial data aggregation method according to claim 2, wherein the D2 specifically includes the following steps:
d2.1, repeatedly synchronizing the effective data point locations, and calculating the sum of the change frequencies of all the data point locations after each data synchronization and the data heat of a single effective data point location;
d2.2, dividing the effective data point positions into hot data point positions and cold data point positions according to the data heat degree and the dividing conditions of the single effective data point position;
d2.3, setting single cold data point location synchronization once in k times of repeated synchronization, returning to the step D2.1 until a finishing condition is met, and stopping data synchronization;
wherein, the dividing conditions are as follows:
Figure FDA0002861769850000021
sifor single data point synchronization times, ciFor the number of changes of a single data point, ε1Is a partition threshold.
4. The industrial data aggregation method according to claim 1, wherein the method for establishing the sampling probability model based on the data heat of the data point locations comprises: setting coordinate intervals of all data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis to form a sampling probability model; wherein, the sampling probability is the data heat of the data point location.
5. The industrial data convergence method of claim 4, wherein the method for determining the collected data point positions by using the random numbers comprises the following steps:
when i is 0, R is belonged to (0, P)0]Then, collecting the 0 th data point bit for synchronization; when in use
Figure FDA0002861769850000022
And then, selecting the ith data point bit for synchronization.
6. An industrial data aggregation system is characterized by comprising a flow sensing module, a data point location selection module and a data control module, wherein the flow sensing module, the data point location selection module and the data control module are used for sequentially transmitting data;
the flow sensing module is used for identifying whether the difference value between the arrival time interval of two adjacent sampling data and the data uploading period exceeds a congestion threshold value or not, and when the difference value exceeds the congestion threshold value, the data point location selection module is started;
the data point location selection module is used for determining the collected data point location in the sampling probability model by utilizing random numbers based on the current working condition;
the data control module is used for uploading the collected data point positions;
the sampling probability model is established based on the data heat of the data point location; the data heat is the ratio of the change frequency of a single data point location to the sum of the change frequencies of all the data point locations; the change frequency of the data point location is the ratio of the change times of the data point location to the synchronization times of the data point location; wherein the random number is a number from 0 to 1.
7. The industrial data convergence system of claim 6, wherein the data point location selection module comprises a heat learning unit and a data synchronization unit;
the heat learning unit is used for calculating the data heat of each data point location based on the current working condition;
the data synchronization unit is used for establishing a sampling probability model by using the data heat of each data point location; and meanwhile, identifying the current working condition, and updating the data heat of each data point by the scheduling heat learning unit if the working condition changes.
8. The industrial data aggregation system according to claim 7, wherein the sampling probability model is formed by setting coordinate intervals for all data point positions according to the sampling probability, and sequentially distributing the coordinate intervals on a closed interval [0,1] of a one-dimensional numerical axis; wherein, the sampling probability is the data heat of the data point location.
9. The industrial data convergence system of claim 8, wherein the method of determining the collected data point locations using random numbers comprises:
when i is 0, R is belonged to (0, P)0]Then, collecting the 0 th data point bit for synchronization; when in use
Figure FDA0002861769850000031
And then, selecting the ith data point bit for synchronization.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN202011568569.XA 2020-12-19 2020-12-25 Industrial data aggregation method and system Active CN113238527B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011509812 2020-12-19
CN2020115098120 2020-12-19

Publications (2)

Publication Number Publication Date
CN113238527A true CN113238527A (en) 2021-08-10
CN113238527B CN113238527B (en) 2022-04-08

Family

ID=77129948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011568569.XA Active CN113238527B (en) 2020-12-19 2020-12-25 Industrial data aggregation method and system

Country Status (1)

Country Link
CN (1) CN113238527B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115002048A (en) * 2022-05-31 2022-09-02 珠海格力电器股份有限公司 Data transmission method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799534A (en) * 2012-07-18 2012-11-28 上海宝存信息科技有限公司 Storage system and method based on solid state medium and cold-hot data identification method
CN103106151A (en) * 2011-11-15 2013-05-15 Lsi公司 Apparatus to manage efficient data migration between tiers
US20160011971A1 (en) * 2014-07-14 2016-01-14 Jae-Il Lee Storage medium, memory system, and method of managing storage area in memory system
CN105959356A (en) * 2016-04-26 2016-09-21 华中科技大学 Method of realizing multi-cloud storage fault-tolerance conversion mechanism
CN106201906A (en) * 2016-07-11 2016-12-07 浪潮(北京)电子信息产业有限公司 A kind of cold and hot data separation method for flash memory and system
CN107844269A (en) * 2017-10-17 2018-03-27 华中科技大学 A kind of layering mixing storage system and method based on uniformity Hash
CN109358821A (en) * 2018-12-12 2019-02-19 山东大学 A kind of cold and hot data store optimization method of cloud computing of cost driving

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106151A (en) * 2011-11-15 2013-05-15 Lsi公司 Apparatus to manage efficient data migration between tiers
CN102799534A (en) * 2012-07-18 2012-11-28 上海宝存信息科技有限公司 Storage system and method based on solid state medium and cold-hot data identification method
US20160011971A1 (en) * 2014-07-14 2016-01-14 Jae-Il Lee Storage medium, memory system, and method of managing storage area in memory system
CN105959356A (en) * 2016-04-26 2016-09-21 华中科技大学 Method of realizing multi-cloud storage fault-tolerance conversion mechanism
CN106201906A (en) * 2016-07-11 2016-12-07 浪潮(北京)电子信息产业有限公司 A kind of cold and hot data separation method for flash memory and system
CN107844269A (en) * 2017-10-17 2018-03-27 华中科技大学 A kind of layering mixing storage system and method based on uniformity Hash
CN109358821A (en) * 2018-12-12 2019-02-19 山东大学 A kind of cold and hot data store optimization method of cloud computing of cost driving

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115002048A (en) * 2022-05-31 2022-09-02 珠海格力电器股份有限公司 Data transmission method and device, electronic equipment and storage medium
CN115002048B (en) * 2022-05-31 2023-09-12 珠海格力电器股份有限公司 Data transmission method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113238527B (en) 2022-04-08

Similar Documents

Publication Publication Date Title
CN107463447B (en) B + tree management method based on remote direct nonvolatile memory access
CN113238527B (en) Industrial data aggregation method and system
CN103152393A (en) Charging method and charging system for cloud computing
CN109815234A (en) A kind of multiple cuckoo filter under streaming computing model
CN108139872A (en) A kind of buffer memory management method, cache controller and computer system
CN106371924B (en) A kind of method for scheduling task minimizing MapReduce cluster energy consumption
CN112947860A (en) Hierarchical storage and scheduling method of distributed data copies
CN112887200A (en) Gateway equipment suitable for multi-source heterogeneous Internet of things and implementation method thereof
Geng et al. Accelerating distributed machine learning by smart parameter server
CN1984030A (en) Method and device for controlling ATM network flow based on FPGA
CN108173974B (en) HCModel internal cache data elimination method based on distributed cache Memcached
CN110413689B (en) Multi-node data synchronization method and device for memory database
CN105517049B (en) A kind of distribution method of wireless relay nodes workload
CN107846328B (en) Network rate real-time statistical method based on concurrent lock-free ring queue
CN116088862A (en) Method and device for caching data
Patel et al. Resource monitoring framework for big raw data processing
WO2015064857A1 (en) Apparatus and method for gathering active and passive data using probabilistic model in control network
CN114895985A (en) Data loading system for sampling-based graph neural network training
CN110688209B (en) Binary tree-based large-window access flow scheduling buffer structure and method
CN113645585A (en) Internet of things data acquisition method
Arifuzzaman et al. Use only what you need: Judicious parallelism for file transfers in high performance networks
CN110703677A (en) PLC control system integrated on industrial personal computer
CN114356418B (en) Intelligent table entry controller and control method
Cruzada et al. Proposed Real-Time Data Aggregation Scheme for Cluster-based WSN Sensor Nodes
CN114844948B (en) Client cache optimization method and device of real-time distribution system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant