CN115659284B - Big data fusion platform - Google Patents

Big data fusion platform Download PDF

Info

Publication number
CN115659284B
CN115659284B CN202211679727.8A CN202211679727A CN115659284B CN 115659284 B CN115659284 B CN 115659284B CN 202211679727 A CN202211679727 A CN 202211679727A CN 115659284 B CN115659284 B CN 115659284B
Authority
CN
China
Prior art keywords
distribution
value
periodic
difference
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211679727.8A
Other languages
Chinese (zh)
Other versions
CN115659284A (en
Inventor
杜秀明
郭红亮
杜玛睿
杜鹏飞
尚华
胡丽莎
刘军池
万海龙
王磊
姚慧娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Xinlong Technology Group Co ltd
Shijiazhuang Xinlong Software Technology Co ltd
Original Assignee
Shijiazhuang Xinlong Software Technology Co ltd
Hebei Xinlong Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shijiazhuang Xinlong Software Technology Co ltd, Hebei Xinlong Technology Group Co ltd filed Critical Shijiazhuang Xinlong Software Technology Co ltd
Priority to CN202211679727.8A priority Critical patent/CN115659284B/en
Publication of CN115659284A publication Critical patent/CN115659284A/en
Application granted granted Critical
Publication of CN115659284B publication Critical patent/CN115659284B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention relates to the technical field of data fusion management, in particular to a big data fusion platform, which is used for acquiring a sensor data sequence and each sensor data in the sensor data sequence through a data acquisition module, acquiring a distribution trend value, a periodic distribution value, a residual error and a period corresponding to the sensor data sequence according to a periodic acquisition module, acquiring the periodic distribution difference and the distribution trend difference through the distribution trend value, the periodic distribution value and the period according to a distribution difference characteristic acquisition module, acquiring a fusion weight value of each sensor data in the sensor data sequence at each time point according to the fusion weight value through the periodic distribution difference, the distribution trend difference and the residual error, and finishing data fusion according to the fusion weight value and storing the fusion weight value into the big data fusion platform for storage management.

Description

Big data fusion platform
Technical Field
The invention relates to the technical field of data fusion management, in particular to a big data fusion platform.
Background
After entering the information age, the related equipment of industrial production can generate a large amount of data in the running process, but because the effective data rich in information in the large amount of data only occupies a small number, a large number of enterprises acquire useful information by constructing a large data fusion platform, acquiring the related data of the equipment and transmitting the related data to the large data fusion platform, and further screening out the effective data. In the prior art, the method for carrying out data fusion on big data is to carry out weighted summation on the same data of multiple channels at the same time.
The inventors have found in practice that the above prior art has the following drawbacks:
in the prior art, the method for carrying out data fusion on big data carries out weighted summation on the same data of multiple channels at the same moment, and because the big data can have a lot of low-efficiency or even invalid information, the simple weighted summation does not consider the actual data characteristics, so that the obtained data information is relatively low-efficiency, thereby leading to low data utilization rate; in addition, as the traditional method for completing data fusion by weighting and summing the data only considers the data at one time point, the adaptability is poor when the data with different characteristics are fused; the method of performing data fusion by only weighted summation of data has low data utilization and poor adaptability.
Disclosure of Invention
In order to solve the technical problems of low data utilization rate and poor adaptability of a method for carrying out data fusion by carrying out weighted summation on data, the invention aims to provide a big data fusion platform, and the adopted technical scheme is as follows:
the invention provides a big data fusion platform, which comprises:
the data acquisition module is used for acquiring a sensor data sequence of each sensor on the production equipment in a preset sampling period according to a preset sampling frequency; the sensors are the same in type and different in position;
the period acquisition module is used for obtaining a distribution trend value, a period distribution value and a residual error of the sensor data sequence according to the sensor data sequence through a time sequence decomposition algorithm, and obtaining the period of the sensor data sequence according to the period distribution value;
the distribution difference characteristic acquisition module is used for obtaining the distribution trend difference according to the periodic distribution value of the corresponding positions among different periods and the fluctuation of the distribution trend value in each period;
the fusion weight value acquisition module is used for obtaining measurement distribution characteristics under each sampling frequency in each sensor data sequence through weighted summation of the periodic distribution difference degree, the distribution trend difference degree and the residual error; and obtaining a fusion weight value of each sensor data sequence at each sampling frequency according to the difference between the measurement distribution characteristics, and carrying out data fusion on different sensor data sequences according to the fusion weight values.
Further, the method for acquiring the period of the sensor data sequence includes:
iteration of a preset step length is carried out according to a preset time interval value, and a correlation degree value of a periodic distribution curve corresponding to the time interval value after each iteration is calculated according to the periodic distribution value through a periodic distribution curve correlation degree model; and taking the time interval between adjacent peaks of the correlation degree value in the iterative process as the period length, and obtaining each period of the sensor data sequence.
Further, the method for obtaining the correlation degree value of the periodic distribution curve comprises the following steps:
calculating a correlation degree value of the periodic distribution curve according to the time interval value and the periodic distribution value through a periodic distribution curve correlation degree model, wherein the periodic distribution curve correlation degree model comprises the following components:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the correlation value of the periodic distribution curve, < >>For the time interval value,/->For the preset sampling period +.>Periodic distribution values of the individual time points, +.>For the preset sampling period +.>Periodic distribution values of the individual time points, +.>For the average value of the periodic distribution values within the preset sampling period, +.>The number of sampling time points in the preset sampling period is set.
Further, the method for obtaining the periodic distribution difference degree comprises the following steps:
calculating the difference accumulation sum of the periodic distribution values at the corresponding time points of each period before the target time point and the target time point, and recording the difference accumulation sum as the previous periodic distribution difference degree; calculating the difference accumulation sum of the periodic distribution values at the corresponding time points of each period after the target time point and the target time point, and recording the difference accumulation sum as the post-periodic distribution difference degree; taking the sum of the front periodic distribution difference degree and the rear periodic distribution difference degree as the periodic distribution difference degree corresponding to the target time point;
and changing the target time point to obtain the periodic distribution difference degree corresponding to all the time points.
Further, the method for obtaining the distribution trend difference degree comprises the following steps:
and fitting a distribution trend curve according to the distribution trend values, calculating the slopes of the distribution trend curves corresponding to the other time points except the first time point and the previous time point, marking the slopes as distribution trend slope values, and taking the variance of the distribution trend slope values in each period as the distribution trend difference degree.
Further, the method for acquiring the measurement distribution characteristics comprises the following steps:
constructing a three-dimensional data coordinate system according to the periodic distribution difference degree, the distribution trend difference degree and the residual error corresponding to each time point as three dimensions of the three-dimensional coordinate system, and obtaining measurement distribution characteristics of each three-dimensional data coordinate point through weighted summation according to all three-dimensional data coordinate points corresponding to each time point in the three-dimensional data coordinate system, wherein the weighted summation comprises:
counting a target periodic distribution difference degree, a target distribution trend difference degree and a target residual error which correspond to a target three-dimensional data coordinate point, calculating the product of the target periodic distribution difference degree and a preset periodic distribution difference degree weight and marking the product as a target periodic distribution difference, calculating the product of the target distribution trend difference degree and a preset distribution trend difference degree weight and marking the product as a target distribution trend difference, calculating the product of the target residual error and the preset residual error weight and marking the product as a target residual error value, and dividing the sum of the target periodic distribution difference, the target distribution trend difference and the target residual error value to obtain a measurement distribution characteristic which corresponds to the target three-dimensional data coordinate point;
and changing the target three-dimensional data coordinate points to obtain all measurement distribution characteristics corresponding to all the three-dimensional data coordinate points.
Further, the obtaining the fusion weight value at each sampling frequency in each sensor data sequence according to the difference between the measurement distribution characteristics includes:
and obtaining local anomaly factors of each three-dimensional data coordinate point by taking the measurement distribution characteristics of each three-dimensional data coordinate point as measurement of distances according to a preset distance neighborhood through an LOF algorithm, and normalizing the local anomaly factors to obtain fusion weight values of the same time point in each channel in the sensor data sequence.
Further, the performing data fusion according to the fusion weight value includes:
and carrying out weighted summation on products of sensor data values corresponding to the data of different channels in the same time point and corresponding fusion weights to obtain data corresponding to each time point, and completing data fusion.
Further, the method for acquiring the distribution trend value, the periodic distribution value and the residual error of the sensor data sequence comprises the following steps:
and taking the sensor data sequence as input data of an STL time sequence segmentation algorithm, and outputting data which are the distribution trend value, the periodic distribution value and the residual error.
The invention has the following beneficial effects:
considering that the traditional simple method for completing data fusion by carrying out weighted summation on data can generate a plurality of inefficient and even ineffective information, the invention obtains the corresponding measurement distribution characteristics according to the sensor data at different positions through the data characteristics, obtains the fusion weights of the sensors at different positions at the same time point to carry out data fusion, so that the weight of effective data in the fused data is larger, and the utilization rate of the data is further higher. In consideration of different distribution characteristics of sensor data sequences at different time points, the degree of difference in corresponding sensor data is different, so that weighting needs to be given to different sensor data in the sensor data sequences corresponding to the same time point for weighted fusion. In conclusion, the method has higher data utilization rate and stronger adaptability in the data fusion process.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a big data fusion platform according to an embodiment of the present invention;
FIG. 2 is a diagram of exploded data according to one embodiment of the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to a specific implementation, structure, features and effects of a big data fusion platform according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the big data fusion platform provided by the invention with reference to the accompanying drawings.
Referring to fig. 1, a block diagram of a big data fusion platform according to an embodiment of the present invention is shown, where the platform includes: a data acquisition module 101, a period acquisition module 102, a distribution difference feature acquisition module 103 and a fusion weight value acquisition module 104.
A data acquisition module 101, configured to obtain a sensor data sequence of each sensor on the production equipment in a preset sampling period according to a preset sampling frequency.
The big data fusion platform provided by the invention fuses data, and aims to obtain effective data rich in information value from mass data. However, the premise of data fusion is to fuse the same kind of data at the same time point, so that it is first necessary to acquire the same kind of data at a plurality of same time points.
In order to acquire accurate operation data on production equipment, errors generated when the sensors acquire the data are reduced as much as possible, more than two sensors are arranged to acquire the sensor data of the production equipment, and the fused data features are ensured to be consistent in consideration of the data fusion requirement, so that more than two sensors with the same type and the same sampling frequency are arranged at different positions of the same production equipment in a preset sampling period, the sensor data with the consistent data features are acquired, and the data acquired by each production equipment form a sensor sequence. For example, in order to collect vibration data of a motor of a production facility, vibration data collection sensors are installed at different positions on the surface of the production facility.
In the embodiment of the invention, the preset sampling frequency is 100 times per second, and the preset sampling period is 10 minutes. It should be noted that, the preset sampling frequency can be specifically set according to the requirements of the implementation personnel, the higher the set frequency is, the more accurate the corresponding subsequent fusion result is, but the larger the calculated amount is; the preset sampling time period can be specifically set according to the requirements of operators, and the larger the preset sampling time period is, the smaller the corresponding error of data fusion is, but the calculated amount is increased; the invention does not limit the operation parameters of the production equipment detected by the sensor, and is specifically set according to the specific situation of implementation.
The period obtaining module 102 is configured to obtain a distribution trend value, a period distribution value and a residual error of the sensor data sequence according to the sensor data sequence through a time sequence decomposition algorithm, and obtain a period of the sensor data sequence according to the period distribution value.
The data acquisition module 101 acquires a sensor data sequence corresponding to each production device, and the sampling frequencies and types of the sensor data in the sensor data sequences are the same. The conventional weighted summation method does not consider actual data characteristics, performs data fusion only according to the obtained data in the sensor sequence, namely performs weighted summation according to the data at the same time in the sensor sequence to obtain fused data, and if weights of different sensor data are only manually set or the average value of a plurality of sensor data is obtained, the fused data contains a plurality of low-efficiency or even invalid data, and the generated error is larger.
In order to make the fused data more accurate, the fusion weight value corresponding to the data with more effective information in the data fusion process should be larger, so that the positive correlation between the degree of abnormality of each production device and the fusion weight value of the sensor data in the sensor data sequence can be obtained. The sensor data in the sensor data sequence is the operation data of the same production equipment, so that under the condition that the equipment operates normally, the different sensor data only differ due to the influence of noise, and therefore, the distribution trend and the periodic distribution of the sensor data in the corresponding sensor data sequence are basically the same for the data of low-efficiency information, namely the data of the production equipment, and the difference between the sensor data is only the fluctuation of signal noise in a small range; in contrast, for data of effective information, such as data at the time of abnormality of production equipment, the sensor data distribution trend and the periodic distribution difference in the corresponding sensor sequence are obvious.
Therefore, in order to acquire the degree of abnormality of the production facility, it is necessary to obtain the distribution trend difference and the periodic distribution difference of the sensor data in each sensor sequence. However, since the distribution trend difference and the periodic distribution difference cannot be directly obtained according to the sensor data sequence, the distribution trend difference and the periodic distribution difference are further obtained according to the periodic characteristic and the distribution trend value by acquiring the distribution trend value and the periodic characteristic of the sensor data in the sensor data sequence of each time point. The process for acquiring the distribution trend value and the periodic characteristic of the sensor data comprises the following steps:
and obtaining a distribution trend value, a periodic distribution value and a residual error corresponding to each sensor data in the sensor data sequence through a time sequence decomposition algorithm. Preferably, each sensor data is used as input data of the STL time sequence algorithm, and a distribution trend value, a periodic distribution value and a residual error corresponding to the data bit are output. The obtained distribution trend value corresponds to the variation among the periods of the sensor data in the preset sampling period, the period distribution value corresponds to the periodic variation in the preset sampling period, and the residual error corresponds to the variation which cannot be interpreted by the period distribution value and the distribution trend value in different periods of the sensor data. It should be noted that, the STL time sequence algorithm is well known in the art, and is not further defined and described herein.
Referring to fig. 2, a schematic diagram of decomposition data provided by an embodiment of the present invention is shown, where the schematic diagram of decomposition data includes distribution of sensor data within a preset sampling period, and distribution trend values, periodic distribution values and residual errors corresponding to each sensor data in a sensor data sequence are obtained by a time sequence decomposition algorithm. Wherein, on the vertical axis,representing sensor data,/->Representing a distribution trend value->Representing a periodic distribution value, +.>Representing the residual error; the horizontal axis is time. In fig. 2, the slope of the corresponding distribution trend value changes many times at the portion corresponding to the time 11 to 12, that is, the portion where the sensor data is abnormal, and the periodic distribution value represents the periodic change corresponding to the data, so that the corresponding change is not large, and the residual error represents the change which cannot be represented by the periodic distribution value and the distribution trend value in the abnormal portion of the sensor data.
However, the sensor data characteristic characterized by the periodic distribution values and the distribution trend values need to be represented by the periodic size of the sensor data, and thus further calculation of the period in the sensor data sequence is required. According to the method, the corresponding periodic distribution curve is obtained according to the periodic distribution values, the time interval values are set and iterated, the correlation degree value of the time interval value after each iteration in the periodic distribution curve is calculated, and then the period of the sensor data sequence is further obtained according to the correlation degree value. Preferably, iteration of a preset step length is carried out according to a preset time interval value, and a correlation degree value of a periodic distribution curve corresponding to the time interval value after each iteration is calculated according to the periodic distribution value through a periodic distribution curve correlation degree model; and taking the time interval between peaks of the correlation degree value in the iterative process as the period length to obtain each period of the sensor data sequence. In the embodiment of the invention, the initial value of the time interval is set to 2, and the preset step length is set to 3. Calculating a correlation degree value of the periodic distribution curve corresponding to the time interval value after each iteration according to the periodic distribution value through the periodic distribution curve correlation degree model, wherein the correlation degree value is specifically as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is the correlation value of the periodic distribution curve, +.>For the time interval value, +.>For the preset sampling period +.>Periodic distribution values of the individual time points, +.>For the preset sampling period +.>Periodic distribution values of the individual time points, +.>For the average value of the periodic distribution values within the preset sampling period, +.>The number of sampling time points in the preset sampling period is set.
The periodic distribution values in FIG. 2 correspond to the periodic distribution curves because of the correlation degree values of the corresponding periodic distribution curvesIs according to->Is changed by the change of (1), a periodic distribution value corresponding to a time point is arbitrarily selected on the periodic distribution curve, when +.>When the size is not an integer multiple of the period, the corresponding molecule must have a negative value, the corresponding equivalent +.>When the size is integer times of the period, namely +.>Corresponding periodic distribution value and->In some agreement, the corresponding molecules do not have negative values, so it can be concluded that the point where the correlation value reaches the maximum will occur at an integer multiple of the period size. The correlation degree model obtains a correlation degree value corresponding to each time point according to the characteristic that the period distribution value corresponding to each time point on the period distribution curve is basically consistent with the period distribution value of the time point which is different from the interval of the time point by integer times of a period. The corresponding correlation degree value gradually increases to the maximum value along with the continuous increase of the time interval value, gradually decreases after reaching the maximum value, and then gradually increases to the maximum value due to gradual increase. The point where the correlation value reaches the maximum value may occur at an integer multiple of the period size, so that according to this characteristic, the time interval between adjacent peaks of the correlation value may be used as the period length, thereby obtaining each period in the sampling period.
The distribution difference feature obtaining module 103 is configured to obtain a periodic distribution difference according to a periodic distribution value of corresponding positions between different periods, and obtain a distribution trend difference according to volatility of the distribution trend value in each period.
The distribution trend value, the period distribution value, the residual and the period of the sensor data sequence are obtained by the period acquisition module 102. The module mainly aims at acquiring the periodic distribution difference degree and the distribution trend difference degree through the distribution trend value, the periodic distribution value and the period. The periodic distribution difference degree acquisition process specifically comprises the following steps:
the periodic distribution curve of each sensor data is divided according to the period obtained by the period obtaining module, and the periodic distribution curve of each sensor data can be divided intoCycle of>Is a positive integer. And further obtaining the periodic distribution difference degree of each time point according to the distribution of the periodic distribution value of each time point. Preferably, the difference accumulation sum of the periodic distribution values at the corresponding time points of each period before the target time point is calculated and recorded as the previous periodic distribution difference degree; calculating the difference accumulation sum of the periodic distribution values at the corresponding time points of each period after the target time point and the target time point, and recording the difference accumulation sum as the post-periodic distribution difference degree; taking the sum of the front periodic distribution difference degree and the rear periodic distribution difference degree as the periodic distribution difference degree corresponding to the target time point; and changing the target time point to obtain the periodic distribution difference degree corresponding to all the time points.
The periodic distribution difference degree is obtained through a periodic distribution difference degree model, and the periodic distribution difference degree model comprises the following components in percentage by weight:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Periodic distribution degree of difference corresponding to each time point, < >>Is->Number of periods of time points in divided periods, < >>For the number of cycles in the cycle profile of each sensor data within a preset sampling period +.>Is->The period size of the periodic distribution curve corresponding to the respective time points,/->Is->Periodic distribution values of the individual time points, +.>Is->Periodic distribution values of the individual time points, +.>Is->Periodic distribution values at each time point.
The periodic distribution difference degree model obtains the periodic distribution difference degree corresponding to each time point by considering the difference of the periodic distribution value corresponding to each time point and the periodic distribution value of the time point which is different from the interval of the time point by an integer multiple of the period, and the corresponding periodic distribution difference degree can be obtained by summing the previous periodic distribution difference degree and the later periodic distribution difference degree by considering the previous periodic distribution difference degree before the time point and the later periodic distribution difference degree after the time point because similar periods possibly exist before and after each time point. And when the period distribution value corresponding to each time point is different from the period distribution value of the time point with the integral multiple of the period, the corresponding period distribution difference degree is larger.
The acquisition process of the distribution trend difference degree specifically comprises the following steps:
the distribution trend value of each sensor data is divided according to the period obtained by the period obtaining module, and the distribution trend value of each sensor data can be divided intoAnd each distribution trend segment. And further obtaining the distribution trend difference degree according to the fluctuation of the distribution trend value of each time point in each period. Preferably, a distribution trend curve is fitted according to the distribution trend values, the slope of the distribution trend curve corresponding to the previous time point except the first time point is calculated and recorded as a distribution trend slope value, and the variance of the distribution trend slope value in each period is taken as the distribution trend difference degree. The method is expressed as a distributed trend difference degree obtained through a distributed trend difference degree model in a formula, wherein the distributed trend difference degree model comprises:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Distribution trend difference corresponding to each time point, < ->Is->The cycle size of each time point; />Is->The +.>A distribution trend slope value of a line connecting each time point and a time point preceding the time point; />Is->The average value of the slope of the distribution trend of the connecting line of two adjacent points in the period where each time point is located.
The distribution trend difference degree model takes the fluctuation change degree of each time point on the fitted distribution trend curve into consideration, quantifies the distribution trend difference degree of the time points, calculates the slope of each time point on the distribution trend curve, and obtains the distribution trend difference degree by the variance of the slope, so that the distribution trend difference degree of the current time point is represented by the fluctuation difference of the distribution trend value change of the period where each time point is located, the fluctuation change of the distribution trend value of the corresponding time point is larger in the period where the corresponding time point is located, the distribution trend slope value of the corresponding time point is larger, the distribution trend of the current time point appears irregular in the period in the distribution trend curve, and the distribution trend difference degree corresponding to the current time point is further larger.
The fusion weight value obtaining module 104 is configured to obtain a measurement distribution feature at each sampling frequency in each sensor data sequence through weighted summation by using the periodic distribution difference degree, the distribution trend difference degree and the residual error; and obtaining a fusion weight value of each sensor data sequence at each sampling frequency according to the difference between the measurement distribution characteristics, and carrying out data fusion on different sensor data sequences according to the fusion weight values.
And obtaining the distribution trend difference degree and the periodic distribution difference degree corresponding to each time point in the sensor data sequence through the distribution difference degree characteristic obtaining module 103. The invention further obtains the fusion weight value corresponding to each sensor data at each time point according to the magnitude of the anomaly degree value by calculating the periodic distribution difference degree and the distribution trend difference degree of each time point in the sensor data sequence.
The invention characterizes the abnormal degree value according to the distribution trend difference degree, the periodic distribution difference degree and the measurement distribution characteristics obtained by the weighted summation of the residual errors, but considers that the periodic distribution difference degree, the distribution trend difference degree and the corresponding residual errors of each time point have different influences on the measurement distribution characteristics, namely the abnormal degree value, and the residual errors belong to residual errors in the STL time sequence algorithm, and the high probability corresponds to the random error of each sensor data in the sensor data sequence, so the proportion of the residual errors in the weighted summation is smaller. In order to facilitate the subsequent acquisition of a fusion weight value according to the measurement distribution characteristics, the invention constructs a three-dimensional data coordinate system by taking the periodic distribution difference degree, the distribution trend difference degree and the residual error corresponding to each time point of each sensor data as three coordinates of three-dimensional coordinate points.
Preferably, a three-dimensional data coordinate system is constructed according to the periodic distribution difference degree, the distribution trend difference degree and the residual error corresponding to each time point as three dimensions of the three-dimensional coordinate system, and the measurement distribution characteristic of each three-dimensional data coordinate point is obtained through weighted summation according to all three-dimensional data coordinate points corresponding to each time point in the three-dimensional data coordinate system, wherein the weighted summation comprises: counting a target periodic distribution difference degree, a target distribution trend difference degree and a target residual error which correspond to a target three-dimensional data coordinate point, calculating the product of the target periodic distribution difference degree and a preset periodic distribution difference degree weight and marking the product as a target periodic distribution difference, calculating the product of the target distribution trend difference degree and a preset distribution trend difference degree weight and marking the product as a target distribution trend difference, calculating the product of the target residual error and a preset residual error weight and marking the product as a target residual error value, and taking the sum of the target periodic distribution difference, the target distribution trend difference and the target residual error value as a part under a root number to obtain a measurement distribution characteristic which corresponds to the target three-dimensional data coordinate point; and changing the target three-dimensional data coordinate points to obtain all measurement distribution characteristics corresponding to all the three-dimensional data coordinate points.
Expressed in terms of the formula: and obtaining a measurement distribution characteristic corresponding to the three-dimensional data coordinate point through a measurement distribution characteristic model, wherein the measurement distribution characteristic model comprises:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->The periodic distribution difference degrees corresponding to the three-dimensional data coordinate points; />Is->The distribution trend difference degrees corresponding to the three-dimensional data coordinate points; />Is->Residual errors corresponding to three-dimensional data coordinate points, < >>Calculating weights for the period distribution variability, +.>Calculating weights for the distribution trend diversity order, +.>Weights are calculated for the residuals. In the embodiment of the present invention, considering that the residual has a smaller specific gravity in the weighted summation, the +.>The setting is made to be 0.4,set to 0.4 @, ->Set to 0.2.
The measurement distribution characteristic model considers that the influence of data corresponding to three dimensions of three-dimensional data coordinate points on an abnormal degree value is different, and further gives different weights when the three dimensions are weighted and summed, so that the obtained measurement distribution characteristic is more accurate. The metric distribution feature model introduces weight values of each dimension through three-dimensional data coordinate points to calculate Euclidean distance, and data features of each three-dimensional data coordinate point in a three-dimensional data coordinate system are represented according to Euclidean distance, namely when the distribution trend difference degree corresponding to each sensor data at each time point is larger, the periodic distribution difference degree is larger, and when residual errors are larger, the corresponding metric distribution feature values are larger.
According to the invention, the measurement distribution characteristic value characterization abnormal degree value is obtained by introducing the three-dimensional data coordinate points, so that the fusion weight value obtained by clustering according to the measurement distribution characteristic value by using an LOF algorithm in the follow-up process is more accurate. Preferably, local anomaly factors of each three-dimensional data coordinate point are obtained by taking the measurement distribution characteristics of each three-dimensional data coordinate point as the measurement of the distance according to a preset distance neighborhood through an LOF algorithm, and the local anomaly factors are normalized to obtain fusion weight values of the same time point in each channel in the sensor data sequence.
At each time point, the acquisition mode of the fusion weight value of different sensor data in the sensor data sequence is specifically as follows: in the invention, a self-adaptive LOF algorithm is adopted to perform clustering operation on three-dimensional data coordinate points, the measurement distribution characteristics of each three-dimensional data coordinate point are used as measurement of distances according to a preset distance neighborhood, the self-adaptive LOF algorithm is carried out to calculate local anomaly factors of each three-dimensional data coordinate point, namely, local anomaly factors corresponding to each sensor data at each time point, normalization is carried out on the local anomaly factors corresponding to each sensor at the same time point, and fusion weight values corresponding to each sensor data in a sensor data sequence at each time point are obtained, namely, the fusion weight values are normalized local anomaly factors. In the embodiment of the invention, the preset distance neighborhood is set to be 6, namely, when the local anomaly factor is calculated by using the LOF algorithm, the calculation is required to be performed according to 6 other sample points closest to the target sample point. It should be noted that, the setting of the neighborhood with the preset distance can be formulated according to the specific implementation situation of the implementer, and the process of obtaining the local anomaly factor by the LOF algorithm is that: the local density of the three-dimensional data coordinate point is compared with the density of the surrounding points, if the local density of the three-dimensional data coordinate point is obviously smaller than the density of the surrounding points, the points are located in a relatively sparse area relative to the surrounding points, the points are abnormal points on the surface, and a specific calculation method is a technical means well known to a person skilled in the art and is not repeated herein.
After fusion weight values corresponding to different sensor data in the sensor data sequence at each time point are obtained, the data fusion at each time point is further completed according to the fusion weight values, specifically:
and carrying out weighted summation on products of sensor data values corresponding to the data of different channels in the same time point and corresponding fusion weights to obtain data corresponding to each time point, and completing data fusion. Expressed in terms of the formula:
wherein, the liquid crystal display device comprises a liquid crystal display device,is->Data after fusion of the individual time points, +.>Is->The number of channels is->Fusion weight value of each time point, +.>Is->No. H of the individual channels>Sensor data values for individual time points, < >>Representing the number of sensor data in the sequence of sensor data, i.e. the number of corresponding sensors.
After the data fusion is completed, the fused data at each time point is transmitted to a big data platform for storage. The present invention has been completed.
In summary, the sensor data sequence and each sensor data in the sensor data sequence are acquired through the data acquisition module, the distribution trend value, the periodic distribution value, the residual error and the period corresponding to the sensor data sequence are acquired according to the period acquisition module, the periodic distribution difference degree and the distribution trend difference degree are obtained through the distribution trend value, the periodic distribution value and the period according to the distribution difference degree characteristic acquisition module, and finally the fusion weight value of each sensor data in the sensor data sequence at each time point is obtained through the periodic distribution difference degree, the distribution trend difference degree and the residual error according to the fusion weight value through the fusion weight value acquisition module, and the data fusion is completed according to the fusion weight value and stored in the large data fusion platform for storage management.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims (9)

1. A big data fusion platform, the platform comprising:
the data acquisition module is used for acquiring a sensor data sequence of each sensor on the production equipment in a preset sampling period according to a preset sampling frequency; the sensors are the same in type and different in position;
the period acquisition module is used for obtaining a distribution trend value, a period distribution value and a residual error of the sensor data sequence according to the sensor data sequence through a time sequence decomposition algorithm, and obtaining the period of the sensor data sequence according to the period distribution value;
the distribution difference characteristic acquisition module is used for obtaining the distribution trend difference according to the periodic distribution value of the corresponding positions among different periods and the fluctuation of the distribution trend value in each period;
the fusion weight value acquisition module is used for obtaining measurement distribution characteristics under each sampling frequency in each sensor data sequence through weighted summation of the periodic distribution difference degree, the distribution trend difference degree and the residual error; and obtaining a fusion weight value of each sensor data sequence at each sampling frequency according to the difference between the measurement distribution characteristics, and carrying out data fusion on different sensor data sequences according to the fusion weight values.
2. The big data fusion platform of claim 1, wherein the method of acquiring the period of the sensor data sequence comprises:
iterating a preset step length according to the preset time interval value, calculating a correlation degree value of the periodic distribution curve corresponding to the time interval value after each iteration according to the periodic distribution value through the periodic distribution curve correlation degree model; and taking the time interval between adjacent peaks of the correlation degree value in the iterative process as the period length, and obtaining each period of the sensor data sequence.
3. The big data fusion platform according to claim 2, wherein the method for obtaining the correlation degree value of the periodic distribution curve comprises:
calculating a correlation degree value of the periodic distribution curve according to the time interval value and the periodic distribution value through a periodic distribution curve correlation degree model, wherein the periodic distribution curve correlation degree model comprises the following components:
wherein, the liquid crystal display device comprises a liquid crystal display device,for the correlation value of the periodic distribution curve, < >>For the time interval value,/->For the preset sampling period +.>Periodic distribution values of the individual time points, +.>For the preset sampling period +.>The periodic distribution values at the respective time points,for the average value of the periodic distribution values within the preset sampling period, +.>The number of sampling time points in the preset sampling period is set.
4. The big data fusion platform according to claim 2, wherein the method for obtaining the periodic distribution difference degree comprises:
calculating the difference accumulation sum of the periodic distribution values at the corresponding time points of each period before the target time point and the target time point, and recording the difference accumulation sum as the previous periodic distribution difference degree; calculating the difference accumulation sum of the periodic distribution values at the corresponding time points of each period after the target time point and the target time point, and recording the difference accumulation sum as the post-periodic distribution difference degree; taking the sum of the front periodic distribution difference degree and the rear periodic distribution difference degree as the periodic distribution difference degree corresponding to the target time point;
and changing the target time point to obtain the periodic distribution difference degree corresponding to all the time points.
5. The big data fusion platform according to claim 1, wherein the method for obtaining the distribution trend diversity factor comprises:
and fitting a distribution trend curve according to the distribution trend values, calculating the slopes of the distribution trend curves corresponding to the other time points except the first time point and the previous time point, marking the slopes as distribution trend slope values, and taking the variance of the distribution trend slope values in each period as the distribution trend difference degree.
6. The big data fusion platform of claim 1, wherein the method for obtaining the metric distribution feature comprises:
constructing a three-dimensional data coordinate system according to the periodic distribution difference degree, the distribution trend difference degree and the residual error corresponding to each time point as three dimensions of the three-dimensional coordinate system, and obtaining measurement distribution characteristics of each three-dimensional data coordinate point through weighted summation according to all three-dimensional data coordinate points corresponding to each time point in the three-dimensional data coordinate system, wherein the weighted summation comprises:
counting a target periodic distribution difference degree, a target distribution trend difference degree and a target residual error which correspond to a target three-dimensional data coordinate point, calculating the product of the target periodic distribution difference degree and a preset periodic distribution difference degree weight and marking the product as a target periodic distribution difference, calculating the product of the target distribution trend difference degree and a preset distribution trend difference degree weight and marking the product as a target distribution trend difference, calculating the product of the target residual error and a preset residual error weight and marking the product as a target residual error value, and obtaining a measurement distribution characteristic which corresponds to the target three-dimensional data coordinate point by dividing the sum of the target periodic distribution difference, the target distribution trend difference and the target residual error value;
and changing the target three-dimensional data coordinate points to obtain all measurement distribution characteristics corresponding to all the three-dimensional data coordinate points.
7. The big data fusion platform of claim 6, wherein the obtaining the fusion weight value for each sampling frequency in each sensor data sequence based on the differences between the metric distribution characteristics comprises:
and obtaining local anomaly factors of each three-dimensional data coordinate point by taking the measurement distribution characteristics of each three-dimensional data coordinate point as measurement of distances according to a preset distance neighborhood through an LOF algorithm, and normalizing the local anomaly factors to obtain fusion weight values of the same time point in each channel in the sensor data sequence.
8. The big data fusion platform of claim 1, wherein the data fusion according to the fusion weight value comprises:
and carrying out weighted summation on products of sensor data values corresponding to the data of different channels in the same time point and corresponding fusion weights to obtain data corresponding to each time point, and completing data fusion.
9. The big data fusion platform according to claim 1, wherein the method for obtaining the distribution trend value, the periodic distribution value and the residual error of the sensor data sequence comprises:
and taking the sensor data sequence as input data of an STL time sequence segmentation algorithm, and outputting data which are the distribution trend value, the periodic distribution value and the residual error.
CN202211679727.8A 2022-12-27 2022-12-27 Big data fusion platform Active CN115659284B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211679727.8A CN115659284B (en) 2022-12-27 2022-12-27 Big data fusion platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211679727.8A CN115659284B (en) 2022-12-27 2022-12-27 Big data fusion platform

Publications (2)

Publication Number Publication Date
CN115659284A CN115659284A (en) 2023-01-31
CN115659284B true CN115659284B (en) 2023-07-18

Family

ID=85022352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211679727.8A Active CN115659284B (en) 2022-12-27 2022-12-27 Big data fusion platform

Country Status (1)

Country Link
CN (1) CN115659284B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115935296B (en) * 2023-03-09 2023-06-23 国网山东省电力公司营销服务中心(计量中心) Electric energy data metering method and system
CN116108008A (en) * 2023-04-13 2023-05-12 山东明远生物科技有限公司 Decorative material formaldehyde detection data processing method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101636A (en) * 2020-08-26 2020-12-18 北京航空航天大学 GRU and GARCH-based satellite long-period variance-variance degradation prediction evaluation method
CN112560974A (en) * 2020-12-22 2021-03-26 清华大学 Information fusion and vehicle information acquisition method and device
CN112989271A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Time series decomposition
CN113761705A (en) * 2021-07-19 2021-12-07 合肥工业大学 Multi-sensor fusion method and system based on multi-dimensional attribute correlation analysis
CN113779111A (en) * 2021-09-22 2021-12-10 超级视线科技有限公司 Service scene time sequence data determination method and system based on multi-time scale fusion
CN114492670A (en) * 2022-02-17 2022-05-13 平安科技(深圳)有限公司 Data analysis method, device, equipment and storage medium based on multi-mode hybrid

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8296108B2 (en) * 2010-04-02 2012-10-23 Yugen Kaisha Suwa Torasuto Time series data analyzer, and a computer-readable recording medium recording a time series data analysis program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989271A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Time series decomposition
CN112101636A (en) * 2020-08-26 2020-12-18 北京航空航天大学 GRU and GARCH-based satellite long-period variance-variance degradation prediction evaluation method
CN112560974A (en) * 2020-12-22 2021-03-26 清华大学 Information fusion and vehicle information acquisition method and device
CN113761705A (en) * 2021-07-19 2021-12-07 合肥工业大学 Multi-sensor fusion method and system based on multi-dimensional attribute correlation analysis
CN113779111A (en) * 2021-09-22 2021-12-10 超级视线科技有限公司 Service scene time sequence data determination method and system based on multi-time scale fusion
CN114492670A (en) * 2022-02-17 2022-05-13 平安科技(深圳)有限公司 Data analysis method, device, equipment and storage medium based on multi-mode hybrid

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Evaluating Multi-sensor Combination of Normalized Difference Vegetation Index (NDVI) Time Series Data over Southeast Asia;Sanjiwana Arjasakusuma 等;2020 6th International Conference on Science and Technology (ICST);全文 *
基于重要点的时间序列趋势特征提取方法;周黔 等;浙江大学学报;全文 *

Also Published As

Publication number Publication date
CN115659284A (en) 2023-01-31

Similar Documents

Publication Publication Date Title
CN115659284B (en) Big data fusion platform
CN112147573A (en) Passive positioning method based on amplitude and phase information of CSI (channel State information)
JP2012047724A (en) Electromagnetic wave identification apparatus, electromagnetic wave identification method, and electromagnetic wave identification program
CN109195110B (en) Indoor positioning method based on hierarchical clustering technology and online extreme learning machine
CN109460539B (en) Target positioning method based on simplified volume particle filtering
CN112182961B (en) Converter station wireless network channel large-scale fading modeling prediction method
CN116541732B (en) Meteorological monitoring system based on ultrasonic data and optimization algorithm
CN116418882B (en) Memory data compression method based on HPLC dual-mode carrier communication
CN111783336A (en) Uncertain structure frequency response dynamic model correction method based on deep learning theory
CN115100376A (en) Electromagnetic spectrum map implementation method based on improved inverse distance interpolation method
CN107104747B (en) Clustering method of multipath components in wireless time-varying channel
CN107656905B (en) Air quality data real-time calibration method using error transfer
CN109100759B (en) Ionosphere amplitude flicker detection method based on machine learning
CN109803234B (en) Unsupervised fusion positioning method based on weight importance constraint
CN116559579A (en) Improved VMD and Teager energy operator fault positioning method
CN115082135B (en) Method, device, equipment and medium for identifying online time difference
CN114118587B (en) Power quality assessment method and system, equipment and storage medium of distributed photovoltaic
CN109065176A (en) A kind of blood glucose prediction method, device, terminal and storage medium
US20220076060A1 (en) Information processing apparatus, information processing method, and non-transitory computer readable medium
CN111222223B (en) Method for determining electromagnetic parameters of a radio wave propagation environment
CN106767773A (en) A kind of indoor earth magnetism reference map construction method and its device
CN113449920A (en) Wind power prediction method, system and computer readable medium
CN111083632A (en) Ultra-wideband indoor positioning method based on support vector machine
CN112766537A (en) Short-term electric load prediction method
CN116975611B (en) High-frequency load data generation method and system based on diffusion model ODE form

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230627

Address after: Area A, Room 204, Floor 2, Building 1, Jinshi Industrial Park, No. 368, Xinshi North Road, Shijiazhuang, Hebei, 050000

Applicant after: Hebei Xinlong Technology Group Co.,Ltd.

Applicant after: Shijiazhuang Xinlong Software Technology Co.,Ltd.

Address before: Area A, Room 204, Floor 2, Building 1, Jinshi Industrial Park, No. 368, Xinshi North Road, Shijiazhuang, Hebei, 050000

Applicant before: Hebei Xinlong Technology Group Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant