Disclosure of Invention
The invention provides an intelligent Internet of things big data transmission method and system, and aims to solve the existing problems.
The invention discloses an intelligent Internet of things big data transmission method and system, which adopt the following technical scheme: the method comprises the following steps:
the method comprises the steps that data of the Internet of things are segmented to obtain data vectors of all moments, the data vectors of all the moments are classified to obtain a plurality of sets of data vectors of the same type, and each type of data vector is formed into time sequence data;
obtaining the confidence coefficient of each unit root of the time sequence data, and calculating the stability degree of the time sequence data according to the confidence coefficient;
acquiring white noise coincidence rate of each time sequence data, and acquiring trend coincidence degree of each time sequence data;
calculating the complexity of each time sequence data according to the stationarity degree, the white noise coincidence rate and the trend coincidence degree, and calculating the loss influence weight of each time sequence data according to the complexity;
calculating the correlation of each two types of time sequence data, and calculating the loss influence degree of each time sequence data according to the correlation and the loss influence weight;
and acquiring an initialization weight matrix according to the loss influence degree, carrying out network training based on the initialization weight matrix, carrying out data compression on the Internet of things by using the trained network, and outputting compressed data.
Preferably, the step of segmenting the data of the internet of things to obtain data vectors at all times, and the step of classifying the data vectors at all times to obtain a plurality of sets of data vectors of the same type comprises the following steps:
obtaining the dimensionality of single moment data of the Internet of things;
dividing the data of the internet of things into corresponding data vectors at the 1 st moment, data vectors at the 2 nd moment and data vectors at the … nth moment according to the dimensionality of the data at each single moment;
extracting data at the same position in the data vector at each moment to obtain the data vector of the same type;
and obtaining a plurality of homogeneous data vector sets according to the homogeneous data vectors.
Preferably, the step of obtaining the confidence level of the unit root of each time series data, and calculating the stationarity degree of the time series data according to the confidence level includes:
obtaining trend information of each time series data through a least square method; eliminating trend information in the time series data;
acquiring the confidence coefficient of the unit root of the time sequence data with the trend information eliminated by a unit root inspection method;
and calculating the stationarity degree of the time series data through the confidence coefficient.
Preferably, the step of acquiring the trend conformity degree of each time series data includes:
obtaining processed first time sequence data by carrying out differential processing on the time sequence data;
analyzing the first time series data by using a polynomial regression model to obtain a trend equation, and obtaining a data prediction value corresponding to each moment of the first time series data according to the trend equation;
calculating a difference value between data corresponding to each moment in the first time sequence data and a data prediction value;
and acquiring the square mean of all the difference values, wherein the square mean is the trend coincidence degree.
Preferably, the step of calculating the complexity of each time series data according to the stationarity degree, the white noise coincidence rate and the trend coincidence degree comprises:
the complexity of each time series data is calculated according to the following formula (1):
wherein, F
iComplexity of time series data representing the ith category, B
iWhite noise coincidence rate, H, representing time series data of ith category
iShows the degree of conformity of the trend of the time series data of the ith category, P
iThe degree of stationarity of the time series data is indicated,
for the over-parameter, 0.2 was taken.
Preferably, the step of calculating the loss influence weight of the time series data according to the complexity comprises:
the loss influence weight of each time series data is calculated according to the following formula (2):
wherein Q isiLoss impact weight, F, representing the ith category of time series dataiThe complexity of the ith category of time series data is represented, and N represents the dimension of the ith category of time series data, namely the number of moments.
Preferably, the step of calculating the correlation between each two types of time series data comprises:
the correlation is calculated according to the following formula (3):
wherein, I represents the ith type of time series data, J represents the jth type of time series data, cov (I, J) represents the covariance of the data sequence, var (I) represents the variance of the ith type of time series data, var (J) represents the variance of the jth type of time series data, and Xi,jAnd representing the correlation coefficient of the i-th class and the j-th class time series data.
Preferably, the step of calculating the loss influence degree of each time series data according to the correlation and the loss influence weight includes:
the degree of influence of the loss was calculated according to the following formula (4):
wherein Q is
iWeight, Q, representing the loss impact of class i time series data
jRepresenting the loss impact weight of the j-th class of timing data. X
i,jShowing the correlation between the ith time sequence data and the jth time sequence data,
shows the influence of the loss of the i-th time series data on other data, Y
iIndicating the degree of influence of loss of the i-th time series data.
Preferably, the step of obtaining the initialization weight matrix according to the degree of influence of loss includes:
assuming that a weight vector multiplied by certain time sequence data and ith time sequence data of the internet of things is an M-dimensional vector, the initialization rule of the weight matrix of the input layer at positions 1-M is as follows:
using the loss influence degree as a mean value and the variance sigma1,σ2…,σMConstructing M normal distributions;
generating a corresponding random number by utilizing each normal distribution;
initializing a weight matrix of the input layer according to the random number as an initialized weight value of the corresponding position;
the initialized weight value of the weight matrix of the hidden layer is alpha, the alpha is an empirical value of 10, and the weight matrix of the hidden layer is initialized, so that the initialized weight matrix is obtained.
The invention also comprises an intelligent Internet of things big data transmission system, which comprises:
the data partitioning module is used for partitioning the data of the Internet of things to obtain data vectors at all times, classifying the data vectors at all times to obtain a set of a plurality of data vectors of the same type, and forming each type of data vector into time sequence data;
the first data processing module is used for acquiring the confidence coefficient of each unit root of the time sequence data and calculating the stationarity degree of the time sequence data according to the confidence coefficient;
the second data processing module is used for acquiring the white noise coincidence rate of each time sequence data and acquiring the trend coincidence degree of each time sequence data;
the third data processing module is used for calculating the complexity of each time sequence data according to the stationarity degree, the white noise coincidence rate and the trend coincidence degree, and calculating the loss influence weight of each time sequence data according to the complexity;
the fourth data processing module is used for calculating the correlation of each two types of time sequence data and calculating the loss influence degree of each time sequence data according to the correlation and the loss influence weight;
and the data transmission module is used for acquiring the initialized weight matrix according to the loss influence degree, carrying out network training based on the initialized weight matrix, carrying out data compression on the Internet of things by using the trained network, and outputting compressed data.
The invention has the beneficial effects that: according to the method and the system for transmitting the big data of the intelligent Internet of things, the loss influence weight corresponding to the time sequence data is obtained through the complexity of data change of the time sequence data, the loss influence degree of the data is obtained according to the loss influence weight and the correlation of various types of time sequence data, the initialized weight matrix is obtained through the loss influence degree, and the initialized weight matrix is used for carrying out grid training, so that the convergence of the grid training is high, the time of the grid training is shortened, and the transmission speed of the output compressed data is improved.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The embodiment of the invention discloses an intelligent Internet of things big data transmission method and system, wherein the method comprises the following steps:
s1, because the data of the Internet of things are a series of time sequence data generally, the time sequence data of the Internet of things has the following characteristics under the normal condition: the data dimensionality of each moment is the same; the homogeneous data is generally in the same position in the data vector at each moment, so that the data of the internet of things needs to be divided into the data vectors at each moment, then the data of the internet of things is divided to obtain the data vectors at each moment, the data vectors at each moment are classified to obtain a plurality of sets of homogeneous data vectors, and each type of data vector forms a time sequence data.
Specifically, S11, obtaining the dimensionality of the single moment data of the Internet of things data; s12, dividing the data of the Internet of things into corresponding data vectors at the 1 st moment, the 2 nd moment and the … nth moment according to the dimensionality of each single moment; s13, extracting the data at the same position in the data vectors at each moment to obtain the data vectors of the same type, wherein the data at the same position in the general time sequence data are the data of the same type; and S14, obtaining a plurality of homogeneous data vector sets according to the homogeneous data vectors.
And S2, acquiring the confidence coefficient of each unit root of the time sequence data, and calculating the stationarity degree of the time sequence data according to the confidence coefficient.
Specifically, S21, obtaining trend information of each time series data through a least square method; eliminating trend information in the time series data; s22, obtaining the confidence coefficient of the unit root of the time sequence data of the elimination trend information through a unit root inspection method; s23, calculating the stationarity degree of the time series data according to the confidence coefficient, and calculating the stationarity degree according to the following formula (5):
wherein, muiConfidence, P, representing the root of the unit of existence of the ith category of time series dataiRepresenting the degree of stationarity, P, of the time-series data of the ith categoryiA larger value indicates a higher probability that the data converges to a certain trend, and indicates a stronger regularity of the data.
And S3, acquiring the white noise coincidence rate of each time sequence data, and acquiring the trend coincidence degree of each time sequence data.
Specifically, S31, obtaining a white noise confidence B through white noise detectioniConfidence of white noise BiThat is, the white noise coincidence rate of the time series data is reflected, and the larger the white noise coincidence rate is, the larger the randomness of the time series data is, the rule thereof isThe regularity is poor.
S32, the step of obtaining the trend conformity degree of each time series data comprises the following steps: s321, carrying out difference processing on the time sequence data to obtain processed first time sequence data; s322, analyzing the first time series data by using a polynomial regression model to obtain a trend equation, and obtaining a data prediction value corresponding to each moment of the first time series data according to the trend equation; s323, calculating a difference value between data corresponding to each moment in the first time series data and a data predicted value; s324, obtaining the square mean value H of all the difference valuesiMean square value HiI.e. the mean square value H which is the degree of conformity of the trendiThe larger the value is, the smaller the degree of coincidence of the trend of the time-series data is reflected.
And S4, calculating the complexity of each time sequence data according to the stationarity degree, the white noise coincidence rate and the trend coincidence degree, and calculating the loss influence weight of each time sequence data according to the complexity.
Specifically, the complexity of each time series data is calculated according to the following formula (1):
wherein, F
iComplexity of time series data representing the ith category, B
iWhite noise coincidence rate, H, representing time series data of ith category
iShows the degree of trend conformity of the time series data of the ith category, P
iThe degree of stationarity of the time series data is indicated,
for the over-parameter, 0.2 was taken.
The loss influence weight of each time series data is calculated according to the following formula (2):
wherein QiLoss impact weight, F, representing the ith category of time series dataiTo representThe complexity of the ith category of time series data, N, indicates the dimension of the ith category of time series data, i.e., the number of times.
And S5, calculating the correlation of each two types of time sequence data, and calculating the loss influence degree of each time sequence data according to the correlation and the loss influence weight.
Specifically, the correlation is calculated according to the following formula (3):
wherein I represents the I-th class time series data, J represents the J-th class time series data, cov (I, J) represents the covariance of the data sequence, Var (I) represents the variance of the I-th class time series data, Var (J) represents the variance of the J-th class time series data, and Xi,jAnd representing the correlation coefficient of the i-th class and the j-th class time series data.
The degree of influence of loss was calculated according to the following formula (4):
wherein Q is
iWeight, Q, representing the loss impact of class i time series data
jRepresenting the loss impact weight of the j-th class of timing data. X
i,jShowing the correlation between the ith type time sequence data and the jth type time sequence data,
showing the influence of the loss of class i time series data on other data, Y
iIndicating the degree of influence of loss of the i-th type time series data.
S6, acquiring an initialization weight matrix according to the loss influence degree, carrying out network training based on the initialization weight matrix, carrying out data compression of the Internet of things by using the trained network, and outputting compressed data, wherein the loss function adopted by the network is a cross entropy loss function.
Specifically, S61, assume that a weight vector obtained by multiplying certain type of time series data of the internet of things data by i-th type of time series data is an M-dimensional vectorThen, the initialization rule of the weight matrix of the input layer at positions 1 to M is: with the loss influence degree as the mean and the variance σ
1,σ
2…,σ
MConstructing M normal distributions; according to the experience of sigma
1、σ
2…σ
MRespectively take out
Where δ represents the influence horizon variance of all time series data, S represents the dimensionality of data at a single moment, M represents the several normal distributions, and S62 generates corresponding random numbers using each normal distribution; the random numbers are approximately distributed around the weight values, so that S63, the weight matrix of the input layer is initialized according to the random numbers as the initialized weight values of the corresponding positions; s64 initializes the weight matrix of the hidden layer with the initialized weight value α of the weight matrix of the hidden layer being a value of 10, thereby obtaining the initialized weight matrix. And S62, specifically, performing Internet of things data compression by using the trained network, and outputting data with the minimum dimension as compressed data.
The invention also discloses an intelligent Internet of things big data transmission system, which comprises: the device comprises a data segmentation module, a first data processing module, a second data processing module, a third data processing module, a fourth data processing module and a data transmission module; the data segmentation module is used for segmenting the data of the Internet of things to obtain data vectors at all times, classifying the data vectors at all times to obtain a set of a plurality of data vectors of the same type, and forming each type of data vector into time sequence data; the first data processing module is used for acquiring the confidence coefficient of each unit root of the time sequence data and calculating the stationarity degree of the time sequence data according to the confidence coefficient; the second data processing module is used for acquiring the white noise coincidence rate of each time sequence data and acquiring the trend coincidence degree of each time sequence data; the third data processing module is used for calculating the complexity of each time sequence data according to the stationarity degree, the white noise coincidence rate and the trend coincidence degree, and calculating the loss influence weight of each time sequence data according to the complexity; the fourth data processing module is used for calculating the correlation of each two types of time sequence data and calculating the loss influence degree of each time sequence data according to the correlation and the loss influence weight; and the data transmission module is used for acquiring an initialized weight matrix according to the loss influence degree, finishing self-coding network training on the basis of the initialized weight matrix, extracting data with the minimum dimension after the network training is finished, and transmitting the data with the minimum dimension to the target port.
In summary, the invention provides an intelligent internet of things big data transmission method and system, wherein loss influence weights corresponding to time sequence data are obtained through complexity of data variation of the time sequence data, loss influence degrees of the data are obtained according to the loss influence weights and correlation of various types of time sequence data, an initialization weight matrix is obtained through the loss influence degrees, and grid training is performed by using the initialized initialization weight matrix, so that convergence of the grid training is fast, time of the grid training is shortened, and transmission speed of output compressed data is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.