CN117786539A - Construction site environment quality early warning method and system based on environment parameters - Google Patents

Construction site environment quality early warning method and system based on environment parameters Download PDF

Info

Publication number
CN117786539A
CN117786539A CN202311780946.XA CN202311780946A CN117786539A CN 117786539 A CN117786539 A CN 117786539A CN 202311780946 A CN202311780946 A CN 202311780946A CN 117786539 A CN117786539 A CN 117786539A
Authority
CN
China
Prior art keywords
data
early warning
environmental
parameters
environmental parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311780946.XA
Other languages
Chinese (zh)
Inventor
翟晓萌
胡余文
陈泉
王静怡
程曦
孙海森
仓敏
吴霜
诸德律
薛磊
刘剑
武永宝
贾玉斌
董玲风
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
Southeast University
Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University, Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co Ltd filed Critical Southeast University
Priority to CN202311780946.XA priority Critical patent/CN117786539A/en
Publication of CN117786539A publication Critical patent/CN117786539A/en
Pending legal-status Critical Current

Links

Landscapes

  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention discloses a construction site environment quality early warning method and system based on environment parameters, wherein the method extracts, standardizes and dimension-increasing processes are carried out on the environment parameters including noise data and dust data on a construction site, on one hand, the design digs the intrinsic characteristics of the environment parameters, improves the detection precision of abnormal data, on the other hand, the design has low requirements on the data quantity and the data source of the noise data and the dust data, and does not need to set a plurality of sensors on the construction site, thereby meeting the actual construction site requirements. According to the method, a self-adaptive algorithm model based on a random forest basic classifier is constructed as an early warning model, and the early warning model is utilized to output a security label corresponding to the current environmental parameter for early warning judgment.

Description

Construction site environment quality early warning method and system based on environment parameters
Technical Field
The invention belongs to the field of environmental parameter monitoring and early warning, and particularly relates to a construction site environmental quality early warning method and system based on environmental parameters.
Background
The power grid project needs to carry out environment-friendly monitoring on the construction site during construction so as to avoid noise and dust emission on the construction site. At present, environmental protection monitoring is mainly carried out according to the environmental parameter acquired by an environmental parameter sensor (such as a noise sensor and a PM2.5 sensor) arranged on a construction site, and early warning is carried out through a classification/detection model or algorithm.
The reconstruction error function based on optimization is introduced through an automatic encoder, and the established prediction model shows a good effect in early warning of the power grid environment. However, the encoder-based algorithm requires a large amount of multi-source sample data to train the model, and is influenced by the construction environment and the cost of the power grid project, and a plurality of sensors cannot be arranged on a construction site, namely, a method for acquiring the multi-source data to meet the requirements of a sample data source and the data amount is not available, so that the research is not suitable for early warning of the actual construction site.
In addition, the random forest classifier is adopted to classify and early warn environmental parameters on site, and although the method can obtain better early warn precision, the method still has higher requirements on data sources, and meanwhile, the random forest algorithm also has the fitting problem. In order to improve the inversion accuracy of the random forest, random forest and Bayesian super-parameter optimization are adopted, and although the self-adaptive algorithm is adopted, the adjustment and optimization of super-parameters are still tasks requiring professional knowledge, and in addition, the model has poor adaptability to various data sets, particularly in the case of noise, the generalization capability of the model is generally general.
There have also been studies to accurately and effectively classify small sample size grid multisource data, and support vector machines have been implemented. However, the parameter solving process of the support vector machine is a time-consuming quadratic solving process, so several improved support vector machine methods are proposed, including the least squares support vector machine, which has better prediction effect than other methods, but cannot guarantee the globally optimal solution.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a construction site environment quality early warning method and system based on environment parameters.
The technical scheme of the invention is as follows:
a construction site environmental quality early warning method based on environmental parameters including noise data acquired by a single or a plurality of noise sensors and dust data acquired by a single or a plurality of dust sensors, comprising:
acquiring historical environment parameters, extracting characteristic data of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
performing dimension lifting on the initial data to obtain sample data, and acquiring a security tag for early warning judgment corresponding to the sample data;
constructing a self-adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking the sample data and the corresponding security label thereof as training data;
and outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
Further, the specific step of extracting the characteristic data of the historical environmental parameters includes:
1) Dividing noise data and dust data in historical environmental parameters according to different corresponding sensors, and performing time matching on the divided historical environmental parameters to form an original data set omega= { d j |j=1,2,...,N},d j The j-th array in the original data set is used, and N is the total number of arrays in the original data set; each array in the original dataset comprises environmental parameters of all categories of environmental parameters at the same time, d j =[d 1j ,d 2j ,...,d ij ,...,d Ij ],d 1j ,d 2j ,...,d ij ,...,d Ij The j-th array is 1,2, I-class environmental parameters, I being the total class of environmental parameters;
calculating the average value of various environmental parameters in the original data set and the overall average value of the array in the original data set, wherein the calculation formula is as follows:
wherein m is i Is the mean value of the i-th environmental parameters; d, d ij The environment parameters are the ith environmental parameters in the jth array;
wherein m is the overall average value of the array in the original data set;
2) Respectively calculating intra-class scattering matrixes of various environmental parameters, and calculating inter-class scattering matrixes among the various environmental parameters, wherein the calculation formulas are respectively as follows:
in the method, in the process of the invention,an intra-class scatter matrix that is a class i environmental parameter; s is S b An inter-class dispersion matrix between various environmental parameters;
3) Searching the optimal projection vector of noise data and dust data by using Fisher criterion function, wherein the Fisher criterion function is that
Wherein J is ii ) Outputting Fisher criterion functions corresponding to the i-th environmental parameters; omega i The optimal projection vector is the i-th environmental parameter;
4) Carrying out projection mapping on various environmental parameters in the original data set according to the optimal projection vectors of the corresponding environmental parameter types to obtain the characteristics corresponding to the various environmental parametersData, thereby forming a feature data set X '= { X' j |j=1,2,...,N};x′ j For the j-th array in the feature dataset, x' j And the j-th group d in the original data set j Correspondingly, each data group in the characteristic data set comprises characteristic data corresponding to the environmental parameters of all the category environmental parameters at the same time, and x' j =[x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij ];x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij And the characteristic data corresponding to the I-class environment parameters are respectively 1,2, I and I in the j-th array.
Further, the standard deviation standardization method is adopted in the standardization treatment, and the specific method comprises the following steps: respectively carrying out standardization processing on each type of environmental parameters in the characteristic data set to form an initial data set X= { X j |j=1,2,...,N},x j For the j-th array in the initial dataset, x j =[x 1j ,x 2j ,...,x ij ,...,x Ij ];x 1j ,x 2j ,...,x ij ,...,x Ij The initial values corresponding to the class I environmental parameters in the j-th array are 1,2, I, as follows:
wherein x is ij The initial value corresponding to the ith environmental parameter in the jth array in the initial data set is obtained; x's' ij The characteristic data corresponding to the i-th environmental parameter in the j-th array in the characteristic data set; x is x i,min The minimum value in the feature data corresponding to the i-th environmental parameter; x is x i,max And the maximum value in the characteristic data corresponding to the i-th environmental parameter.
Further, the specific step of performing dimension lifting on the initial data to obtain sample data includes:
1) The initial data is up-scaled as follows:
in the method, in the process of the invention,is a dimension-increasing function; />For the j-th array x in the initial dataset j A sample corresponding to the dimension after the dimension rise; />For expansion coefficient after dimension increase, +.>Lambda is the exponential coefficient of the Gaussian kernel function; || is a norm operator;
x 1j 、x 2j andrespectively setting an initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set, an initial value corresponding to the class 2 environmental parameter in the jth array in the initial data set and a k power of the initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set; k is the dimension; exp is an exponential operator with natural number as a base;
2) Based on the dimension-up result, a sample data set is formed,the j-th sample in the sample dataset.
Further, the specific steps of constructing the adaptive algorithm model based on the random forest basic classifier as the early warning model comprise the following steps of
1) Constructing a decision tree model and a classifier, initializing decision tree model parameters, the weight of the classifier, updating gradient threshold epsilon and the weight of sample weight and error rate by a self-adaptive algorithm, and enabling iteration times t=1;
2) Iteratively updating the sample weight, the classifier weight, and the error rate weight, the iteratively updated formula being as follows
In the method, in the process of the invention,and->Sample +.f. in the iterative update procedure of the t-th and t-1 th times, respectively>The output of the corresponding classifier;
and->Sample +.f. in the iterative update procedure of the t-th and t-1 th times, respectively>The weight of the corresponding classifier;
θ is the gradient step size of the adaptive algorithm,
and->Sample +.>Outputting a corresponding decision tree model;
ω t,j and omega t-1,j Samples in the t-th and t-1 th iteration updating processes respectivelyCorresponding weights;
y j for the sampleA corresponding security tag;
for the gradient value of the loss function, the loss function +.>
3) Judging the magnitude relation between the Loss function Loss and the adaptive algorithm updating gradient threshold epsilon, if the Loss function Loss is smaller than the adaptive algorithm updating gradient threshold, iterating the times t=t+1, and returning to the step 2); otherwise, generating a self-adaptive algorithm model as an early warning model.
Further, the specific step of outputting the security tag corresponding to the current environmental parameter by using the trained early warning model includes:
extracting feature data of the current environment parameters, and carrying out standardization processing on the extracted feature data to obtain standardized data; and carrying out dimension lifting on the standardized data to obtain sample data corresponding to the current environmental parameters, inputting the sample data corresponding to the current environmental parameters into a trained early warning model, and finally utilizing the obtained early warning model to process the latest acquired current environmental parameters in real time and output security labels corresponding to the current environmental parameters.
Further, the method for acquiring the security tag for early warning judgment corresponding to the sample data comprises the following steps: and acquiring the security tag of the sample data based on the size of the historical environment parameter corresponding to the sample data.
The construction site environment quality early warning system based on the environment parameters comprises noise data and dust data, and comprises a preprocessing module, a data dimension increasing module, a label acquisition module, a model construction and training module and an early warning judging module;
the preprocessing module is used for acquiring historical environment parameters, extracting characteristic data of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
the data dimension increasing module is used for increasing dimension of the initial data to obtain sample data
The tag acquisition module is used for acquiring a security tag which corresponds to the sample data and is used for early warning judgment;
the model construction and training module is used for constructing an adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking the sample data and the corresponding security labels thereof as training data;
and the early warning judging module is used for outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
An electronic device comprising a memory storing a computer program and a processor for invoking and running the computer program stored in the memory to perform the method according to any of the preceding claims.
A computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the method as claimed in any one of the preceding claims.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a construction site environment quality early warning method based on environment parameters, which is characterized in that the environment parameters including noise data and dust data in a construction site are subjected to feature extraction, standardization and dimension lifting processing, on one hand, the design digs the intrinsic features of the environment parameters, the detection precision of abnormal data is improved, on the other hand, the design has low requirements on the data quantity and the data source of the noise data and the dust data, and a plurality of sensors are not required to be arranged in the construction site, so that the actual construction site requirements are met.
According to the prediction method, the self-adaptive algorithm model based on the random forest basic classifier is constructed as the early warning model, the safety label corresponding to the current environmental parameter is output by the early warning model, early warning judgment is carried out, the weight of the decision tree is automatically adjusted according to the data type based on the deep learning mode, the overall optimal solution can be converged, the accuracy of data anomaly detection and the generalization capability of the early warning model are improved, and effective early warning can be carried out on the environmental quality of a construction site.
Drawings
FIG. 1 is an overall scheme of a construction site environmental quality early warning method based on environmental parameters;
FIG. 2 is a flow chart of a construction site environmental quality pre-warning method based on environmental parameters;
FIG. 3 is a graph of early warning accuracy versus the first round of implementation;
FIG. 4 is a graph of early warning accuracy versus a second round of implementation;
FIG. 5 is a graph of early warning accuracy versus a third round of implementation using an Ada-RF model.
Detailed Description
The invention will be further described with reference to specific embodiments and corresponding drawings.
Example 1
The invention relates to a construction site environmental quality early warning method based on environmental parameters, wherein the environmental parameters comprise noise data acquired by single or multiple noise sensors and dust data acquired by single or multiple dust sensors (the dust sensors can be PM2.5 sensors), and as shown in fig. 1 and 2, the early warning method specifically comprises the following steps:
acquiring historical environment parameters, extracting characteristic data (namely multi-source data characteristic extraction) of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
carrying out data dimension lifting on the initial data by utilizing a Gaussian kernel function to obtain sample data, and acquiring a security tag which corresponds to the sample data and is used for early warning judgment;
constructing a self-adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking sample data and corresponding security labels thereof as training data;
and outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
Further, the specific steps of extracting the characteristic data of the historical environment parameters include:
1) Dividing noise data and dust data in historical environmental parameters according to different corresponding sensors, and performing time matching on the divided historical environmental parameters to form an original data set omega= { d j |j=1,2,...,N},d j The j-th array in the original data set is used, and N is the total number of arrays in the original data set; each array in the original dataset comprises environmental parameters of all categories of environmental parameters at the same time, d j =[d 1j ,d 2j ,...,d ij ,...,d Ij ],d 1j ,d 2j ,...,d ij ,...,d Ij Respectively the jth numberThe group 1,2, I environmental parameters, I being the total category of environmental parameters, i.e. the total number of sensors, I being not less than 2, i.e. at least one type of environmental parameters corresponding to noise data and another type of environmental parameters corresponding to dust data;
the following will explain the formation of an original data set in detail by using an example in which the environmental parameters are obtained by two different noise sensors (noise sensor 1 and noise sensor 2) to obtain the noise sensor and the dust data obtained by one dust sensor (dust sensor 1), thereby forming three types of environmental parameters, the 1 st type environmental parameter is the noise data corresponding to the noise sensor 1, the 2 nd type environmental parameter is the noise data corresponding to the noise sensor 2, the 3 rd type environmental parameter is the dust data obtained by the dust sensor 1, the sampling periods of these sensors are the same and the initial sampling moments of the various environmental parameters are the same, and after time matching, the original data set Ω= { d is formed j j=1, 2,.. 1 ,d 1 =[d 11 ,d 21 ,d 31 ],d 11 For the 1 st environmental parameter (noise data corresponding to the noise sensor 1) in the 1 st array, d 21 For the type 2 environmental parameters (noise data corresponding to noise sensor 2) in the 1 st array, d 31 The 3 rd environmental parameters (noise data corresponding to the dust sensor 1) in the 1 st array.
Calculating the average value of various environmental parameters in the original data set and the overall average value of the array in the original data set, wherein the calculation formula is as follows:
wherein m is i Is the mean value of the i-th environmental parameters; d, d ij The environment parameters are the ith environmental parameters in the jth array;
wherein m is the overall average value of the array in the original data set;
2) Respectively calculating intra-class scattering matrixes of various environmental parameters, and calculating inter-class scattering matrixes among the various environmental parameters, wherein the calculation formulas are respectively as follows:
in the method, in the process of the invention,an intra-class scatter matrix that is a class i environmental parameter; s is S b An inter-class dispersion matrix between various environmental parameters;
3) Searching the optimal projection vectors of the noise data and the dust data by using Fisher criterion function, wherein the Fisher criterion function is that
Wherein J is ii ) Outputting Fisher criterion functions corresponding to the i-th environmental parameters; omega i The optimal projection vector is the i-th environmental parameter;
4) Carrying out projection mapping on various environmental parameters in the original data set according to the optimal projection vector of the corresponding environmental parameter category to obtain characteristic data corresponding to the various environmental parameters, thereby forming a characteristic data set X '= { X' j |j=1,2,...,N};x′ j For the j-th array in the feature dataset, x' j And the j-th group d in the original data set j Correspondingly, each data group in the characteristic data set comprises characteristic data corresponding to the environmental parameters of all the category environmental parameters at the same time, and x' j =[x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij ];x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij And the characteristic data corresponding to the I-class environment parameters are respectively 1,2, I and I in the j-th array.
Further, the standard deviation standardization method is adopted in the standardization treatment, and the specific method comprises the following steps: respectively carrying out standardization processing on each type of environmental parameters in the characteristic data set to form an initial data set X= { X j |j=1,2,...,N},x j For the j-th array in the initial dataset, x j =[x 1j ,x 2j ,...,x ij ,...,x Ij ];x 1j ,x 2j ,...,x ij ,...,x Ij The initial values corresponding to the class I environmental parameters in the j-th array are 1,2, I, as follows:
wherein x is ij The initial value corresponding to the ith environmental parameter in the jth array in the initial data set is obtained; x's' ij The characteristic data corresponding to the i-th environmental parameter in the j-th array in the characteristic data set; x is x i,min The minimum value in the feature data corresponding to the i-th environmental parameter; x is x i,max And the maximum value in the characteristic data corresponding to the i-th environmental parameter.
Further, the dimension increase is performed by using a gaussian kernel function, and the formula of the gaussian kernel function is as follows:
wherein K is a gaussian kernel function operator; x is x j 、x p J, p e {1,2,..N } and j+.p in the initial dataset, respectively; lambda is the exponential coefficient of the Gaussian kernel function; exp is an exponential operator with natural number as a base;
the method for obtaining the sample data comprises the following specific steps of:
1) The initial data is up-scaled as follows:
in the method, in the process of the invention,is a dimension-increasing function; />For the j-th array x in the initial dataset j A sample corresponding to the dimension after the dimension rise; />For expansion coefficient after dimension increase, +.>Lambda is the exponential coefficient of the Gaussian kernel function; || is a norm operator;
x 1j 、x 2j andrespectively setting an initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set, an initial value corresponding to the class 2 environmental parameter in the jth array in the initial data set and a k power of the initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set; k is the dimension; the value of k depends on the empirical determination of the actual situation, and generally ranges from 6 to 10, in this example, the value of k is 8.exp is an exponential operator with natural number as a base;
2) Based on the dimension-up result, a sample data set is formed,the j-th sample in the sample dataset.
Further, as shown in fig. 2, the specific steps of constructing the adaptive algorithm model based on the random forest basic classifier as the early warning model in this example include:
1) Constructing a decision tree model and a classifier, initializing decision tree model parameters, the weight of the classifier, updating a gradient threshold epsilon and a sample weight (the sample weight is generally equally divided in the initialization process) and the weight of an error rate (the weight of the error rate is generally set to be 0 in the initialization process) by an adaptive algorithm, and enabling the iteration times t=1;
2) Iteratively updating the sample weight, the classifier weight and the error rate weight, wherein the iteratively updating formula is as follows
In the method, in the process of the invention,and->Sample +.f. in the iterative update procedure of the t-th and t-1 th times, respectively>The output of the corresponding classifier;
and->Respectively the t th time and the t th timeSample +.1 in t-1 iterative update procedure>The weight of the corresponding classifier;
θ is the gradient step size of the adaptive algorithm,
and->Sample +.>Outputting a corresponding decision tree model;
ω t,j and omega t-1,j Samples in the t-th and t-1 th iteration updating processes respectivelyCorresponding weights;
y j for the sampleA corresponding security tag;
for the gradient value of the loss function, the loss function +.>
3) Judging the magnitude relation between the Loss function Loss and the adaptive algorithm updating gradient threshold epsilon, if the Loss function Loss is smaller than the adaptive algorithm updating gradient threshold, iterating the times t=t+1, and returning to the step 2); otherwise, generating a self-adaptive algorithm model as an early warning model.
Further, the specific step of outputting the security tag corresponding to the current environmental parameter by using the trained early warning model comprises the following steps:
extracting feature data of the current environment parameters, and carrying out standardization processing on the extracted feature data to obtain standardized data; and carrying out dimension lifting on the standardized data to obtain sample data corresponding to the current environment parameters, inputting the sample data corresponding to the current environment parameters into a trained early warning model, and finally utilizing the obtained early warning model to process the latest acquired current environment parameters in real time and output security labels corresponding to the current environment parameters.
Further, the method for acquiring the security tag for early warning judgment corresponding to the sample data comprises the following steps: and acquiring security labels of the sample data based on the size of the historical environmental parameters corresponding to the sample data (a manual/expert evaluation method can be adopted), wherein the security labels are normal and abnormal.
Example two
The invention also provides a construction site environmental quality early warning system based on environmental parameters, wherein the environmental parameters comprise noise data and dust data, and the system comprises a preprocessing module, a data dimension increasing module, a label acquisition module, a model construction and training module and an early warning judging module;
the preprocessing module is used for acquiring historical environment parameters, extracting characteristic data of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
the data dimension-increasing module is used for increasing dimension of the initial data to obtain sample data
The tag acquisition module is used for acquiring a security tag which corresponds to the sample data and is used for early warning judgment;
the model construction and training module is used for constructing an adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking sample data and a corresponding security label thereof as training data;
and the early warning judging module is used for outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
Example III
The invention also provides an electronic device comprising a memory storing a computer program and a processor for invoking and running the computer program stored in the memory to perform the method according to any of the embodiments described above.
Example IV
The invention also provides a computer readable storage medium storing a computer program which when executed by a processor performs the steps of a method according to any of the embodiments described above.
Example five
The early warning method is simulated and implemented on a construction site, and noise and dust generated by construction on the construction site are required to be monitored so as to realize the purpose of complete and accurate environmental early warning on the construction site. In the embodiment, the noise sensors and the PM2.5 sensors are respectively arranged at 8 different positions of the construction site, the sampling time interval of each sensor is kept consistent, sampled data are transmitted to a server local to the construction site in a wireless communication mode, then a field expert judges whether each group of data need to be pre-warned or not, a label corresponding to each group of data is given, and then the server uploads the sampled data to a database to generate a sample training set. The computer program is deployed at the server, and when the sample training set meets the training quantity requirement, the program calls the sample training set data to carry out iterative updating of the self-adaptive algorithm model (early warning model). And finally, applying the updated model to a construction site, and carrying out automatic early warning judgment on the environmental parameters collected each time.
In this case, 20000 pieces of the above environmental parameter data are selected, wherein the security tag is classified into a normal environment type and an abnormal environment type. The process was performed three times with a total of 8 data for each set of environmental parameters. The first round selects two (one noise data and one dust data) from the 8 data at random, the second round selects 4 (two noise data and two dust data) from the 8 data at random, and the first two rounds of implementation processes are performance comparison tests at the algorithm level. The third round is a comparative test based on the number of sensors used, two (one noise data and one dust data), 4 (two noise data and two dust data), 5 (3 noise data and 2 dust data), 6 (3 noise data and 3 dust data), 8 (4 noise data and 4 dust data) were selected from the 8 data, respectively. The three-wheel implementation process adopts a random forest algorithm to establish a prediction model. In this example, the ten thousands of data were divided into 50 parts, one part was randomly selected as the validation set, and the remaining 49 parts were used as the training set. The number of decision trees in the random forest is 50, and 50 rounds of cross validation are performed to classify the environmental parameter data.
For a sample consisting of 20000 data and tags, it consists of Ω= { d j |l i =c i (j= {1,2, …,20000} represents a set of 20000 data, y j = { -1,1} represents a tag of data.
In the power grid environment-friendly intelligent management and control method, the mathematical model of the strong classifier is:
S t (x)=S t-1 (x)+α t (x)*R t (x)
wherein alpha is t (x) Representing the weight of the basic classifier, S t (x) Model representing strong classifier iterated t times, R t (x) Represents the output of the iterative t times weak classifier, and outputs R t (x) = { -1,1}. The last x category is defined by sign (S t (x) Is obtained.
The loss function in the iterative process of the adaptive algorithm is defined as:
L(x,y)=L(S t-1 (x)+α t *R t (x),y)
our goal is to minimize the loss function, defined as follows:
for the adaptive algorithm, our goal is achieved by minimizing the loss function, the mathematical expression of which is as follows:
the specific implementation steps are as follows:
and step 1, extracting characteristic data corresponding to multi-source data such as noise, dust and the like. Definition Ω= { d j |l i =c i }, wherein c i Representing the class of data in the original dataset. Calculating the average value of the data samples of each category in the original data set respectively, wherein the average value is expressed asWherein N is i For the number of samples per class, i is the index of the class and j is the index of the number of data.
Then for each category, calculate its intra-category scatter matrix, expressed asSo that the degree of spread of different sample data within the same category can be measured. Then calculate the inter-class dispersion matrix S of all classes b =∑N i *(m i -m) 2 Where m is the overall average of all classes.
Finally, searching the optimal projection vector omega through Fisher criterion functions, wherein the Fisher criterion functions of each category are defined as followsBy maximising J i (omega) obtaining an optimal projection vector omega, finally mapping each original sample data into new characteristic data according to the corresponding optimal projection vector, and defining a set formed by all the new characteristic data as a data set X= { X j J=1, 2, …, N is the data set sample length.
Step 2, carrying out standardization treatment on the obtained characteristic data set X, wherein a standard deviation standardization method is adopted:
wherein x is min Representing the minimum value of the index, x max Representing the maximum value of the index, x' j Representing the normalized data.
And step 3, the obtained standardized data is subjected to dimension ascending through a Gaussian kernel function, so that more useful information is extracted. The data were upscaled using the following formula:
in the method, in the process of the invention,is a dimension-increasing function; />For the j-th array x in the initial dataset j A sample corresponding to the dimension after the dimension rise; />For expansion coefficient after dimension increase, +.>Lambda is the exponential coefficient of the Gaussian kernel function; || is a norm operator;
step 4, initializing basic decision tree and some parameters of the self-adaptive algorithm, such as total number of samples n=20000, maximum gradient threshold epsilon=0.001 of the loss function, number of generated decision tree n=50, number of estimators e=50, and number of selected characteristics k=2 during cross-validation.
And 5, constructing a weak learner, and generating a strong learner by continuously and iteratively updating the weights. Weak and weakThe mathematical model of the learner is R t (x) Representing the weak classifier produced by the t-th iteration. The mathematical model for reinforcing the generated weak classifier with the current existing learner to generate strong classification is as follows:
S t (x)=S t-1 (x)+a t (x)*R t (x)
and 6, using the generated strong classifier for predictive analysis to realize accurate predictive analysis of the multi-source data.
Table 1 is early warning accuracy data for the first round of implementation.
TABLE 1
For convenience of expression, "RF-L-Dim", "RF-H-Dim", "Ada-RF" are used in Table 1 to represent the model after data processing without Gaussian kernel function, the model after Gaussian kernel function processing, and the model optimized by adaptive reinforcement learning method (i.e., the early warning model of the present invention), respectively. Table 1 demonstrates that the random forest model classification is more accurate by properly increasing the dimensionality of the data, the early warning accuracy is improved by about 3%, and the self-adaptive algorithm further improves the early warning accuracy by about 4% on the basis.
FIG. 3 is a graph of accuracy versus analysis of the first round of implementation, where the line "train score" corresponds to the training score obtained from the training dataset, "test score RF-L-Dim" corresponds to the score obtained from the RF-L-Dim model, "test score RF-H-Dim" corresponds to the score obtained from the RF-H-Dim model, and "test score Ada-RF" corresponds to the score obtained from the Ada-RF model, and it is apparent that the performance of the model is significantly improved using the adaptive algorithm.
FIG. 4 is a graph of accuracy versus analysis of the second round of implementation showing that the accuracy and accuracy of the early warning model are significantly improved after the data acquired using 2 noise sensors and 2 PM2.5 sensors are optimized by the adaptive algorithm, as shown in FIG. 4.
Fig. 5 is a graph of accuracy versus analysis for a third round of implementation, and it can be seen that the pre-warning effect achieved using 2 noise sensors and 2 PM2.5 sensors and using more sensors differs by less than 1% after processing using the Ada-RF model, meaning that using 2 noise sensors and 2 PM2.5 sensors is sufficient to complete the pre-warning task for the construction site environment without the need to expend more cost to arrange more sensors while allowing 1% of deviation.
According to the construction site environment quality early warning method based on the environment parameters, provided by the invention, the complexity and the incompleteness of the on-site multi-source data in the power grid construction period and the constraint condition of data missing are considered, and the problem of prediction analysis of the multi-source data is solved under the conditions of low data dimension and small data quantity. The convergence performance of the predictive analysis model is not influenced by the initial state of the data, and the cost of sensor arrangement for acquiring multi-source data on site in the power grid construction period is effectively saved.
The foregoing is merely exemplary of the present invention and is not intended to limit the present invention. All equivalents and alternatives falling within the spirit of the invention are intended to be included within the scope of the invention. What is not elaborated on the invention belongs to the prior art which is known to the person skilled in the art.

Claims (10)

1. A construction site environmental quality early warning method based on environmental parameters including noise data acquired by a single or a plurality of noise sensors and dust data acquired by a single or a plurality of dust sensors, characterized in that: comprising the following steps:
acquiring historical environment parameters, extracting characteristic data of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
performing dimension lifting on the initial data to obtain sample data, and acquiring a security tag for early warning judgment corresponding to the sample data;
constructing a self-adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking the sample data and the corresponding security label thereof as training data;
and outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
2. The construction site environmental quality early warning method based on environmental parameters according to claim 1, wherein: the specific steps of extracting the characteristic data of the historical environment parameters comprise:
1) Dividing noise data and dust data in historical environmental parameters according to different corresponding sensors, and performing time matching on the divided historical environmental parameters to form an original data set omega= { d j |j=1,2,...,N},d j The j-th array in the original data set is used, and N is the total number of arrays in the original data set; each array in the original dataset comprises environmental parameters of all categories of environmental parameters at the same time, d j =[d 1j ,d 2j ,...,d ij ,...,d Ij ],d 1j ,d 2j ,...,d ij ,...,d Ij The j-th array is 1,2, I-class environmental parameters, I being the total class of environmental parameters;
calculating the average value of various environmental parameters in the original data set and the overall average value of the array in the original data set, wherein the calculation formula is as follows:
wherein m is i Is the mean value of the i-th environmental parameters; d, d ij The environment parameters are the ith environmental parameters in the jth array;
wherein m is the overall average value of the array in the original data set;
2) Respectively calculating intra-class scattering matrixes of various environmental parameters, and calculating inter-class scattering matrixes among the various environmental parameters, wherein the calculation formulas are respectively as follows:
in the method, in the process of the invention,an intra-class scatter matrix that is a class i environmental parameter; s is S b An inter-class dispersion matrix between various environmental parameters;
3) Searching the optimal projection vector of noise data and dust data by using Fisher criterion function, wherein the Fisher criterion function is that
Wherein J is ii ) Outputting Fisher criterion functions corresponding to the i-th environmental parameters; omega i The optimal projection vector is the i-th environmental parameter;
4) Carrying out projection mapping on various environmental parameters in the original data set according to the optimal projection vector of the corresponding environmental parameter category to obtain characteristic data corresponding to the various environmental parameters, thereby forming a characteristic data set X '= { X' j j=1,2,...,N};x′ j For the j-th array in the feature dataset, x' j And the j-th group d in the original data set j Correspondingly, each data group in the characteristic data set comprises characteristic data corresponding to the environmental parameters of all the category environmental parameters at the same time, and x' j =[x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij ];x′ 1j ,x′ 2j ,...,x′ ij ,...,x′ Ij And the characteristic data corresponding to the I-class environment parameters are respectively 1,2, I and I in the j-th array.
3. The construction site environmental quality early warning method based on environmental parameters according to claim 2, wherein: the standardized treatment adopts a standard deviation standardized method, and the specific method comprises the following steps: respectively carrying out standardization processing on each type of environmental parameters in the characteristic data set to form an initial data set X= { X j j=1,2,...,N},x j For the j-th array in the initial dataset, x j =[x 1j ,x 2j ,...,x ij ,...,x Ij ];x 1j ,x 2j ,...,x ij ,...,x Ij The initial values corresponding to the class I environmental parameters in the j-th array are 1,2, I, as follows:
wherein x is ij The initial value corresponding to the ith environmental parameter in the jth array in the initial data set is obtained; x's' ij The characteristic data corresponding to the i-th environmental parameter in the j-th array in the characteristic data set; x is x i,min The minimum value in the feature data corresponding to the i-th environmental parameter; x is x i,max And the maximum value in the characteristic data corresponding to the i-th environmental parameter.
4. The construction site environmental quality early warning method based on environmental parameters according to claim 3, wherein:
the specific steps of carrying out dimension lifting on the initial data to obtain sample data comprise:
1) The initial data is up-scaled as follows:
in the method, in the process of the invention,is a dimension-increasing function; />For the j-th array x in the initial dataset j A sample corresponding to the dimension after the dimension rise; />For expansion coefficient after dimension increase, +.>Lambda is the exponential coefficient of the Gaussian kernel function; || is a norm operator;
x 1j 、x 2j andrespectively setting an initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set, an initial value corresponding to the class 2 environmental parameter in the jth array in the initial data set and a k power of the initial value corresponding to the class 1 environmental parameter in the jth array in the initial data set; k is the dimension; exp is an exponential operator with natural number as a base;
2) Based on the dimension-up result, a sample data set is formed, the j-th sample in the sample dataset.
5. The environmental parameter-based construction site environmental quality early warning method according to claim 4, wherein: the specific steps of constructing the self-adaptive algorithm model based on the random forest basic classifier as an early warning model comprise the following steps:
1) Constructing a decision tree model and a classifier, initializing decision tree model parameters, the weight of the classifier, updating gradient threshold epsilon and the weight of sample weight and error rate by a self-adaptive algorithm, and enabling iteration times t=1;
2) Iteratively updating the sample weight, the classifier weight and the error rate weight, wherein the iteratively updated formula is as follows:
in the method, in the process of the invention,and->Sample +.f. in the iterative update procedure of the t-th and t-1 th times, respectively>The output of the corresponding classifier;
and->Sample +.f. in the iterative update procedure of the t-th and t-1 th times, respectively>The weight of the corresponding classifier;
θ is the gradient step size of the adaptive algorithm,
and->Sample +.>Outputting a corresponding decision tree model;
ω t,j and omega t-1,j Samples in the t-th and t-1 th iteration updating processes respectivelyCorresponding weights;
y j for the sampleA corresponding security tag;
for the gradient value of the loss function, the loss function +.>
3) Judging the magnitude relation between the Loss function Loss and the adaptive algorithm updating gradient threshold epsilon, if the Loss function Loss is smaller than the adaptive algorithm updating gradient threshold, iterating the times t=t+1, and returning to the step 2); otherwise, generating a self-adaptive algorithm model as an early warning model.
6. The construction site environmental quality early warning method based on environmental parameters according to claim 1, wherein: the specific step of outputting the security label corresponding to the current environmental parameter by using the trained early warning model comprises the following steps:
extracting feature data of the current environment parameters, and carrying out standardization processing on the extracted feature data to obtain standardized data; and carrying out dimension lifting on the standardized data to obtain sample data corresponding to the current environmental parameters, inputting the sample data corresponding to the current environmental parameters into a trained early warning model, and finally utilizing the obtained early warning model to process the latest acquired current environmental parameters in real time and output security labels corresponding to the current environmental parameters.
7. The construction site environmental quality early warning method based on environmental parameters according to claim 1, wherein: the method for acquiring the security tag for early warning judgment corresponding to the sample data comprises the following steps: and acquiring the security tag of the sample data based on the size of the historical environment parameter corresponding to the sample data.
8. Construction site environmental quality early warning system based on environmental parameter, environmental parameter includes noise data and raise dust data, its characterized in that: the system comprises a preprocessing module, a data dimension increasing module, a label acquisition module, a model construction and training module and an early warning judging module;
the preprocessing module is used for acquiring historical environment parameters, extracting characteristic data of the historical environment parameters, and carrying out standardized processing on the extracted characteristic data to obtain initial data;
the data dimension increasing module is used for increasing dimension of the initial data to obtain sample data
The tag acquisition module is used for acquiring a security tag which corresponds to the sample data and is used for early warning judgment;
the model construction and training module is used for constructing an adaptive algorithm model based on a random forest basic classifier as an early warning model, and training the early warning model by taking the sample data and the corresponding security labels thereof as training data;
and the early warning judging module is used for outputting a security label corresponding to the current environmental parameter by using the trained early warning model to perform early warning judgment.
9. An electronic device comprising a memory storing a computer program and a processor for invoking and running the computer program stored in the memory to perform the method of any of claims 1 to 7.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any of the preceding claims 1 to 7.
CN202311780946.XA 2023-12-22 2023-12-22 Construction site environment quality early warning method and system based on environment parameters Pending CN117786539A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311780946.XA CN117786539A (en) 2023-12-22 2023-12-22 Construction site environment quality early warning method and system based on environment parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311780946.XA CN117786539A (en) 2023-12-22 2023-12-22 Construction site environment quality early warning method and system based on environment parameters

Publications (1)

Publication Number Publication Date
CN117786539A true CN117786539A (en) 2024-03-29

Family

ID=90395693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311780946.XA Pending CN117786539A (en) 2023-12-22 2023-12-22 Construction site environment quality early warning method and system based on environment parameters

Country Status (1)

Country Link
CN (1) CN117786539A (en)

Similar Documents

Publication Publication Date Title
CN113259331B (en) Unknown abnormal flow online detection method and system based on incremental learning
CN118095972B (en) Marine ecological environment prediction evaluation system based on artificial intelligence
CN111127364A (en) Image data enhancement strategy selection method and face recognition image data enhancement method
CN113344243B (en) Wind speed prediction method and system for optimizing ELM (ELM) based on improved Harris eagle algorithm
CN107403188A (en) A kind of quality evaluation method and device
CN113761259A (en) Image processing method and device and computer equipment
CN112560948B (en) Fundus image classification method and imaging method under data deviation
CN114897204A (en) Method and device for predicting short-term wind speed of offshore wind farm
CN116227786A (en) Unmanned aerial vehicle comprehensive efficiency evaluation system
CN116881841A (en) Hybrid model fault diagnosis method based on F1-score multistage decision analysis
CN113935413A (en) Distribution network wave recording file waveform identification method based on convolutional neural network
CN113988519A (en) Method for representing risk of cultural relic preservation environment in collection of cultural relics
CN117938477A (en) Network intrusion detection method, device, terminal equipment and storage medium
CN117763316A (en) High-dimensional data dimension reduction method and dimension reduction system based on machine learning
CN116702839A (en) Model training method and application system based on convolutional neural network
CN116434273A (en) Multi-label prediction method and system based on single positive label
CN117786539A (en) Construction site environment quality early warning method and system based on environment parameters
CN117011577A (en) Image classification method, apparatus, computer device and storage medium
CN115423091A (en) Conditional antagonistic neural network training method, scene generation method and system
CN115169458A (en) Adaptive fault diagnosis method and device based on active learning and related medium
KR20220097215A (en) Method for embedding normalization and electronic device using the same
Almola et al. Citrus diseases recognition by using CNN
Boccato et al. In the Depths of Hyponymy: A Step towards Lifelong Learning
CN110728292A (en) Self-adaptive feature selection algorithm under multi-task joint optimization
CN117784615B (en) Fire control system fault prediction method based on IMPA-RF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination