CN111797937B - Greenhouse environment assessment method based on PNN network - Google Patents
Greenhouse environment assessment method based on PNN network Download PDFInfo
- Publication number
- CN111797937B CN111797937B CN202010678726.6A CN202010678726A CN111797937B CN 111797937 B CN111797937 B CN 111797937B CN 202010678726 A CN202010678726 A CN 202010678726A CN 111797937 B CN111797937 B CN 111797937B
- Authority
- CN
- China
- Prior art keywords
- samples
- sample
- pnn
- training
- pnn network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 108
- 238000012360 testing method Methods 0.000 claims abstract description 62
- 239000002245 particle Substances 0.000 claims abstract description 46
- 238000009499 grossing Methods 0.000 claims abstract description 45
- 210000002569 neuron Anatomy 0.000 claims abstract description 44
- 238000005457 optimization Methods 0.000 claims abstract description 28
- 238000011156 evaluation Methods 0.000 claims abstract description 14
- 238000003064 k means clustering Methods 0.000 claims abstract description 13
- 238000013145 classification model Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 58
- 230000006870 function Effects 0.000 claims description 37
- 239000013598 vector Substances 0.000 claims description 28
- 238000010606 normalization Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000012804 iterative process Methods 0.000 claims description 4
- 230000008635 plant growth Effects 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 230000014509 gene expression Effects 0.000 claims description 3
- 238000013441 quality evaluation Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 description 6
- 239000002689 soil Substances 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 4
- 240000008067 Cucumis sativus Species 0.000 description 4
- 235000010799 Cucumis sativus var sativus Nutrition 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000007613 environmental effect Effects 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 238000001303 quality assessment method Methods 0.000 description 3
- 238000012271 agricultural production Methods 0.000 description 2
- 229910002092 carbon dioxide Inorganic materials 0.000 description 2
- 239000001569 carbon dioxide Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000013210 evaluation model Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 235000009849 Cucumis sativus Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention provides a greenhouse environment assessment method based on a PNN network, and relates to the technical field of facility agriculture. Firstly, establishing a greenhouse environment parameter sample library, classifying samples, and dividing the samples into training samples and test samples; then, clustering the training samples by using an improved K-means clustering algorithm, and selecting a batch of representative samples as new training samples of the PNN network according to a representative sample selection threshold; training the PNN network after normalizing the new training sample, performing grade evaluation on the normalized test sample by using the trained PNN network, and calculating the error rate of classifying the test sample; and finally, enabling the mode layer neurons of the same class in the PNN to adopt the same smoothing factors, enabling the mode layer neurons of different classes to adopt different smoothing factors, and modifying the smoothing factors of the PNN network by taking the classified error rate as an objective function of a particle swarm optimization algorithm to obtain an optimal PNN classification model.
Description
Technical Field
The invention relates to the technical field of facility agriculture, in particular to a greenhouse environment assessment method based on a PNN network.
Background
With the development of agricultural technologies and the change of climate environment, greenhouse planting technology plays an increasingly important role in agricultural production. The greenhouse planting technology has the main effects of providing a proper growth environment for crops in a severe outdoor environment, so that in order to ensure that the greenhouse environment meets the requirement of crop growth, the greenhouse environment is monitored on line or off line, and the quality of the greenhouse environment is evaluated according to expert experience, so that the greenhouse planting technology has very important significance in guiding agricultural production.
At present, the common problems of the comprehensive greenhouse environment detector in the market are as follows: the comprehensive greenhouse environment detector cannot judge whether the greenhouse environment quality is good or bad according to the detected data, and the greenhouse environment quality assessment mainly depends on human expert experience knowledge due to the lack of guiding significance for users.
The probabilistic neural network (Probabilistic Neural Network, PNN) is a neural network with simple structure and wide application proposed by d.f. specht doctor in 1988, and the basic idea is: and separating a decision space from the multidimensional input space by using a Bayesian minimum risk criterion. The PNN network is a feedforward neural network based on a statistical principle and taking Parzen window functions as activation functions, absorbs the advantages of a radial basis neural network and a classical probability density estimation principle, and has obvious advantages in mode classification compared with the traditional feedforward neural network. However, the conventional PNN network has two drawbacks: (1) When training samples are too many, the PNN network has higher requirements on the storage space and computing power of the hardware, which increases the difficulty and cost of hardware design; (2) In the conventional PNN network, the smoothing factor has a crucial influence on the classification accuracy of the network, but the value of the smoothing factor is usually given manually, and a specific selection basis is lacked.
In terms of structural optimization and parameter optimization of PNN networks, the common options are: firstly, clustering training samples by using a clustering algorithm such as a K-means algorithm and the like, and using a representative clustering center as a new training sample of the PNN network, thereby simplifying the structure of the PNN network; then, assuming that the smoothing factors of the neurons of the PNN network mode layer have the same value, taking the PNN network classification error rate as an objective function, and carrying out parameter optimization on the smoothing factors by utilizing an optimization algorithm such as a particle swarm algorithm (Particle Swarm Optimization, PSO) so as to improve the classification accuracy of the network. However, this method has two disadvantages: (1) When the number of the clustering centers is small, the phenomenon that the classification accuracy of the PNN network is reduced due to the small number of the training samples can be caused by using the clustering centers as new training samples of the PNN network; (2) Assuming that the PNN network mode layer neurons have the same parameters, the PNN network has poor control capability on the differences between samples when performing classification tasks, and may ignore details of the test samples.
Disclosure of Invention
Aiming at the defects of the prior art, the technical problem to be solved by the invention is to provide a greenhouse environment assessment method based on a PNN network, so as to assess the greenhouse environment.
In order to solve the technical problems, the invention adopts the following technical scheme: a greenhouse environment assessment method based on PNN network comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
the initialized parameters are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster centerAn iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles in a particle swarm optimization algorithm (Partical Swarm Optimization, namely PSO), wherein the solution space dimension is D, the maximum iteration number max_iter of the PSO algorithm, the initial position vector px and the initial velocity vector pv of the particles; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; initializing all smoothing factor values in the PNN network to be 0.1;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: randomly selecting K sample points from m training samples to serve as initial clustering centers of K-means clustering algorithm
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turnThe Euclidean distance d (t, g) of (a) is shown in the following formula:
step 4.3: finding each sample point about the respective cluster centerAnd will correspond to the sample point p t Dividing into and clustering center->The cluster with the smallest distance;
step 4.4: recalculating each cluster by means of an averageIs a cluster center of (2)The following formula is shown:
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
wherein E is j+1 Representing the sum of squares of the distances between the samples in each cluster and the new cluster center;
step 4.6: judging whether the iteration number J is equal to the maximum iteration number J or |E j+1 -E j Step 4.7 is executed if the I is less than epsilon, otherwise step 4.2 is executed again;
step 4.7: counting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The alpha samples are output as the most representative training samples, resulting in p=m·alpha new training samples;
step 5: carrying out normalization processing on the new training sample;
setting a new training sample matrix X as follows:
wherein p represents the number of new training samples, and q represents the dimension of the new training samples;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
step 6: training a PNN network according to the normalized training sample matrix C, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of training samples, and the activation function is a Gaussian function; the connection mode of the summation layer and the mode layer is sparse connection, and the neuron number of the summation layer is equal to the class number of the training samples; the output layer selects the category corresponding to the maximum posterior probability for output according to the Bayesian decision rule;
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
the training matrix C has p training samples, and is divided into M classes, and the number of the training samples of the M classes is set to be h respectively 1 ,h 2 ,…,h M The following steps are:
p=h 1 +h 2 +…+h M (8)
setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Is a nerve of (2)The element corresponds to a training sample of type 1, the number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h M Neurons from +1 to number p in turn belong to the class M sample;
step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m test samples and normalized is shown in the following formula:
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and all the smoothing factors are set to have the same value, namely sigma at the beginning 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
step 6.5: calculating the probability prob of the alpha-th sample to be tested belonging to the b-th class according to the initial probability sum of the samples to be tested belonging to the classes ab The following formula is shown:
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =arg max(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples of the PNN network misclassification is represented, and l=n-m represents the number of test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using error rate ER of the PNN network for classifying test samples as an objective function of a PSO algorithm, modifying smoothing factor parameters in the PNN network through the PSO algorithm, realizing optimization of the PNN network, and finally obtaining an optimal PNN classification model;
step 7.1: setting in the beta iteration process of optimizing PNN network parameters by PSO algorithm, firstly carrying out velocity vector pv on particles in PSO algorithm i And a position vector px i Is shown in the following formula:
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 1 ,μ 2 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension D=M;
step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; vector the position of the particleReplacing the value of the PNN network smoothing factor, and then calculating the PNN network pair ++>Error rate of classifying the corresponding test sample +.>And updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if p_fitness i < g_field, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches the maximum iteration number max_iter, the PSO algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the PNN network with gbest as the optimal smoothing factor; otherwise, re-executing the step 7.1;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
The beneficial effects of adopting above-mentioned technical scheme to produce lie in: the greenhouse environment assessment method based on the PNN network provided by the invention has the advantages that the PNN classification model is applied to greenhouse environment quality assessment, so that the current situation that the greenhouse environment comprehensive detector lacks assessment capability is overcome; the improved K-means algorithm is utilized to select representative samples of the training samples, so that the selected training samples are more representative, and the requirement of the PNN network on the training samples is met; the complexity of the PNN network structure is reduced, and the difficulty of hardware implementation and the storage cost are reduced; the method reduces the complexity of the PNN network and simultaneously avoids the phenomenon that the accuracy of PNN network classification is greatly reduced due to too few training samples. And carrying out parameter optimization on the smoothing factors in the PNN by utilizing the improved PSO algorithm, and setting that the same smoothing factors are adopted by the mode layer neurons of the same class in the PNN and different smoothing factors are adopted by the mode layer neurons of different classes in the PNN, so that the classification accuracy of the PNN is further improved.
Drawings
FIG. 1 is a flowchart of a greenhouse environment assessment method based on a PNN network, provided by an embodiment of the present invention;
FIG. 2 is a diagram of environmental requirements of greenhouse cucumber planting provided by the embodiment of the invention;
fig. 3 is a schematic structural diagram of a PNN network according to an embodiment of the present invention;
fig. 4 is a diagram of classification results of a PNN network on a test sample according to an embodiment of the present invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
In this embodiment, taking a greenhouse environment for planting cucumber as an example, the greenhouse environment is evaluated by adopting the greenhouse environment evaluation method based on the PNN network.
A greenhouse environment assessment method based on PNN network, as shown in figure 1, comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
in this example, the dimension of each group of samples in the sample library is q=7, and represents 7 greenhouse environmental parameters, i.e. air temperature, air humidity, carbon dioxide concentration, illumination intensity, soil temperature, soil humidity and soil salinity, which have important influence on plant growth; and each group of sample data is evaluated and classified into four grades of good medium difference, which are respectively represented by 1,2,3 and 4.
In this embodiment, the requirements of greenhouse cucumber planting on 7 environmental parameters are shown in fig. 2, and it can be seen from fig. 2 that the conditions for optimum growth of greenhouse cucumbers are: the carbon dioxide concentration is 1000 ppt-1500 ppt, the illumination intensity is 55 KLx-60 KLx, the air humidity is 70.0-80.0%, the soil humidity is 80.0-90.0%, the air temperature is 25-30 ℃, the soil temperature is 20-24 ℃, and the soil salinity is 0.5 mS/cm-0.8 mS/cm. In the embodiment, 1000 groups of samples of the greenhouse environment are measured through a sensor technology, each group of samples is comprehensively evaluated, and partial sample data are shown in table 1;
TABLE 1 sample data for a portion of a sample library
Step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
in the embodiment, 900 samples are selected from a sample library to serve as training samples according to the ratio of 9:1, and the rest 100 samples serve as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
the initialized parameters are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster centerAn iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles in a particle swarm optimization algorithm (Partical Swarm Optimization, namely PSO), wherein the solution space dimension is D, the maximum iteration number max_iter of the PSO algorithm, the initial position vector px and the initial velocity vector pv of the particles; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; initializing all smoothing factor values in the PNN network to be 0.1;
in this embodiment, as can be seen from steps 1 and 2, the sample library size is nThe dimension of each sample is q=7, the class number of the sample is m=4, namely 4 different types of training samples, the original training sample number is m=900, and the test sample number is l=100; meanwhile, initializing the clustering center number k=8, iteration stop threshold epsilon=0.001, representative sample selection threshold alpha=0.2, current iteration times j=1 and maximum iteration times j=20 in an improved K-means algorithm; initializing the initial position vector px of the ith particle in the improved PSO algorithm, wherein the number of particles N=30, the solution space dimension D=M=4, the maximum iteration number max_iter=100 i =[px i1 ,px i2 ,…,px iD ]Initial velocity vector pv i =[pv i1 ,pv i2 ,…,pv iD ]The optimal position of the individual is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; the smoothing factor of the initial PNN network takes the same value and σ=0.1;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: randomly selecting K sample points from m training samples to serve as initial clustering centers of K-means clustering algorithm
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turnThe Euclidean distance d (t, g) of (a) is as followsThe formula is shown as follows:
step 4.3: finding each sample point about the respective cluster centerAnd will correspond to the sample point p t Dividing into and clustering center->The cluster with the smallest distance;
step 4.4: recalculating cluster centers of each cluster by means of average valueThe following formula is shown:
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
wherein E is j+1 The goal of the K-means algorithm can be seen as minimizing the sum of squares of the distances within the clusters, representing the sum of squares of the distances of the samples from each cluster to the new cluster center;
step 4.6: determining whether the number of iterations J is equal to the maximum number of iterations j=20 or |e j+1 -E j Step 4.7 is executed if epsilon=0.001, otherwise step 4.2 is executed again;
step 4.7: system for managing a plurality of dataCounting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The α samples are output as the most representative training samples, yielding p=m·α=180 new training samples;
step 5: carrying out normalization processing on the new training sample;
setting a new training sample matrix X as shown in the following formula
Wherein p represents the number of new training samples, q represents the dimension of the new training samples, p=180, q=7;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
step 6: training a PNN network according to the normalized training sample matrix C, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of the training samples, the activation function is a Gaussian function, and the main function is to calculate the matching relation between the input characteristic vector and each training sample in the training samples; the connection mode of the summation layer and the mode layer is sparse connection, the number of neurons of the summation layer is equal to the number of categories of training samples, and the main function is that according to the calculation method of Parzen window functions, the output of neurons of the mode layer is summed according to categories and averaged to obtain the posterior probability of input vectors input into each category; the output layer is used for selecting and outputting the category corresponding to the maximum posterior probability according to the Bayesian decision rule;
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
there are p=180 training samples in training matrix C, and the training matrix C is divided into m=4 classes, and the number of training samples in the 4 classes is set to be h 1 ,h 2 ,h 3 ,h 4 The following steps are:
p=h 1 +h 2 +h 3 +h 4 (8)
in PNN networks, the number of pattern layer neurons is determined by training samples. So when there are p training samples in the training set, then p neurons are also present at the pattern layer of PNN. Setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Corresponding to the neurons of (1) class training samples, number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h 4 Neurons numbered sequentially +1 through p belong to class 4 samples;
in this embodiment, the PNN network is structured as shown in fig. 3.
Step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m=100 test samples and normalized is shown in the following formula:
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and the value of the smoothing factors has a critical influence on the classification accuracy of the PNN network. Setting all smoothing factors to have the same value, namely sigma 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
step 6.5: calculating the probability prob of the a-th sample belonging to the b-th class according to the initial probability sum of the samples to be detected belonging to the classes ab The following formula is shown:
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =arg max(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples of the PNN network misclassification is represented, and l=n-m represents the number of test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using error rate ER of the PNN network for classifying test samples as an objective function of a PSO algorithm, modifying smoothing factor parameters in the PNN network through the PSO algorithm, optimizing the PNN network, continuously reducing the value of ER under limited iteration times, achieving the purpose of optimizing PNN network parameters, and finally obtaining an optimal PNN classification model;
step 7.1: setting in the beta iteration process of optimizing PNN network parameters by PSO algorithm, firstly carrying out velocity vector pv on particles in PSO algorithm i And a position vector px i Is shown in the following formula:
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 1 ,μ 2 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension d=m=4;
in this embodiment, the inertia weight ω=0.6, and the learning factor c 1 =c 2 =2;
Step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; vector the position of the particleReplacing the value of the PNN network smoothing factor, then calculating the PNN network pair ++according to the process of calculating the PNN network classification error rate of the test sample in the step 6>Error rate of classifying the corresponding test sample +.>And updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if p_fitness i < g_fitness, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches max_iter=100, the PSO algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the gbest as the PNN network of the optimal smoothing factor; otherwise, re-executing the step 7.1;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
In this embodiment, the test set samples are evaluated by using the optimal PNN classification model, the evaluation result is shown in fig. 4, the evaluation accuracy of the whole test sample reaches 85%, wherein the evaluation accuracy of each type of test sample is shown in table 2, the evaluation accuracy of the evaluation method of the present invention is higher than 95.2% for the type 2 test sample, and the evaluation accuracy of the evaluation method is as low as 62.5% for the type 4, which is caused by different sample distributions in the training samples.
Table 2 comparison table of classification results of the evaluation method of the present invention
In this embodiment, the optimal PNN classification model of the present invention is compared with the conventional PNN evaluation model from the perspective of evaluation accuracy, network structure, training time, test time and storage space, and as shown in table 3, the optimal PNN classification model of the present invention requires a longer time in the training process, but is superior to the conventional PNN network in classification accuracy, network structure, test time and storage space. Therefore, the assessment method provided by the invention has the advantages of higher assessment speed and lower hardware design difficulty and storage space requirements, and provides an effective method for greenhouse environment quality assessment.
Table 3 against conventional PNN evaluation model
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions, which are defined by the scope of the appended claims.
Claims (6)
1. A greenhouse environment assessment method based on a PNN network is characterized by comprising the following steps: the method comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: in m training samplesThe machine selects K sample points as initial clustering centers of the K-means clustering algorithm
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turnThe Euclidean distance d (t, g) of (a) is shown in the following formula:
step 4.3: finding each sample point about the respective cluster centerAnd will correspond to the sample point p t Dividing into and clustering center->The cluster with the smallest distance;
step 4.4: recalculating cluster centers of each cluster by means of average valueThe following formula is shown:
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
wherein E is j+1 Representing the sum of squares of the distances between the samples in each cluster and the new cluster center;
step 4.6: judging whether the iteration number J is equal to the maximum iteration number J or |E j+1 -E j |<Epsilon, if yes, executing the step 4.7, otherwise, re-executing the step 4.2;
step 4.7: counting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The alpha samples are output as the most representative training samples, resulting in p=m·alpha new training samples;
step 5: carrying out normalization processing on the new training sample;
step 6: training a PNN network according to the normalized training sample matrix, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using the error rate of the PNN network for classifying test samples as an objective function of a particle swarm optimization algorithm, modifying smoothing factor parameters in the PNN network through the particle swarm optimization algorithm, realizing optimization of the PNN network, and finally obtaining an optimal PNN classification model;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
2. A PNN network-based greenhouse environment assessment method according to claim 1, wherein: the parameters initialized in the step 3 are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster centerAn iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles, the solution space dimension D, the maximum iteration number max_iter, the initial position vector px and the initial velocity vector pv of the particles in a particle swarm optimization algorithm; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; all smoothing factor values in the initialized PNN network are 0.1.
3. A PNN network-based greenhouse environment assessment method according to claim 2, wherein: the specific method for carrying out normalization processing on the new training samples in the step 5 is as follows:
setting a new training sample matrix X as shown in the following formula
Wherein p represents the number of new training samples, and q represents the dimension of the new training samples;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
4. a PNN network-based greenhouse environment assessment method according to claim 3, wherein: the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of training samples, and the activation function is a Gaussian function; the connection mode of the summation layer and the mode layer is sparse connection, and the neuron number of the summation layer is equal to the class number of the training samples; and the output layer selects the category corresponding to the maximum posterior probability for output according to the Bayesian decision rule.
5. A PNN network-based greenhouse environment assessment method according to claim 4, wherein: the specific method of the step 6 is as follows:
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
the training matrix C has p training samples, and is divided into M classes, and the number of the training samples of the M classes is set to be h respectively 1 ,h 2 ,…,h M The following steps are:
p=h 1 +h 2 +…+h M (8)
setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Corresponding to the neurons of (1) class training samples, number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h M Neurons from +1 to number p in turn belong to the class M sample;
step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m test samples and normalized is shown in the following formula:
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and all the smoothing factors are set to have the same value, namely sigma at the beginning 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
step 6.5: calculating the probability prob of the a-th sample belonging to the b-th class according to the initial probability sum of the samples to be detected belonging to the classes ab The following formula is shown:
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =argmax(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples that are misclassified by PNN is represented, and l=n-m is the number of test samples.
6. A PNN network-based greenhouse environment assessment method according to claim 5, wherein:
step 7.1: in the process of setting the beta-th iteration of optimizing PNN network parameters by the particle swarm optimization algorithm, firstly, carrying out velocity vector pv on particles in the particle swarm optimization algorithm i And a position vector px i Is shown in the following formula:
pv i β+1 =w·pv i β +c 1 μ 1 (pbest i β -px i β )+c 2 μ 2 (gbest β -px i β ) (16)
px i β+1 =px i β +pv i β+1 (17)
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 1 ,μ 2 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension D=M;
step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; the position vector px of the particle i β+1 Replacing the value of the PNN network smoothing factor, and then calculating the PNN network pair px i β+1 Error rate of classifying corresponding test samplesAnd updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if p_fitness i <g_fit, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches the maximum iteration number max_iter, the particle swarm optimization algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the PNN network with gbest as the optimal smoothing factor; otherwise, step 7.1 is re-executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010678726.6A CN111797937B (en) | 2020-07-15 | 2020-07-15 | Greenhouse environment assessment method based on PNN network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010678726.6A CN111797937B (en) | 2020-07-15 | 2020-07-15 | Greenhouse environment assessment method based on PNN network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111797937A CN111797937A (en) | 2020-10-20 |
CN111797937B true CN111797937B (en) | 2023-06-13 |
Family
ID=72807092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010678726.6A Active CN111797937B (en) | 2020-07-15 | 2020-07-15 | Greenhouse environment assessment method based on PNN network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111797937B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120961A (en) * | 2018-07-20 | 2019-01-01 | 南京邮电大学 | The prediction technique of the QoE of IPTV unbalanced dataset based on PNN-PSO algorithm |
CN110909802A (en) * | 2019-11-26 | 2020-03-24 | 西安邮电大学 | Improved PSO (particle swarm optimization) based fault classification method for optimizing PNN (portable network) smoothing factor |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7034701B1 (en) * | 2000-06-16 | 2006-04-25 | The United States Of America As Represented By The Secretary Of The Navy | Identification of fire signatures for shipboard multi-criteria fire detection systems |
-
2020
- 2020-07-15 CN CN202010678726.6A patent/CN111797937B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120961A (en) * | 2018-07-20 | 2019-01-01 | 南京邮电大学 | The prediction technique of the QoE of IPTV unbalanced dataset based on PNN-PSO algorithm |
CN110909802A (en) * | 2019-11-26 | 2020-03-24 | 西安邮电大学 | Improved PSO (particle swarm optimization) based fault classification method for optimizing PNN (portable network) smoothing factor |
Non-Patent Citations (2)
Title |
---|
Khakzad Hamid.Improving performance of classification on severity of ill effects (SEV) index on fish using K-Means clustering algorithm with various distance metrics.《Water Practice & Technology》.2019,第14卷(第1期),101-117. * |
环境因子对轻腌大黄鱼中溶藻弧菌生长/非生长界面的影响;郭全友等;《农业工程学报》;第34卷(第3期);292-299 * |
Also Published As
Publication number | Publication date |
---|---|
CN111797937A (en) | 2020-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tahir et al. | A classification model for class imbalance dataset using genetic programming | |
CN110232416B (en) | Equipment fault prediction method based on HSMM-SVM | |
CN110009030B (en) | Sewage treatment fault diagnosis method based on stacking meta-learning strategy | |
CN110363230B (en) | Stacking integrated sewage treatment fault diagnosis method based on weighted base classifier | |
CN106202952A (en) | A kind of Parkinson disease diagnostic method based on machine learning | |
CN112557034B (en) | Bearing fault diagnosis method based on PCA _ CNNS | |
CN113705877B (en) | Real-time moon runoff forecasting method based on deep learning model | |
CN110880369A (en) | Gas marker detection method based on radial basis function neural network and application | |
CN110826611A (en) | Stacking sewage treatment fault diagnosis method based on weighted integration of multiple meta-classifiers | |
CN111988329A (en) | Network intrusion detection method based on deep learning | |
CN116542382A (en) | Sewage treatment dissolved oxygen concentration prediction method based on mixed optimization algorithm | |
CN115907195A (en) | Photovoltaic power generation power prediction method, system, electronic device and medium | |
CN111209939A (en) | SVM classification prediction method with intelligent parameter optimization module | |
CN113222035B (en) | Multi-class imbalance fault classification method based on reinforcement learning and knowledge distillation | |
CN111797937B (en) | Greenhouse environment assessment method based on PNN network | |
CN109408896A (en) | A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring | |
CN112990593A (en) | Transformer fault diagnosis and state prediction method based on CSO-ANN-EL algorithm | |
CN116956160A (en) | Data classification prediction method based on self-adaptive tree species algorithm | |
CN115906959A (en) | Parameter training method of neural network model based on DE-BP algorithm | |
CN116031879A (en) | Hybrid intelligent feature selection method suitable for transient voltage stability evaluation of power system | |
CN114117876A (en) | Feature selection method based on improved Harris eagle algorithm | |
CN113807005A (en) | Bearing residual life prediction method based on improved FPA-DBN | |
Wenxuan et al. | Leaf disease image classification method based on improved convolutional neural network | |
CN113361709A (en) | Deep neural network model repairing method based on variation | |
Khotimah et al. | Adaptive SOMMI (Self Organizing Map Multiple Imputation) base on Variation Weight for Incomplete Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |