CN111797937B - Greenhouse environment assessment method based on PNN network - Google Patents

Greenhouse environment assessment method based on PNN network Download PDF

Info

Publication number
CN111797937B
CN111797937B CN202010678726.6A CN202010678726A CN111797937B CN 111797937 B CN111797937 B CN 111797937B CN 202010678726 A CN202010678726 A CN 202010678726A CN 111797937 B CN111797937 B CN 111797937B
Authority
CN
China
Prior art keywords
samples
sample
pnn
training
pnn network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010678726.6A
Other languages
Chinese (zh)
Other versions
CN111797937A (en
Inventor
关守平
方秋杨
陈旭涛
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Priority to CN202010678726.6A priority Critical patent/CN111797937B/en
Publication of CN111797937A publication Critical patent/CN111797937A/en
Application granted granted Critical
Publication of CN111797937B publication Critical patent/CN111797937B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a greenhouse environment assessment method based on a PNN network, and relates to the technical field of facility agriculture. Firstly, establishing a greenhouse environment parameter sample library, classifying samples, and dividing the samples into training samples and test samples; then, clustering the training samples by using an improved K-means clustering algorithm, and selecting a batch of representative samples as new training samples of the PNN network according to a representative sample selection threshold; training the PNN network after normalizing the new training sample, performing grade evaluation on the normalized test sample by using the trained PNN network, and calculating the error rate of classifying the test sample; and finally, enabling the mode layer neurons of the same class in the PNN to adopt the same smoothing factors, enabling the mode layer neurons of different classes to adopt different smoothing factors, and modifying the smoothing factors of the PNN network by taking the classified error rate as an objective function of a particle swarm optimization algorithm to obtain an optimal PNN classification model.

Description

Greenhouse environment assessment method based on PNN network
Technical Field
The invention relates to the technical field of facility agriculture, in particular to a greenhouse environment assessment method based on a PNN network.
Background
With the development of agricultural technologies and the change of climate environment, greenhouse planting technology plays an increasingly important role in agricultural production. The greenhouse planting technology has the main effects of providing a proper growth environment for crops in a severe outdoor environment, so that in order to ensure that the greenhouse environment meets the requirement of crop growth, the greenhouse environment is monitored on line or off line, and the quality of the greenhouse environment is evaluated according to expert experience, so that the greenhouse planting technology has very important significance in guiding agricultural production.
At present, the common problems of the comprehensive greenhouse environment detector in the market are as follows: the comprehensive greenhouse environment detector cannot judge whether the greenhouse environment quality is good or bad according to the detected data, and the greenhouse environment quality assessment mainly depends on human expert experience knowledge due to the lack of guiding significance for users.
The probabilistic neural network (Probabilistic Neural Network, PNN) is a neural network with simple structure and wide application proposed by d.f. specht doctor in 1988, and the basic idea is: and separating a decision space from the multidimensional input space by using a Bayesian minimum risk criterion. The PNN network is a feedforward neural network based on a statistical principle and taking Parzen window functions as activation functions, absorbs the advantages of a radial basis neural network and a classical probability density estimation principle, and has obvious advantages in mode classification compared with the traditional feedforward neural network. However, the conventional PNN network has two drawbacks: (1) When training samples are too many, the PNN network has higher requirements on the storage space and computing power of the hardware, which increases the difficulty and cost of hardware design; (2) In the conventional PNN network, the smoothing factor has a crucial influence on the classification accuracy of the network, but the value of the smoothing factor is usually given manually, and a specific selection basis is lacked.
In terms of structural optimization and parameter optimization of PNN networks, the common options are: firstly, clustering training samples by using a clustering algorithm such as a K-means algorithm and the like, and using a representative clustering center as a new training sample of the PNN network, thereby simplifying the structure of the PNN network; then, assuming that the smoothing factors of the neurons of the PNN network mode layer have the same value, taking the PNN network classification error rate as an objective function, and carrying out parameter optimization on the smoothing factors by utilizing an optimization algorithm such as a particle swarm algorithm (Particle Swarm Optimization, PSO) so as to improve the classification accuracy of the network. However, this method has two disadvantages: (1) When the number of the clustering centers is small, the phenomenon that the classification accuracy of the PNN network is reduced due to the small number of the training samples can be caused by using the clustering centers as new training samples of the PNN network; (2) Assuming that the PNN network mode layer neurons have the same parameters, the PNN network has poor control capability on the differences between samples when performing classification tasks, and may ignore details of the test samples.
Disclosure of Invention
Aiming at the defects of the prior art, the technical problem to be solved by the invention is to provide a greenhouse environment assessment method based on a PNN network, so as to assess the greenhouse environment.
In order to solve the technical problems, the invention adopts the following technical scheme: a greenhouse environment assessment method based on PNN network comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
the initialized parameters are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster center
Figure BDA0002585084260000021
An iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles in a particle swarm optimization algorithm (Partical Swarm Optimization, namely PSO), wherein the solution space dimension is D, the maximum iteration number max_iter of the PSO algorithm, the initial position vector px and the initial velocity vector pv of the particles; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; initializing all smoothing factor values in the PNN network to be 0.1;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: randomly selecting K sample points from m training samples to serve as initial clustering centers of K-means clustering algorithm
Figure BDA0002585084260000022
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turn
Figure BDA0002585084260000023
The Euclidean distance d (t, g) of (a) is shown in the following formula:
Figure BDA0002585084260000024
step 4.3: finding each sample point about the respective cluster center
Figure BDA0002585084260000025
And will correspond to the sample point p t Dividing into and clustering center->
Figure BDA0002585084260000026
The cluster with the smallest distance;
step 4.4: recalculating each cluster by means of an averageIs a cluster center of (2)
Figure BDA0002585084260000031
The following formula is shown:
Figure BDA0002585084260000032
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
Figure BDA0002585084260000033
wherein E is j+1 Representing the sum of squares of the distances between the samples in each cluster and the new cluster center;
step 4.6: judging whether the iteration number J is equal to the maximum iteration number J or |E j+1 -E j Step 4.7 is executed if the I is less than epsilon, otherwise step 4.2 is executed again;
step 4.7: counting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The alpha samples are output as the most representative training samples, resulting in p=m·alpha new training samples;
step 5: carrying out normalization processing on the new training sample;
setting a new training sample matrix X as follows:
Figure BDA0002585084260000034
wherein p represents the number of new training samples, and q represents the dimension of the new training samples;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
Figure BDA0002585084260000035
Figure BDA0002585084260000036
step 6: training a PNN network according to the normalized training sample matrix C, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of training samples, and the activation function is a Gaussian function; the connection mode of the summation layer and the mode layer is sparse connection, and the neuron number of the summation layer is equal to the class number of the training samples; the output layer selects the category corresponding to the maximum posterior probability for output according to the Bayesian decision rule;
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
Figure BDA0002585084260000041
the training matrix C has p training samples, and is divided into M classes, and the number of the training samples of the M classes is set to be h respectively 1 ,h 2 ,…,h M The following steps are:
p=h 1 +h 2 +…+h M (8)
setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Is a nerve of (2)The element corresponds to a training sample of type 1, the number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h M Neurons from +1 to number p in turn belong to the class M sample;
step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m test samples and normalized is shown in the following formula:
Figure BDA0002585084260000042
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
Figure BDA0002585084260000051
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
Figure BDA0002585084260000052
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and all the smoothing factors are set to have the same value, namely sigma at the beginning 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
Figure BDA0002585084260000053
step 6.5: calculating the probability prob of the alpha-th sample to be tested belonging to the b-th class according to the initial probability sum of the samples to be tested belonging to the classes ab The following formula is shown:
Figure BDA0002585084260000054
/>
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =arg max(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
Figure BDA0002585084260000061
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples of the PNN network misclassification is represented, and l=n-m represents the number of test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using error rate ER of the PNN network for classifying test samples as an objective function of a PSO algorithm, modifying smoothing factor parameters in the PNN network through the PSO algorithm, realizing optimization of the PNN network, and finally obtaining an optimal PNN classification model;
step 7.1: setting in the beta iteration process of optimizing PNN network parameters by PSO algorithm, firstly carrying out velocity vector pv on particles in PSO algorithm i And a position vector px i Is shown in the following formula:
Figure BDA0002585084260000062
Figure BDA0002585084260000063
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 1 ,μ 2 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension D=M;
step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; vector the position of the particle
Figure BDA0002585084260000064
Replacing the value of the PNN network smoothing factor, and then calculating the PNN network pair ++>
Figure BDA0002585084260000065
Error rate of classifying the corresponding test sample +.>
Figure BDA0002585084260000066
And updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if it is
Figure BDA0002585084260000067
Then->
Figure BDA0002585084260000068
Otherwise, pbest i And p_fitness i Remain unchanged;
if p_fitness i < g_field, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches the maximum iteration number max_iter, the PSO algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the PNN network with gbest as the optimal smoothing factor; otherwise, re-executing the step 7.1;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
The beneficial effects of adopting above-mentioned technical scheme to produce lie in: the greenhouse environment assessment method based on the PNN network provided by the invention has the advantages that the PNN classification model is applied to greenhouse environment quality assessment, so that the current situation that the greenhouse environment comprehensive detector lacks assessment capability is overcome; the improved K-means algorithm is utilized to select representative samples of the training samples, so that the selected training samples are more representative, and the requirement of the PNN network on the training samples is met; the complexity of the PNN network structure is reduced, and the difficulty of hardware implementation and the storage cost are reduced; the method reduces the complexity of the PNN network and simultaneously avoids the phenomenon that the accuracy of PNN network classification is greatly reduced due to too few training samples. And carrying out parameter optimization on the smoothing factors in the PNN by utilizing the improved PSO algorithm, and setting that the same smoothing factors are adopted by the mode layer neurons of the same class in the PNN and different smoothing factors are adopted by the mode layer neurons of different classes in the PNN, so that the classification accuracy of the PNN is further improved.
Drawings
FIG. 1 is a flowchart of a greenhouse environment assessment method based on a PNN network, provided by an embodiment of the present invention;
FIG. 2 is a diagram of environmental requirements of greenhouse cucumber planting provided by the embodiment of the invention;
fig. 3 is a schematic structural diagram of a PNN network according to an embodiment of the present invention;
fig. 4 is a diagram of classification results of a PNN network on a test sample according to an embodiment of the present invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
In this embodiment, taking a greenhouse environment for planting cucumber as an example, the greenhouse environment is evaluated by adopting the greenhouse environment evaluation method based on the PNN network.
A greenhouse environment assessment method based on PNN network, as shown in figure 1, comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
in this example, the dimension of each group of samples in the sample library is q=7, and represents 7 greenhouse environmental parameters, i.e. air temperature, air humidity, carbon dioxide concentration, illumination intensity, soil temperature, soil humidity and soil salinity, which have important influence on plant growth; and each group of sample data is evaluated and classified into four grades of good medium difference, which are respectively represented by 1,2,3 and 4.
In this embodiment, the requirements of greenhouse cucumber planting on 7 environmental parameters are shown in fig. 2, and it can be seen from fig. 2 that the conditions for optimum growth of greenhouse cucumbers are: the carbon dioxide concentration is 1000 ppt-1500 ppt, the illumination intensity is 55 KLx-60 KLx, the air humidity is 70.0-80.0%, the soil humidity is 80.0-90.0%, the air temperature is 25-30 ℃, the soil temperature is 20-24 ℃, and the soil salinity is 0.5 mS/cm-0.8 mS/cm. In the embodiment, 1000 groups of samples of the greenhouse environment are measured through a sensor technology, each group of samples is comprehensively evaluated, and partial sample data are shown in table 1;
TABLE 1 sample data for a portion of a sample library
Figure BDA0002585084260000081
Step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
in the embodiment, 900 samples are selected from a sample library to serve as training samples according to the ratio of 9:1, and the rest 100 samples serve as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
the initialized parameters are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster center
Figure BDA0002585084260000082
An iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles in a particle swarm optimization algorithm (Partical Swarm Optimization, namely PSO), wherein the solution space dimension is D, the maximum iteration number max_iter of the PSO algorithm, the initial position vector px and the initial velocity vector pv of the particles; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; initializing all smoothing factor values in the PNN network to be 0.1;
in this embodiment, as can be seen from steps 1 and 2, the sample library size is nThe dimension of each sample is q=7, the class number of the sample is m=4, namely 4 different types of training samples, the original training sample number is m=900, and the test sample number is l=100; meanwhile, initializing the clustering center number k=8, iteration stop threshold epsilon=0.001, representative sample selection threshold alpha=0.2, current iteration times j=1 and maximum iteration times j=20 in an improved K-means algorithm; initializing the initial position vector px of the ith particle in the improved PSO algorithm, wherein the number of particles N=30, the solution space dimension D=M=4, the maximum iteration number max_iter=100 i =[px i1 ,px i2 ,…,px iD ]Initial velocity vector pv i =[pv i1 ,pv i2 ,…,pv iD ]The optimal position of the individual is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; the smoothing factor of the initial PNN network takes the same value and σ=0.1;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: randomly selecting K sample points from m training samples to serve as initial clustering centers of K-means clustering algorithm
Figure BDA0002585084260000091
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turn
Figure BDA0002585084260000097
The Euclidean distance d (t, g) of (a) is as followsThe formula is shown as follows:
Figure BDA0002585084260000092
step 4.3: finding each sample point about the respective cluster center
Figure BDA0002585084260000093
And will correspond to the sample point p t Dividing into and clustering center->
Figure BDA0002585084260000094
The cluster with the smallest distance;
step 4.4: recalculating cluster centers of each cluster by means of average value
Figure BDA0002585084260000095
The following formula is shown:
Figure BDA0002585084260000096
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
Figure BDA0002585084260000101
wherein E is j+1 The goal of the K-means algorithm can be seen as minimizing the sum of squares of the distances within the clusters, representing the sum of squares of the distances of the samples from each cluster to the new cluster center;
step 4.6: determining whether the number of iterations J is equal to the maximum number of iterations j=20 or |e j+1 -E j Step 4.7 is executed if epsilon=0.001, otherwise step 4.2 is executed again;
step 4.7: system for managing a plurality of dataCounting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The α samples are output as the most representative training samples, yielding p=m·α=180 new training samples;
step 5: carrying out normalization processing on the new training sample;
setting a new training sample matrix X as shown in the following formula
Figure BDA0002585084260000102
Wherein p represents the number of new training samples, q represents the dimension of the new training samples, p=180, q=7;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
Figure BDA0002585084260000103
Figure BDA0002585084260000104
step 6: training a PNN network according to the normalized training sample matrix C, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of the training samples, the activation function is a Gaussian function, and the main function is to calculate the matching relation between the input characteristic vector and each training sample in the training samples; the connection mode of the summation layer and the mode layer is sparse connection, the number of neurons of the summation layer is equal to the number of categories of training samples, and the main function is that according to the calculation method of Parzen window functions, the output of neurons of the mode layer is summed according to categories and averaged to obtain the posterior probability of input vectors input into each category; the output layer is used for selecting and outputting the category corresponding to the maximum posterior probability according to the Bayesian decision rule;
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
Figure BDA0002585084260000111
there are p=180 training samples in training matrix C, and the training matrix C is divided into m=4 classes, and the number of training samples in the 4 classes is set to be h 1 ,h 2 ,h 3 ,h 4 The following steps are:
p=h 1 +h 2 +h 3 +h 4 (8)
in PNN networks, the number of pattern layer neurons is determined by training samples. So when there are p training samples in the training set, then p neurons are also present at the pattern layer of PNN. Setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Corresponding to the neurons of (1) class training samples, number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h 4 Neurons numbered sequentially +1 through p belong to class 4 samples;
in this embodiment, the PNN network is structured as shown in fig. 3.
Step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m=100 test samples and normalized is shown in the following formula:
Figure BDA0002585084260000112
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
Figure BDA0002585084260000121
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
Figure BDA0002585084260000122
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and the value of the smoothing factors has a critical influence on the classification accuracy of the PNN network. Setting all smoothing factors to have the same value, namely sigma 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
Figure BDA0002585084260000123
step 6.5: calculating the probability prob of the a-th sample belonging to the b-th class according to the initial probability sum of the samples to be detected belonging to the classes ab The following formula is shown:
Figure BDA0002585084260000124
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =arg max(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
Figure BDA0002585084260000131
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples of the PNN network misclassification is represented, and l=n-m represents the number of test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using error rate ER of the PNN network for classifying test samples as an objective function of a PSO algorithm, modifying smoothing factor parameters in the PNN network through the PSO algorithm, optimizing the PNN network, continuously reducing the value of ER under limited iteration times, achieving the purpose of optimizing PNN network parameters, and finally obtaining an optimal PNN classification model;
step 7.1: setting in the beta iteration process of optimizing PNN network parameters by PSO algorithm, firstly carrying out velocity vector pv on particles in PSO algorithm i And a position vector px i Is shown in the following formula:
Figure BDA0002585084260000132
Figure BDA0002585084260000133
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 1 ,μ 2 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension d=m=4;
in this embodiment, the inertia weight ω=0.6, and the learning factor c 1 =c 2 =2;
Step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; vector the position of the particle
Figure BDA0002585084260000134
Replacing the value of the PNN network smoothing factor, then calculating the PNN network pair ++according to the process of calculating the PNN network classification error rate of the test sample in the step 6>
Figure BDA0002585084260000135
Error rate of classifying the corresponding test sample +.>
Figure BDA0002585084260000136
And updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if it is
Figure BDA0002585084260000141
Then->
Figure BDA0002585084260000142
Otherwise, pbest i And p_fitness i Remain unchanged;
if p_fitness i < g_fitness, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches max_iter=100, the PSO algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the gbest as the PNN network of the optimal smoothing factor; otherwise, re-executing the step 7.1;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
In this embodiment, the test set samples are evaluated by using the optimal PNN classification model, the evaluation result is shown in fig. 4, the evaluation accuracy of the whole test sample reaches 85%, wherein the evaluation accuracy of each type of test sample is shown in table 2, the evaluation accuracy of the evaluation method of the present invention is higher than 95.2% for the type 2 test sample, and the evaluation accuracy of the evaluation method is as low as 62.5% for the type 4, which is caused by different sample distributions in the training samples.
Table 2 comparison table of classification results of the evaluation method of the present invention
Figure BDA0002585084260000143
In this embodiment, the optimal PNN classification model of the present invention is compared with the conventional PNN evaluation model from the perspective of evaluation accuracy, network structure, training time, test time and storage space, and as shown in table 3, the optimal PNN classification model of the present invention requires a longer time in the training process, but is superior to the conventional PNN network in classification accuracy, network structure, test time and storage space. Therefore, the assessment method provided by the invention has the advantages of higher assessment speed and lower hardware design difficulty and storage space requirements, and provides an effective method for greenhouse environment quality assessment.
Table 3 against conventional PNN evaluation model
Figure BDA0002585084260000144
Figure BDA0002585084260000151
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions, which are defined by the scope of the appended claims.

Claims (6)

1. A greenhouse environment assessment method based on a PNN network is characterized by comprising the following steps: the method comprises the following steps:
step 1: establishing a greenhouse environment parameter sample library with the size of n, and carrying out quality evaluation on each group of samples in the sample library according to M grades, so as to divide the n samples into M classes; the dimension of each group of samples in the sample library is q, and the parameters respectively represent greenhouse environment parameters with important influence on plant growth;
step 2: selecting m samples from a sample library as training samples, and using other l=n-m samples as test samples;
step 3: initializing parameters in an improved K-means clustering algorithm and a particle swarm optimization algorithm;
step 4: using an improved K-means clustering algorithm to perform clustering treatment on the m training samples selected to obtain K clustering clusters and K clustering centers, wherein the number of samples in each cluster is m g G=1, 2, … k; selecting a threshold alpha according to the representative samples, and selecting a batch of representative samples from each cluster as new training samples of the PNN network;
step 4.1: in m training samplesThe machine selects K sample points as initial clustering centers of the K-means clustering algorithm
Figure FDA0004205100820000011
Step 4.2: setting the current iteration number as j, and for each sample point p in the training sample t T=1, 2, …, m are calculated to each cluster center in turn
Figure FDA0004205100820000012
The Euclidean distance d (t, g) of (a) is shown in the following formula:
Figure FDA0004205100820000013
step 4.3: finding each sample point about the respective cluster center
Figure FDA0004205100820000014
And will correspond to the sample point p t Dividing into and clustering center->
Figure FDA0004205100820000015
The cluster with the smallest distance;
step 4.4: recalculating cluster centers of each cluster by means of average value
Figure FDA0004205100820000016
The following formula is shown:
Figure FDA0004205100820000017
wherein p is gw Representing the w sample point in the g cluster;
step 4.5: the sum of squares of the distances between the samples in each cluster and the new cluster center is calculated, and the formula is shown as follows:
Figure FDA0004205100820000018
wherein E is j+1 Representing the sum of squares of the distances between the samples in each cluster and the new cluster center;
step 4.6: judging whether the iteration number J is equal to the maximum iteration number J or |E j+1 -E j |<Epsilon, if yes, executing the step 4.7, otherwise, re-executing the step 4.2;
step 4.7: counting the number m of samples in each cluster g And selecting m nearest neighbor of each cluster center according to the sample selection threshold alpha g The alpha samples are output as the most representative training samples, resulting in p=m·alpha new training samples;
step 5: carrying out normalization processing on the new training sample;
step 6: training a PNN network according to the normalized training sample matrix, performing grade evaluation on the normalized test samples by utilizing the trained PNN network, and simultaneously calculating the error rate of classifying the test samples;
step 7: the method comprises the steps of enabling mode layer neurons of the same class in a PNN network to adopt the same smoothing factors and mode layer neurons of different classes to adopt different smoothing factors, then using the error rate of the PNN network for classifying test samples as an objective function of a particle swarm optimization algorithm, modifying smoothing factor parameters in the PNN network through the particle swarm optimization algorithm, realizing optimization of the PNN network, and finally obtaining an optimal PNN classification model;
step 8: and (3) collecting greenhouse environment data to be evaluated, and evaluating the greenhouse environment quality by adopting the optimal PNN classification model obtained in the step (7).
2. A PNN network-based greenhouse environment assessment method according to claim 1, wherein: the parameters initialized in the step 3 are specifically as follows: initializing the number K of clusters in the improved K-means clustering algorithm and the initial cluster center
Figure FDA0004205100820000021
An iteration stop threshold epsilon, a representative sample selection threshold alpha, a maximum iteration number J and a current iteration number J; initializing the number N of particles, the solution space dimension D, the maximum iteration number max_iter, the initial position vector px and the initial velocity vector pv of the particles in a particle swarm optimization algorithm; let the position vector of the particle be denoted as px i =[px i1 ,px i2 ,…,px iD ],i∈[1,N]The velocity vector of the particle is denoted as pv i =[pv i1 ,pv i2 ,…,pv iD ]The individual optimal position in the current iteration at which the objective function is minimized is pbest i =[pbest i1 ,pbest i2 ,…,pbest iD ]The optimal position of the population is gbest= [ gbest ] 1 ,gbest 2 ,…,gbest D ]The minimum values of the objective functions experienced by the individuals and the population in the iterative process are p_fitness respectively i G_fitness; all smoothing factor values in the initialized PNN network are 0.1.
3. A PNN network-based greenhouse environment assessment method according to claim 2, wherein: the specific method for carrying out normalization processing on the new training samples in the step 5 is as follows:
setting a new training sample matrix X as shown in the following formula
Figure FDA0004205100820000022
Wherein p represents the number of new training samples, and q represents the dimension of the new training samples;
and carrying out normalization processing on the new training sample matrix X through a normalization factor matrix B to obtain a matrix C, wherein the expressions of the matrix B and the matrix C are shown as follows:
Figure FDA0004205100820000031
Figure FDA0004205100820000032
4. a PNN network-based greenhouse environment assessment method according to claim 3, wherein: the PNN network structure comprises an input layer, a mode layer, a summation layer and an output layer; the input layer does not process the data and sends the data into the mode layer; the number of the neurons of the mode layer is equal to the number of training samples, and the activation function is a Gaussian function; the connection mode of the summation layer and the mode layer is sparse connection, and the neuron number of the summation layer is equal to the class number of the training samples; and the output layer selects the category corresponding to the maximum posterior probability for output according to the Bayesian decision rule.
5. A PNN network-based greenhouse environment assessment method according to claim 4, wherein: the specific method of the step 6 is as follows:
step 6.1: constructing a mode layer of the PNN network by using the normalized training sample matrix C;
after the new training sample matrix X is normalized, a training sample matrix C is obtained, and the following formula is shown:
Figure FDA0004205100820000033
the training matrix C has p training samples, and is divided into M classes, and the number of the training samples of the M classes is set to be h respectively 1 ,h 2 ,…,h M The following steps are:
p=h 1 +h 2 +…+h M (8)
setting M types of samples to be sequentially arranged in a sample matrix C, and sequentially numbering each neuron of the mode layer as 1 to p; numbered 1 through h 1 Corresponding to the neurons of (1) class training samples, number h 1 +1 in turn to h 1 +h 2 The +1 neurons correspond to class 2 training samples, and so on, numbered p-h M Neurons from +1 to number p in turn belong to the class M sample;
step 6.2: calculating Euclidean distance between each test sample in the test sample matrix and each training sample in the training set;
the test sample matrix T consisting of l=n-m test samples and normalized is shown in the following formula:
Figure FDA0004205100820000041
the euclidean distance matrix E between each test sample and each training sample d The following formula is shown:
Figure FDA0004205100820000042
step 6.3: activating the mode layer neurons by using radial basis functions;
selecting a Gaussian function as an activation function of the mode layer neuron, and calculating an activated probability matrix U, wherein the probability matrix U is shown in the following formula:
Figure FDA0004205100820000043
wherein sigma 1 、σ 2 、…σ p The smoothing factors of the p mode layer neurons are respectively represented, and all the smoothing factors are set to have the same value, namely sigma at the beginning 1 =σ 2 =…σ p =0.1;
Step 6.4: solving the initial probability and matrix S of each class of sample to be tested by a summation layer, wherein the initial probability and matrix S are represented by the following formula:
Figure FDA0004205100820000044
step 6.5: calculating the probability prob of the a-th sample belonging to the b-th class according to the initial probability sum of the samples to be detected belonging to the classes ab The following formula is shown:
Figure FDA0004205100820000051
wherein a epsilon [1, l ], b epsilon [1, M ];
step 6.6: according to the Bayesian decision theorem and the probability that each sample to be tested belongs to various types, the class corresponding to the a-th sample to be tested is determined, and the following formula is shown:
y a =argmax(prob ab ) (14)
wherein y is a The prediction result of the PNN network on the a test sample is shown, namely the category corresponding to the a test sample;
step 6.7: calculating the error rate of the PNN network for classifying the test samples, wherein the error rate is shown in the following formula:
Figure FDA0004205100820000052
wherein ER represents the error rate of the PNN network for classifying the test sample, n e The number of samples that are misclassified by PNN is represented, and l=n-m is the number of test samples.
6. A PNN network-based greenhouse environment assessment method according to claim 5, wherein:
step 7.1: in the process of setting the beta-th iteration of optimizing PNN network parameters by the particle swarm optimization algorithm, firstly, carrying out velocity vector pv on particles in the particle swarm optimization algorithm i And a position vector px i Is shown in the following formula:
pv i β+1 =w·pv i β +c 1 μ 1 (pbest i β -px i β )+c 2 μ 2 (gbest β -px i β ) (16)
px i β+1 =px i β +pv i β+1 (17)
wherein ω is an inertial weight, representing the searching capability of the particle swarm optimization algorithm; c 1 ,c 2 Learning factors of the individual extremum points and the global extremum points respectively; mu (mu) 12 Respectively representing random numbers between 0 and 1; since the PSO algorithm optimization object is a smoothing factor adopted by each class of mode layer neurons, the particle solution space dimension D=M;
step 7.2: the updated particle location vector represents a feasible solution of the PNN network smoothing factor; the position vector px of the particle i β+1 Replacing the value of the PNN network smoothing factor, and then calculating the PNN network pair px i β+1 Error rate of classifying corresponding test samples
Figure FDA0004205100820000053
And updating the individual optimal position information pbest of the particles according to the following updating rule i Minimum value of objective function p_fitness i Updating, namely updating the optimal position information gbest and the minimum value g_fit of the objective function experienced by the population at the same time;
the update rule is as follows:
if it is
Figure FDA0004205100820000061
Then pbest is i =px i β+1 ,/>
Figure FDA0004205100820000062
Otherwise, pbest i And p_fitness i Remain unchanged;
if p_fitness i <g_fit, then gbest=pbest i ,g_fitness=p_fitness i Otherwise, the gbest and g_fitness remain unchanged;
step 7.3: when the iteration number reaches the maximum iteration number max_iter, the particle swarm optimization algorithm is terminated, wherein gbest represents an optimal solution of the PNN network on the smoothing factor, and g_fitness represents an error rate of classifying the test sample by using the PNN network with gbest as the optimal smoothing factor; otherwise, step 7.1 is re-executed.
CN202010678726.6A 2020-07-15 2020-07-15 Greenhouse environment assessment method based on PNN network Active CN111797937B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010678726.6A CN111797937B (en) 2020-07-15 2020-07-15 Greenhouse environment assessment method based on PNN network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010678726.6A CN111797937B (en) 2020-07-15 2020-07-15 Greenhouse environment assessment method based on PNN network

Publications (2)

Publication Number Publication Date
CN111797937A CN111797937A (en) 2020-10-20
CN111797937B true CN111797937B (en) 2023-06-13

Family

ID=72807092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010678726.6A Active CN111797937B (en) 2020-07-15 2020-07-15 Greenhouse environment assessment method based on PNN network

Country Status (1)

Country Link
CN (1) CN111797937B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120961A (en) * 2018-07-20 2019-01-01 南京邮电大学 The prediction technique of the QoE of IPTV unbalanced dataset based on PNN-PSO algorithm
CN110909802A (en) * 2019-11-26 2020-03-24 西安邮电大学 Improved PSO (particle swarm optimization) based fault classification method for optimizing PNN (portable network) smoothing factor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7034701B1 (en) * 2000-06-16 2006-04-25 The United States Of America As Represented By The Secretary Of The Navy Identification of fire signatures for shipboard multi-criteria fire detection systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120961A (en) * 2018-07-20 2019-01-01 南京邮电大学 The prediction technique of the QoE of IPTV unbalanced dataset based on PNN-PSO algorithm
CN110909802A (en) * 2019-11-26 2020-03-24 西安邮电大学 Improved PSO (particle swarm optimization) based fault classification method for optimizing PNN (portable network) smoothing factor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Khakzad Hamid.Improving performance of classification on severity of ill effects (SEV) index on fish using K-Means clustering algorithm with various distance metrics.《Water Practice &amp Technology》.2019,第14卷(第1期),101-117. *
环境因子对轻腌大黄鱼中溶藻弧菌生长/非生长界面的影响;郭全友等;《农业工程学报》;第34卷(第3期);292-299 *

Also Published As

Publication number Publication date
CN111797937A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
Tahir et al. A classification model for class imbalance dataset using genetic programming
CN110232416B (en) Equipment fault prediction method based on HSMM-SVM
CN110009030B (en) Sewage treatment fault diagnosis method based on stacking meta-learning strategy
CN110363230B (en) Stacking integrated sewage treatment fault diagnosis method based on weighted base classifier
CN106202952A (en) A kind of Parkinson disease diagnostic method based on machine learning
CN112557034B (en) Bearing fault diagnosis method based on PCA _ CNNS
CN113705877B (en) Real-time moon runoff forecasting method based on deep learning model
CN110880369A (en) Gas marker detection method based on radial basis function neural network and application
CN110826611A (en) Stacking sewage treatment fault diagnosis method based on weighted integration of multiple meta-classifiers
CN111988329A (en) Network intrusion detection method based on deep learning
CN116542382A (en) Sewage treatment dissolved oxygen concentration prediction method based on mixed optimization algorithm
CN115907195A (en) Photovoltaic power generation power prediction method, system, electronic device and medium
CN111209939A (en) SVM classification prediction method with intelligent parameter optimization module
CN113222035B (en) Multi-class imbalance fault classification method based on reinforcement learning and knowledge distillation
CN111797937B (en) Greenhouse environment assessment method based on PNN network
CN109408896A (en) A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring
CN112990593A (en) Transformer fault diagnosis and state prediction method based on CSO-ANN-EL algorithm
CN116956160A (en) Data classification prediction method based on self-adaptive tree species algorithm
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
CN116031879A (en) Hybrid intelligent feature selection method suitable for transient voltage stability evaluation of power system
CN114117876A (en) Feature selection method based on improved Harris eagle algorithm
CN113807005A (en) Bearing residual life prediction method based on improved FPA-DBN
Wenxuan et al. Leaf disease image classification method based on improved convolutional neural network
CN113361709A (en) Deep neural network model repairing method based on variation
Khotimah et al. Adaptive SOMMI (Self Organizing Map Multiple Imputation) base on Variation Weight for Incomplete Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant