CN110909948A - Soil pollution prediction method and system - Google Patents

Soil pollution prediction method and system Download PDF

Info

Publication number
CN110909948A
CN110909948A CN201911199940.7A CN201911199940A CN110909948A CN 110909948 A CN110909948 A CN 110909948A CN 201911199940 A CN201911199940 A CN 201911199940A CN 110909948 A CN110909948 A CN 110909948A
Authority
CN
China
Prior art keywords
soil pollution
prediction
soil
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911199940.7A
Other languages
Chinese (zh)
Inventor
王占刚
何云山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN201911199940.7A priority Critical patent/CN110909948A/en
Publication of CN110909948A publication Critical patent/CN110909948A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Educational Administration (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a soil pollution prediction method and a soil pollution prediction system. The method comprises the following steps: generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value; training a neural network model by using the effective number sequence; predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year; calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year; predicting a future soil pollution index for the particular area based on the error.

Description

Soil pollution prediction method and system
Technical Field
The invention relates to the field of soil pollution treatment, in particular to a soil pollution prediction method and a soil pollution prediction system.
Background
The method is very important for guaranteeing the grain safety and the health of people in China and performing land restoration on the polluted land in a targeted manner. Most of the existing methods for treating soil heavy metal pollution are mainly used for prevention, however, the factors influencing the soil heavy metal pollution are more, and the data volume is larger, so that the traditional soil heavy metal pollution prediction speed is low, the prediction is not in place, the prevention effect is poor, and the fund and time waste is caused.
Disclosure of Invention
The invention provides a soil pollution prediction method and a soil pollution prediction system, and aims to improve the accuracy of soil pollution prediction.
According to the present invention, there is provided a soil pollution prediction method including: generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value; training a neural network model by using the effective number sequence; predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year; calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year; predicting a future soil pollution index for the particular area based on the error.
The step of generating an effective number series for soil pollution prediction by using a grey prediction model comprises the following steps: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.
The neural network model is deployed on a plurality of child nodes of a Hadoop distributed network framework.
The step of training a neural network model using the significance sequence comprises: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.
And the data in the Hadoop distributed network framework are processed in parallel through a Spark algorithm.
The step of predicting a future soil pollution index for a particular area based on the error comprises: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.
The present invention provides a soil pollution prediction system, including: a grey module configured to generate an effective number series for soil pollution prediction using a grey prediction model according to a past year soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value; a neural network prediction module configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, wherein the predicted soil pollution index includes a last year predicted soil pollution index corresponding to a last year soil pollution index; a prediction correction module configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, and predict a future soil pollution index of the specific area based on the error.
The gray module is configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.
The neural network model is deployed on a plurality of child nodes of a Hadoop distributed network framework.
The neural network prediction module is configured to: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.
The neural network prediction module is configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.
The prediction modification module is configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.
According to the soil pollution prediction method and the soil pollution prediction system, the soil heavy metal sample amount can be expanded through gray prediction, and effective data are provided for later-stage distributed neural network calculation; the parallel operation is carried out through big data Spark, and the operation speed of the neural network is improved; and the prediction result is corrected through a Markov chain, so that the prediction precision of the heavy metal pollution of the soil is higher.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of exemplary embodiments of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:
fig. 1 is a flowchart of a soil contamination prediction method according to an exemplary embodiment of the present invention.
Fig. 2 is a schematic diagram of a structure of a neural network model according to an exemplary embodiment of the present invention.
Fig. 3 is a block diagram of a soil contamination prediction system according to an exemplary embodiment of the present invention.
The present invention will hereinafter be described in detail with reference to the drawings, wherein like or similar elements are designated by like or similar reference numerals throughout.
Detailed Description
The invention relates to soil pollution remediation, and also relates to grey scale prediction, big data distributed computing, neural network prediction, Markov Chain (MC) correction. Predicting soil pollution data by using a grey prediction method, analyzing the soil pollution state, and expanding the original data; based on massive soil pollution data, distributed calculation is carried out by adopting a big data processing algorithm Spark, a neural network is deployed on a child node of Hadoop to carry out operation, and Markov chain correction is carried out on an output result compared with original data; the method can enhance the prediction accuracy and enable the calculation efficiency to be fast.
The grey prediction method is a method for predicting a system with uncertain factors. The grey prediction is used for identifying the degree of dissimilarity of development trends among system factors, performing correlation analysis, generating and processing original data to find the rule of system change, generating a data sequence with a strong rule, and then establishing a corresponding differential equation model, so as to predict the future development trend of the object.
The artificial neural network can predict the heavy metal pollution of the soil by training a large amount of data. The artificial neural network is combined with a big data system, a big neural network is divided into countless small neural networks, parallel operation is carried out by means of Spark algorithm, and the operation efficiency can be improved.
The advantage of applying a markov chain is that it can predict the probability that the system is in state j at time n +1, on condition that the system is in state i at time n. The error transition probability can be predicted by a markov chain, so that a relative error interval for generating an error distribution of numerical values can be established, and the error can be repaired by a formula to reduce the error.
By collecting data of soil heavy metal pollution (the data can be calculated and provided by a single factor index method for evaluating the soil heavy metal pollution, an inner-merory comprehensive pollution index method and the like), and applying a gray prediction model (for example, but not limited to a GM (1, 1) model established by using a first-order differential equation to a variable), a generation array value and a model reduction value corresponding to the evaluation of the soil pollution index are obtained. Deploying neural network model parameters in a Hadoop distributed computing framework based on Spark algorithm to construct a distributed neural network, then integrating and inputting a generated array value and a model reduction value of soil heavy metal pollution into the distributed neural network, so that a larger data set can be divided into m equal parts of small data sets for parallel computing, and then integrating parameters output by each slave sub-node to obtain global parameters, thereby accelerating the operation speed; and finally, correcting data through a Markov chain, and removing and replacing data with large errors, thereby improving the model prediction precision.
The following description is provided with reference to the accompanying drawings to assist in a comprehensive understanding of exemplary embodiments of the invention as defined by the claims and their equivalents. The description includes various specific details to aid understanding, but these details are to be regarded as illustrative only. Thus, one of ordinary skill in the art will recognize that: various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present invention. Moreover, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.
In order to further improve the accuracy and speed of prediction of heavy metal in regional soil, the invention provides a novel regional soil heavy metal pollution prediction method, which adopts a novel model combining gray prediction, a neural network, large data distributed computation and Markov chain correction error and is applied to prediction of heavy metal pollution in soil for the first time.
Fig. 1 is a flowchart of a soil contamination prediction method according to an exemplary embodiment of the present invention.
As shown in fig. 1, in step S101, an effective number series for soil pollution prediction is generated using a gray prediction model according to a past soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value.
In the embodiment of the invention, the soil pollution indexes of the past year in a specific area can be collected, and the grade ratio of the soil pollution data of the past year is calculated; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.
According to the embodiment, the collected soil pollution index (e.g. the mei-type integrated pollution index) can be gray-treated first, and the soil pollution index situation of a piece of soil polluted by heavy metal in a specified future year can be predicted through GM (1, 1), wherein the principle formula derivation process using GM (1, 1) is as follows:
and performing simulation fitting on the soil pollution index in the past year by establishing a gray prediction model GM (1, 1), if the fitted data is available through inspection, predicting the soil pollution condition in the specified future year by using the fitted data, and putting an effective data sequence generated by gray prediction into a neural network for learning prediction.
First, a data time series is established for a collected data set of a region which is specified to be polluted by a heavy metal in the past years:
s(0)=(s(0)(1),s(0)(2),...,s(0)(n))
in the formula s(0)(1) Data representing the first year in which the area was designated for heavy metal contamination, s(0)(2) S data representing the next year of heavy metal contamination assigned(0)(n) data indicating the designation of heavy metal contamination in the nth year. For the original data s(0)Add up once to calculate s(1)Definition of s(1)The gray derivatives of (d) are:
d(k)=s(0)(k)=s(1)(k)-s(1)(k-1)
definition of i(1)(k) Is s is(1)The generation sequence of the critical value:
i(1)(k)=αs(1)(k)+(1-α)s(1)
establish the gray differential equation for GM (1, 1):
s(0)(k)+αi(1)(k)=β
wherein s is(0)(k) Called the gray derivative, α coefficient of development, i(1)(k) For whitened background values β is a grey scale dose.
By using the regression analysis method, the estimated values of the development coefficient and the gray scale usage amount can be obtained, and at this time, the equation of the corresponding gray model is as follows:
Figure BDA0002295615700000051
by solving the upper differential equation, the result is obtained
Figure BDA0002295615700000052
Thus, a resulting array value can be obtained
Figure BDA0002295615700000054
And model reduction value
Figure BDA0002295615700000055
A set of new data, namely a model reduction value and a generated sequence value, can be fitted according to soil pollution data in the past year through a gray prediction method, if the fitted model reduction value is verified to be reasonable and feasible, a new data sequence of the soil heavy metal pollution state can be established according to the fitted model reduction value and the generated sequence value, and the significance of using gray prediction is that a small amount of data can be expanded, so that the data sequence entering neural network learning is more comprehensive.
For example, the collected soil pollution indexes in the past year are calculated, and the data grade ratio is calculated as follows:
Figure BDA0002295615700000053
judging the grade ratio, if | 1-lambda (k) | < 0.2, then the selected data s (0) can be used for satisfactory GM (1, 1) modeling; then, for the original data s(0)(t) performing a summation to obtain s(1)(t); establishing a grey prediction model, solving a differential equation to obtain s(1)(t + 1); generating array values
Figure BDA0002295615700000056
And model reduction value
Figure BDA0002295615700000061
Let k take 1, 2 …, n respectively, to obtain
Figure BDA0002295615700000062
By
Figure BDA0002295615700000063
Can be calculated
Figure BDA0002295615700000064
Then carrying out model inspection, calculating relative residual error epsilon (k) of the model reduction value and the originally collected model, if | epsilon (k) & gt<0.1, the model is considered to be restored to a higher requirement if | ε (k) & gtY<0.2, the model is considered to be reduced to the general requirements.
In step S102, a neural network model is trained using the significance sequence. For example, but not limiting of, the neural network model may be deployed on multiple child nodes of a Hadoop distributed network framework.
Fig. 2 is a schematic diagram of a structure of a neural network model according to an exemplary embodiment of the present invention. As shown in FIG. 2, in an example of the invention, the neural network model may be deployed on multiple child nodes S1 through Sm of the Hadoop distributed network framework.
In step S103, the trained neural network model is used to predict soil pollution indexes of the specific area, where the predicted soil pollution indexes include the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year.
In the embodiment of the present invention, the significant number sequence needs to be integrated into a data set, for example, a model reduction value obtained by gray level prediction needs to be integrated
Figure BDA0002295615700000065
And generating array values
Figure BDA0002295615700000066
Integrating the sequences into a new sequence V; and carrying out data partitioning on the integrated soil heavy metal pollution value V, and dividing the data set V into m different subdata sets. That is, the data set V may be partitioned into a plurality of sub data sets and the plurality of sub data sets may be partitioned from the HBroadcasting the master node M of the adoop distributed network framework to the plurality of child nodes S1 to Sm; training the neural network model using the plurality of sub data sets on the plurality of sub nodes S1 to Sm, respectively. For example, but not limited to, data in the Hadoop distributed network framework is processed in parallel through a Spark algorithm to execute big data distributed computation, so that the operation speed is increased. Because the heavy metal pollution index of the soil can be predicted by the GM (1, 1) model in the future specified year, when a training set and a test set are divided, partial model reduction values are selected for training, and the partial model reduction values and generated sequence values are tested. The training set and the test set have a weight of about 5: 1. And finally, summarizing the output of each slave node to a master node M to obtain an output sequence Z.
In the training process, a neural network is combined with a big data distributed system, and global neural network parameters including the structure, the number of layers, an activation function g (·), a bias b and a weight omega of the neural network are set in a master node master of a Spark distributed computing framework. Generating array values in soil heavy metal pollution to be simulated by gray scale prediction
Figure BDA0002295615700000067
And model reduction value
Figure BDA0002295615700000068
Before being input into a neural network as an input layer of the neural network, the data sets are classified into a group of new data sets, the data sets are defined as V, the V is divided into m sub-data sets, the m sub-data sets are broadcast to each slave node, partial model reduction values are selected in training to be a training set, and partial model reduction values and generated sequence values are used as a test set.
Suppose that the neural network used consists of an input layer, a hidden layer and an output layer, and the input layer is defined to be composed of 3 neurons, a1 (1)、a2 (1)And a3 (1)(ii) a The hidden layer is composed of 2 neurons, each being a1 (2)And a2 (2)(ii) a The output function is defined as z. The output of the neural network can then be represented by the following two equations:
z=g(ω(2)*a(2)+b)
a(2)=g(ω(1)*a(1)+b)
and summarizing the Z calculated and output by each slave node to a master node to obtain an output sequence Z. z is a radical oftThe state of heavy metal contamination of the soil at time t is predicted to be z.
In step S104, an error between the past-year-predicted soil pollution index and the past-year-predicted soil pollution index is calculated. According to the embodiment of the invention, after the soil pollution indexes of the specific area are predicted, Markov chain correction can be carried out on the output sequence Z, and the relative error of the predicted soil pollution indexes in the past year is calculated by comparing the predicted soil pollution indexes in the past year with the collected soil pollution indexes in the past year.
In step S105, a future soil pollution index of the specific area is predicted based on the error. In the embodiment of the invention, the initial state probability vector and the frequency transition matrix of the error can be calculated through a Markov chain algorithm, the one-step transition probability matrix is obtained through the frequency transition matrix, and then the future soil pollution index of the specific area can be predicted according to the initial state probability vector and the one-step transition probability matrix.
In an embodiment of the present invention, the calculated errors may be classified. The errors can be divided into four categories, respectively E1、E2、E3、E4They represent abnormal underestimation, overestimation and abnormal overestimation, respectively. Therefore, a transfer frequency matrix T of the relative error E of the heavy metal pollution of the soil and a transfer probability matrix P can be obtained. The one-step transition probability P is solved as follows:
Figure BDA0002295615700000071
at the same time, the frequency F of each state in the frequency transfer matrix T is calculated0. In Markov chain, n-step transition probability P(n)=PnThen determined by Markov chainsNext year (next stage) of distribution vector F of heavy metal contaminated soil staten=F0·PnTherefore, the prediction error of the heavy metal pollution of the soil in the following n years can be determined, and then the formula is used
Figure BDA0002295615700000072
Carrying out Markov chain correction to obtain a corrected soil heavy metal predicted value z't. In the formula e1、e2Respectively representing the upper and lower error limits in their relative error intervals.
And performing Markov chain error correction on the output sequence, comparing the output sequence Z with the soil pollution index of the previous year, listing an error E between the output sequence Z and the soil pollution index of the previous year, and dividing the error E into four types to form an error interval according to the calculated error, wherein the four types of errors are abnormal underestimation, overestimation and abnormal overestimation respectively. And observing and calculating an error e between the subsequence Z in the soil heavy metal pollution sequence Z and the original data, and determining that e is a certain type of error in an error interval. By comparing the error of the subsequence Z in the sequence Z, the frequency transfer matrix T of the error E of the heavy metal pollution of the soil and the one-step transfer probability matrix P thereof can be obtained, and the frequency F of each state in the frequency transfer matrix T can be calculated0. By the formula Fn=F0·PnAnd (5) the state distribution vector of the soil heavy metal pollution in the next n years can be known.
The probability of which interval the soil heavy metal pollution error is in the next n years is known to be larger through a Markov chain, then the interval is selected, and formula correction is carried out on the soil heavy metal pollution index predicted by the interval. So that the prediction effect is better.
The following is a brief description of the application of the soil contamination prediction method with reference to specific examples, which are only for illustration and the present invention is not limited thereto.
In an example of the present invention, data expansion may be first performed using a gray scale model by collecting soil pollution indexes (i.e., composite pollution indexes) of a specific area in 2010 to 2017, and then model reduction values (i.e., composite pollution indexes) in 2010 to 2017 and generation array values (i.e., composite pollution indexes) in 2018 to 2020 may be obtained.
And integrating the model reduction value and the generated sequence value output by the gray model into a data set V. Then, factory building data, pollution reporting data and resident life data related to the specific area are collected, and the factory building data, the pollution reporting data, the resident life data and the data set V are used as feature vectors of the input layer of the distributed neural network to construct a distributed neural network prediction model containing a pollution fluctuation rule. Then, the soil pollution indexes in 2010-2019 are predicted by using a distributed neural network model, wherein the soil pollution indexes in 2010-2017 can be called as the soil pollution indexes predicted in the past year, and then Markov chain repairing is carried out based on the prediction result.
In this example, the soil pollution index predicted in the past year (i.e., 2010 to 2017) may be selected from the predicted soil pollution indexes in 2010 to 2019, the soil pollution index predicted in the past year is compared with the original soil pollution index predicted in the past year (i.e., 2010 to 2017), an error is obtained, the error is divided into four state intervals of abnormal underestimation, overestimation and abnormal overestimation, and an initial state probability vector F is calculated0And a one-step transfer matrix P, so that the section of the error of the soil pollution index in the future (for example, but not limited to, 2018 to 2020) can be predicted, after the section is determined, the error is repaired through the upper and lower limits of the section error, for example, a frequency transfer matrix T and a one-step transfer probability matrix P of the error E of the soil heavy metal pollution can be calculated, and the frequency F of each state in the frequency matrix T can be calculated at the same time0. By the formula Fn=F0·PnA soil pollution index is calculated for the future (e.g., without limitation, 2018 to 2020).
Fig. 3 is a block diagram of a soil contamination prediction system 300 according to an exemplary embodiment of the present invention. As shown in fig. 3, soil contamination prediction system 300 may include a gray module 301, a neural network prediction module 302, and a prediction revision module 303.
The gray module 301 may be configured to generate a significance array for soil pollution prediction using a gray prediction model based on the past year soil pollution index for a particular area, the significance array including a model reduction value and a generated array value. The neural network prediction module 302 may be configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, where the predicted soil pollution index includes a past-year predicted soil pollution index corresponding to a past-year soil pollution index. The prediction correction module 303 may be configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, based on which a future soil pollution index for the particular area is predicted.
According to embodiments of the invention, the gray module 301 may be configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.
In embodiments of the invention, the neural network model may be deployed on multiple child nodes of a Hadoop distributed network framework. The neural network prediction module 302 may be configured to: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.
The neural network prediction module 302 may be configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.
The prediction modification module 303 may be configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.
Since the soil pollution prediction method shown in fig. 1 and 2 is performed by the soil pollution prediction system 300 shown in fig. 3, the above descriptions with reference to fig. 1 and 2 that are mentioned when describing the respective steps included in the soil pollution prediction method are all applicable to the operations performed by the corresponding modules in the soil pollution prediction system described with reference to fig. 3, so as to refer to the corresponding descriptions of fig. 1 and 2 for the details related to the operations performed by the respective modules in the soil pollution prediction system 300, which are not repeated herein.
As described above, according to the soil pollution prediction method and the soil pollution prediction system, the soil heavy metal sample amount can be expanded through gray scale prediction, and effective data are provided for later-stage distributed neural network calculation; the parallel operation is carried out through big data Spark, and the operation speed of the neural network is improved; and the prediction result is corrected through a Markov chain, so that the prediction precision of the heavy metal pollution of the soil is higher.
Further, it should be understood that the respective modules or units in the soil contamination prediction system according to the exemplary embodiment of the present invention may be implemented as hardware components and/or software components. Those skilled in the art may implement the various modules using, for example, Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs), depending on the processing performed by the defined various modules.
A computer-readable storage medium according to an exemplary embodiment of the present invention stores a computer program that, when executed by a processor, causes the processor to perform the soil contamination prediction method of the above-described exemplary embodiment. The computer readable storage medium is any data storage device that can store data which can be read by a computer system. Examples of computer-readable storage media include: read-only memory, random access memory, read-only optical disks, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the internet via wired or wireless transmission paths).
Although a few exemplary embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.

Claims (10)

1. A soil pollution prediction method, comprising:
generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value;
training a neural network model by using the effective number sequence;
predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year;
calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year;
predicting a future soil pollution index for the particular area based on the error.
2. The soil contamination prediction method of claim 1, wherein the step of generating an effective number series for soil contamination prediction using a gray prediction model comprises:
calculating the grade ratio of the past year pollution data;
selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio;
establishing a gray prediction model by using the selected data;
and generating a model reduction value and a generated sequence value by using a gray prediction model.
3. The soil pollution prediction method of claim 1, wherein the neural network model is deployed on a plurality of sub-nodes of a Hadoop distributed network framework,
the step of training a neural network model using the significance sequence comprises:
integrating the significant digit sequence into a data set;
dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes;
and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.
4. The soil pollution prediction method of claim 3, wherein the data in the Hadoop distributed network framework are processed in parallel by Spark algorithm.
5. The soil contamination prediction method of claim 1, wherein the step of predicting a future soil contamination index for a particular area based on the error comprises:
calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm;
and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.
6. A soil pollution prediction system comprising:
a grey module configured to generate an effective number series for soil pollution prediction using a grey prediction model according to a past year soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value;
a neural network prediction module configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, wherein the predicted soil pollution index includes a last year predicted soil pollution index corresponding to a last year soil pollution index;
a prediction correction module configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, and predict a future soil pollution index of the specific area based on the error.
7. The soil contamination prediction system of claim 6, wherein the gray module is configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.
8. The soil contamination prediction system of claim 6, wherein the neural network model is deployed on a plurality of sub-nodes of a Hadoop distributed network framework,
the neural network prediction module is configured to: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.
9. The soil contamination prediction system of claim 8, wherein the neural network prediction module is configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.
10. The soil contamination prediction system of claim 6, wherein the prediction correction module is configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.
CN201911199940.7A 2019-11-29 2019-11-29 Soil pollution prediction method and system Pending CN110909948A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911199940.7A CN110909948A (en) 2019-11-29 2019-11-29 Soil pollution prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911199940.7A CN110909948A (en) 2019-11-29 2019-11-29 Soil pollution prediction method and system

Publications (1)

Publication Number Publication Date
CN110909948A true CN110909948A (en) 2020-03-24

Family

ID=69820662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911199940.7A Pending CN110909948A (en) 2019-11-29 2019-11-29 Soil pollution prediction method and system

Country Status (1)

Country Link
CN (1) CN110909948A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112505128A (en) * 2020-11-30 2021-03-16 北方民族大学 Method and device for nondestructive testing of reducing sugar of wine
CN113435707A (en) * 2021-06-03 2021-09-24 大连钜智信息科技有限公司 Soil testing and formulated fertilization method based on deep learning and weighted multi-factor evaluation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751254A (en) * 2015-04-23 2015-07-01 国家电网公司 Line loss rate prediction method based on non-isometric weighted grey model and fuzzy clustering sorting
CN107230217A (en) * 2017-04-26 2017-10-03 中国南方电网有限责任公司超高压输电公司检修试验中心 A kind of transmission line forest fire method for early warning based on image and gray prediction
CN110222632A (en) * 2019-06-04 2019-09-10 哈尔滨工程大学 A kind of waterborne target detection method of gray prediction auxiliary area suggestion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751254A (en) * 2015-04-23 2015-07-01 国家电网公司 Line loss rate prediction method based on non-isometric weighted grey model and fuzzy clustering sorting
CN107230217A (en) * 2017-04-26 2017-10-03 中国南方电网有限责任公司超高压输电公司检修试验中心 A kind of transmission line forest fire method for early warning based on image and gray prediction
CN110222632A (en) * 2019-06-04 2019-09-10 哈尔滨工程大学 A kind of waterborne target detection method of gray prediction auxiliary area suggestion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
郎君,苏小红,周秀杰: "基于有机灰色神经网络模型的空气污染指数预测" *
颜廷文;孙宝盛;张冉;: "基于等维新息灰色马尔可夫模型的河流水质预测" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112505128A (en) * 2020-11-30 2021-03-16 北方民族大学 Method and device for nondestructive testing of reducing sugar of wine
CN113435707A (en) * 2021-06-03 2021-09-24 大连钜智信息科技有限公司 Soil testing and formulated fertilization method based on deep learning and weighted multi-factor evaluation
CN113435707B (en) * 2021-06-03 2023-11-10 大连钜智信息科技有限公司 Soil testing formula fertilization method based on deep learning and weighting multi-factor evaluation

Similar Documents

Publication Publication Date Title
CN107590565B (en) Method and device for constructing building energy consumption prediction model
Dini et al. A new method for simultaneous calibration of demand pattern and Hazen-Williams coefficients in water distribution systems
Marwala Bayesian training of neural networks using genetic programming
CN103514366A (en) Urban air quality concentration monitoring missing data recovering method
CN108921359B (en) Distributed gas concentration prediction method and device
CN103810101A (en) Software defect prediction method and system
CN114492675B (en) Intelligent fault cause diagnosis method for capacitor voltage transformer
CN108491931B (en) Method for improving nondestructive testing precision based on machine learning
Samantaray et al. Sediment assessment for a watershed in arid region via neural networks
CN110909948A (en) Soil pollution prediction method and system
Mahaweerawat et al. Fault prediction in object-oriented software using neural network techniques
Wang et al. Examining dynamic interactions among experimental factors influencing hydrologic data assimilation with the ensemble Kalman filter
Shoaib et al. Input selection of wavelet-coupled neural network models for rainfall-runoff modelling
Yan et al. Pollution source positioning in a water supply network based on expensive optimization
CN115049124A (en) Deep and long tunnel water inrush prediction method based on Bayesian network
CN113379156A (en) Speed prediction method, device, equipment and storage medium
Goswami et al. Comparative assessment of six automatic optimization techniques for calibration of a conceptual rainfall—runoff model
CN108932197A (en) Software failure time forecasting methods based on parameter Bootstrap double sampling
CN117194918A (en) Air temperature prediction method and system based on self-attention echo state network
Aramane et al. Iot and neural network based multi region and simultaneous leakage detection in pipelines
KR101151013B1 (en) Method for evaluating performance of tire
Pattnaik et al. A survey on machine learning techniques used for software quality prediction
CN114880818A (en) Global gas pipe network structure-oriented neural network monitoring method and system
CN111062118B (en) Multilayer soft measurement modeling system and method based on neural network prediction layering
CN113722308A (en) Acceleration response data completion method and device based on EEMD-MultiCNN-LSTM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination