CN110909948A

CN110909948A - Soil pollution prediction method and system

Info

Publication number: CN110909948A
Application number: CN201911199940.7A
Authority: CN
Inventors: 王占刚; 何云山
Original assignee: Beijing Information Science and Technology University
Current assignee: Beijing Information Science and Technology University
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2020-03-24

Abstract

The invention provides a soil pollution prediction method and a soil pollution prediction system. The method comprises the following steps: generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value; training a neural network model by using the effective number sequence; predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year; calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year; predicting a future soil pollution index for the particular area based on the error.

Description

Soil pollution prediction method and system

Technical Field

The invention relates to the field of soil pollution treatment, in particular to a soil pollution prediction method and a soil pollution prediction system.

Background

The method is very important for guaranteeing the grain safety and the health of people in China and performing land restoration on the polluted land in a targeted manner. Most of the existing methods for treating soil heavy metal pollution are mainly used for prevention, however, the factors influencing the soil heavy metal pollution are more, and the data volume is larger, so that the traditional soil heavy metal pollution prediction speed is low, the prediction is not in place, the prevention effect is poor, and the fund and time waste is caused.

Disclosure of Invention

The invention provides a soil pollution prediction method and a soil pollution prediction system, and aims to improve the accuracy of soil pollution prediction.

According to the present invention, there is provided a soil pollution prediction method including: generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value; training a neural network model by using the effective number sequence; predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year; calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year; predicting a future soil pollution index for the particular area based on the error.

The step of generating an effective number series for soil pollution prediction by using a grey prediction model comprises the following steps: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.

The neural network model is deployed on a plurality of child nodes of a Hadoop distributed network framework.

The step of training a neural network model using the significance sequence comprises: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.

And the data in the Hadoop distributed network framework are processed in parallel through a Spark algorithm.

The step of predicting a future soil pollution index for a particular area based on the error comprises: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.

The present invention provides a soil pollution prediction system, including: a grey module configured to generate an effective number series for soil pollution prediction using a grey prediction model according to a past year soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value; a neural network prediction module configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, wherein the predicted soil pollution index includes a last year predicted soil pollution index corresponding to a last year soil pollution index; a prediction correction module configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, and predict a future soil pollution index of the specific area based on the error.

The gray module is configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.

The neural network prediction module is configured to: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.

The neural network prediction module is configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.

The prediction modification module is configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.

According to the soil pollution prediction method and the soil pollution prediction system, the soil heavy metal sample amount can be expanded through gray prediction, and effective data are provided for later-stage distributed neural network calculation; the parallel operation is carried out through big data Spark, and the operation speed of the neural network is improved; and the prediction result is corrected through a Markov chain, so that the prediction precision of the heavy metal pollution of the soil is higher.

Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.

Drawings

The above and other objects and features of exemplary embodiments of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:

fig. 1 is a flowchart of a soil contamination prediction method according to an exemplary embodiment of the present invention.

Fig. 2 is a schematic diagram of a structure of a neural network model according to an exemplary embodiment of the present invention.

Fig. 3 is a block diagram of a soil contamination prediction system according to an exemplary embodiment of the present invention.

The present invention will hereinafter be described in detail with reference to the drawings, wherein like or similar elements are designated by like or similar reference numerals throughout.

Detailed Description

The invention relates to soil pollution remediation, and also relates to grey scale prediction, big data distributed computing, neural network prediction, Markov Chain (MC) correction. Predicting soil pollution data by using a grey prediction method, analyzing the soil pollution state, and expanding the original data; based on massive soil pollution data, distributed calculation is carried out by adopting a big data processing algorithm Spark, a neural network is deployed on a child node of Hadoop to carry out operation, and Markov chain correction is carried out on an output result compared with original data; the method can enhance the prediction accuracy and enable the calculation efficiency to be fast.

The grey prediction method is a method for predicting a system with uncertain factors. The grey prediction is used for identifying the degree of dissimilarity of development trends among system factors, performing correlation analysis, generating and processing original data to find the rule of system change, generating a data sequence with a strong rule, and then establishing a corresponding differential equation model, so as to predict the future development trend of the object.

The artificial neural network can predict the heavy metal pollution of the soil by training a large amount of data. The artificial neural network is combined with a big data system, a big neural network is divided into countless small neural networks, parallel operation is carried out by means of Spark algorithm, and the operation efficiency can be improved.

The advantage of applying a markov chain is that it can predict the probability that the system is in state j at time n +1, on condition that the system is in state i at time n. The error transition probability can be predicted by a markov chain, so that a relative error interval for generating an error distribution of numerical values can be established, and the error can be repaired by a formula to reduce the error.

By collecting data of soil heavy metal pollution (the data can be calculated and provided by a single factor index method for evaluating the soil heavy metal pollution, an inner-merory comprehensive pollution index method and the like), and applying a gray prediction model (for example, but not limited to a GM (1, 1) model established by using a first-order differential equation to a variable), a generation array value and a model reduction value corresponding to the evaluation of the soil pollution index are obtained. Deploying neural network model parameters in a Hadoop distributed computing framework based on Spark algorithm to construct a distributed neural network, then integrating and inputting a generated array value and a model reduction value of soil heavy metal pollution into the distributed neural network, so that a larger data set can be divided into m equal parts of small data sets for parallel computing, and then integrating parameters output by each slave sub-node to obtain global parameters, thereby accelerating the operation speed; and finally, correcting data through a Markov chain, and removing and replacing data with large errors, thereby improving the model prediction precision.

The following description is provided with reference to the accompanying drawings to assist in a comprehensive understanding of exemplary embodiments of the invention as defined by the claims and their equivalents. The description includes various specific details to aid understanding, but these details are to be regarded as illustrative only. Thus, one of ordinary skill in the art will recognize that: various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present invention. Moreover, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.

In order to further improve the accuracy and speed of prediction of heavy metal in regional soil, the invention provides a novel regional soil heavy metal pollution prediction method, which adopts a novel model combining gray prediction, a neural network, large data distributed computation and Markov chain correction error and is applied to prediction of heavy metal pollution in soil for the first time.

As shown in fig. 1, in step S101, an effective number series for soil pollution prediction is generated using a gray prediction model according to a past soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value.

In the embodiment of the invention, the soil pollution indexes of the past year in a specific area can be collected, and the grade ratio of the soil pollution data of the past year is calculated; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.

According to the embodiment, the collected soil pollution index (e.g. the mei-type integrated pollution index) can be gray-treated first, and the soil pollution index situation of a piece of soil polluted by heavy metal in a specified future year can be predicted through GM (1, 1), wherein the principle formula derivation process using GM (1, 1) is as follows:

and performing simulation fitting on the soil pollution index in the past year by establishing a gray prediction model GM (1, 1), if the fitted data is available through inspection, predicting the soil pollution condition in the specified future year by using the fitted data, and putting an effective data sequence generated by gray prediction into a neural network for learning prediction.

First, a data time series is established for a collected data set of a region which is specified to be polluted by a heavy metal in the past years:

s⁽⁰⁾＝(s⁽⁰⁾(1)，s⁽⁰⁾(2)，...，s⁽⁰⁾(n))

in the formula s⁽⁰⁾(1) Data representing the first year in which the area was designated for heavy metal contamination, s⁽⁰⁾(2) S data representing the next year of heavy metal contamination assigned⁽⁰⁾(n) data indicating the designation of heavy metal contamination in the nth year. For the original data s⁽⁰⁾Add up once to calculate s⁽¹⁾Definition of s⁽¹⁾The gray derivatives of (d) are:

d(k)＝s⁽⁰⁾(k)＝s⁽¹⁾(k)-s⁽¹⁾(k-1)

definition of i⁽¹⁾(k) Is s is⁽¹⁾The generation sequence of the critical value:

i⁽¹⁾(k)＝αs⁽¹⁾(k)+(1-α)s⁽¹⁾

establish the gray differential equation for GM (1, 1):

s⁽⁰⁾(k)+αi⁽¹⁾(k)＝β

wherein s is⁽⁰⁾(k) Called the gray derivative, α coefficient of development, i⁽¹⁾(k) For whitened background values β is a grey scale dose.

By using the regression analysis method, the estimated values of the development coefficient and the gray scale usage amount can be obtained, and at this time, the equation of the corresponding gray model is as follows:

by solving the upper differential equation, the result is obtained

Thus, a resulting array value can be obtained

And model reduction value

A set of new data, namely a model reduction value and a generated sequence value, can be fitted according to soil pollution data in the past year through a gray prediction method, if the fitted model reduction value is verified to be reasonable and feasible, a new data sequence of the soil heavy metal pollution state can be established according to the fitted model reduction value and the generated sequence value, and the significance of using gray prediction is that a small amount of data can be expanded, so that the data sequence entering neural network learning is more comprehensive.

For example, the collected soil pollution indexes in the past year are calculated, and the data grade ratio is calculated as follows:

judging the grade ratio, if | 1-lambda (k) | < 0.2, then the selected data s (0) can be used for satisfactory GM (1, 1) modeling; then, for the original data s⁽⁰⁾(t) performing a summation to obtain s⁽¹⁾(t); establishing a grey prediction model, solving a differential equation to obtain s⁽¹⁾(t + 1); generating array values

And model reduction value

Let k take 1, 2 …, n respectively, to obtain

By

Can be calculated

Then carrying out model inspection, calculating relative residual error epsilon (k) of the model reduction value and the originally collected model, if | epsilon (k) & gt<0.1, the model is considered to be restored to a higher requirement if | ε (k) & gtY<0.2, the model is considered to be reduced to the general requirements.

In step S102, a neural network model is trained using the significance sequence. For example, but not limiting of, the neural network model may be deployed on multiple child nodes of a Hadoop distributed network framework.

Fig. 2 is a schematic diagram of a structure of a neural network model according to an exemplary embodiment of the present invention. As shown in FIG. 2, in an example of the invention, the neural network model may be deployed on multiple child nodes S1 through Sm of the Hadoop distributed network framework.

In step S103, the trained neural network model is used to predict soil pollution indexes of the specific area, where the predicted soil pollution indexes include the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year.

In the embodiment of the present invention, the significant number sequence needs to be integrated into a data set, for example, a model reduction value obtained by gray level prediction needs to be integrated

And generating array values

Integrating the sequences into a new sequence V; and carrying out data partitioning on the integrated soil heavy metal pollution value V, and dividing the data set V into m different subdata sets. That is, the data set V may be partitioned into a plurality of sub data sets and the plurality of sub data sets may be partitioned from the HBroadcasting the master node M of the adoop distributed network framework to the plurality of child nodes S1 to Sm; training the neural network model using the plurality of sub data sets on the plurality of sub nodes S1 to Sm, respectively. For example, but not limited to, data in the Hadoop distributed network framework is processed in parallel through a Spark algorithm to execute big data distributed computation, so that the operation speed is increased. Because the heavy metal pollution index of the soil can be predicted by the GM (1, 1) model in the future specified year, when a training set and a test set are divided, partial model reduction values are selected for training, and the partial model reduction values and generated sequence values are tested. The training set and the test set have a weight of about 5: 1. And finally, summarizing the output of each slave node to a master node M to obtain an output sequence Z.

In the training process, a neural network is combined with a big data distributed system, and global neural network parameters including the structure, the number of layers, an activation function g (·), a bias b and a weight omega of the neural network are set in a master node master of a Spark distributed computing framework. Generating array values in soil heavy metal pollution to be simulated by gray scale prediction

And model reduction value

Before being input into a neural network as an input layer of the neural network, the data sets are classified into a group of new data sets, the data sets are defined as V, the V is divided into m sub-data sets, the m sub-data sets are broadcast to each slave node, partial model reduction values are selected in training to be a training set, and partial model reduction values and generated sequence values are used as a test set.

Suppose that the neural network used consists of an input layer, a hidden layer and an output layer, and the input layer is defined to be composed of 3 neurons, a₁ ⁽¹⁾、a₂ ⁽¹⁾And a₃ ⁽¹⁾(ii) a The hidden layer is composed of 2 neurons, each being a₁ ⁽²⁾And a₂ ⁽²⁾(ii) a The output function is defined as z. The output of the neural network can then be represented by the following two equations:

z＝g(ω(2)*a(2)+b)

a(2)＝g(ω(1)*a(1)+b)

and summarizing the Z calculated and output by each slave node to a master node to obtain an output sequence Z. z is a radical of_tThe state of heavy metal contamination of the soil at time t is predicted to be z.

In step S104, an error between the past-year-predicted soil pollution index and the past-year-predicted soil pollution index is calculated. According to the embodiment of the invention, after the soil pollution indexes of the specific area are predicted, Markov chain correction can be carried out on the output sequence Z, and the relative error of the predicted soil pollution indexes in the past year is calculated by comparing the predicted soil pollution indexes in the past year with the collected soil pollution indexes in the past year.

In step S105, a future soil pollution index of the specific area is predicted based on the error. In the embodiment of the invention, the initial state probability vector and the frequency transition matrix of the error can be calculated through a Markov chain algorithm, the one-step transition probability matrix is obtained through the frequency transition matrix, and then the future soil pollution index of the specific area can be predicted according to the initial state probability vector and the one-step transition probability matrix.

In an embodiment of the present invention, the calculated errors may be classified. The errors can be divided into four categories, respectively E₁、E₂、E₃、E₄They represent abnormal underestimation, overestimation and abnormal overestimation, respectively. Therefore, a transfer frequency matrix T of the relative error E of the heavy metal pollution of the soil and a transfer probability matrix P can be obtained. The one-step transition probability P is solved as follows:

at the same time, the frequency F of each state in the frequency transfer matrix T is calculated₀. In Markov chain, n-step transition probability P⁽ⁿ⁾＝PⁿThen determined by Markov chainsNext year (next stage) of distribution vector F of heavy metal contaminated soil state_n＝F₀·PⁿTherefore, the prediction error of the heavy metal pollution of the soil in the following n years can be determined, and then the formula is used

Carrying out Markov chain correction to obtain a corrected soil heavy metal predicted value z'_t. In the formula e₁、e₂Respectively representing the upper and lower error limits in their relative error intervals.

And performing Markov chain error correction on the output sequence, comparing the output sequence Z with the soil pollution index of the previous year, listing an error E between the output sequence Z and the soil pollution index of the previous year, and dividing the error E into four types to form an error interval according to the calculated error, wherein the four types of errors are abnormal underestimation, overestimation and abnormal overestimation respectively. And observing and calculating an error e between the subsequence Z in the soil heavy metal pollution sequence Z and the original data, and determining that e is a certain type of error in an error interval. By comparing the error of the subsequence Z in the sequence Z, the frequency transfer matrix T of the error E of the heavy metal pollution of the soil and the one-step transfer probability matrix P thereof can be obtained, and the frequency F of each state in the frequency transfer matrix T can be calculated₀. By the formula F_n＝F₀·PⁿAnd (5) the state distribution vector of the soil heavy metal pollution in the next n years can be known.

The probability of which interval the soil heavy metal pollution error is in the next n years is known to be larger through a Markov chain, then the interval is selected, and formula correction is carried out on the soil heavy metal pollution index predicted by the interval. So that the prediction effect is better.

The following is a brief description of the application of the soil contamination prediction method with reference to specific examples, which are only for illustration and the present invention is not limited thereto.

In an example of the present invention, data expansion may be first performed using a gray scale model by collecting soil pollution indexes (i.e., composite pollution indexes) of a specific area in 2010 to 2017, and then model reduction values (i.e., composite pollution indexes) in 2010 to 2017 and generation array values (i.e., composite pollution indexes) in 2018 to 2020 may be obtained.

And integrating the model reduction value and the generated sequence value output by the gray model into a data set V. Then, factory building data, pollution reporting data and resident life data related to the specific area are collected, and the factory building data, the pollution reporting data, the resident life data and the data set V are used as feature vectors of the input layer of the distributed neural network to construct a distributed neural network prediction model containing a pollution fluctuation rule. Then, the soil pollution indexes in 2010-2019 are predicted by using a distributed neural network model, wherein the soil pollution indexes in 2010-2017 can be called as the soil pollution indexes predicted in the past year, and then Markov chain repairing is carried out based on the prediction result.

In this example, the soil pollution index predicted in the past year (i.e., 2010 to 2017) may be selected from the predicted soil pollution indexes in 2010 to 2019, the soil pollution index predicted in the past year is compared with the original soil pollution index predicted in the past year (i.e., 2010 to 2017), an error is obtained, the error is divided into four state intervals of abnormal underestimation, overestimation and abnormal overestimation, and an initial state probability vector F is calculated₀And a one-step transfer matrix P, so that the section of the error of the soil pollution index in the future (for example, but not limited to, 2018 to 2020) can be predicted, after the section is determined, the error is repaired through the upper and lower limits of the section error, for example, a frequency transfer matrix T and a one-step transfer probability matrix P of the error E of the soil heavy metal pollution can be calculated, and the frequency F of each state in the frequency matrix T can be calculated at the same time₀. By the formula F_n＝F₀·PⁿA soil pollution index is calculated for the future (e.g., without limitation, 2018 to 2020).

Fig. 3 is a block diagram of a soil contamination prediction system 300 according to an exemplary embodiment of the present invention. As shown in fig. 3, soil contamination prediction system 300 may include a gray module 301, a neural network prediction module 302, and a prediction revision module 303.

The gray module 301 may be configured to generate a significance array for soil pollution prediction using a gray prediction model based on the past year soil pollution index for a particular area, the significance array including a model reduction value and a generated array value. The neural network prediction module 302 may be configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, where the predicted soil pollution index includes a past-year predicted soil pollution index corresponding to a past-year soil pollution index. The prediction correction module 303 may be configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, based on which a future soil pollution index for the particular area is predicted.

According to embodiments of the invention, the gray module 301 may be configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.

In embodiments of the invention, the neural network model may be deployed on multiple child nodes of a Hadoop distributed network framework. The neural network prediction module 302 may be configured to: integrating the significant digit sequence into a data set; dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes; and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.

The neural network prediction module 302 may be configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.

The prediction modification module 303 may be configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.

Since the soil pollution prediction method shown in fig. 1 and 2 is performed by the soil pollution prediction system 300 shown in fig. 3, the above descriptions with reference to fig. 1 and 2 that are mentioned when describing the respective steps included in the soil pollution prediction method are all applicable to the operations performed by the corresponding modules in the soil pollution prediction system described with reference to fig. 3, so as to refer to the corresponding descriptions of fig. 1 and 2 for the details related to the operations performed by the respective modules in the soil pollution prediction system 300, which are not repeated herein.

As described above, according to the soil pollution prediction method and the soil pollution prediction system, the soil heavy metal sample amount can be expanded through gray scale prediction, and effective data are provided for later-stage distributed neural network calculation; the parallel operation is carried out through big data Spark, and the operation speed of the neural network is improved; and the prediction result is corrected through a Markov chain, so that the prediction precision of the heavy metal pollution of the soil is higher.

Further, it should be understood that the respective modules or units in the soil contamination prediction system according to the exemplary embodiment of the present invention may be implemented as hardware components and/or software components. Those skilled in the art may implement the various modules using, for example, Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs), depending on the processing performed by the defined various modules.

A computer-readable storage medium according to an exemplary embodiment of the present invention stores a computer program that, when executed by a processor, causes the processor to perform the soil contamination prediction method of the above-described exemplary embodiment. The computer readable storage medium is any data storage device that can store data which can be read by a computer system. Examples of computer-readable storage media include: read-only memory, random access memory, read-only optical disks, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the internet via wired or wireless transmission paths).

Although a few exemplary embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.

Claims

1. A soil pollution prediction method, comprising:

generating an effective array for soil pollution prediction by using a gray prediction model according to the soil pollution index of the specific area in the past year, wherein the effective array comprises a model reduction value and a generated array value;

training a neural network model by using the effective number sequence;

predicting soil pollution indexes of a specific area by using the trained neural network model, wherein the predicted soil pollution indexes comprise the soil pollution indexes predicted in the past year corresponding to the soil pollution indexes in the past year;

calculating the error between the soil pollution index predicted in the past year and the soil pollution index predicted in the past year;

predicting a future soil pollution index for the particular area based on the error.

2. The soil contamination prediction method of claim 1, wherein the step of generating an effective number series for soil contamination prediction using a gray prediction model comprises:

calculating the grade ratio of the past year pollution data;

selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio;

establishing a gray prediction model by using the selected data;

and generating a model reduction value and a generated sequence value by using a gray prediction model.

3. The soil pollution prediction method of claim 1, wherein the neural network model is deployed on a plurality of sub-nodes of a Hadoop distributed network framework,

the step of training a neural network model using the significance sequence comprises:

integrating the significant digit sequence into a data set;

dividing the data set into a plurality of sub data sets, and broadcasting the plurality of sub data sets from a main node of the Hadoop distributed network framework to the plurality of sub nodes;

and training the neural network model by using the plurality of sub-data sets on the plurality of sub-nodes respectively.

4. The soil pollution prediction method of claim 3, wherein the data in the Hadoop distributed network framework are processed in parallel by Spark algorithm.

5. The soil contamination prediction method of claim 1, wherein the step of predicting a future soil contamination index for a particular area based on the error comprises:

calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm;

and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.

6. A soil pollution prediction system comprising:

a grey module configured to generate an effective number series for soil pollution prediction using a grey prediction model according to a past year soil pollution index of a specific area, the effective number series including a model reduction value and a generated number series value;

a neural network prediction module configured to train a neural network model using the effective number series, and predict a soil pollution index of a specific area using the trained neural network model, wherein the predicted soil pollution index includes a last year predicted soil pollution index corresponding to a last year soil pollution index;

a prediction correction module configured to calculate an error between the past year predicted soil pollution index and the past year soil pollution index, and predict a future soil pollution index of the specific area based on the error.

7. The soil contamination prediction system of claim 6, wherein the gray module is configured to: calculating the grade ratio of the past year pollution data; selecting data for building a grey prediction model from the previous year soil pollution data based on the grade ratio; establishing a gray prediction model by using the selected data; and generating a model reduction value and a generated sequence value by using a gray prediction model.

8. The soil contamination prediction system of claim 6, wherein the neural network model is deployed on a plurality of sub-nodes of a Hadoop distributed network framework,

9. The soil contamination prediction system of claim 8, wherein the neural network prediction module is configured to: and processing the data in the Hadoop distributed network framework in parallel through a Spark algorithm.

10. The soil contamination prediction system of claim 6, wherein the prediction correction module is configured to: calculating an initial state probability vector and a one-step transition probability matrix of the error through a Markov chain algorithm; and predicting the future soil pollution index of the specific area according to the initial state probability vector and the one-step transition probability matrix.