CN109242133B

CN109242133B - A data processing method and system for early warning of surface disasters

Info

Publication number: CN109242133B
Application number: CN201810754439.1A
Authority: CN
Inventors: 张晓明; 戴波; 陈亚峰; 曹国清
Original assignee: Beijing Institute of Petrochemical Technology
Current assignee: Beijing Institute of Petrochemical Technology
Priority date: 2018-07-11
Filing date: 2018-07-11
Publication date: 2022-03-22
Anticipated expiration: 2038-07-11
Also published as: CN109242133A

Abstract

The present invention provides a data processing method and system for early warning of surface disasters, wherein the data processing method for early warning of surface disasters includes: using a feature vector selection algorithm to select a first feature vector from n initial sample data, and the sample data is The monitoring data obtained by the sensor; the support vector machine regression algorithm including the insensitive loss function is used to learn the first feature vector and the first newly added sample data respectively to obtain the first prediction result and the second prediction result; according to The difference between the first prediction result and the first newly added sample data, and the difference between the second prediction result and the second newly added sample data, adjust the insensitive loss function; The support vector machine regression algorithm of the loss function learns the first feature vector and obtains the target prediction result. The data processing method for surface disaster early warning provided by the embodiment of the present invention can improve the accuracy of the prediction result.

Description

Data processing method and system for ground disaster early warning

Technical Field

The invention relates to the technical field of data processing, in particular to a data processing method and system for ground surface disaster early warning.

Background

The disaster early warning result can be obtained by analyzing and processing the monitoring information detected by the sensor, and in the actual application process, monitoring personnel and the like can take corresponding precautionary measures according to the disaster early warning result so as to prevent the occurrence of disasters or reduce the loss caused by disasters.

In the related art, the sensors for monitoring the ground surface disasters have multiple types (such as temperature sensors, humidity sensors, pressure sensors, displacement sensors and the like) and are numerous, so that ground surface disaster monitoring data have the characteristics of nonlinearity and high dimensionality, in order to reduce unnecessary sample training time, the offline sample size reduction of a large data set is carried out by utilizing the thought of a Feature Vector Selection (FVS) algorithm, and the Feature sample data based on the sample set is constructed, so that the purposes of reducing the calculation complexity, reducing the calculation time and outputting early warning information in time are achieved.

In the related art, a Support Vector Regression (SVR) algorithm is used to predict a disaster for the feature sample data obtained after the FVS algorithm is reduced.

However, in the actual application process, in the process of narrowing down the monitoring data by using the FVS algorithm, a large error exists between the feature sample data and the actual monitoring data, so that a large error exists between the disaster prediction result output by the SVR algorithm based on the feature sample data and the actual situation, and thus, it is known that the accuracy of the prediction result of the data processing method for surface disaster warning in the related art is low.

Disclosure of Invention

The embodiment of the invention provides a data processing method and a data processing system for ground surface disaster early warning, which aim to solve the problem of low accuracy of a prediction result of the data processing method for ground surface disaster early warning.

In order to achieve the above object, the present invention is realized by:

in a first aspect, an embodiment of the present invention provides a data processing method for ground surface disaster warning, where the method includes:

selecting a first feature vector from n initial sample data by adopting a feature vector selection algorithm, wherein the first feature vector comprises m sample data, n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired by a sensor;

learning the first feature vector and the first newly added sample data respectively by adopting a support vector machine regression algorithm comprising an insensitive loss function to obtain a first prediction result and a second prediction result, wherein the first newly added sample data is newly added sample data in a first prediction period;

adjusting the insensitive loss function according to a difference value between the first prediction result and the first newly added sample data and a difference value between the second prediction result and second newly added sample data, wherein the second newly added sample data is the sample data newly added in a second prediction period, and the second prediction period is later than the first prediction period;

and learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result.

In a second aspect, an embodiment of the present invention further provides an early warning data processing system, where the system includes:

the first selection module is used for selecting a first feature vector from n initial sample data by adopting a feature vector selection algorithm, wherein the first feature vector comprises m sample data, n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired through a sensor;

the first learning module is used for learning the first feature vector and the first newly added sample data respectively by adopting a support vector machine regression algorithm comprising an insensitive loss function to obtain a first prediction result and a second prediction result, wherein the first newly added sample data is sample data newly added in a first prediction period;

an adjusting module, configured to adjust the insensitive loss function according to a difference between the first prediction result and the first newly added sample data and a difference between the second prediction result and a second newly added sample data, where the second newly added sample data is sample data newly added in a second prediction period, and the second prediction period is later than the first prediction period;

and the second learning module is used for learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result.

In the embodiment of the invention, a first feature vector is selected from n initial sample data by adopting a feature vector selection algorithm, wherein the first feature vector comprises m sample data, n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired by a sensor; learning the first feature vector and the first newly added sample data respectively by adopting a support vector machine regression algorithm comprising an insensitive loss function to obtain a first prediction result and a second prediction result, wherein the first newly added sample data is newly added sample data in a first prediction period; adjusting the insensitive loss function according to a difference value between the first prediction result and the first newly added sample data and a difference value between the second prediction result and second newly added sample data, wherein the second newly added sample data is the sample data newly added in a second prediction period, and the second prediction period is later than the first prediction period; and learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result. Therefore, the prediction result can be verified according to newly added sample data after the prediction result, and the value of the insensitive loss function is adjusted according to the verification result so as to improve the accuracy of the prediction result of the support vector machine regression algorithm, thereby improving the accuracy of the prediction result of the data processing method for the earth surface disaster early warning.

Drawings

Fig. 1 is a flowchart of a data processing method for ground disaster warning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of early warning data processing provided by an embodiment of the invention;

fig. 3 is a schematic diagram of fuzzy subset partitioning of E1 and Δ E in a data processing method for ground disaster warning according to an embodiment of the present invention;

fig. 4 is a schematic diagram illustrating fuzzy subset partitioning of Δ ∈ in a data processing method for ground disaster warning according to an embodiment of the present invention;

fig. 5 is a flowchart of another data processing method for ground disaster warning according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a disaster early warning interface in an embodiment of the invention;

fig. 7 is a schematic diagram of a digital map in an embodiment of the invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

The data processing method for ground surface disaster early warning provided by the embodiment of the invention can be applied to vector machine regression learning of monitoring data detected by a sensor to obtain a prediction result, the ground surface disaster early warning result can be determined according to the prediction result, and monitoring personnel can take appropriate measures according to the ground surface disaster early warning result to prevent ground surface disasters, wherein the ground surface disasters can be disasters caused by movement of rocks, soil texture and the like on the ground surface, for example: the method comprises the steps of adopting various sensors to respectively acquire earth surface displacement information, internal displacement information, rainfall information, soil pressure information, soil water content information, pore water pressure information, temperature information, humidity information and the like of a dumping site, processing the information by adopting a data processing method for early warning of the earth surface disasters, predicting whether the dumping site has dangers such as debris flow and collapse, and obtaining a prediction result.

Referring to fig. 1, fig. 1 is a flowchart of a data processing method for ground disaster warning according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:

101, selecting a first feature vector from n initial sample data by using a feature vector selection algorithm, wherein the first feature vector comprises m sample data, n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired through a sensor.

The feature vector selection algorithm may select m sample data (xs1, xs2, … …, xsv) from n initial sample data (x1, x2, … …, xn), where v is greater than or equal to 1 and less than or equal to m, and the m sample data (xs1, xs2, … …, xsv) are the selected feature vectors FV.

Thus, given a set of feature vectors for sample data, any vector in the m sample data may be represented linearly by FV, facilitating learning in steps 102-104 using a support vector machine regression algorithm.

In addition, the monitoring data may be any one or more of pressure information acquired by a pressure sensor, temperature information acquired by a temperature sensor, humidity information acquired by a humidity sensor, and the like, for example: when the method is applied to the disaster monitoring process of the refuse dump, the monitoring data can be one or more of surface displacement information, internal displacement information, rainfall information, soil pressure information, soil moisture content information, pore water pressure information, temperature information and humidity information of the refuse dump. Therefore, the monitoring data has characteristics of large data volume and high dimensionality.

When the support vector machine regression algorithm is adopted to learn the monitoring data with large data quantity, a large amount of time is needed, so that the time for outputting a prediction result is too long, and the early warning effect cannot be achieved, and the data quantity of the monitoring data needs to be reduced.

In this step, the feature vector selection algorithm is adopted to reduce the initial sample data containing a large amount of sample data without changing the structure of the initial sample data, so that the operation time of the support vector machine regression algorithm in steps 102 to 104 on the first feature vector can be reduced, and the efficiency of the data processing method for ground surface disaster early warning can be improved.

Step 102, learning the first feature vector and the first newly added sample data respectively by adopting a support vector machine regression algorithm comprising an insensitive loss function to obtain a first prediction result and a second prediction result, wherein the first newly added sample data is sample data newly added in a first prediction period.

Wherein, the value of the insensitive loss function epsilon can influence the accuracy of the prediction result of the regression algorithm of the support vector machine. When the value of epsilon is too large, the efficiency and accuracy of the support vector machine regression algorithm will be reduced; when the value of epsilon is too small, an overfitting situation will occur, and the accuracy of the prediction result thereof will also be reduced.

The first prediction result is a prediction result obtained by performing a support vector machine regression algorithm on the first feature vector in a first prediction period, the second prediction result is a prediction result obtained by performing a support vector machine regression algorithm on the first newly added sample data in a second prediction period, and the second prediction period is later than the first prediction period.

Since the first prediction result and the second prediction result are not verified, the accuracy of the first prediction result and the second prediction result cannot be confirmed, for example: due to the inappropriate fitness value in the feature vector selection algorithm in step 101, the difference between the first feature vector and the initial sample data is too large, which causes an error between the actual situation and the first prediction result and the second prediction result.

Through the step, the first feature vector and the first newly added sample data can be learned by adopting a support vector machine regression algorithm, the first prediction result and the second prediction result can be obtained respectively, and an operation basis is provided for the step 103 and the step 104.

Step 103, adjusting the insensitive loss function according to a difference between the first prediction result and the first newly added sample data and a difference between the second prediction result and a second newly added sample data, wherein the second newly added sample data is the newly added sample data in a second prediction period, and the second prediction period is later than the first prediction period.

In the specific application process, the monitoring data of the refuse dump are continuously increased, and the prediction result obtained in the first prediction period can be verified according to the newly increased sample data obtained after the first prediction period.

In addition, if the prediction result obtained in the first prediction period and the newly added sample data obtained after the first prediction period have a large difference, and the second prediction result and the second newly added sample data have a large difference, it indicates that the value of epsilon in the support vector machine regression algorithm may be inappropriate, and the accuracy of the prediction result of the support vector machine regression algorithm is higher by adjusting the value of epsilon.

Specifically, the insensitive loss function can be adjusted by a fuzzy control method.

For example, as shown in FIG. 2, E1 and Δ E are input to a fuzzy controller, which outputs an increment Δ ε of ε to adjust the value of ε. The fuzzy controller divides the fuzzy variables of E1 and Δ E into 7 fuzzy subsets { NL, NM, NS, ZE, PS, PM, PL } as shown in FIG. 3. According to empirical rules and sample calculation analysis, E1 and Δ E each include membership as shown in FIG. 3. The fuzzy controller divides Δ ε into 5 fuzzy subsets { NB, NS, ZE, PS, PB } as shown in FIG. 4, and establishes fuzzy rules as shown in Table 1 based on the changing relationship between input E1 and Δ E and output Δ ε:

TABLE 1

(ΔE，U，E1)

NL

NM

NS

ZE

PS

PM

PL

NL

NB

ZE

NS

ZE

NM

NB

PS

ZE

NS

NB

NS

PB

ZE

NS

ZE

NB

NS

ZE

PB

ZE

NS

NB

PS

NS

ZE

PB

NS

NB

PM

NS

ZE

PS

NB

PL

ZE

NS

ZE

NB

Where the first row represents the respective fuzzy subset to which E1 belongs, the first column represents the respective fuzzy subset to which Δ E belongs, and U represents the fuzzy rule, i.e. the different fuzzy subsets to which Δ ∈ is adapted according to the fuzzy rule. For example, when E1 belongs to the NL fuzzy subset and Δ E belongs to the PL fuzzy subset, Δ ∈ will be adjusted in the direction of the ZE fuzzy subset according to the fuzzy rule.

In addition, when E1 is in the NL interval or PL interval and Δ E indicates that the error value of E2 is greater than the error value of E1, the output value of Δ ε is adjusted to decrease the value of ε.

Of course, the number of the fuzzy subsets of E1, Δ E, and Δ ∈ may be other numbers such as 3, 4, etc., which is not limited herein.

It should be noted that the offline SVR model shown in fig. 2 is a first feature Vector obtained by reducing initial sample data through a feature Vector selection algorithm, and the first feature Vector is input to a learning machine to perform Support Vector Regression (SVR) offline learning.

In this step, whether the value of epsilon is appropriate can be determined by comparing the difference between the first prediction result and the first newly added sample data with the difference between the second prediction result and the second newly added sample data, and the value of epsilon is adjusted under the condition that the value of epsilon is not appropriate, so as to improve the accuracy of the prediction result of the support vector machine regression algorithm, thereby improving the accuracy of the prediction result of the data processing method for earth surface disaster early warning.

And 104, learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result.

It should be noted that, in the actual application process, since the monitored data may be continuously updated, the target prediction result may be a prediction result obtained by learning the first feature vector and/or the first newly added sample data and/or the second newly added sample data by using a support vector machine regression algorithm including an adjusted insensitive loss function, names of the first feature vector, the first newly added sample data and the second newly added sample data may be different, and roles of the first feature vector, the first newly added sample data and the second newly added sample data may change over time, for example: if the current time can be 08:01, the detection data acquired at the time 08:00 is new sample data, but if the current time is 09:00, the detection data acquired at the time 08:00 becomes history data.

Of course, the target prediction result may be a prediction result obtained by performing a support vector machine regression algorithm on sample data newly added after the second prediction period.

In this step, the first feature vector and the first newly added sample data are relearned by using a support vector machine regression algorithm including the adjusted epsilon, so that the accuracy of the obtained target prediction result is higher.

Referring to fig. 5, fig. 5 is a flowchart of another data processing method for ground disaster warning according to an embodiment of the present invention, as shown in fig. 5, the method includes the following steps:

step 501, selecting a first feature vector from n initial sample data by using a feature vector selection algorithm, where the first feature vector includes m sample data, n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired by a sensor.

Step 502, learning the first feature vector and the first newly added sample data respectively by using a support vector machine regression algorithm including an insensitive loss function to obtain a first prediction result and a second prediction result, wherein the first newly added sample data is sample data newly added in a first prediction period.

Optionally, before the step of learning the first new sample data by using a support vector machine regression algorithm including an insensitive loss function, the method further includes:

selecting a second feature vector from the first newly added sample data by adopting a feature vector selection algorithm under the condition that the quantity of the sample data contained in the first newly added sample data is greater than a preset value, wherein the quantity of the sample data contained in the second feature vector is less than the quantity of the sample data contained in the first newly added sample data;

the step of learning the first newly added sample data by adopting a support vector machine regression algorithm comprising an insensitive loss function comprises the following steps:

and performing online learning on the second feature vector by adopting a support vector machine regression algorithm comprising an insensitive loss function to obtain a second prediction result.

Wherein, the preset value can be set according to the requirement, for example: and under the condition that the number of the first newly added sample data is larger than any numerical value such as 200, 500, 1000 and the like, selecting a second feature vector from the first newly added sample data by adopting a feature vector selection algorithm, wherein the number of samples in the second feature vector is smaller than that of the samples in the first newly added sample data.

In addition, as shown in fig. 2, when the number of sample data included in the first newly added sample data is less than or equal to the preset value, the point-by-point learning may be performed on each sample data in the first newly added sample data, so that the accuracy of machine learning may be improved when the number of sample data included in the first newly added sample data is small.

In this embodiment, when the number of the first newly added sample data is too large, the number of the samples of the first newly added sample data may be reduced by using the feature vector selection algorithm, so as to reduce a large amount of time consumed for learning and verifying the first newly added sample data by using the support vector machine regression algorithm in the subsequent steps.

Step 503, adjusting the insensitive loss function according to a difference between the first prediction result and the first newly added sample data and a difference between the second prediction result and a second newly added sample data, where the second newly added sample data is the newly added sample data in a second prediction period, and the second prediction period is later than the first prediction period.

Optionally, the step 503 may further include the following specific steps:

obtaining an absolute error E1 between the first prediction result and the sample data generated first in the first newly added sample;

acquiring an absolute error E2 between the second prediction result and the sample data generated first in the second newly-added sample;

and adjusting the insensitive loss function according to a difference delta E between E1 and E2, wherein if E1 is greater than or equal to a preset error value and delta E represents that E2 is greater than E1, the insensitive loss function is reduced.

In the process of acquiring the newly added sample data, failure conditions such as network signal disconnection, server crash and the like may occur, so that the newly added sample data is not acquired in time, and the newly added data is accumulated. This results in a large amount of sample data contained in the first and second newly added sample data.

Therefore, the process of learning the first newly-added sample data and the second newly-added sample data containing a large amount of sample data by the support vector machine regression algorithm takes a large amount of time, so that the time of a target prediction result obtained by the early warning data processing algorithm is too large to be different from the time of an actual situation, and the early warning effect cannot be achieved.

In addition, the generation time of the newly added sample data in the first prediction period is later than that of the initial sample data. The first prediction result is obtained according to the first eigenvector, i.e. the reduced initial sample data, and the first newly added sample is the newly added sample in the first prediction period. In this way, when the first newly added sample includes a plurality of sample data, by comparing the sample data generated first in the first newly added sample with the first prediction result, it is possible to ensure that the sample data closest in time to the first prediction result is used for comparison with the first prediction result, and it is possible to ensure the accuracy of the verification result.

In addition, the second new sample is data which is newly added in the second prediction period, and the second prediction period is later than the first prediction period. And the second prediction result is a prediction result obtained according to the first newly added sample data two. In this way, in the case that the second newly added sample includes a plurality of sample data, by comparing the sample data generated first in the second newly added sample with the second prediction result, it can be ensured that the sample data closest in time to the second prediction result is used for comparison with the second prediction result, and the accuracy of the verification result can be ensured.

In this embodiment, when the first newly added sample data or the second newly added sample data contains a large amount of sample data, the number of samples of the first newly added sample data or the second newly added sample data is reduced, so as to improve the processing time of the data processing method for surface disaster early warning and improve the early warning efficiency.

And 504, learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result.

And 505, determining a disaster early warning result according to the magnitude relation between the target prediction result and a preset threshold value.

The preset threshold may include a plurality of thresholds, and the disaster early warning result may also include early warning levels corresponding to the preset thresholds one to one, for example: when the target prediction result is the soil pressure value of the refuse dump, obtaining a disaster early warning result of early warning level 1 under the condition that the soil pressure value is greater than a first preset threshold value; and when the soil pressure value is larger than a second preset threshold value, obtaining a disaster early warning result of early warning level 2, wherein the first preset threshold value is different from the second preset threshold value.

In the embodiment of the invention, the target prediction result is compared with the preset threshold value to determine whether the target prediction result represents the risk of disaster occurrence, so that a more visual disaster early warning result is obtained, and monitoring personnel can conveniently acquire the disaster early warning result.

Optionally, the data processing method for ground surface disaster early warning provided by the embodiment of the present invention may be applied to monitoring of a waste dump disaster, and specifically, the monitoring of the waste dump disaster may include the following steps:

acquiring a geographical monitoring map of a refuse dump;

displaying a disaster early warning interface, wherein the disaster early warning interface comprises a geographical distribution window and an early warning information window displayed on the geographical distribution window;

the early warning information window displays a target prediction result obtained in the data processing method embodiment adopting the earth surface disaster early warning, the geographic distribution window displays a geographic monitoring graph of the refuse dump, and monitoring point identifiers which are arranged in one-to-one correspondence with the installation positions of a plurality of monitoring sensors arranged in the refuse dump are displayed on the geographic monitoring graph.

The geographical monitoring view can be any monitoring view of the refuse dump, such as a monitoring camera, a GPS satellite view, an unmanned aerial vehicle aerial view and the like.

The step of obtaining the geographical monitoring map of the refuse dump may be to obtain the implementation geographical monitoring map of the refuse dump from the server storing the geographical monitoring map in a web page transmission manner.

The above-mentioned early warning information window may also be referred to as a "latest early warning information" window, and the target prediction result and/or the disaster early warning result obtained in the above-mentioned data processing method for surface disaster early warning may be displayed in the window.

For example, the target prediction result may be a calculation result in the latest warning information window 602 as shown in fig. 6, and the disaster warning result may be a comprehensive warning level in the latest warning information window 602 as shown in fig. 6.

In addition, the plurality of monitoring sensors arranged in the refuse dump can be respectively used for detecting earth surface displacement information, internal displacement information, rainfall information, soil pressure information, soil water content information, pore water pressure information, temperature information, humidity information and the like of the refuse dump, and the detected data can be used for judging whether the refuse dump has risks of debris flow, collapse and the like.

In addition, the monitoring point identifier can adopt figures or characters with different appearances according to the type of the corresponding monitoring sensor so as to be convenient for identification.

When the danger exists, the target prediction result with the risk can be displayed on the early warning information window by adopting the data processing method for the surface disaster early warning, so that monitoring personnel can take measures to prevent the disaster in time.

In the embodiment, by acquiring the geographical monitoring map of the refuse dump, monitoring personnel can check the site state of the refuse dump in time, so that disaster danger can be found in time. In addition, the geographical distribution window and the early warning information window are displayed simultaneously, so that monitoring personnel can conveniently check early warning information and a monitoring map of a site simultaneously, and in addition, as the target prediction result obtained by the data processing method for the surface disaster early warning has the advantages of short calculation time and high accuracy, the target prediction result displayed in the early warning information window in the step also has the advantages of high response speed and high accuracy.

Optionally, as shown in fig. 6, the geographic distribution window 601 is displayed on a bottom layer, the warning information window 602 is displayed on the geographic distribution window, and a plurality of monitoring point identifiers are further displayed on the geographic distribution window 601. Under the condition that a first operation aiming at a target monitoring point identifier 604 is detected, displaying an information window 603 of a target monitoring sensor corresponding to the target monitoring point identifier 604;

an information window 603 of the target monitoring sensor is displayed on the geographic distribution window 601, and the information window 603 of the target monitoring sensor displays at least one of the name, the model, the monitoring data of the target monitoring sensor and the monitoring chart of the target monitoring sensor.

The monitoring data may be specific values detected by the target monitoring sensor, for example: a pressure value detected by the soil pressure monitoring sensor, and the like.

In the embodiment, monitoring personnel can conveniently check the data and the field monitoring view of the monitoring sensor arranged at each monitoring point at any time, thereby being beneficial to timely finding out the monitoring sensor with a fault, timely eliminating the fault and ensuring the accuracy of the method for processing the monitoring data of the refuse dump.

Referring to fig. 7, fig. 7 is a structural diagram of an early warning data processing system according to an embodiment of the present invention, where the system 700 includes:

a first selecting module 701, configured to select, by using a feature vector selection algorithm, a first feature vector from n initial sample data, where the first feature vector includes m sample data, where n is a positive integer, m is an integer smaller than n, and the sample data is monitoring data acquired by a sensor;

a first learning module 702, configured to learn the first feature vector and the first newly added sample data respectively by using a support vector machine regression algorithm including an insensitive loss function to obtain a first prediction result and a second prediction result, where the first newly added sample data is sample data newly added in a first prediction period;

an adjusting module 703, configured to adjust the insensitive loss function according to a difference between the first prediction result and the first newly added sample data and a difference between the second prediction result and a second newly added sample data, where the second newly added sample data is sample data newly added in a second prediction period, and the second prediction period is later than the first prediction period;

a second learning module 704, configured to learn the first feature vector by using a support vector machine regression algorithm that includes the adjusted insensitive loss function, so as to obtain a target prediction result.

Optionally, the system further includes:

a second selecting module, configured to select a second eigenvector from the first newly added sample data by using an eigenvector selection algorithm when the number of sample data included in the first newly added sample data is greater than a preset value, where the number of sample data included in the second eigenvector is smaller than the number of sample data included in the first newly added sample data;

the first learning module 702 is further configured to perform online learning on the second feature vector by using a support vector machine regression algorithm including an insensitive loss function, so as to obtain a second prediction result.

Optionally, the adjusting module 703 includes:

a first obtaining unit, configured to obtain an absolute error E1 between the first prediction result and sample data that is generated first in the first newly added sample;

a second obtaining unit, configured to obtain an absolute error E2 between the second prediction result and sample data that is generated first in the second newly added sample;

and the adjusting unit is used for adjusting the insensitive loss function according to a difference value delta E between E1 and E2, wherein if E1 is greater than or equal to a preset error value and delta E represents that E2 is greater than E1, the insensitive loss function is reduced.

Optionally, the system further includes:

and the determining module is used for determining a disaster early warning result according to the size relation between the target prediction result and a preset threshold value.

According to the data processing method and the device for the earth surface disaster early warning, the steps in the data processing method for the earth surface disaster early warning can be realized, the same beneficial effects can be achieved, and redundant description is not needed here to avoid repetition.

In the several embodiments provided in the present application, it should be understood that the disclosed method and system may be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the processing method of the information data block according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A data processing method for ground disaster early warning is characterized by comprising the following steps:

selecting a first characteristic vector from n initial sample data by adopting a characteristic vector selection algorithm, wherein the first characteristic vector comprises m sample data, n is a positive integer, m is an integer smaller than n, the sample data is monitoring data acquired by a sensor, and the monitoring data is earth surface displacement information, internal displacement information, rainfall information, soil pressure information, soil moisture content information, pore water pressure information, temperature information and humidity information of a refuse dump;

learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result;

determining a disaster early warning result of the refuse dump according to the size relation between the target prediction result and a preset threshold value;

wherein the step of adjusting the insensitive loss function according to the difference between the first prediction result and the sample data generated first in the first newly added sample data and the difference between the second prediction result and the second newly added sample data comprises:

2. The method of claim 1, wherein prior to the step of learning the first new sample data using a support vector machine regression algorithm including an insensitive penalty function, the method further comprises:

3. An early warning data processing system, the system comprising:

the system comprises a first selection module, a second selection module and a third selection module, wherein the first selection module is used for selecting a first characteristic vector from n initial sample data by adopting a characteristic vector selection algorithm, the first characteristic vector comprises m sample data, n is a positive integer, m is an integer smaller than n, the sample data is monitoring data acquired by a sensor, and the monitoring data is earth surface displacement information, internal displacement information, rainfall information, soil pressure information, soil water content information, pore water pressure information, temperature information and humidity information of a refuse dump;

the second learning module is used for learning the first feature vector by adopting a support vector machine regression algorithm comprising the adjusted insensitive loss function to obtain a target prediction result; the determining module is used for obtaining a refuse dump disaster early warning result according to the size relation between the target prediction result and a preset threshold value;

wherein the adjustment module comprises:

4. The system of claim 3, further comprising:

the first learning module is further used for learning the second feature vector on line by adopting a support vector machine regression algorithm including an insensitive loss function to obtain a second prediction result.