CN112950047A - Progressive identification method for suspected contaminated site - Google Patents

Progressive identification method for suspected contaminated site Download PDF

Info

Publication number
CN112950047A
CN112950047A CN202110290595.9A CN202110290595A CN112950047A CN 112950047 A CN112950047 A CN 112950047A CN 202110290595 A CN202110290595 A CN 202110290595A CN 112950047 A CN112950047 A CN 112950047A
Authority
CN
China
Prior art keywords
data
pollution
technology
land
suspected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110290595.9A
Other languages
Chinese (zh)
Inventor
周睿
杨典华
展明旭
王彩云
朱云翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingshi Tianqi Beijing Technology Co ltd
Original Assignee
Jingshi Tianqi Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingshi Tianqi Beijing Technology Co ltd filed Critical Jingshi Tianqi Beijing Technology Co ltd
Priority to CN202110290595.9A priority Critical patent/CN112950047A/en
Publication of CN112950047A publication Critical patent/CN112950047A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Primary Health Care (AREA)
  • Astronomy & Astrophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for gradually identifying a suspected polluted site. Based on the characteristics of basic attributes, surface textures, spatial distribution and the like of the current industrial land in China, the research of gradually identifying the suspected polluted site is developed by combining a big data technology with a remote sensing technology and a pollution potential model technical means developed by the company, facing the industrial land, so as to realize the purpose of rapidly screening and distinguishing the suspected polluted site on a large scale. The invention optimizes the traditional method based on-site investigation and sampling in China, and greatly saves the high cost generated by on-site investigation and sampling of suspected pollution. By using the suspected polluted site result identified by the method, a time-space database of the suspected polluted site can be effectively filled, and beneficial supplement is provided for a management decision-making department to fully master the local suspected polluted site list.

Description

Progressive identification method for suspected contaminated site
Technical Field
The invention relates to the field of data identification, in particular to a method for gradually identifying a suspected polluted site.
Background
Along with the development of social economy, the quantity of construction land in China is continuously increased, the trend of wide distribution and increase of potential pollution risk is presented, and the harmfulness, the safety of ecological environment and the quality of life of people are more and more displayed. In addition, the traditional method for identifying the polluted site relies on site investigation and sampling, so that time and labor are consumed, and decision efficiency is seriously influenced.
With the advent of the big data era, data has become a national fundamental strategic resource, and has important influence on production, consumption and national governance capacity. The application of big data technology in the fields of energy, education, scientific research, manufacturing, finance, electronic government affairs, enterprise operation management, information management and the like is increasingly wide, and new opportunities and development are brought to the environmental field. The system has the capacity of storing and processing mass information, so that the sources and types of data information are greatly expanded, and the system has obvious advantages in solving the problem of complex pollution by using modern technical means such as data mining, artificial intelligence, analog simulation, correlation analysis and the like. A large amount of historical data, pollution emergency news and statistical data about industrial sites on the network; pollution enterprise data, dynamically updated monitoring data, remote sensing data, hydrological weather, land utilization, soil types and the like published by each department. The massive information is integrated, and a data basis can be provided for the progressive identification of suspected polluted sites. The remote sensing is a new technology, is deeply applied to the work and life of human beings at present, plays an increasingly important role in many fields such as agriculture, forestry, geology, geography, ocean, hydrology, meteorology, surveying and mapping, environmental protection and military reconnaissance, provides a new way for human beings to know the state and soil, develop resources, monitor the environment, research disasters and environmental protection, and provides important information for solving a series of serious challenges such as resource shortage, environmental deterioration, population dramatic increase, frequent disasters and the like faced by human beings. The spatial resolution of data has been developed from kilometer level to sub-meter level, the repeated observation frequency has been developed from monthly cycle to several hours, the spectral resolution has been developed from multi-band to hyper-spectrum, and the remote sensing data acquisition technology is moving towards real-time and accuracy. The identification technology of suspected polluted sites based on remote sensing data is developed rapidly from traditional visual interpretation, automatic interpretation based on pixels, emerging object-oriented classification, intelligent expert system, deep learning and the like.
On the basis of a big data technology and a remote sensing technology, the pollution potential model is combined for simulation, the pollution potential of the industrial land is analyzed and calculated, and technical support can be provided for the gradual identification of a suspected polluted site.
The method aims at solving the series problems that the suspected polluted site is identified by seriously depending on a field investigation sampling method, the time and the labor are wasted, and the like at present.
Disclosure of Invention
In order to solve the defects of the technology, the invention provides a method for gradually identifying a suspected polluted site.
In order to solve the technical problems, the invention adopts the technical scheme that: a method for gradually identifying a suspected polluted site comprises the following steps:
acquiring multi-source heterogeneous data related to industrial land disclosed by a network by using a big data technology, and processing and fusing the multi-source heterogeneous data;
step two, screening the industrial land by combining the remote sensing technology, segmenting original image data by using an object-oriented recognition technology, and automatically learning effective characteristics from a training set by using a deep learning technology;
acquiring industrial land indexes based on a big data technology and a remote sensing technology, inputting a pollution potential model, calculating a pollution potential value of a single industrial land block, and screening out a suspected polluted site;
and step four, importing the calculated pollution potential value of the industrial land into an industrial land information database, and verifying the result to confirm the location of the pollution field.
And further, a big data technology comprises a big data acquisition technology and a big data processing technology, the big data acquisition technology and the web crawler technology are used for crawling the multi-source heterogeneous data related to the industrial land disclosed by the network, and the big data processing technology is used for processing and fusing the multi-source heterogeneous data.
And further, the means of the data acquisition technology in the second step comprise satellite remote sensing, a sensor, radio frequency identification, the Internet of things and a mobile platform.
Further, the big data processing technology comprises the following steps of processing and fusing multi-source heterogeneous data: data storage, data preprocessing, data deep processing and data mining.
Furthermore, screening the selected and crawled industrial land by combining a remote sensing technology in the step two, establishing a remote sensing image sample library, and segmenting original image data by adopting a multi-scale image segmentation by applying an object-oriented identification technology to realize the transition from pixel-based remote sensing image classification to object-based remote sensing image classification; the deep learning technology simulates the process of processing data of the human brain through network learning, adopts a pre-trained deep learning model to construct a network basic convolution layer and a pooling layer, then utilizes a sample to train the model, extracts the characteristics of various sensitive land, and then utilizes a precision evaluation function to optimize the model to obtain a final deep learning model.
Furthermore, the data storage is based on Hadoop technology expansion and encapsulation, and multi-source heterogeneous data acquired by a crawling network are stored into a uniform local data file and are stored in a structured mode; the data preprocessing is used for carrying out data cleaning on multi-source heterogeneous data and converting the multi-source heterogeneous data into a single or conveniently processed structure, the data cleaning comprises missing value processing, noise data processing and inconsistent data processing, and the missing value processing adopts global constants, attribute mean values and possible value filling or directly ignores the data processing method; the noise data processing adopts a method of box separation, clustering, computer manual inspection and regression to remove noise; inconsistent data processing adopts manual correction, and after data cleaning, all data information is integrated into a set of data list taking enterprise land as an object; the deep data processing comprises machine learning, an intelligent algorithm, statistical analysis and system modeling, the deep data processing combines the cleaned data list of the enterprise land with the geographic information big data to further obtain the geographic attribute characteristics of the single enterprise land, and enriches the content of the enterprise land as the object list; the data mining adopts a Meta analysis method, a mining method based on data driving and a model-data fusion method based on a process mechanism to integrate and mine big data to obtain scientific, fusion and effectiveness information, a data list of abundant enterprise land is analyzed by means of correlation analysis, regression analysis, cluster analysis and principal component analysis according to the condition of production or outage, the current scale and production and pollution discharge information of a single enterprise land, driving force factors formed by a single industrial land are analyzed, spatial distribution hot areas of all industrial land blocks in an area are formed, and an industrial land forming mechanism is analyzed from point to surface in a hierarchical scale mode.
Further, the pollution potential model in the third step comprises model construction, an index system, a quantification standard and summary assigning.
Further, the index system is based on a pollution diffusion basic path, a pollution source, a transmission path and receptor bearing, and the index system including a pollution potential characteristic dimension, a transmission path dimension and a receptor risk dimension is constructed; the pollution potential characteristic dimensions comprise enterprise scale, industry category, production age, registered capital, pollutant types, whether a national pollution source is used, whether a full-aperture heavy metal industry enterprise is used, whether a hazardous waste operation license is provided, whether a local retrieval land soil pollution risk management and control and repair directory is provided, whether a pollution discharge license is provided, whether an abnormal operation record is provided, whether a media report pollution event is provided, administrative penalty and environmental protection penalty are provided; the dimensions of the transmission path comprise terrain gradient, soil texture, soil pH value, soil medium, underground water burial depth, in-water transfer coefficient and net infiltration amount; the receptor risk dimensions include population density, whether there are towns within 1km of the boundary, whether there are water conservation sites, whether there are ecological red line areas, whether there are basic farmland conservation areas, whether there are sensitive receptors.
Further, the quantification standard estimates the original scores represented by various industry categories according to the pollution potential characteristic dimension, the transmission path dimension and the receptor risk dimension, wherein the pollution potential characteristic dimension is the objective quantified data according to the industry categories, and then according to the average release amount of the local high-pollution potential industry category characteristic pollutants to the environment and the professional judgment of experts; adjusting according to industry classes, and further adjusting the class basis of each industry by combining with the process flow of an enterprise, potential pollution facilities and the current state adjustment factor of a factory floor so as to truly reflect the actual pollution risk potential of the target construction land; judging according to the scale, the service life and the main body change coefficient; the dimension of the transmission path is to determine the risk of the pollutants according to the speed of the pollutants passing through the environment medium or the length of the residence time of the pollutants in the transmission path, and evaluate the vulnerability of the polluted potential of the groundwater environment.
Further, the summarizing and assigning is to assign values to data results obtained through a big data technology and a remote sensing technology one by one according to weights in an index system, obtain pollution potentiality scores of industrial land after weighted calculation, and carry out field investigation on suspected pollution sites with high risk levels to determine whether the sites are pollution sites.
The invention discloses a method for progressively identifying a suspected polluted site, which is used for developing progressive identification research of the suspected polluted site by using technical means of large-data high-resolution remote sensing images and pollution potentiality models based on the characteristics of basic attributes, surface textures, spatial distribution and the like of the current industrial site in China and achieving the purpose of rapidly screening and distinguishing the suspected polluted site on a large scale. Firstly, the suspected contaminated site progressive identification method improves the mode of combining on-site sampling monitoring in the traditional way in China, and greatly saves the cost for acquiring the information of the suspected contaminated site. Secondly, the suspected polluted site result identified by the method can effectively fill a suspected polluted site space-time database, and beneficial supplement is provided for a management decision department to fully master the local suspected polluted site name list.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments.
A suspected contaminated site progressive identification method comprises the steps of firstly, acquiring data by using a big data acquisition technology, crawling multi-source heterogeneous data on an industrial land disclosed by a network by using a web crawler, and processing and fusing the multi-source heterogeneous data by using a big data processing technology; the means of the data acquisition technology comprises various types of structured, semi-structured and unstructured data acquired by satellite remote sensing, sensors, radio frequency identification, Internet of things and mobile platform technical means; the web crawler is a program or script which automatically captures world wide web information according to a certain rule, an initial URL is given to the crawler, the crawler extracts and stores resources required to be extracted from a webpage, simultaneously extracts other website links existing in the website, receives website response and analyzes the webpage again after sending a request, extracts and stores the required resources, and extracts and stores the required resources from the webpage; big data processing technique combines hydrometeorology, land utilization, geology soil condition to process the heterogeneous data of multisource and fuses the processing, and big data processing technique function is mainly for analysis and processing data to obtain useful information, based on characteristics such as the storage of big data technique, characteristics such as processing, analysis, integration, divide into four steps with big data processing flow: data storage, data preprocessing, data deep processing and data mining.
The data storage is the basis of big data processing, has good compatibility, expansibility, advancement, stability and safety, and can meet different requirements under complex conditions, the data storage is based on Hadoop technical expansion and encapsulation, can meet the processing requirements on non-structural, semi-structural data processing, complex ETL flow and complex data mining and calculation models, and multi-source heterogeneous data acquired by a crawling network is stored into a uniform local data file and is stored in a structured mode;
due to the fact that data sources are complex in structure and various in types, collected original data need to be cleaned, filled, smoothed, combined, normalized, checked for consistency and the like, and disordered data are converted into a relatively single configuration which is convenient to process, so that a foundation is laid for later-stage data analysis. Data preprocessing adopts methods such as Data Stage, Data Flux, information Power Center and the like to clean source Data and convert the source Data into a single or conveniently processed structure, the Data cleaning comprises missing value processing, noise Data processing and inconsistent Data processing, and the missing value processing adopts global constants, attribute mean values, possible value filling or directly ignores the Data method processing; the noise data processing adopts a method of box separation, clustering, computer manual inspection and regression to remove noise; inconsistent data processing adopts manual correction, and after data cleaning, all data information is integrated into a set of data list taking enterprise land as an object;
after data preprocessing, further deeply processing the data, wherein the deeply processing comprises machine learning, intelligent algorithm, statistical analysis and system modeling, and the deeply processing of the data further combines a data list of the cleaned enterprise land with geographic information big data, such as POI hot spots, topographic landforms, soil geology and the like, to further obtain the geographic attribute characteristics of the single enterprise land and enrich the content of the enterprise land as an object list;
after data is deeply processed, in order to centralize information hidden in a large amount of disordered data, extract and refine the information to find out potential useful information and the process of the internal rule of a researched object, a Meta analysis method, a mining method based on data driving and a model-data fusion method based on process mechanism are adopted to integrate and mine the big data to obtain scientific, fusion and effective information; in the link, the abundant enterprise land data list is used for analyzing driving force factors formed by a single enterprise land according to information such as the single enterprise land production or outage condition, the scale current situation, the production and pollution discharge and the like by means of correlation analysis, regression analysis, cluster analysis, principal component analysis and the like, forming spatial distribution hot areas of all the enterprise land used in the area, and analyzing an enterprise land forming mechanism from point to surface in a hierarchical scale mode.
Secondly, screening the selected and crawled industrial land by combining a remote sensing technology, wherein in the selection of remote sensing images, because the breadth of the country is wide and the regional difference is obvious, the coverage areas of different satellite sensors and the climate difference between the south and the north of the country need to be considered in the research process, and the factors such as vegetation, accumulated snow, rainfall, cloud cover and the like mainly need to be considered; in southern areas such as Jiangsu province, Yunnan province and the like, due to the fact that the summer is long, rainwater and clouds are more, and images with good quality are difficult to obtain, therefore, the image data are generally selected from 11 months to 3 months of the next year; in northern areas such as Beijing City and inner Mongolia autonomous region, because of less influence of weather, remote sensing data in the whole year range can be selected, but the influence of vegetation on buildings and the influence of accumulated snow on all sensitive land are considered, and the time period in which the accumulated snow is melted and the vegetation is not covered is preferably selected, such as 2 months end to 5 months beginning every year; taking a research area of an old city in Yunnan province as an example, by checking a land observation satellite data service platform, it is found that images of the old city are not contained in the PMS sensors of the GF-1 satellite in 2018 and 2019; the PMS sensor of the GF-2 satellite does not cover the remote sensing image of the individual city in 2018, and obtains the remote sensing image 31 scene covering the individual city from 13 days of 1 month in 2019 to 5 days of 6 months in 2019, wherein 11 scenes are good in quality and mainly focus on 1 month and 2 months; the PMS sensor of the GF-6 satellite acquires 16 scenes of remote sensing images covering old cities from 7 months and 4 days in 2018 to 5 months and 3 days in 2019, wherein the images with better quality and less cloud cover are 5 scenes, and the acquisition time is 11 months end in 2018 to 2 months end in 2019; the MUX sensor of the ZY3 satellite acquires 22 scenes of remote sensing images covering old cities from 1 month and 4 days in 2018 to 5 months and 1 day in 2019, wherein the quality of the images is 10 scenes, and the acquisition time is mainly concentrated at the beginning of 3 months; therefore, for the old city in Yunnan province, by 6 th of the year 2019, a GF-6 satellite is selected, and the remote sensing image data with the time phase from 11 th of the year 2018 to 3 rd of the year 2019 is more suitable for establishing a remote sensing image sample library;
taking petrochemical industry enterprises as an example, the establishment of the sample is mainly to establish the corresponding relation between various ground objects and the satellite image characteristics according to the hue, color, shape, shadow, texture, size, spatial position, figure and the like of the remote sensing image; obtaining various sensitive site pictures through manual site survey, obtaining the spatial position of an industrial site through the manual site survey and POI data by a visual interpretation method, marking the image data to obtain a sample of the industrial site on a remote sensing image, and establishing a remote sensing interpretation sample feature library of the industrial site by extracting spectral features, geometric structures, textural features, other self-defined features and the like of various samples; randomly extracting 70% of samples in a sample library as a training set, developing the research of a suspected polluted site by a deep learning technology, and performing effectiveness evaluation on the model by using the remaining 30% of samples as a test set;
the method comprises the steps of segmenting original image data by using an object-oriented identification technology and adopting multi-scale image segmentation to realize transition from pixel-based remote sensing image classification to object-based remote sensing image classification, and automatically learning effective characteristics from a training set by using a deep learning technology; the object-oriented recognition technology is based on an object classification method, on the basis of image segmentation, the spectral feature, the texture feature, the shape feature, the spatial geometric attribute information and the semantic information of the image are utilized to perform image processing, the image object generated by the image segmentation is taken as a research object, the image segmentation is the process of generating the image object, namely, the image is segmented into a plurality of objects which are homogeneous, continuous in space and have specific thematic significance by an image segmentation algorithm, the multi-scale image segmentation is based on the region merging of the minimum heterogeneity principle of the combination of the spectral feature and the shape feature, the difference criterion of each pair of spatial adjacent regions is calculated, the minimum difference criterion is determined to be a threshold value, and if the difference criterion value of the two adjacent regions is equal to the threshold value, the two adjacent regions are merged; the deep learning technology simulates the process of processing data by the human brain through network learning, the essential characteristics of the data are obtained through a deep network, the high-level characteristics are expressed through integration of low-level characteristics, a pre-trained deep learning model is adopted to construct a network basic convolution layer and a pooling layer, then the model is trained through a sample, the characteristics of various sensitive lands are extracted, and then the model is optimized through a precision evaluation function, so that a final deep learning model is obtained.
Then, obtaining an industrial land index based on a big data technology and a remote sensing technology, inputting a pollution potential model, and calculating a pollution potential value of the industrial land, thereby realizing gradual identification of a suspected polluted site; the pollution potential model comprises model construction, an index system, a quantification standard and summary assigning.
The pollution potential model is the key point and the difficulty in the technology. The following data are referred to in the initial stage of model construction: (1) the domestic related technical specification is as follows: a construction land soil pollution risk assessment technical guide rule, a construction land soil pollution condition investigation technical guide rule and the like; (2) the domestic related method comprises the following steps: the ecological environment ministry and economic policy research center "guidelines for evaluating liability and insurance risk of environmental pollution"; (3) international related empirical reference: analyzing annual underground water and soil Release amount and Human Toxicity Potential (HTP) of enterprises in various industries and categories of the United states Federal toxic substance Release Inventory (TRI for short) database; a classification/screening measurement mode for soil and groundwater pollution potential evaluation and survey plan in international factory; EU environmental responsibility orders reinforce insurance sustainable development solutions, conceptual models of pollution caused by professional activities, and the like; the international reinsurance personnel and environment damage liability insurance risk assessment practice;
the index system is based on a basic pollution diffusion path, a pollution source, a transmission path and receptor bearing, and the index system including a pollution potential characteristic dimension, a transmission path dimension and a receptor risk dimension is constructed; the pollution potential characteristic dimensions comprise enterprise scale, industry category, production age, registered capital, pollutant types, whether a national pollution source is used, whether a full-aperture heavy metal industry enterprise is used, whether a hazardous waste operation license is provided, whether a local retrieval land soil pollution risk management and control and repair directory is provided, whether a pollution discharge license is provided, whether an abnormal operation record is provided, whether a media report pollution event is provided, administrative penalty and environmental protection penalty are provided; the dimensions of the transmission path comprise terrain gradient, soil texture, soil pH value, soil medium, underground water burial depth, in-water transfer coefficient and net infiltration amount; the receptor risk dimensions include population density, whether there are towns within 1km of the boundary, whether there are water conservation sites, whether there are ecological red line areas, whether there are basic farmland conservation areas, whether there are sensitive receptors.
The quantification standard estimates the original scores of the representatives of all the industry classes according to the characteristic dimension of the pollution potential, the dimension of a transmission path and the dimension of receptor risk, wherein the characteristic dimension of the pollution potential is the objective quantified data according to the industry classes, and then according to the average release amount of the local high-pollution potential industry class characteristic pollutants to the environment and the professional judgment of experts, and the reference table 1 is referred to;
TABLE 1
Figure BDA0002981906050000091
Figure BDA0002981906050000101
Adjusting according to industry classes, further adjusting the class basis of each industry by combining with the process flow of an enterprise, potential pollution facilities and the current state adjustment factor of a factory floor so as to truly reflect the actual pollution risk potential of the target construction land, and referring to a table 2;
TABLE 2
Figure BDA0002981906050000102
The high-pollution potential process mainly includes electroplating, smelting, pyrolysis, casting, active resin production, metal degreasing with water or oil soluble detergent, etc
Storage tank for storing heavy metal-containing solid, diesel oil, heating fuel storage tank, mineral oil storage tank, storage facility for harmful solid waste, strong acid liquid storage tank, refueling facility and the like in high-pollution potential facility
And adjusting the total score range to be-2 or more and X +8, wherein the total score range is not more than the original score represented by the industry.
Judging according to the scale, the service life and the main body change coefficient: the enterprise scale is divided into five conditions of large, medium, small and micro by combining with the data acquisition; combining different scales with different operation year grades to generate S2-1 (preliminary scale and operation year coefficient), and multiplying by S2-2 (main body change risk influence adjustment coefficient) to obtain S2, referring to tables 3 and 4;
TABLE 3
Figure BDA0002981906050000103
Figure BDA0002981906050000111
TABLE 4
Figure BDA0002981906050000112
Transmission path dimension: the environmental transmission Pathway (Pathway) analysis determines the risk of the pollutants by the speed or the length of the residence time of the pollutants in the environmental medium or the transmission Pathway, and refers to a groundwater pollution potential evaluation (DRASTIC) system researched by the National Water Well Association (National Water Well Association) entrusted by the environmental protection agency to evaluate the vulnerability of the groundwater environment pollution potential, refer to table 5;
TABLE 5
Figure BDA0002981906050000113
Figure BDA0002981906050000121
Receptor risk dimension, see table 6;
TABLE 6
Figure BDA0002981906050000122
Summarizing and assigning scores: and (3) assigning the data results obtained by the big data technology and the remote sensing technology one by one according to the weights in the index system, and obtaining the pollution potential score of the industrial land after weighted calculation. Moreover, only the suspected contaminated site with a high risk level (risk level 5 or 6) needs to be subjected to field investigation to determine whether the site is a contaminated site, and refer to tables 7 and 8;
TABLE 7
Figure BDA0002981906050000123
TABLE 8
Figure BDA0002981906050000124
And finally, comparing the calculated pollution potential value of the industrial land with the result of a polluted plot list issued by the national ecological environment department, wherein more than 85% of suspected polluted fields with high potential scores are matched with the polluted plot list in the national plot list, and the method has better verification effect and higher usability.
The invention discloses a method for progressively identifying a suspected polluted site, which is used for developing progressive identification research of the suspected polluted site by using technical means of large-data high-resolution remote sensing images and pollution potentiality models based on the characteristics of basic attributes, surface textures, spatial distribution and the like of the current industrial site in China and achieving the purpose of rapidly screening and distinguishing the suspected polluted site on a large scale. Firstly, the suspected contaminated site progressive identification method improves the mode of combining on-site sampling monitoring in the traditional way in China, and greatly saves the cost for acquiring the information of the suspected contaminated site. Secondly, the suspected polluted site result identified by the method can effectively fill a suspected polluted site space-time database, and beneficial supplement is provided for a management decision department to fully master the local suspected polluted site name list.
The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make variations, modifications, additions or substitutions within the technical scope of the present invention.

Claims (10)

1. A method for gradually identifying a suspected polluted site is characterized by comprising the following steps: the method comprises the following steps:
acquiring multi-source heterogeneous data related to industrial land disclosed by a network by using a big data technology, and processing and fusing the multi-source heterogeneous data;
step two, screening the industrial land by combining the remote sensing technology, segmenting original image data by using an object-oriented recognition technology, and automatically learning effective characteristics from a training set by using a deep learning technology;
acquiring industrial land indexes based on a big data technology and a remote sensing technology, inputting a pollution potential model, calculating a pollution potential value of a single industrial land block, and screening out a suspected polluted site;
and step four, importing the calculated pollution potential value of the industrial land into an industrial land information database, and verifying the result to confirm the location of the pollution field.
2. The progressive identification method for the suspected contaminated site according to claim 1, wherein: the step one is that a big data technology comprises a big data acquisition technology and a big data processing technology, multi-source heterogeneous data related to the industrial land disclosed by the network is crawled by means of the big data acquisition technology and the web crawler technology, and the multi-source heterogeneous data is processed and fused by the big data processing technology.
3. The progressive identification method for the suspected contaminated site according to claim 2, wherein: and the second step is a means of data acquisition technology, including satellite remote sensing, sensors, radio frequency identification, Internet of things and a mobile platform.
4. The progressive identification method for the suspected contaminated site according to claim 2 or 3, wherein: the big data processing technology comprises the following steps of processing and fusing multi-source heterogeneous data: data storage, data preprocessing, data deep processing and data mining.
5. The progressive identification method for the suspected contaminated site according to claim 4, wherein: screening the selected and crawled industrial land by combining a remote sensing technology, establishing a remote sensing image sample library, and segmenting original image data by adopting multi-scale image segmentation by applying an object-oriented identification technology to realize transition from pixel-based remote sensing image classification to object-based remote sensing image classification; the deep learning technology simulates the process of processing data of the human brain through network learning, adopts a pre-trained deep learning model to construct a network basic convolution layer and a pooling layer, then utilizes a sample to train the model, extracts the characteristics of various sensitive land, and then utilizes a precision evaluation function to optimize the model to obtain a final deep learning model.
6. The progressive identification method for the suspected contaminated site according to claim 5, wherein: the data storage is based on Hadoop technology expansion and encapsulation, and multi-source heterogeneous data acquired by a crawling network are stored into a unified local data file and stored in a structured mode; the data preprocessing is used for carrying out data cleaning on multi-source heterogeneous data and converting the multi-source heterogeneous data into a single or conveniently processed structure, the data cleaning comprises missing value processing, noise data processing and inconsistent data processing, and the missing value processing adopts global constants, attribute mean values and possible value filling or directly ignores the data processing method; the noise data processing adopts a method of box separation, clustering, computer manual inspection and regression to remove noise; inconsistent data processing adopts manual correction, and after data cleaning, all data information is integrated into a set of data list taking enterprise land as an object; the data deep processing comprises machine learning, an intelligent algorithm, statistical analysis and system modeling, the data deep processing combines the cleaned data list of the enterprise land with the geographic information big data, the geographic attribute characteristics of the single enterprise land are further obtained, and the content of the enterprise land as the object list is enriched; the data mining adopts a Meta analysis method, a mining method based on data driving and a model-data fusion method based on a process mechanism to integrate and mine big data to obtain scientific, fusion and effectiveness information, a data list of abundant enterprise land is analyzed by means of correlation analysis, regression analysis, cluster analysis and principal component analysis according to the condition of production or outage, the current scale and production and pollution discharge information of a single enterprise land, driving force factors formed by a single industrial land are analyzed, spatial distribution hot areas of all industrial land blocks in an area are formed, and an industrial land forming mechanism is analyzed in a hierarchical scale from point to surface.
7. The progressive identification method for the suspected contaminated site according to claim 6, wherein: and the pollution potential model in the third step comprises model construction, an index system, a quantitative standard and summary assigning.
8. The progressive identification method for the suspected contaminated site according to claim 7, wherein: the index system is based on a basic pollution diffusion path, a pollution source, a transmission path and receptor bearing, and the index system including a pollution potential characteristic dimension, a transmission path dimension and a receptor risk dimension is constructed; the pollution potential characteristic dimensions comprise enterprise scale, industry category, production age, registered capital, pollutant types, whether a national pollution source is used, whether a full-aperture heavy metal industry enterprise is used, whether a hazardous waste operation license is provided, whether a local retrieval land soil pollution risk management and control and repair directory is provided, whether a pollution discharge license is provided, whether an abnormal operation record is provided, whether a media report pollution event is provided, administrative penalty and environmental protection penalty are provided; the dimensions of the transmission path comprise terrain gradient, soil texture, soil pH value, soil medium, underground water burial depth, in-water transfer coefficient and net infiltration amount; the receptor risk dimensions include population density, whether there are towns within 1km of the boundary, whether there are water conservation sites, whether there are ecological red line areas, whether there are basic farmland conservation areas, whether there are sensitive receptors.
9. The progressive identification method for the suspected contaminated site according to claim 8, wherein: the quantification standard estimates the original scores of various industry category representatives according to pollution potential characteristic dimensions, transmission path dimensions and receptor risk dimensions, wherein the pollution potential characteristic dimensions are data which are objectively quantified according to industry categories, and then according to the average release amount of local high-pollution potential industry category characteristic pollutants to the environment and by combining professional judgment of experts; adjusting according to industry classes, and further adjusting the class basis of each industry by combining with the process flow of an enterprise, potential pollution facilities and the current state adjustment factor of a factory floor so as to truly reflect the actual pollution risk potential of the target construction land; judging according to the scale, the service life and the main body change coefficient; the dimension of the transmission path is to determine the risk of the pollutants according to the speed of the pollutants passing through the environment medium or the length of the residence time of the pollutants in the transmission path, and evaluate the vulnerability of the polluted potential of the groundwater environment.
10. The progressive identification method for the suspected contaminated site of claim 9, wherein: and the summarizing and assigning is to assign the data results obtained by a big data technology and a remote sensing technology one by one according to the weights in the index system, obtain the pollution potentiality score of the industrial land after weighted calculation, and carry out field investigation on the suspected pollution site with high risk level to determine whether the site is the pollution site.
CN202110290595.9A 2021-03-18 2021-03-18 Progressive identification method for suspected contaminated site Pending CN112950047A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110290595.9A CN112950047A (en) 2021-03-18 2021-03-18 Progressive identification method for suspected contaminated site

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110290595.9A CN112950047A (en) 2021-03-18 2021-03-18 Progressive identification method for suspected contaminated site

Publications (1)

Publication Number Publication Date
CN112950047A true CN112950047A (en) 2021-06-11

Family

ID=76226570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290595.9A Pending CN112950047A (en) 2021-03-18 2021-03-18 Progressive identification method for suspected contaminated site

Country Status (1)

Country Link
CN (1) CN112950047A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241326A (en) * 2022-02-24 2022-03-25 自然资源部第三地理信息制图院 Progressive intelligent production method and system for ground feature elements of remote sensing images
CN116385689A (en) * 2023-06-02 2023-07-04 北京建工环境修复股份有限公司 Visual information management method, system and medium for site pollution data
CN117591506A (en) * 2024-01-12 2024-02-23 南京大学 Site soil and groundwater environment monitoring data cleaning method based on fusion model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107340364A (en) * 2017-05-31 2017-11-10 北京市环境保护监测中心 Polluted space analysis method and device based on magnanimity air pollution concentration data
CN110096490A (en) * 2018-10-18 2019-08-06 苏州科技大学 Contaminated site database and its construction method
CN111651432A (en) * 2020-06-11 2020-09-11 中科山水(北京)科技信息有限公司 Suspected contaminated site space-time information identification method
CN111666909A (en) * 2020-06-11 2020-09-15 中科山水(北京)科技信息有限公司 Suspected contaminated site space identification method based on object-oriented and deep learning
CN111915197A (en) * 2020-08-07 2020-11-10 厦门青霭信息科技有限公司 Enterprise environment damage potential assessment method based on multi-source geographic big data
CN112070056A (en) * 2020-09-17 2020-12-11 京师天启(北京)科技有限公司 Sensitive land use identification method based on object-oriented and deep learning
CN112329706A (en) * 2020-11-23 2021-02-05 京师天启(北京)科技有限公司 Mining land identification method based on remote sensing technology

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107340364A (en) * 2017-05-31 2017-11-10 北京市环境保护监测中心 Polluted space analysis method and device based on magnanimity air pollution concentration data
CN110096490A (en) * 2018-10-18 2019-08-06 苏州科技大学 Contaminated site database and its construction method
CN111651432A (en) * 2020-06-11 2020-09-11 中科山水(北京)科技信息有限公司 Suspected contaminated site space-time information identification method
CN111666909A (en) * 2020-06-11 2020-09-15 中科山水(北京)科技信息有限公司 Suspected contaminated site space identification method based on object-oriented and deep learning
CN111915197A (en) * 2020-08-07 2020-11-10 厦门青霭信息科技有限公司 Enterprise environment damage potential assessment method based on multi-source geographic big data
CN112070056A (en) * 2020-09-17 2020-12-11 京师天启(北京)科技有限公司 Sensitive land use identification method based on object-oriented and deep learning
CN112329706A (en) * 2020-11-23 2021-02-05 京师天启(北京)科技有限公司 Mining land identification method based on remote sensing technology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王夏晖;黄国鑫;朱文会;季国华;: "大数据支持场地污染风险管控的总体技术策略", 环境保护, vol. 47, no. 13, pages 14 *
申敏夏: "精准治霾进入大数据时代", 中国气象报, pages 1 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241326A (en) * 2022-02-24 2022-03-25 自然资源部第三地理信息制图院 Progressive intelligent production method and system for ground feature elements of remote sensing images
CN116385689A (en) * 2023-06-02 2023-07-04 北京建工环境修复股份有限公司 Visual information management method, system and medium for site pollution data
CN116385689B (en) * 2023-06-02 2023-08-04 北京建工环境修复股份有限公司 Visual information management method, system and medium for site pollution data
CN117591506A (en) * 2024-01-12 2024-02-23 南京大学 Site soil and groundwater environment monitoring data cleaning method based on fusion model
CN117591506B (en) * 2024-01-12 2024-03-22 南京大学 Site soil and groundwater environment monitoring data cleaning method based on fusion model

Similar Documents

Publication Publication Date Title
Kumar et al. Appraising the accuracy of GIS-based multi-criteria decision making technique for delineation of groundwater potential zones
Khodaparast et al. Municipal solid waste landfill siting by using GIS and analytical hierarchy process (AHP): a case study in Qom city, Iran
CN112950047A (en) Progressive identification method for suspected contaminated site
Zamorano et al. Evaluation of a municipal landfill site in Southern Spain with GIS-aided methodology
Rajput et al. Modification and optimization of DRASTIC model for groundwater vulnerability and contamination risk assessment for Bhiwadi region of Rajasthan, India
Kazuva et al. GIS-and MCD-based suitability assessment for optimized location of solid waste landfills in Dar es Salaam, Tanzania
Bryant et al. US Department of Agriculture Agricultural Research Service Mahantango Creek Watershed, Pennsylvania, United States: Physiography and history
CN112070056A (en) Sensitive land use identification method based on object-oriented and deep learning
CN111666909A (en) Suspected contaminated site space identification method based on object-oriented and deep learning
Werz et al. Groundwater risk intensity mapping in semi-arid regions using optical remote sensing data as an additional tool
Vaezihir et al. Total vulnerability estimation for the Tabriz aquifer (Iran) by combining a new model with DRASTIC
Zhao et al. Groundwater contamination risk assessment based on intrinsic vulnerability, pollution source assessment, and groundwater function zoning
Ren et al. Analysis of the spatial characteristics of inhalable particulate matter concentrations under the influence of a three-dimensional landscape pattern in Xi'an, China
Taubenböck et al. Remote sensing—An effective data source for urban monitoring
Mulligan Modelling catchment hydrology
Fodor et al. Application of environmental information systems in environmental impact assessment (in Hungary)
Rosli et al. Sustainable urban forestry potential based quantitative and qualitative measurement using geospatial technique
CN107885833A (en) Method and system based on the change of Web newsletter archive quick detections ground mulching
Liu Online monitoring method of non-point source pollution of water resources in river scenic spots
CN113360835A (en) Suspected pollution land block oriented pollution potential evaluation method
Liu et al. Siting MSW landfills via the integration of DEMATEL-ANP and clustering algorithm in a fuzzy logic environment (Case Study: Lanzhou, China)
Haghparast et al. Comprehensive Environmental Monitoring based on Stations of Environmental Pollutants (Air, Water and Soil) in Tehran
Hossain et al. Neural Network Based Estimation of Service Life of Different Metal Culverts in Arkansas
Carver et al. Wildness Study in the Cairngorms National Park
Kherde et al. Integrating Geographical Information Systems (GIS) with Hydrological Modelling—Applicability and Limitations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination