CN114860875B - Data integration system and method for fixed pollution source - Google Patents

Data integration system and method for fixed pollution source Download PDF

Info

Publication number
CN114860875B
CN114860875B CN202210443669.2A CN202210443669A CN114860875B CN 114860875 B CN114860875 B CN 114860875B CN 202210443669 A CN202210443669 A CN 202210443669A CN 114860875 B CN114860875 B CN 114860875B
Authority
CN
China
Prior art keywords
data
integrated
pollution source
registered
pollution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210443669.2A
Other languages
Chinese (zh)
Other versions
CN114860875A (en
Inventor
毛庆国
尹�民
游勇
费新勇
彭胜巍
刘琳琳
黄为炜
蔡昌才
张德辉
何燕飞
张冬华
伍城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ecological Environment Intelligent Control Center
Original Assignee
Shenzhen Ecological Environment Intelligent Control Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ecological Environment Intelligent Control Center filed Critical Shenzhen Ecological Environment Intelligent Control Center
Priority to CN202210443669.2A priority Critical patent/CN114860875B/en
Publication of CN114860875A publication Critical patent/CN114860875A/en
Application granted granted Critical
Publication of CN114860875B publication Critical patent/CN114860875B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Databases & Information Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data integration system and a data integration method for a fixed pollution source. The method comprises the following steps: acquiring to-be-integrated data of a fixed pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, wherein the to-be-integrated data comprises a name and/or an identification code of the to-be-integrated pollution source, acquiring registered data of a registered pollution source, and the registered data comprises the name and/or the identification code of the registered pollution source; keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated is judged; if yes, integrating the data to be integrated into the matched pollution source data of the matched pollution source; if not, generating a new pollution source and new pollution source data according to the pollution source data to be integrated. The invention can effectively improve the data processing efficiency and reduce the data management difficulty.

Description

Data integration system and method for fixed pollution source
Technical Field
The invention relates to the field of data processing, in particular to a data integration system and method for a fixed pollution source.
Background
With the development of technology, the increase of population and the improvement of living standard of people, the resource consumption is gradually increased, and the discharge amount of pollutants is continuously increased. In order to improve the ecological environment quality, protect the regional ecological safety and improve the environment supervision capability, the data of each fixed pollution source needs to be updated at any time to be provided for relevant departments for supervision, analysis and processing, but the paths for acquiring the data of the fixed pollution sources are more, the acquired data are also more disordered, error and leakage are easy to occur, the data processing pressure of the relevant departments is high, and the processing efficiency is lower.
Disclosure of Invention
The invention aims to solve the technical problems that the data processing pressure is high, the processing efficiency is low, and the data integration system and method for fixing the pollution source are provided for overcoming the defects in the prior art.
The technical scheme adopted for solving the technical problems is as follows: provided is a fixed pollution source data integration method, comprising:
acquiring to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode matched with the data source, wherein the to-be-integrated data comprises a name and/or an identification code of the to-be-integrated pollution source, acquiring registered data of a registered pollution source, and the registered data comprises the name and/or the identification code of the registered pollution source;
Keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated or not is judged;
if the registered pollution sources are matched pollution sources matched with the pollution sources to be integrated, integrating the data to be integrated into matched pollution source data of the matched pollution sources;
and if the registered pollution sources are not matched pollution sources matched with the pollution sources to be integrated, generating a new pollution source and new pollution source data according to the pollution source data to be integrated.
The step of generating the newly added pollution source and the newly added pollution source data according to the pollution source data to be integrated comprises the following steps:
and acquiring the name and/or identification code of the newly added pollution source according to a preset naming rule and/or coding rule.
After the step of generating the new pollution source and the new pollution source data according to the pollution source data to be integrated, the method comprises the following steps:
generating a fixed pollution source list according to all the current newly-added pollution sources and registered pollution sources, and judging whether the fixed pollution source list contains the same combinable pollution sources or not;
And at least one of de-duplication, coverage and deletion is carried out on the combinable pollution source data of the combinable pollution source, and the combinable pollution source data are combined into combined pollution source data.
Wherein, after the step of generating a fixed pollution source list according to all the current newly added pollution sources and registered pollution sources, the method comprises the following steps:
and constructing a pollution source file for each fixed pollution source in the fixed pollution source list, wherein the pollution source file comprises at least one of a name, an address, a administrative area, an industry, an enterprise, a code, a management attribute, a supervision flow chart, pollution source emission and monitoring information.
The to-be-integrated data further comprises an enterprise to be integrated, to which the to-be-integrated pollution source belongs; the registered data also includes a registered business to which the registered pollution source belongs;
the step of integrating the data to be integrated into the matching pollution source data of the matching pollution source comprises the following steps:
acquiring an association relationship between the enterprise to be integrated and the registered enterprise, wherein the association relationship comprises any one of a superior relationship, a subordinate relationship, a same superior relationship and a substantially same relationship;
integrating the data to be integrated into the matching pollution source data of the matching pollution source according to the association relation, and adding corresponding association identifiers and association links on the data to be integrated and the matching pollution source data.
The step of obtaining the association relationship between the enterprise to be integrated and the registered enterprise includes:
and constructing an enterprise data knowledge graph of the matched pollution source data and the data to be integrated according to the association relation.
Wherein the data to be integrated comprises a data source of the pollution source to be integrated, the registered data comprises a data source of the registered pollution source, and the data source comprises at least one of a providing unit, a system and a sharing mode;
the step of judging whether the registered pollution sources are matched pollution sources matched with the pollution sources to be integrated or not includes:
if the name and/or the identification code of the pollution source to be integrated is completely consistent with the name and/or the identification code of the registered pollution source, the registered pollution source is used as the matched pollution source;
and if the name and/or the identification code of the pollution source to be integrated is partially consistent with the name and/or the identification code of the registered pollution source, acquiring the data sources of the pollution source to be integrated and the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated according to the data sources.
The step of judging whether the registered pollution sources are matched pollution sources matched with the pollution sources to be integrated according to the data sources comprises the following steps:
judging whether the data source of the pollution source to be integrated is consistent with the data source of the registered pollution source;
if the data source of the pollution source to be integrated is completely consistent with the data source of the registered pollution source, acquiring the data characteristics of the data to be integrated and the data characteristics of the registered pollution source, and judging whether the data characteristics of the data to be integrated and the data characteristics of the registered pollution source are consistent or not;
if the data characteristics of the data to be integrated are consistent with the data characteristics of the registered pollution sources, the registered pollution sources are used as the matched pollution sources;
if the data characteristics of the data to be integrated are inconsistent with the data characteristics of the registered pollution sources, judging whether the reliability of the data to be integrated meets the preset requirement, and if the reliability of the data to be integrated meets the preset requirement, taking the pollution sources to be integrated as the newly added pollution sources.
Wherein the data characteristics of the data to be integrated comprise at least one of emission peak value, emission valley value, emission period, emission trend, peak emission time and valley emission time of the data to be integrated;
The data characteristic of the registered data includes at least one of emission peak, emission trough, emission period, emission trend, peak emission time, trough emission time of the registered data.
After the step of generating the new pollution source and the new pollution source data according to the pollution source data to be integrated, the method comprises the following steps:
and performing data quality audit on the registered data, the newly added pollution source data or the integrated matched pollution source data, and deleting the data with unqualified quality audit, wherein the quality audit comprises data integrity audit, data validity audit and data reliability audit.
Wherein, the data integrity audit includes: verifying whether the data content includes the necessary data items;
the data validity audit includes: checking whether the data content is in the data validity period;
the data reliability audit includes: and checking whether the data source belongs to an official data source.
The step of importing the data to be integrated in the data integration mode matched with the data source comprises the following steps:
inputting the data to be integrated into a preset form matched with the fixed pollution source data type, and acquiring the data of the input preset form; and/or
Acquiring the data to be integrated at intervals of a preset period, automatically performing data examination on the data to be integrated, and importing the data to be integrated in a data copying mode; and/or
When detecting the to-be-integrated data of the newly added to-be-integrated pollution source, acquiring the to-be-integrated data of the newly added to-be-integrated pollution source through a standard data interface, converting the to-be-integrated data of the newly added to-be-integrated pollution source into a standard format, and then importing the to-be-integrated data; and/or
And importing the data to be integrated through a data importing tool and a data management tool.
The technical scheme adopted for solving the technical problems is as follows: there is provided a stationary pollution source data integration system comprising:
the acquisition module is used for acquiring to-be-integrated data of the to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of matching the data sources, wherein the to-be-integrated data comprises a name and/or an identification code of the to-be-integrated pollution source, acquiring registered data of a registered pollution source, and the registered data comprises the name and/or the identification code of the registered pollution source;
the matching module is used for carrying out keyword matching on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated or not;
The integration module is used for integrating the data to be integrated into the matching pollution source data of the matching pollution source if the registered pollution source is the matching pollution source matched with the pollution source to be integrated;
and the new adding module is used for generating a new adding pollution source and new adding pollution source data according to the pollution source data to be integrated if the registered pollution source is not a matched pollution source matched with the pollution source to be integrated.
The technical scheme adopted for solving the technical problems is as follows: there is provided a stationary pollution source data integration system comprising a memory storing a computer program and a processor executing the computer program to carry out the steps of the method as described above.
Wherein the memory stores a computer program which, when executed by a processor, causes the processor to perform the steps of the method as described above.
Compared with the prior art, the method has the beneficial effects that the key word matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source so as to judge whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated, if so, the data to be integrated is integrated into the matched pollution source data of the matched pollution source, if not, the newly added pollution source and the newly added pollution source data are generated according to the pollution source data to be integrated, the data which substantially correspond to the same fixed pollution source can be integrated, and the problems of disordered data and higher management difficulty caused by different data sources are avoided.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Wherein:
FIG. 1 is a flow chart of a first embodiment of a method for integrating stationary pollution source data according to the present invention;
FIG. 2 is a flow chart of a second embodiment of a method for integrating stationary pollution source data according to the present invention;
FIG. 3 is a flow chart of a third embodiment of a method for integrating stationary pollution source data according to the present invention;
FIG. 4 is a flow chart of a fourth embodiment of a method for integrating stationary pollution source data according to the present invention;
FIG. 5 is a flow chart of a fifth embodiment of a method for integrating stationary pollution source data according to the present invention;
FIG. 6 is a schematic diagram of a fixed pollution source data integration system according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a fixed pollution source data integration system according to an embodiment of the present invention;
Fig. 8 is a schematic structural diagram of an embodiment of a storage medium according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a first embodiment of a method for integrating fixed pollution source data according to the present invention. The method for integrating the fixed pollution source data provided by the invention comprises the following steps:
s101: the method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and obtaining registered data of registered pollution sources.
In a specific implementation scenario, the to-be-integrated data of the to-be-integrated pollution source and the data source of the to-be-integrated data are acquired, the to-be-integrated data are imported in a data integration mode of matching the data sources, and the to-be-integrated data can be collected, tidied and reported by an enterprise where the to-be-integrated fixed pollution source is located, or collected and acquired by a supervision department supervising the to-be-integrated fixed pollution source, and can also be automatically acquired through equipment such as a sensor.
In one implementation scenario, the data to be integrated is historical data or data integrated or collected by other software, and the data to be integrated is automatically obtained periodically by using an ETL data importing tool and a corresponding ETL management tool, for example, the data to be integrated is imported after being packaged. After the data to be integrated is acquired, the data format or the data acquisition standard of the data to be integrated is possibly different from the format or the standard of the system because the data to be integrated is acquired or integrated by other software or historical data, after the data to be integrated is acquired, the data to be integrated is subjected to data cleaning and screening, whether the data quality of the data to be integrated is qualified or not is judged, and if the data quality is not qualified, the data quality of the data to be integrated can be improved through an artificial neural network, and the data quality of the data to be integrated can be improved, such as defect supplement, error correction of mapping of the format and the like.
In one implementation scenario, the data to be integrated are business data such as approval permission for environment, which are mainly generated in the daily business processing process, a standard data interface is provided for the data, when the data to be integrated with a newly added pollution source to be integrated is detected, the data to be integrated with the newly added pollution source to be integrated is obtained through the standard data interface, and the data to be integrated with the newly added pollution source to be integrated is converted into a standard format and then is imported into the data to be integrated. Furthermore, by modifying the service function according to the standard interface, the real-time automatic importing of the data to be integrated can be realized in the normal service handling process of the service personnel.
In one implementation scenario, the data to be integrated is the data to be integrated, and the data to be integrated is imported in a data copying mode by automatically performing data inspection on the data to be integrated after the data to be integrated is acquired without making a middle format data standard because the data acquisition department, the data acquisition mode and the data acquisition format are relatively fixed.
In one implementation scenario, the data to be integrated is to record the data to be integrated into a preset form matched with the fixed pollution source data type without support of a service system for public codes, pollution source basic information, environment quality measuring points, section information and the like, and the recorded preset form is subjected to data acquisition. For example, a neural network may be trained, and a matched preset form is obtained through the neural network, and the data to be integrated is filled into the preset form.
The method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and carrying out a corresponding importing method for the to-be-integrated data according to the reliability and the data characteristics of the to-be-integrated data of different data sources, so that the reliability and the effectiveness of the imported to-be-integrated data can be ensured.
The pollution sources to be integrated can be fixed pollution sources which are registered in a pollution source management system, or can be newly added fixed pollution sources of enterprises. The data to be integrated includes the name and/or identification code of the pollution source to be integrated. The identification code is preferably selected when the pollution source to be integrated has the identification code, and if the pollution source to be integrated does not have the identification code (for example, the newly added fixed pollution source is not registered or the identification code is not applied to be acquired yet) the name of the pollution source to be integrated is acquired, the name of the pollution source to be integrated can comprise at least one of a place, a discharge type, main pollutants, an affiliated enterprise, an affiliated industry and an affiliated administrative area of the pollution source to be integrated.
Registered data of registered pollution sources, which are fixed pollution sources that have been registered in the pollution source management system before, are acquired, and the registered data are relevant data of the registered pollution sources collected or arranged before, including at least one of the location, emission type, emission amount, main pollutant, affiliated enterprise, affiliated project, affiliated industry, affiliated administrative area, pollution discharge license, real-time monitoring data, and supervision information of the registered pollution sources. The identification code of the registered pollution source is obtained according to a preset coding rule when the fixed pollution source is registered. The name of the registered pollution source may include at least one of a location of the registered pollution, a type of discharge, a primary pollutant, an affiliated business, an affiliated industry, an affiliated administrative area.
S102: keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated is judged; if yes, step S103 is executed, and if no, step S104 is executed.
In a specific implementation scenario, keyword matching is performed on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated or not is judged. For example, it may be detected whether the name and/or identification code of the pollution source to be integrated and the name and/or identification code of the registered pollution source coincide in a certain or some fixed location. For example, whether the second, third, fifth and tenth digits of the identification codes of the two are identical. For another example, the names of the two correspond to whether the contents of the enterprise, the emission type and the main pollutant are consistent.
S103: integrating the data to be integrated into the matched pollution source data of the matched pollution source.
In one particular implementation scenario, the registered pollution source is a matched pollution source that matches the pollution source to be integrated, and the data to be integrated is integrated into the matched pollution source data of the matched pollution source. The data to be integrated can be integrated into the matching pollution source data after being screened and cleaned, the data to be integrated can be added into the matching pollution source data, partial content in the matching pollution source data can be replaced, and partial content in the data to be integrated can be selected to be added into the matching pollution source data. In one implementation scenario, the time, the content, the unit of the operation integration step, the responsible person and other content of the integration are recorded, and the follow-up data tracking is reserved.
In one implementation scenario, a matching data format of the matching pollution source data is obtained, and data mapping is performed on the data to be integrated according to the matching data format, so that the data to be integrated has the matching data format, and the overall data format is consistent, and is convenient to read and store.
In one implementation scenario, a data item matching pollution source data and a data item of data to be integrated may be obtained, the data corresponding to each data item is compared, if the data is different, the data is filled, and if the data is the same, no modification is made. Or if different data exist, comparing, if the difference is smaller than a preset threshold, filling, if the difference is larger, marking, and then manually confirming the processing mode.
In one implementation scenario, the pollution source to be integrated and a registered pollution source are actually the same pollution source, but because the two pollution sources are collected and reported by different departments, the registered pollution source already has an identification code according to a preset coding rule, and the pollution source to be integrated only has a name, when the two pollution sources are judged to be matched according to keyword matching, the data to be integrated of the pollution source to be integrated is integrated into the registered data of the matched registered pollution source, so that the problems of messy data and high management difficulty caused by different data sources are avoided.
S104: and generating a new pollution source and new pollution source data according to the pollution source data to be integrated.
In a specific implementation scenario, if there is no matching pollution source matching with the pollution source to be integrated in the registered pollution sources, that means that the pollution source to be integrated is not registered before, the pollution source to be integrated needs to be registered in the pollution source management system as a new pollution source, and the data to be integrated is used as new pollution source data of the new pollution source. The data format of the registered data can be obtained, and the newly added pollution source data is subjected to data mapping, so that the newly added pollution source data and the registered data have a uniform format, and the data management is convenient.
In this implementation scenario, the name and/or the identification code of the newly added pollution source are obtained according to a preset naming rule and/or a coding rule. So that the newly added pollution sources can be managed together with the registered pollution sources. In the implementation scene macro, the problems of multiple sources, information splitting and management of the traditional pollution source data are solved through unified regular coding and coding of the fixed pollution source, and meanwhile, the pollution source information is dynamically updated and shared in real time through data butt joint between the fixed pollution source coding and various pollution source management related service systems in the whole city.
As can be seen from the above description, in this embodiment, the keyword matching is performed on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, so as to determine whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated, if yes, the data to be integrated is integrated into the matched pollution source data of the matched pollution source, if no, the new pollution source and the new pollution source data are generated according to the pollution source data to be integrated, so that the data substantially corresponding to the same fixed pollution source can be integrated, and the problems of data disorder and greater management difficulty caused by different data sources are avoided.
Referring to fig. 2, fig. 2 is a flowchart illustrating a second embodiment of a method for integrating fixed pollution source data according to the present invention. The method for integrating the fixed pollution source data provided by the invention comprises the following steps:
s201: the method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and obtaining registered data of registered pollution sources.
S202: and carrying out keyword matching on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated. If yes, step S203 is executed, and if no, step S204 is executed.
S203: integrating the data to be integrated into the matched pollution source data of the matched pollution source.
S204: and generating a new pollution source and new pollution source data according to the pollution source data to be integrated.
In a specific implementation scenario, steps S201 to S204 are substantially identical to steps S101 to S104 in the first embodiment of the method for integrating fixed pollution source data provided by the present invention, and will not be described herein.
S205: generating a fixed pollution source list according to all the current newly-added pollution sources and registered pollution sources, and judging whether the same combinable pollution sources exist in the fixed pollution source list. If yes, go to step S206.
In a specific implementation scenario, a fixed pollution source list is generated according to all the current newly-added pollution sources and registered pollution sources, wherein the fixed pollution source list comprises the name and the identification code of each newly-added pollution source or registered pollution source, and newly-added pollution source data or registered data of each newly-added pollution source or registered pollution source. It is determined whether there are substantially identical combinable pollution sources in the fixed pollution source list. For example, a certain fixed pollution source is a wastewater pollution source, and different departments collect emission data of different pollutants of the wastewater pollution source, so that the situation that the same fixed pollution source but recorded data of two fixed pollution sources exist may occur, whether the fixed pollution sources exist in a fixed pollution source list or not is judged, and whether the fixed pollution sources exist or not can be judged by the position of the fixed pollution sources, the enterprise, the industry, the emission type and the emission amount of the fixed pollution sources. If the locations, industries, emission types, emissions of two or more stationary sources are the same or close together, then these stationary sources may be considered to be combinable sources.
S206: and at least one of de-duplication, coverage and deletion is carried out on the combinable pollution source data of the combinable pollution source, and the combinable pollution source data are combined into combined pollution source data.
In one specific implementation scenario, at least one of deduplication, overwriting, and deletion of the combinable pollution sources is combined into combined pollution source data. For example, the combinable pollution sources are sewage discharge pollution sources, and the plurality of combinable pollution sources comprise sewage discharge amount data of the sewage discharge pollution sources, wherein the sewage discharge amount data comprises discharge peaks and valleys, discharge average values, discharge time, pause discharge time and the like. If the data is duplicated, deduplication is performed, if there is significantly erroneous data (e.g., that is significantly different from other data, or that is significantly unreasonable), then if there is similar or similar data, one of the data is selected to cover the remaining data, e.g., intermediate value coverage extremum may be selected.
S207: and constructing a pollution source file for each fixed pollution source in the fixed pollution source list, wherein the pollution source file comprises at least one of a name, an address, an administrative area, an industry, an enterprise, an identification code, a management attribute, a supervision flow chart, pollution source emission and monitoring information.
In one specific implementation scenario, a pollution source archive is built for each fixed pollution source in the fixed pollution source list, the pollution source archive including at least one of a name, an address, a administrative district to which the pollution source belongs, an industry to which the pollution source belongs, an identification code, a management attribute, a regulatory flow chart, a pollution source emission amount, and monitoring information. After the pollution source file is generated, corresponding links are added in the fixed pollution source list, when a user browses the fixed pollution source list, the user wants to know the content of a specific fixed pollution source, can access the pollution source file by clicking the corresponding links to acquire detailed information, the user does not need to search and collect data by himself, and the efficiency of data management and data review is greatly improved.
It can be seen from the foregoing description that, in this embodiment, a fixed pollution source list is generated according to all the newly added pollution sources and registered pollution sources, at least one of duplication removal, coverage and deletion of the combinable pollution sources that are substantially the same in the fixed pollution source list is performed, and the combinable pollution sources are combined into combined pollution source data, and a pollution source file is built for each fixed pollution source in the fixed pollution source list, so that erroneous or redundant data can be effectively removed, storage space is saved, data quality is improved, time required for a user to arrange data can be reduced, and efficiency of data management is improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating a third embodiment of a method for integrating fixed pollution source data according to the present invention. The method for integrating the fixed pollution source data provided by the invention comprises the following steps:
s301: the method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and obtaining registered data of registered pollution sources.
S302: and carrying out keyword matching on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated. If yes, step S303 is executed, and if no, step S304 is executed.
S303: integrating the data to be integrated into the matched pollution source data of the matched pollution source.
S304: and generating a new pollution source and new pollution source data according to the pollution source data to be integrated.
In a specific implementation scenario, steps S301 to S304 are substantially identical to steps S101 to S104 in the first embodiment of the method for integrating fixed pollution source data provided by the present invention, and will not be described herein.
S305: and performing data quality audit on the registered data, the newly added pollution source data or the integrated matched pollution source data, and deleting the data with unqualified quality audit, wherein the quality audit comprises data integrity audit, data validity audit and data reliability audit.
In a specific implementation scenario, all data in the current pollution source management system are subjected to data quality audit, and data with unqualified quality audit are deleted, so that the data quality in the current pollution source management system is ensured, and the data management is convenient. The quality audit comprises data integrity audit, data validity audit and data reliability audit. The data integrity check is used to check whether the registered data, the newly added pollution source data, or the already integrated matching pollution source data includes the necessary data items. The necessary data items may be set by a user. The necessary data items may include data collection time, a department uploading the data, a responsible person, an item corresponding to the data, and the like, and if a certain data lacks a necessary data item, the reliability of the data is doubtful, and the data may be deleted. Further, the necessary data items corresponding to the different types of data may be different, for example, data corresponding to the pollutant discharge amount monitored in real time, the necessary data items including at least one of detection time, detection position, detection standard of the detected pollutant.
The data validity audit is used to audit whether registered data, newly added pollution source data, or matched pollution source data that has been integrated is within the data validity period. For example, some contaminants have a treatment standard with a validity period, some contaminants correspond to items with a validity period, data in the validity period can be reserved, and data beyond the validity period can be deleted to save storage space.
The data reliability audit is used to audit whether registered data, newly added pollution source data, or matched pollution source data that has been integrated belongs to an official certified data source, and uploading of some data may require an audit of at least one superior authority or superior lead, and if the data has not passed the audit process, it cannot be considered to belong to an official certified data source. In one implementation, identification codes, identification watermarks, tags, etc. may be added to the official certified data to enable the official certified data to be quickly identified.
In other implementation scenarios, other aspects of auditing can be performed on registered data, newly added pollution source data or integrated matched pollution source data, including auditing of data formats, auditing of data authorities, auditing of data accuracy and auditing of data accuracy. The auditing of the data format includes whether the data format meets preset requirements, such as whether it has been mapped to a specified format, whether it is a specified disallowed format, and so forth. The auditing of the data rights includes whether the settings of the rights to the data are compliant, e.g., whether the settings of the rights to review, modify, download, upload, forward the data are compliant with relevant regulations, such as data security management regulations. The data correctness checking includes checking whether the data is correct, for example, the data of the same fixed pollution source in adjacent or same period can be obtained, comparison is carried out, and if the difference between the comparison results is smaller, the correct data is judged. Or the historical data of the same pollution source can be input into the neural network to obtain predicted data, the predicted data is compared with the current data, and if the difference between the comparison results is smaller, the correct data is judged.
In other implementation scenarios, marking the data with unqualified quality audit, determining whether to delete the data or tracing the data with unqualified quality audit by a user, checking the data source of the data, a department uploading the data, a responsible person, and the like, and warning the department or the responsible person if the data managed by the department or the responsible person has the situation of unqualified quality audit for many times. If the quality audit is failed for a plurality of times, judging whether the data source is an alternative data source, if so, deleting the data source, and adopting the alternative data source to provide data. And the quality check is performed again on other data provided by the data source so as to effectively ensure the reliability of the data. In other implementation scenarios, the data source may be intelligently analyzed to obtain reasons for poor quality of the data provided by the data source, for example, inaccuracy of the monitored data caused by a longer service life of the sensor, or coding error easily occurs in a data transmission manner.
As can be seen from the above description, in this embodiment, data quality audit is performed on the fixed pollution source data, and data with unqualified quality audit is deleted, so that the data quality can be effectively improved, and thus the efficiency of data management is effectively improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating a fourth embodiment of a method for integrating fixed pollution source data according to the present invention. The method for integrating the fixed pollution source data provided by the invention comprises the following steps:
s401: the method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and obtaining registered data of registered pollution sources.
S402: and (3) keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated or not is judged, and if yes, step S403 is executed.
In a specific implementation scenario, steps S401 to S402 are substantially identical to steps S101 to S102 in the first embodiment of the method for integrating fixed pollution source data provided by the present invention, and will not be described herein.
S403: and acquiring the association relation between the enterprise to be integrated and the registered enterprise, wherein the association relation comprises any one of an upper-level relation, a lower-level relation, a parallel subordinate same upper-level relation and a substantially same relation.
In a specific implementation scenario, the data to be integrated further includes an enterprise to be integrated to which the pollution source to be integrated belongs; the registered data also includes registered businesses to which the registered pollution sources belong. The association relation between the enterprise to be integrated and the registered enterprise can be provided by the registered enterprise or the enterprise to be integrated or can be obtained by searching related information on the internet. The association relationship includes any one of a top-bottom relationship, a parallel relationship, a substantially identical relationship, for example, if the company b is a subsidiary of the company a, the company a and the company b are the top-bottom relationship. And if the company B and the company C are all subsidiary companies of the company A, the company B and the company C are in parallel subordinate and same superior relation. The butyl company and the propyl company have the same address and the same fixed pollution source, and can be considered to be substantially the same.
S404: integrating the data to be integrated into the matched pollution source data of the matched pollution source according to the association relation, and adding corresponding association identification and association link on the data to be integrated and the matched pollution source data.
In a specific implementation scenario, the data to be integrated is integrated into the matching pollution source data of the matching pollution source according to the association relation. For example, if the enterprise to be integrated and the registered enterprise of the matching pollution source are in substantially the same relationship, the data to be integrated and the registered enterprise are added to the matching pollution source data, and the same or duplicate part as the matching pollution source data is deleted. And if the enterprise to be integrated and the registered enterprise of the matched pollution source are in a superior-subordinate relationship, adding the data to be integrated into the matched pollution source data as the whole data of the enterprise to be integrated. And if the enterprise to be integrated and the registered enterprise of the matched pollution source are in parallel subordinate same superior relation, taking the data to be integrated as the whole data of the enterprise to be integrated as independent data, and storing the data in parallel with the matched pollution source data.
In the implementation scenario, corresponding association identifiers and/or association links are added on the data to be integrated and the matched pollution source data. Therefore, when the user refers to the data to be integrated or the data of the matched pollution source, the user can know other data related to the currently referred data and can refer to other data of the relations through the related links, and information acquisition and data management of the user are greatly facilitated.
S405: and constructing a pollution knowledge graph for matching the pollution source data and the data to be integrated according to the association relation.
In a specific implementation scenario, an enterprise data knowledge graph matching pollution source data and data to be integrated is constructed according to the association relation. The pollution data knowledge graph can be constructed by taking each data (comprising the data to be integrated and the matched pollution source data), each enterprise (comprising the enterprise corresponding to the data to be integrated and the matched pollution source) and each pollution source (comprising the pollution source to be integrated and the matched pollution) as nodes. Therefore, the user can know the relation among the nodes more clearly through the pollution data knowledge graph, and the user can acquire the data of other nodes related to the node only by inputting the content of any node during searching, so that the use of the user is greatly facilitated, and the user does not need to search and sort manually. Furthermore, the pollution relation between each pollution source and each enterprise can be obtained according to the obtained pollution knowledge graph, and key industries and terminal enterprises for pollution control can be obtained.
As can be seen from the above description, in this embodiment, the association relationship between the enterprise to be integrated and the registered enterprise is obtained, the data to be integrated is integrated into the matching pollution source data of the matching pollution source according to the association relationship, and the corresponding association identifier and association link are added to the data to be integrated and the matching pollution source data, so that the user can conveniently and quickly obtain the related data, and the efficiency of data management and searching is improved.
Referring to fig. 5, fig. 5 is a flowchart illustrating a fifth embodiment of a method for integrating fixed pollution source data according to the present invention. The method for integrating the fixed pollution source data provided by the invention comprises the following steps:
s501: the method comprises the steps of obtaining to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, importing the to-be-integrated data in a data integration mode of data source matching, and obtaining registered data of registered pollution sources.
In one specific implementation scenario, the data to be integrated includes a data source of the pollution source to be integrated, the registered data includes a data source of the registered pollution source, and the data source includes at least one of a providing unit, a system to which the data source belongs, and a sharing mode. The providing unit comprises an enterprise or department to which the pollution source to be integrated or the registered pollution source belongs, and the system comprises the enterprise to be integrated or the system of the pollution source to be integrated. The sharing mode comprises unconditional sharing, conditional sharing and no sharing.
The data to be integrated includes the name and/or identification code of the pollution source to be integrated. The identification code is preferably selected when the pollution source to be integrated has the identification code, and if the pollution source to be integrated does not have the identification code (for example, the newly added fixed pollution source is not registered or the identification code is not applied to be acquired yet) the name of the pollution source to be integrated is acquired, the name of the pollution source to be integrated can comprise at least one of a place, an emission type, main pollutants, an affiliated enterprise and an affiliated administrative area of the pollution source to be integrated. The identification code of the registered pollution source is obtained according to a preset coding rule when the fixed pollution source is registered. The name of the registered pollution source may include at least one of a location of the registered pollution, a type of discharge, a primary pollutant, an affiliated business, and an affiliated administrative area.
S502: and (3) keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, whether the name and/or the identification code of the pollution source to be integrated is completely consistent with the name and/or the identification code of the registered pollution source is judged, if so, step S503 is executed, and if not, step S504 is executed.
In a specific implementation scenario, step S502 is substantially identical to step S102 of the first embodiment of the method for integrating fixed pollution source data provided in the present invention, and will not be described herein.
S503: registered pollution sources are used as matching pollution sources.
In this implementation scenario, if the name and/or identifier of the pollution source to be integrated is completely consistent with the name and/or identifier of the registered pollution source, the registered pollution source may be considered to be substantially the same as the pollution source to be integrated, and the registered pollution source is used as the matching pollution source.
S504: and judging whether the data sources of the pollution sources to be integrated are consistent with the data sources of the registered pollution sources. If yes, step S505 is executed. If not, step S507 is executed.
In a specific implementation scenario, the name and/or the identification code of the pollution source to be integrated is partially consistent with the name and/or the identification code of the registered pollution source, further, the part of the pollution source to be integrated and the part of the registered pollution source are consistent at the preset designated position, so that the pollution source to be integrated and the registered pollution source are consistent, and the pollution source to be integrated and the registered pollution source may be matched with each other. And judging whether the data sources of the pollution sources to be integrated are consistent with the data sources of the registered pollution sources. If the data sources of the two are consistent, the probability of matching the two is high. If the data sources of the two are inconsistent, for example, the enterprises to which the two belong are different, and the systems to which the two belong are different, the two are not matched, and the pollution sources to be integrated can be used as new pollution sources.
S505: and acquiring the data characteristics of the data to be integrated and the data characteristics of the registered pollution sources, judging whether the data characteristics of the data to be integrated and the data characteristics of the registered pollution sources are consistent, if so, executing the step S503, and if not, executing the step S506.
In one particular implementation scenario, data characteristics of data to be integrated and data characteristics of registered pollution sources are acquired. The data characteristics of the data to be integrated comprise at least one of emission peak value, emission valley value, emission period, emission trend, peak emission time and valley emission time of the data to be integrated; the data characteristic of the registered data includes at least one of emission peak, emission trough, emission period, emission trend, peak emission time, trough emission time of the registered data. If the data to be integrated and the registered data correspond to the same pollutant and the data features are the same, the probability of matching the two is high. Otherwise, the two do not match.
S506: and judging whether the reliability of the data to be integrated meets the preset requirement, if so, executing step S507.
In a specific implementation scenario, if the data characteristics of the data to be integrated and the data characteristics of the registered pollution sources are inconsistent, the data characteristics are not matched, whether the reliability of the data to be integrated meets the preset requirement is judged, whether the data to be integrated is uploaded by appointed personnel of an appointed unit through an appointed channel or not can be detected by the reliability judgment, whether the data to be integrated has a preset data detection mark or not can also be detected, if so, the data is proved to be subjected to data quality detection, the reliability is higher, and whether a sensor for collecting the data to be integrated is an appointed sensor or not can also be detected.
S507: and taking the pollution source to be integrated as a newly added pollution source.
In a specific implementation scenario, the reliability of the data to be integrated meets a preset requirement, and the pollution source to be integrated is used as a new pollution source, and the data to be integrated is used as the new pollution source data. In other implementation scenarios, the reliability of the data to be integrated meets the preset requirement, and the data to be integrated is deleted.
As can be seen from the above description, in this embodiment, whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated is determined according to the data source, and whether the registered pollution source is matched with the data feature of the data to be integrated is determined according to the data feature of the data to be integrated and the data feature of the registered pollution source, so that accuracy and reliability of the determination result can be effectively improved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of a fixed pollution source data integration system according to the present invention. The fixed pollution source data integration system 10 comprises an acquisition module 11, a matching module 12, an integration module 13 and a new addition module 14.
The acquisition module 11 is configured to acquire to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, import the to-be-integrated data in a data integration manner matched with the data source, wherein the to-be-integrated data includes a name and/or an identification code of the to-be-integrated pollution source, acquire registered data of a registered pollution source, and the registered data includes the name and/or the identification code of the registered pollution source; the matching module 12 is configured to perform keyword matching on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and determine whether the registered pollution source is a matching pollution source matched with the pollution source to be integrated; the integration module 13 is configured to integrate the data to be integrated into the matching pollution source data of the matching pollution source if the registered pollution source is the matching pollution source matched with the pollution source to be integrated; the adding module 14 is configured to generate an added pollution source and added pollution source data according to the pollution source data to be integrated if the registered pollution source is not a matched pollution source matched with the pollution source to be integrated.
The new adding module 14 is configured to obtain a name and/or an identification code of the new adding pollution source according to a preset naming rule and/or a coding rule.
The new adding module 14 is configured to generate a fixed pollution source list according to all the new adding pollution sources and the registered pollution sources, and determine whether there are substantially the same combinable pollution sources in the fixed pollution source list; and at least one of de-duplication, coverage and deletion is carried out on the combinable pollution source data of the combinable pollution source, and the combinable pollution source data are combined into combined pollution source data.
The adding module 14 is configured to construct a pollution source file for each fixed pollution source in the fixed pollution source list, where the pollution source file includes at least one of a name, an address, an administrative district to which the pollution source belongs, an industry to which the pollution source belongs, an enterprise to which the pollution source belongs, an identification code, a management attribute, a supervision flow chart, a pollution source emission amount, and monitoring information.
The data to be integrated also comprises enterprises to be integrated, to which the pollution sources to be integrated belong; the registered data also includes registered businesses to which the registered pollution sources belong. The integration module 13 is configured to obtain an association relationship between an enterprise to be integrated and a registered enterprise, where the association relationship includes any one of a superior relationship, a subordinate relationship, a same superior relationship, and a substantially same relationship; integrating the data to be integrated into the matched pollution source data of the matched pollution source according to the association relation, and adding corresponding association identifications and/or association links on the data to be integrated and the matched pollution source data. Acquiring an association relation between an enterprise to be integrated and a registered enterprise, wherein the association relation comprises any one of a superior relation, a subordinate relation and a same superior relation which are parallel, and a substantially same relation; integrating the data to be integrated into the matched pollution source data of the matched pollution source according to the association relation, and adding corresponding association identifications and/or association links on the data to be integrated and the matched pollution source data.
The integration module 13 is configured to construct a pollution data knowledge graph that matches pollution source data with data to be integrated according to the association relationship.
The data to be integrated comprises a data source of the pollution source to be integrated, the registered data comprises a data source of the registered pollution source, and the data source comprises at least one of a providing unit, a system and a sharing mode. The integration module 13 is configured to take the registered pollution source as a matching pollution source if the name and/or identification code of the pollution source to be integrated is completely consistent with the name and/or identification code of the registered pollution source; the matching module 12 is configured to obtain data sources of the pollution source to be integrated and the registered pollution source if the name and/or the identification code of the pollution source to be integrated is partially consistent with the name and/or the identification code of the registered pollution source, and determine whether the registered pollution source is a matching pollution source matching the pollution source to be integrated according to the data sources.
The matching module 12 is used for judging whether the data source of the pollution sources to be integrated is consistent with the data source of the registered pollution sources; if the data source of the pollution source to be integrated is completely consistent with the data source of the registered pollution source, the data characteristics of the data to be integrated and the data characteristics of the registered pollution source are obtained, and whether the data characteristics of the data to be integrated and the data characteristics of the registered pollution source are consistent or not is judged. The integration module 13 is configured to take the registered pollution source as a matching pollution source if the data features of the data to be integrated and the data features of the registered pollution source are consistent. The new adding module 14 is configured to determine whether the reliability of the data to be integrated meets a preset requirement if the data characteristics of the data to be integrated are inconsistent with the data characteristics of the registered pollution sources, and take the pollution sources to be integrated as new adding pollution sources if the reliability of the data to be integrated meets the preset requirement.
The data characteristics of the data to be integrated comprise at least one of emission peak value, emission valley value, emission period, emission trend, peak emission time and valley emission time of the data to be integrated; the data characteristic of the registered data includes at least one of emission peak, emission trough, emission period, emission trend, peak emission time, trough emission time of the registered data.
The integration module 13 is used for performing data quality audit on registered data, newly added pollution source data or matched pollution source data which have been integrated, and deleting data which are unqualified in the quality audit, wherein the quality audit comprises data integrity audit, data validity audit and data reliability audit.
The data integrity audit includes: verifying whether the data content includes the necessary data items; the data validity audit includes: checking whether the data content is in the data validity period; the data reliability audit includes: and checking whether the data source belongs to an official authentication data source.
The acquisition module 11 is used for inputting the data to be integrated into a preset form matched with the fixed pollution source data type, and acquiring the data of the input preset form; and/or acquiring data to be integrated, automatically performing data examination on the data to be integrated, and importing the data to be integrated in a data copying mode; and/or when the to-be-integrated data of the newly added to-be-integrated pollution source is detected, acquiring the to-be-integrated data of the newly added to-be-integrated pollution source through a standard data interface, converting the to-be-integrated data of the newly added to-be-integrated pollution source into a standard format, and then importing the to-be-integrated data; and/or importing the data to be integrated through a data importing tool and a data managing tool.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a fixed pollution source data integration system according to the present invention. The stationary pollution source data integration system 20 includes a processor 21, a memory 22. The processor 21 is coupled to the memory 22. The memory 22 has stored therein a computer program which is executed by the processor 21 in operation to implement the method as shown in fig. 1-5. The detailed method can be referred to above, and will not be described here.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a storage medium according to an embodiment of the invention. The storage medium 30 stores at least one computer program 31, and the computer program 31 is configured to be executed by a processor to implement the method shown in fig. 1 to 5, and the detailed method is referred to above and will not be described herein. In one embodiment, the storage medium 30 may be a memory chip, a hard disk or a removable hard disk in the terminal, or other readable and writable storage means such as a flash disk, an optical disk, etc., and may also be a server, etc.
Those skilled in the art will appreciate that the processes implementing all or part of the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, and the program may be stored in a non-volatile computer readable storage medium, and the program may include processes of the embodiments of the methods as above when executed. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not thereby to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (13)

1. A method for integrating fixed pollution source data, comprising:
acquiring to-be-integrated data of a to-be-integrated pollution source and a data source of the to-be-integrated data, and importing the to-be-integrated data in a data integration mode matched with the data source, wherein the to-be-integrated data comprises a name and/or an identification code of the to-be-integrated pollution source;
when the data to be integrated is historical data or data integrated or collected by other software, the data to be integrated is automatically obtained periodically by using a data warehouse technology data importing tool and a corresponding data warehouse technology management tool;
When the data to be integrated is business data comprising approval permission for environment, providing a standard data interface, and acquiring the newly added data to be integrated of a pollution source to be integrated through the standard data interface;
when the data to be integrated is online monitoring data, importing the data to be integrated in a data copying mode;
when the data to be integrated is at least one of public codes, pollution source basic information, environment quality measuring points and section information, inputting the data to be integrated into a preset form matched with the fixed pollution source data type, and acquiring the data of the input preset form;
acquiring registered data of a registered pollution source, wherein the registered data comprises a name and/or an identification code of the registered pollution source;
keyword matching is carried out on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated is judged;
if the registered pollution source is a matched pollution source matched with the pollution source to be integrated, integrating the data to be integrated into matched pollution source data of the matched pollution source;
If the registered pollution source is not a matched pollution source matched with the pollution source to be integrated, generating a new pollution source and new pollution source data according to the pollution source data to be integrated;
wherein the data to be integrated comprises a data source of the pollution source to be integrated, the registered data comprises a data source of the registered pollution source, and the data source comprises at least one of a providing unit, a system and a sharing mode;
the step of judging whether the registered pollution sources are matched pollution sources matched with the pollution sources to be integrated or not includes:
if the name and/or the identification code of the pollution source to be integrated is completely consistent with the name and/or the identification code of the registered pollution source, the registered pollution source is used as the matched pollution source;
if the name and/or the identification code of the pollution source to be integrated is partially consistent with the name and/or the identification code of the registered pollution source, acquiring the data sources of the pollution source to be integrated and the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated according to the data sources;
The step of judging whether the registered pollution sources are matched pollution sources matched with the pollution sources to be integrated according to the data sources comprises the following steps:
judging whether the data source of the pollution source to be integrated is consistent with the data source of the registered pollution source;
if the data source of the pollution source to be integrated is completely consistent with the data source of the registered pollution source, acquiring the data characteristics of the data to be integrated and the data characteristics of the registered pollution source, and judging whether the data characteristics of the data to be integrated and the data characteristics of the registered pollution source are consistent or not;
if the data characteristics of the data to be integrated are consistent with the data characteristics of the registered pollution sources, the registered pollution sources are used as the matched pollution sources;
if the data characteristics of the data to be integrated are inconsistent with the data characteristics of the registered pollution sources, judging whether the reliability of the data to be integrated meets the preset requirement, and if the reliability of the data to be integrated meets the preset requirement, taking the pollution sources to be integrated as the newly added pollution sources.
2. The method of claim 1, wherein the step of generating the additional pollution source and the additional pollution source data from the pollution source data to be integrated comprises:
And acquiring the name and/or identification code of the newly added pollution source according to a preset naming rule and/or coding rule.
3. The method for integrating fixed pollution source data according to claim 1, wherein after the step of generating the additional pollution source and the additional pollution source data from the pollution source data to be integrated, the method comprises:
generating a fixed pollution source list according to all the current newly-added pollution sources and registered pollution sources, and judging whether the fixed pollution source list contains the same combinable pollution sources or not;
and at least one of de-duplication, coverage and deletion is carried out on the combinable pollution source data of the combinable pollution source, and the combinable pollution source data are combined into combined pollution source data.
4. A method of integrating fixed pollution source data as recited in claim 3, wherein after said step of generating a list of fixed pollution sources from all currently added and registered pollution sources, it comprises:
and constructing a pollution source file for each fixed pollution source in the fixed pollution source list, wherein the pollution source file comprises at least one of a name, an address, an administrative area, an industry, an enterprise, an identification code, a management attribute, a supervision flow chart, pollution source emission and monitoring information.
5. The method for integrating fixed pollution source data according to claim 1, wherein the data to be integrated further comprises an enterprise to be integrated to which the pollution source to be integrated belongs; the registered data also includes a registered business to which the registered pollution source belongs;
the step of integrating the data to be integrated into the matching pollution source data of the matching pollution source comprises the following steps:
acquiring an association relationship between the enterprise to be integrated and the registered enterprise, wherein the association relationship comprises any one of a superior relationship, a subordinate relationship, a same superior relationship and a substantially same relationship;
and integrating the data to be integrated into the matching pollution source data of the matching pollution source according to the association relation, and adding corresponding association identifiers and/or association links on the data to be integrated and the matching pollution source data.
6. The method for integrating fixed pollution source data according to claim 5, wherein said step of acquiring the association relationship between the enterprise to be integrated and the registered enterprise comprises:
and constructing a pollution data knowledge graph of the matched pollution source data and the data to be integrated according to the association relation.
7. The method of claim 1, wherein the data characteristics of the data to be integrated include at least one of emission peaks, emission valleys, emission periods, emission trends, peak emission times, valley emission times of the data to be integrated;
the data characteristic of the registered data includes at least one of emission peak, emission trough, emission period, emission trend, peak emission time, trough emission time of the registered data.
8. The method for integrating fixed pollution source data according to claim 1, wherein after the step of generating the additional pollution source and the additional pollution source data from the pollution source data to be integrated, the method comprises:
and performing data quality audit on the registered data, the newly added pollution source data or the integrated matched pollution source data, and deleting the data with unqualified quality audit, wherein the quality audit comprises data integrity audit, data validity audit and data reliability audit.
9. The method for integrating stationary pollution source data according to claim 8,
the data integrity audit includes: verifying whether the data content includes the necessary data items;
The data validity audit includes: checking whether the data content is in the data validity period;
the data reliability audit includes: and checking whether the data source belongs to an official authentication data source.
10. The method for integrating data of stationary pollution sources according to claim 1, wherein said step of importing said data to be integrated in a data integration manner matched with said data sources comprises:
inputting the data to be integrated into a preset form matched with the fixed pollution source data type, and acquiring the data of the input preset form; and/or
Acquiring the data to be integrated, automatically performing data examination on the data to be integrated, and importing the data to be integrated in a data copying mode; and/or
When detecting the to-be-integrated data of the newly added to-be-integrated pollution source, acquiring the to-be-integrated data of the newly added to-be-integrated pollution source through a standard data interface, converting the to-be-integrated data of the newly added to-be-integrated pollution source into a standard format, and then importing the to-be-integrated data; and/or
And importing the data to be integrated through a data importing tool and a data management tool.
11. A stationary pollution source data integration system, comprising:
The acquisition module is used for acquiring to-be-integrated data of the to-be-integrated pollution source and a data source of the to-be-integrated data, and importing the to-be-integrated data in a data integration mode matched with the data source, wherein the to-be-integrated data comprises a name and/or an identification code of the to-be-integrated pollution source;
when the data to be integrated is historical data or data integrated or collected by other software, the data to be integrated is automatically obtained periodically by using a data warehouse technology data importing tool and a corresponding data warehouse technology management tool;
when the data to be integrated is business data comprising approval permission for environment, providing a standard data interface, and acquiring the newly added data to be integrated of a pollution source to be integrated through the standard data interface;
when the data to be integrated is online monitoring data, importing the data to be integrated in a data copying mode;
when the data to be integrated is at least one of public codes, pollution source basic information, environment quality measuring points and section information, inputting the data to be integrated into a preset form matched with the fixed pollution source data type, and acquiring the data of the input preset form;
Acquiring registered data of a registered pollution source, wherein the registered data comprises a name and/or an identification code of the registered pollution source;
the matching module is used for carrying out keyword matching on the name and/or the identification code of the pollution source to be integrated and the name and/or the identification code of the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated or not;
the integration module is used for integrating the data to be integrated into the matching pollution source data of the matching pollution source if the registered pollution source is the matching pollution source matched with the pollution source to be integrated;
the new adding module is used for generating a new adding pollution source and new adding pollution source data according to the pollution source data to be integrated if the registered pollution source is not a matched pollution source matched with the pollution source to be integrated;
wherein the data to be integrated comprises a data source of the pollution source to be integrated, the registered data comprises a data source of the registered pollution source, and the data source comprises at least one of a providing unit, a system and a sharing mode;
the matching module is also used for:
If the name and/or the identification code of the pollution source to be integrated is completely consistent with the name and/or the identification code of the registered pollution source, the registered pollution source is used as the matched pollution source;
if the name and/or the identification code of the pollution source to be integrated is partially consistent with the name and/or the identification code of the registered pollution source, acquiring the data sources of the pollution source to be integrated and the registered pollution source, and judging whether the registered pollution source is a matched pollution source matched with the pollution source to be integrated according to the data sources;
the matching module is also used for:
judging whether the data source of the pollution source to be integrated is consistent with the data source of the registered pollution source;
if the data source of the pollution source to be integrated is completely consistent with the data source of the registered pollution source, acquiring the data characteristics of the data to be integrated and the data characteristics of the registered pollution source, and judging whether the data characteristics of the data to be integrated and the data characteristics of the registered pollution source are consistent or not;
if the data characteristics of the data to be integrated are consistent with the data characteristics of the registered pollution sources, the registered pollution sources are used as the matched pollution sources;
If the data characteristics of the data to be integrated are inconsistent with the data characteristics of the registered pollution sources, judging whether the reliability of the data to be integrated meets the preset requirement, and if the reliability of the data to be integrated meets the preset requirement, taking the pollution sources to be integrated as the newly added pollution sources.
12. A stationary pollution source data integration system, comprising a memory storing a computer program and a processor executing the computer program to perform the steps of the method of any of claims 1 to 11.
13. The stationary contamination source data integration system of claim 12, wherein the memory stores a computer program that, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 11.
CN202210443669.2A 2022-04-26 2022-04-26 Data integration system and method for fixed pollution source Active CN114860875B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210443669.2A CN114860875B (en) 2022-04-26 2022-04-26 Data integration system and method for fixed pollution source

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210443669.2A CN114860875B (en) 2022-04-26 2022-04-26 Data integration system and method for fixed pollution source

Publications (2)

Publication Number Publication Date
CN114860875A CN114860875A (en) 2022-08-05
CN114860875B true CN114860875B (en) 2023-06-20

Family

ID=82634224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210443669.2A Active CN114860875B (en) 2022-04-26 2022-04-26 Data integration system and method for fixed pollution source

Country Status (1)

Country Link
CN (1) CN114860875B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238658B (en) * 2022-09-22 2023-01-31 中科三清科技有限公司 Data processing method and device, storage medium and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664480A (en) * 2017-03-27 2018-10-16 北京国双科技有限公司 A kind of multi-data source user information integration method and device
CN109376210A (en) * 2018-10-24 2019-02-22 海口金政信息科技有限公司 A kind of intelligence pollution sources dynamic management system and method
CN110851667A (en) * 2019-09-25 2020-02-28 中国移动通信集团河南有限公司 Integrated analysis method and tool for multi-source large data
CN110852601A (en) * 2019-11-07 2020-02-28 佛山市南海区环境技术中心 Big data application method and system for environmental monitoring law enforcement decision
CN111708773A (en) * 2020-08-13 2020-09-25 江苏宝和数据股份有限公司 Multi-source scientific and creative resource data fusion method
CN113297448A (en) * 2021-05-13 2021-08-24 中国电波传播研究所(中国电子科技集团公司第二十二研究所) Open-source electric wave environment data acquisition method based on web crawler and computer readable storage medium
CN113312342A (en) * 2021-05-26 2021-08-27 北京航空航天大学 Scientific and technological resource integration system based on multi-source database
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113792160A (en) * 2021-09-17 2021-12-14 南京大创师智能科技有限公司 Knowledge graph expansion and fusion method for multi-source data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050165752A1 (en) * 2004-01-28 2005-07-28 Sun Microsystems, Inc. Synchronizing and consolidating information from multiple source systems of a distributed enterprise information system
CN107247787A (en) * 2017-06-15 2017-10-13 山东浪潮云服务信息科技有限公司 A kind of sorting technique based on multisource data fusion
CN112732713A (en) * 2020-12-29 2021-04-30 郑州信大捷安信息技术股份有限公司 Data integration method and system based on same user in multiple data sources

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664480A (en) * 2017-03-27 2018-10-16 北京国双科技有限公司 A kind of multi-data source user information integration method and device
CN109376210A (en) * 2018-10-24 2019-02-22 海口金政信息科技有限公司 A kind of intelligence pollution sources dynamic management system and method
CN110851667A (en) * 2019-09-25 2020-02-28 中国移动通信集团河南有限公司 Integrated analysis method and tool for multi-source large data
CN110852601A (en) * 2019-11-07 2020-02-28 佛山市南海区环境技术中心 Big data application method and system for environmental monitoring law enforcement decision
CN111708773A (en) * 2020-08-13 2020-09-25 江苏宝和数据股份有限公司 Multi-source scientific and creative resource data fusion method
CN113297448A (en) * 2021-05-13 2021-08-24 中国电波传播研究所(中国电子科技集团公司第二十二研究所) Open-source electric wave environment data acquisition method based on web crawler and computer readable storage medium
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113312342A (en) * 2021-05-26 2021-08-27 北京航空航天大学 Scientific and technological resource integration system based on multi-source database
CN113792160A (en) * 2021-09-17 2021-12-14 南京大创师智能科技有限公司 Knowledge graph expansion and fusion method for multi-source data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于WebGIS的交互式缓冲区分析查询;吉杰 等;计算机应用与软件;第29卷(第3期);235-238 *

Also Published As

Publication number Publication date
CN114860875A (en) 2022-08-05

Similar Documents

Publication Publication Date Title
Barron et al. A comprehensive framework for intrinsic OpenStreetMap quality analysis
CN105868373B (en) Method and device for processing key data of power business information system
Rodseth et al. A revised approach for estimating informally disposed domestic waste in rural versus urban South Africa and implications for waste management
Chapman et al. Developing Standards for Improved Data Quality and for Selecting Fit for Use Biodiversity Data.
CN109063178B (en) Method and device for automatically expanding self-help analysis report
CN109241223B (en) Behavior track identification method and system
CN104731816A (en) Method and device for processing abnormal business data
CN114860875B (en) Data integration system and method for fixed pollution source
CN113469857A (en) Data processing method and device, electronic equipment and storage medium
Turner Defining and measuring traffic data quality: White paper on recommended approaches
CN115794839B (en) Data collection method based on Php+Mysql system, computer equipment and storage medium
CN111078512A (en) Alarm record generation method and device, alarm equipment and storage medium
CN112817958A (en) Electric power planning data acquisition method and device and intelligent terminal
CN112948504B (en) Data acquisition method and device, computer equipment and storage medium
CN115982429B (en) Knowledge management method and system based on flow control
CN116260866A (en) Government information pushing method and device based on machine learning and computer equipment
CN116506186A (en) Big data layering analysis method for network security level protection evaluation data
KR101415528B1 (en) Apparatus and Method for processing data error for distributed system
CN114580945A (en) Water environment data analysis method, device, equipment and storage medium
Taylor et al. A provenance maturity model
CN114218383A (en) Method, device and application for judging repeated events
KR20180077397A (en) System for constructing software project relationship and method thereof
CN111914147A (en) Suspected actual control person credit investigation method and system for enterprise
KR100693370B1 (en) Duplicated database merge purge arrangement apparatus and the Method Thereof
CN116775508B (en) Garbage cleaning method and system for android mobile phone

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant