CN115827797A - Environmental data analysis and integration method and system based on big data - Google Patents

Environmental data analysis and integration method and system based on big data Download PDF

Info

Publication number
CN115827797A
CN115827797A CN202211534650.5A CN202211534650A CN115827797A CN 115827797 A CN115827797 A CN 115827797A CN 202211534650 A CN202211534650 A CN 202211534650A CN 115827797 A CN115827797 A CN 115827797A
Authority
CN
China
Prior art keywords
data
analysis
historical
neural network
environmental
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211534650.5A
Other languages
Chinese (zh)
Inventor
谭立球
周敏
唐宇光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUNAN CREATOR INFORMATION TECHNOLOGIES CO LTD
Original Assignee
HUNAN CREATOR INFORMATION TECHNOLOGIES CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUNAN CREATOR INFORMATION TECHNOLOGIES CO LTD filed Critical HUNAN CREATOR INFORMATION TECHNOLOGIES CO LTD
Priority to CN202211534650.5A priority Critical patent/CN115827797A/en
Publication of CN115827797A publication Critical patent/CN115827797A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an environmental data analysis and integration method and system based on big data, which relate to the technical field of environmental big data and comprise the steps of obtaining historical data of environmental data to be detected; preprocessing the historical data, and carrying out lightweight processing on the preprocessed historical data to obtain new environment data subjected to lightweight processing; performing clustering analysis on the environmental data to obtain an analysis result; constructing a neural network architecture model, and training the neural network architecture model according to the analysis result; and dividing the environment data into different dimensional data for storage according to the trained neural network architecture model. The method has the advantages of reducing manual operation amount, reducing operation errors, improving integration efficiency and realizing the refining and platform capability of analyzing and integrating the environmental data under the condition of big data.

Description

Environmental data analysis and integration method and system based on big data
Technical Field
The invention relates to the technical field of environmental big data, in particular to an environmental data analysis and integration method and system based on big data.
Background
As an important ring of intelligent environmental protection, the environment big data integration technology needs to implement the idea from terminal sensors to application services. The environmental data contains single-dimensional data resources in various environmental protection fields such as water, atmosphere, soil, secondary pollution, pollution sources and the like, which cannot meet the application requirements of environmental protection personnel, and the original disordered data such as pollution, monitoring, evaluation and the like must be scientifically treated. However, in the face of complex and various types of environmental data, manual operation integration is often high in error rate and large in error, the requirements of actual environmental protection work cannot be completely met in the aspects of accuracy and efficiency of data collection, extraction and data sharing services, and the intelligence level of the intelligent environment data integration system also has a great space.
Disclosure of Invention
The present invention is directed to a method and a system for analyzing and integrating environmental data based on big data, so as to solve the above-mentioned problems. In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
in a first aspect, the present application provides a big data-based environmental data analysis and integration method, including:
acquiring historical data of environmental data to be detected, wherein the historical data comprises track data left at the rear end when a user accesses a front end page, and the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data;
preprocessing the historical data, and carrying out lightweight processing on the preprocessed historical data to obtain new environment data subjected to lightweight processing;
performing clustering analysis on the environmental data to obtain an analysis result;
constructing a neural network architecture model, and training the neural network architecture model according to the analysis result;
and dividing the environment data into different dimensional data for storage according to the trained neural network architecture model.
Preferably, the preprocessing the historical data and performing a weight reduction process on the preprocessed historical data to obtain new environment data after the weight reduction process includes:
respectively extracting word frequency based on preset historical data and attributes of the historical data to obtain a first historical phrase, wherein the first historical phrase is a phrase with the word frequency higher than the preset historical data in the historical data;
classifying based on the attribute of the first historical phrase to obtain a second historical phrase;
analyzing the historical data to obtain analyzed data analysis information;
traversing the data analysis information, and judging whether each piece of data analysis information is related to the track data or not based on a second historical phrase, wherein if each piece of data analysis information is related to the track data, the data analysis information is reserved; and if each piece of data analysis information is not related to the track data, rejecting the data analysis information, and traversing the next piece of data analysis information until the data analysis information is traversed, so as to obtain new environment data subjected to light weight processing.
Preferably, the clustering analysis is performed on the environmental data to obtain an analysis result, which includes:
performing clustering analysis on the environment data by adopting a preset clustering model, and sampling and aggregating local characteristics of nodes to generate node representation;
combining an attention mechanism, carrying out weight calculation on the nodes according to inner product operation, and self-adapting to the automatic learning characteristic;
and according to the constraint and integration of the clustering model to the graph embedding, training and verifying the clustering model, and adjusting and optimizing network parameters to obtain an analysis result.
Preferably, the building a neural network architecture model, and training the neural network architecture model according to the analysis result includes:
determining mark information corresponding to each piece of environmental data in the analysis result;
generating a neural network model for multi-label classification according to the marking information and preset weight coefficients based on the neural network architecture model, wherein the neural network model is used for acquiring a multi-label feature map;
performing convolution operation and normalization operation on the multi-label feature map to obtain a weighted multi-label feature map;
and obtaining a classification result according to the multi-label feature map.
Preferably, the dividing the environmental data into different dimensional data according to the trained neural network architecture model for storage further includes:
constructing a basic index system of the environment data based on the neural network architecture model;
determining an environmental data analysis index according to the basic index system;
and combining the basic index system, the environmental data analysis index and the neural network architecture model to construct an operation strategy, wherein the operation strategy is used for presenting the data analyzed according to the environmental data of the big data.
In a second aspect, the present application further provides an environmental data analysis and integration system based on big data, including an acquisition module, a preprocessing module, an analysis module, a construction module, and a storage module, where:
an acquisition module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring historical data of environmental data to be detected, the historical data comprises track data left at a rear end when a user accesses a front end page, and the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data;
a preprocessing module: the system comprises a history data processing unit, a database and a database, wherein the history data processing unit is used for preprocessing the history data and carrying out lightweight processing on the preprocessed history data to obtain new environment data after the lightweight processing;
an analysis module: the system is used for carrying out clustering analysis on the environment data to obtain an analysis result;
constructing a module: the neural network architecture model is constructed and trained according to the analysis result;
a storage module: and the environment data is divided into different dimension data for storage according to the trained neural network architecture model.
In a third aspect, the present application further provides an environmental data analysis and integration apparatus based on big data, including:
a memory for storing a computer program;
a processor for implementing the steps of the big data based environmental data analysis and integration method when executing the computer program.
In a fourth aspect, the present application further provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above-mentioned big data based environmental data analysis and integration method.
The invention has the beneficial effects that: lightweight data are obtained through preprocessing of historical data, clustering analysis is conducted on the data, a neural network is constructed to divide and store the data, the safety, expandability, flexibility and intelligence of environmental data application are greatly improved, convenience is provided for later-stage environmental data retrieval, more flexible and diverse data integration and sharing interaction mechanisms such as application are provided for users, the practical requirements of classification integration and sharing application of mass data in the current stage of environmental protection work informatization construction are met, manual workload is reduced, operation errors are reduced, integration efficiency is improved, and the refining and platform capability of environmental data analysis and integration under the condition of big data is achieved. By means of technical advantages of power cloud computing, big data and the like, an ecological environment data resource center is formed, cleaning integration of data and value mining of the data are achieved, the ecological environment management level is improved, the ecological environment comprehensive decision-making capacity is enhanced, and the data sharing, information disclosure and government affair service level are improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a schematic flow chart illustrating a method for analyzing and integrating environmental data based on big data according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an environmental data analysis and integration system based on big data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an environmental data analysis and integration apparatus based on big data according to an embodiment of the present invention.
In the figure, 701, an acquisition module; 702. a preprocessing module; 7021. an extraction unit; 7022. a first classification unit; 7023. an analysis unit; 7024. a traversing unit; 703. an analysis module; 7031. a generating unit; 7032. a calculation unit; 7033. an integration unit; 704. building a module; 7041. a determination unit; 7042. a second classification unit; 7043. an operation unit; 7044. an acquisition unit; 705. a storage module; 7051. a first building element; 7052. a determination unit; 7053. a second building element; 800. the environmental data analysis and integration equipment based on big data; 801. a processor; 802. a memory; 803. a multimedia component; 804. an I/O interface; 805. a communication component.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not construed as indicating or implying relative importance.
Example 1:
the embodiment provides an environmental data analysis and integration method based on big data. It should be noted that the technology suitable for big data includes a Massively Parallel Processing (MPP) database, data mining, a distributed file system, a distributed database, a cloud computing platform, the internet, and an extensible storage system.
Referring to fig. 1, it is shown that the method includes step S100, step S200, step S300, step S400 and step S500.
S100, acquiring historical data of the environmental data to be detected, wherein the historical data comprises track data left at the back end when a user accesses a front end page, and the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data.
The method includes but is not limited to four different services of water, gas, pollution sources and emergency, a water environment risk assessment model, a wading sensitive point analysis model, a water pollution diffusion analysis model, a wading sensitive point analysis model, an atmospheric pollution diffusion analysis model, a pollution source supervision service model and the like are built, analysis and assessment are carried out by utilizing rich data gathered by a big data resource center, and an analysis result is issued to the outside for a service system to call.
It is understood that in this step, the front-end page includes a website, an APP, an applet, and the like. The track data includes all operation actions performed by the user on the front-end page, such as registering and logging in a website, browsing a page, clicking a page button, and inputting content into an input box. For example, a user's actions on a front-end page, browsing and clicking input content, may be clustered as action dimensions, and accesses the front-end, may be clustered as accesses to network channel dimensions. The construction thinking of big data is used, monitoring, supervision data related to environmental services are integrated to build a database, the exchange and analysis advantages of the big data in mass data are fully utilized, the data sharing exchange of ecological environmental data across departments and industries is enhanced, an environmental service model and an algorithm are supported to carry out deep excavation of the ecological environmental data, data resources are turned to information resources, the decision support capability of ecological environment protection and management is improved, effective support is provided for work such as pollution prevention and control attack and environmental supervision, emergency and the like, and the maximum value of the data is exerted.
The historical data comprises an ecological environment big data resource center, a cloud architecture design is adopted, and an infrastructure layer, a data resource layer, a platform service layer and an application layer are arranged from bottom to top respectively. The infrastructure layer comprises computing resources, storage resources, network resources and Hadoop big data clusters, provides virtualized resources, distributed storage, parallel computing power and the like according to business requirements, and effectively supports operation of each platform or system. The data resource layer mainly comprises a thematic library related to environmental elements such as atmosphere, water, sound and pollution sources related to ecological environment, data are derived from all business systems, environmental department working reports, data crawled from the Internet and the like, and a combined storage scheme of a Hadoop big data platform and other relational databases is adopted for storage according to data characteristics. The platform service layer mainly comprises a big data management platform, a big data analysis platform and a geographic information platform, the big data management platform, the big data analysis platform and the geographic information platform are used for respectively realizing the receiving, storing, managing and using of ecological environment data and realizing the data analysis based on a business data construction algorithm model, and the data resource center application layer mainly comprises a unified resource portal and the construction of a data monitoring center and aims to provide inquiry statistics and monitoring related to the use of data resources.
S200, preprocessing the historical data, and performing light weight processing on the preprocessed historical data to obtain new environment data after the light weight processing.
It is understood that in this step, it includes:
respectively extracting word frequency based on preset historical data and attributes of the historical data to obtain a first historical phrase, wherein the first historical phrase is a phrase with the word frequency higher than the preset historical data in the historical data;
classifying based on the attribute of the first historical phrase to obtain a second historical phrase;
analyzing the historical data to obtain analyzed data analysis information;
it can be understood that, in this step, an extraction tool (such as a TF-IDF keyword algorithm, a Text Rank keyword algorithm, etc.) extracts keyword groups capable of simplifying and summarizing the history data from the word frequency of the history data and the attributes of the history data, classifies according to the part of speech of each summary keyword group, the summary keyword groups of the same part of speech form a group of history phrases, and arranges the timestamps corresponding to all the history phrases in a time series manner to obtain a second history phrase.
Traversing the data analysis information, and judging whether each piece of data analysis information is related to the track data or not based on a second historical phrase, wherein if each piece of data analysis information is related to the track data, the data analysis information is reserved; and if each piece of data analysis information is not related to the track data, rejecting the data analysis information, and traversing the next piece of data analysis information until the data analysis information is traversed, so as to obtain new environment data subjected to light weight processing.
It should be noted that after the whole process is finished, a basic data list of the storage component position conversion data analysis information is generated, and the track data are correlated with each other, so that the information of the repeated data can be quickly removed, the redundant data of the environment data is greatly reduced, the model data is compressed, and the purpose of lightweight historical data and neural network model is achieved.
It should be noted that, the method and the system can make full use of respective advantages of the Hadoop big data platform and the relational database, and design a big data platform storage scheme with a plurality of storage technologies integrated. By combining the IoT data access and query requirements, starting from the performance of guaranteeing data query and analysis, the structured data generated by an OLTP (online transaction processing) system with a fair volume is stored by adopting an Oracle relational database; for structured data with very large volume or needing complex model analysis, extracting and storing the structured data into a Hive big data component at extra timing for interactive analysis or offline analysis; storing the semi-structured data through an Hbase big data component; storing unstructured data through an HDFS big data component; and storing the time sequence data through a Hive big data component.
S300, carrying out clustering analysis on the environment data to obtain an analysis result.
It is understood that in this step, among others:
performing clustering analysis on the environment data by adopting a preset clustering model, and sampling and aggregating local characteristics of nodes to generate node representation;
it should be noted that, the MR image information is input into the clustering model to obtain an optimal solution, and differences of different categories and similarities of the same category are analyzed.
Combining an attention mechanism, carrying out weight calculation on the nodes according to inner product operation, and self-adapting to the automatic learning characteristic;
specifically, an attention mechanism is needed to be combined, an end-to-end machine model of the neural network is constructed based on one encoder, all possibilities are traversed in the features, the correlation and the weight are analyzed, if the variance of the features is higher, the importance is higher, and otherwise, the importance is lower. It should be noted that, after the clustering result, the data is optimized by assigning values through the self-adaptive automatic learning characteristic, and the closer to the clustering center, the greater the importance is, so as to improve the clustering cohesiveness.
And according to the constraint and integration of the clustering model to the graph embedding, training and verifying the clustering model, and adjusting and optimizing network parameters to obtain an analysis result.
Specifically, the clustering model is trained and verified, parameters are adjusted at any time, and the accuracy of big data environment data analysis can be improved.
S400, constructing a neural network architecture model, and training the neural network architecture model according to the analysis result.
It is understood that, in this step, the following are included:
determining mark information corresponding to each piece of environmental data in the analysis result;
generating a neural network model for multi-label classification according to the marking information and preset weight coefficients based on the neural network architecture model, wherein the neural network model is used for acquiring a multi-label feature map;
performing convolution operation and normalization operation on the multi-label feature map to obtain a weighted multi-label feature map;
specifically, according to the multi-label features, a multi-label classification result of the image to be classified is obtained, and the multi-label features are weighted by using the weight coefficient, so that the attention degree to the important features can be improved, and the robustness of multi-label classification is improved.
And obtaining a classification result according to the multi-label feature map.
It should be noted that, the label information is input into the neural network, that is, the image to be classified is input into the neural network model, the neural network model obtains the weighted multi-label features according to the multi-label features and the weight coefficients, and performs multi-label classification on the image to be classified according to the weighted multi-label features, thereby outputting the probability of each label. And obtaining multi-label classification results according to the probability of each label.
S500, dividing the environment data into different dimension data according to the trained neural network architecture model and storing the dimension data.
It is understood that the method further comprises the following steps:
constructing a basic index system of the environment data based on the neural network architecture model;
determining an environmental data analysis index according to the basic index system;
and constructing an operation strategy by combining the basic index system, the environmental data analysis index and the neural network architecture model, wherein the operation strategy is used for presenting the data analyzed according to the environmental data of the big data.
It should be noted that a unified, normative and sharable basic index system of the environmental data divided according to the theme can be constructed on the basis of the neural network architecture model, so that redundancy and repeated construction of the environmental data are avoided, data chimney and inconsistency are avoided, and unique advantages of the enterprise big data center in terms of large data mass and diversity are fully exerted. The basic index systems are mutually linked, and an operation strategy can be established by analyzing indexes and models, so that the environmental data can be more conveniently integrated and analyzed.
Example 2:
as shown in fig. 2, the present embodiment provides an environmental data analysis and integration system based on big data, referring to fig. 2, which includes an obtaining module 701, a preprocessing module 702, an analyzing module 703, a constructing module 704, and a storing module 705, where:
an acquisition module 701: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring historical data of environmental data to be detected, and the historical data comprises track data left at a back end when a user accesses a front end page, wherein the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data;
the preprocessing module 702: the system comprises a history data processing unit, a database and a database, wherein the history data processing unit is used for preprocessing the history data and carrying out lightweight processing on the preprocessed history data to obtain new environment data after the lightweight processing;
the analysis module 703: the system is used for carrying out clustering analysis on the environment data to obtain an analysis result;
the building block 704: the neural network architecture model is constructed and trained according to the analysis result;
the storage module 705: and the environment data is divided into different dimension data for storage according to the trained neural network architecture model.
Specifically, the preprocessing module 702 includes an extracting unit 7021, a first classifying unit 7022, an analyzing unit 7023, and a traversing unit 7024, where:
extraction unit 7021: the word frequency extracting method comprises the steps of respectively extracting word frequency based on preset historical data and attributes of the historical data to obtain a first historical word group, wherein the first historical word group is a word group with the word frequency higher than the preset historical data in the historical data;
first classification unit 7022: the system is used for classifying based on the attribute of the first historical phrase to obtain a second historical phrase;
analysis unit 7023: the historical data analysis module is used for analyzing the historical data to obtain analyzed data analysis information;
traversal unit 7024: the data analysis device is used for traversing the data analysis information and judging whether each piece of data analysis information is related to the track data or not based on a second historical phrase, wherein if each piece of data analysis information is related to the track data, the data analysis information is reserved; and if each piece of data analysis information is not related to the track data, rejecting the data analysis information, and traversing the next piece of data analysis information until the data analysis information is traversed, so as to obtain new environment data subjected to light weight processing.
Specifically, the analysis module 703 includes a generation unit 7031, a calculation unit 7032, and an integration unit 7033, where:
generating unit 7031: the system comprises a clustering module, a node expression module and a node expression module, wherein the clustering module is used for performing clustering analysis on the environment data by adopting a preset clustering model, and sampling and aggregating local characteristics of nodes to generate the node expression;
computing unit 7032: the node self-adaption automatic learning system is used for combining an attention mechanism, carrying out weight calculation on the node according to inner product operation and self-adaption automatic learning characteristics;
integration unit 7033: and the clustering model is used for carrying out constraint and integration on graph embedding according to the clustering model, training and verifying the clustering model, and adjusting and optimizing network parameters to obtain an analysis result.
Specifically, the building module 704 includes a determining unit 7041, a second classifying unit 7042, an operating unit 7043, and an obtaining unit 7044, where:
determination unit 7041: the marking information is used for determining the corresponding marking information of each piece of environmental data in the analysis result;
second classification unit 7042: the neural network model is used for generating a neural network model for multi-label classification according to the marking information and preset weight coefficients based on the neural network architecture model, wherein the neural network model is used for acquiring a multi-label feature map;
operation unit 7043: the multi-label feature map is subjected to convolution operation and normalization operation to obtain a weighted multi-label feature map;
acquisition unit 7044: and obtaining a classification result according to the multi-label feature map.
The storage module 705 further includes a first constructing unit 7051, a determining unit 7041, and a second constructing unit 7053, where:
first constructing unit 7051: a basic index system used for constructing the environment data based on the neural network architecture model;
determination unit 7041: the system is used for determining environmental data analysis indexes according to the basic index system;
second building element 7053: and the operation strategy is used for constructing an operation strategy by combining the basic index system, the environmental data analysis index and the neural network architecture model, and the operation strategy is used for presenting the data analyzed according to the environmental data of the big data.
It should be noted that, regarding the system in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated herein.
Example 3:
corresponding to the above method embodiment, the present embodiment further provides an environmental data analysis and integration device based on big data, and a big data analysis and integration device described below and an environmental data analysis and integration method based on big data described above may be referred to correspondingly.
Fig. 3 is a block diagram illustrating a big-data based environmental data analysis integration apparatus 800 according to an example embodiment. As shown in fig. 3, the big-data-based environmental data analysis integration apparatus 800 may include: a processor 801, a memory 802. The big-data based environmental data analysis integration apparatus 800 may further include one or more of a multimedia component 803, an i/O interface 804, and a communication component 805.
The processor 801 is configured to control the overall operation of the big-data based environmental data analysis and integration apparatus 800, so as to complete all or part of the steps in the big-data based environmental data analysis and integration method. The memory 802 is used to store various types of data to support the operation of the big data based environmental data analytics integration device 800, which may include, for example, instructions for any application or method operating on the big data based environmental data analytics integration device 800, as well as application related data such as contact data, transceived messages, pictures, audio, video, and so forth. The Memory 802 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. The multimedia components 803 may include screen and audio components. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 802 or transmitted through the communication component 805. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 804 provides an interface between the processor 801 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 805 is used for wired or wireless communication between the big data based environmental data analysis integration apparatus 800 and other apparatuses. Wireless communication, such as Wi-Fi, bluetooth, near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so that the corresponding communication component 805 may include: wi-Fi module, bluetooth module, NFC module.
In an exemplary embodiment, the big-data based environmental data analysis integration apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components for performing the big-data based environmental data analysis integration method.
In another exemplary embodiment, a computer readable storage medium including program instructions for implementing the steps of the big data based environmental data analysis integration method described above when executed by a processor is also provided. For example, the computer readable storage medium may be the memory 802 described above that includes program instructions executable by the processor 801 of the big data based environmental data analysis integration apparatus 800 to perform the big data based environmental data analysis integration method described above.
Example 4:
corresponding to the above method embodiment, a readable storage medium is further provided in this embodiment, and a readable storage medium described below and a big data based environmental data analysis and integration method described above may be referred to in correspondence with each other.
A readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of the big data based environmental data analysis and integration method of the above method embodiments.
The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other readable storage media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and shall cover the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A big data based environmental data analysis and integration method is characterized by comprising the following steps:
acquiring historical data of environmental data to be detected, wherein the historical data comprises track data left at the rear end when a user accesses a front end page, and the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data;
preprocessing the historical data, and carrying out lightweight processing on the preprocessed historical data to obtain new environment data subjected to lightweight processing;
performing clustering analysis on the environmental data to obtain an analysis result;
constructing a neural network architecture model, and training the neural network architecture model according to the analysis result;
and dividing the environment data into different dimensional data for storage according to the trained neural network architecture model.
2. The big data based environmental data analysis and integration method according to claim 1, wherein the preprocessing the historical data and performing a weight reduction process on the preprocessed historical data to obtain new environment data after the weight reduction process comprises:
respectively extracting word frequency based on preset historical data and attributes of the historical data to obtain a first historical phrase, wherein the first historical phrase is a phrase with the word frequency higher than the preset historical data in the historical data;
classifying based on the attribute of the first historical phrase to obtain a second historical phrase;
analyzing the historical data to obtain analyzed data analysis information;
traversing the data analysis information, and judging whether each piece of data analysis information is related to the track data or not based on a second historical phrase, wherein if each piece of data analysis information is related to the track data, the data analysis information is reserved; and if each piece of data analysis information is not related to the track data, rejecting the data analysis information, and traversing the next piece of data analysis information until the data analysis information is traversed, so as to obtain new environment data subjected to light weight processing.
3. The big data based environmental data analysis and integration method according to claim 1, wherein the environmental data is subjected to cluster analysis to obtain an analysis result, which comprises:
performing clustering analysis on the environment data by adopting a preset clustering model, and sampling and aggregating local characteristics of nodes to generate node representation;
combining an attention mechanism, carrying out weight calculation on the nodes according to inner product operation, and self-adapting to the automatic learning characteristic;
and according to the constraint and integration of the clustering model to the graph embedding, training and verifying the clustering model, and adjusting and optimizing network parameters to obtain an analysis result.
4. The big data based environmental data analysis and integration method according to claim 1, wherein the building of the neural network architecture model, and the training of the neural network architecture model according to the analysis result comprise:
determining mark information corresponding to each piece of environmental data in the analysis result;
generating a neural network model for multi-label classification according to the marking information and preset weight coefficients based on the neural network architecture model, wherein the neural network model is used for acquiring a multi-label feature map;
performing convolution operation and normalization operation on the multi-label feature map to obtain a weighted multi-label feature map;
and obtaining a classification result according to the multi-label feature map.
5. The big data based environmental data analysis and integration method according to claim 1, wherein the environmental data is divided into different dimensional data according to the trained neural network architecture model for storage, and then further comprising:
constructing a basic index system of the environment data based on the neural network architecture model;
determining an environmental data analysis index according to the basic index system;
and constructing an operation strategy by combining the basic index system, the environmental data analysis index and the neural network architecture model, wherein the operation strategy is used for presenting the data analyzed according to the environmental data of the big data.
6. An environmental data analysis and integration system based on big data, comprising:
an acquisition module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring historical data of environmental data to be detected, the historical data comprises track data left at a rear end when a user accesses a front end page, and the track data comprises energy consumption data, water source consumption data, waste data and carbon emission data;
a preprocessing module: the system comprises a history data processing unit, a database and a database, wherein the history data processing unit is used for preprocessing the history data and carrying out lightweight processing on the preprocessed history data to obtain new environment data after the lightweight processing;
an analysis module: the system is used for carrying out clustering analysis on the environment data to obtain an analysis result;
constructing a module: the neural network architecture model is constructed and trained according to the analysis result;
a storage module: and the dimensional data storage module is used for dividing the environment data into different dimensional data for storage according to the trained neural network architecture model.
7. The big data based environmental data analytics integration system of claim 6, wherein the pre-processing module comprises:
an extraction unit: the word frequency extracting method comprises the steps of respectively extracting word frequency based on preset historical data and attributes of the historical data to obtain a first historical word group, wherein the first historical word group is a word group with the word frequency higher than the preset historical data in the historical data;
a first classification unit: the system is used for classifying based on the attribute of the first historical phrase to obtain a second historical phrase;
an analysis unit: the historical data analysis module is used for analyzing the historical data to obtain analyzed data analysis information;
traversing unit: the data analysis device is used for traversing the data analysis information and judging whether each piece of data analysis information is related to the track data or not based on a second historical phrase, wherein if each piece of data analysis information is related to the track data, the data analysis information is reserved; and if each piece of data analysis information is not related to the track data, rejecting the data analysis information, and traversing the next piece of data analysis information until the data analysis information is traversed, so as to obtain new environment data subjected to light weight processing.
8. The big data based environmental data analytics integration system of claim 6, wherein the analytics module comprises:
a generation unit: the system comprises a clustering module, a node expression module and a node expression module, wherein the clustering module is used for performing clustering analysis on the environment data by adopting a preset clustering model, and sampling and aggregating local characteristics of nodes to generate the node expression;
a calculation unit: the node self-adaption automatic learning method is used for combining with an attention mechanism, carrying out weight calculation on the node according to inner product operation and self-adaption automatic learning characteristics;
an integration unit: and the clustering model is used for carrying out constraint and integration on graph embedding according to the clustering model, training and verifying the clustering model, and adjusting and optimizing network parameters to obtain an analysis result.
9. The big data based environmental data analytics integration system of claim 6, wherein the building module comprises:
a determination unit: the marking information is used for determining the corresponding marking information of each piece of environmental data in the analysis result;
a second classification unit: the neural network model is used for generating a neural network model for multi-label classification according to the marking information and preset weight coefficients based on the neural network architecture model, wherein the neural network model is used for acquiring a multi-label feature map;
an operation unit: the multi-label feature map normalization processing unit is used for performing convolution operation and normalization operation on the multi-label feature map to obtain a weighted multi-label feature map;
an acquisition unit: and obtaining a classification result according to the multi-label feature map.
10. The big data-based environmental data analytics integration system of claim 6, wherein the storage module further comprises:
a first building unit: a basic index system used for constructing the environment data based on the neural network architecture model;
a determination unit: the system is used for determining environmental data analysis indexes according to the basic index system;
a second building element: and the operation strategy is used for constructing an operation strategy by combining the basic index system, the environmental data analysis index and the neural network architecture model, and the operation strategy is used for presenting the data analyzed according to the environmental data of the big data.
CN202211534650.5A 2022-12-02 2022-12-02 Environmental data analysis and integration method and system based on big data Pending CN115827797A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211534650.5A CN115827797A (en) 2022-12-02 2022-12-02 Environmental data analysis and integration method and system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211534650.5A CN115827797A (en) 2022-12-02 2022-12-02 Environmental data analysis and integration method and system based on big data

Publications (1)

Publication Number Publication Date
CN115827797A true CN115827797A (en) 2023-03-21

Family

ID=85544852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211534650.5A Pending CN115827797A (en) 2022-12-02 2022-12-02 Environmental data analysis and integration method and system based on big data

Country Status (1)

Country Link
CN (1) CN115827797A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303678A (en) * 2023-03-23 2023-06-23 成都瀚宇游科技有限公司 Big data integrated analysis method and system based on neural network
CN116739317A (en) * 2023-08-15 2023-09-12 山东宇飞传动技术有限公司 Mining winch automatic management and dispatching platform, method, equipment and medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303678A (en) * 2023-03-23 2023-06-23 成都瀚宇游科技有限公司 Big data integrated analysis method and system based on neural network
CN116739317A (en) * 2023-08-15 2023-09-12 山东宇飞传动技术有限公司 Mining winch automatic management and dispatching platform, method, equipment and medium
CN116739317B (en) * 2023-08-15 2023-10-31 山东宇飞传动技术有限公司 Mining winch automatic management and dispatching platform, method, equipment and medium

Similar Documents

Publication Publication Date Title
EP3819792A2 (en) Method, apparatus, device, and storage medium for intention recommendation
CN111125460B (en) Information recommendation method and device
Wu et al. Big data analytics= machine learning+ cloud computing
US9535902B1 (en) Systems and methods for entity resolution using attributes from structured and unstructured data
CN115827797A (en) Environmental data analysis and integration method and system based on big data
Nasridinov et al. A decision tree-based classification model for crime prediction
Dave et al. Different clustering algorithms for Big Data analytics: A review
Sekhar et al. Optimized focused web crawler with natural language processing based relevance measure in bioinformatics web sources
Ranganathan et al. Action rules for sentiment analysis on twitter data using spark
Ranganathan et al. Actionable pattern discovery for sentiment analysis on twitter data in clustered environment
Minervini et al. Leveraging the schema in latent factor models for knowledge graph completion
Wang et al. Data mining applications in big data
Niu Optimization of teaching management system based on association rules algorithm
Dritsas et al. Aspect-based community detection of cultural heritage streaming data
Dass et al. Amelioration of Big Data analytics by employing Big Data tools and techniques
CN116467291A (en) Knowledge graph storage and search method and system
Zhao et al. Collecting, managing and analyzing social networking data effectively
Portugal et al. Towards a provenance-aware spatial-temporal architectural framework for massive data integration and analysis
Hirchoua et al. Topic hierarchies for knowledge capitalization using hierarchical Dirichlet processes in big data context
Saxena et al. An iterative MapReduce framework for sports-based tweet clustering
Narayanasamy et al. Crisis and disaster situations on social media streams: An ontology-based knowledge harvesting approach
Sukumar et al. Knowledge Graph Generation for Unstructured Data Using Data Processing Pipeline
CN112835852B (en) Character duplicate name disambiguation method, system and equipment for improving filing-by-filing efficiency
CN115269851B (en) Article classification method, apparatus, electronic device, storage medium and program product
Feng et al. Construction of Legal Reporting Information Platform Based on Natural Optimization Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination