CN115934958A - Information data storage integration system based on big data - Google Patents

Information data storage integration system based on big data Download PDF

Info

Publication number
CN115934958A
CN115934958A CN202211485733.XA CN202211485733A CN115934958A CN 115934958 A CN115934958 A CN 115934958A CN 202211485733 A CN202211485733 A CN 202211485733A CN 115934958 A CN115934958 A CN 115934958A
Authority
CN
China
Prior art keywords
data
wind power
module
power data
big
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211485733.XA
Other languages
Chinese (zh)
Inventor
魏艳华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202211485733.XA priority Critical patent/CN115934958A/en
Publication of CN115934958A publication Critical patent/CN115934958A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses an information data storage and integration system based on big data, which comprises S101, big data acquisition and related wind power data acquisition; s102, data processing is carried out on the obtained data; s103, further integrating the processed data to construct a knowledge graph; s104, storing the integrated wind power data in a database; the step of acquiring the relevant wind power data by the big data comprises the following steps: the related wind power data mainly comprise sensor data transmitted back by the fan, operation and maintenance data generated around the fan, periodic inspection data, equipment purchase data and maintenance data; the step of processing the acquired data comprises: processing the acquired unstructured data; the method has the advantages of being effective in safe storage and efficient in integrated management.

Description

Information data storage integration system based on big data
Technical Field
The invention relates to the technical field of big data, in particular to an information data storage and integration system based on big data.
Background
With the background of the gradual depletion of conventional energy sources, clean energy sources represented by wind power generation are being vigorously developed in the world today. With the gradual maturity of the related technologies and the development of the wind power industry, the related data generated in the industry reach a new height in terms of source and quantity. Under the trend of big data technology, people realize that huge values are hidden in the wind power data, and if the wind power data can be efficiently managed and utilized, the method has a huge promotion effect on the wind power industry. However, in practice, wind power data are stored in each business department in an isolated manner due to the reasons of format, type, use and the like, data cannot be integrated effectively, data of different sources, different structures and different types are difficult to integrate, cross-business, cross-time and cross-type panoramic data mining and analyzing are achieved, and huge obstacles are brought to wind power data storage and management. Therefore, it is necessary to design a big data based information data storage consolidation system for efficient and secure storage and efficient consolidation management.
Disclosure of Invention
The present invention is directed to an information data storage integration system based on big data, so as to solve the problems in the background art.
In order to solve the technical problems, the invention provides the following technical scheme: the big data-based information data storage and integration system runs a method comprising the following steps:
acquiring related wind power data by big data;
carrying out data processing on the acquired data;
further integrating the processed data to construct a knowledge graph;
and storing the integrated wind power data in a database.
According to the technical scheme, the step of acquiring the related wind power data by the big data comprises the following steps:
the related wind power data mainly comprise sensor data transmitted back by the fan, operation and maintenance data, periodic inspection data, equipment purchase data and maintenance data generated around the fan.
According to the above technical solution, the step of processing the acquired data includes:
processing the acquired unstructured data;
the method comprises the steps of cutting a wind power data text into independent words by combining a directed graph probability model based on statistics with wind power professional terms;
matching the words segmented in the last step one by one in a field professional term set;
performing dimension reduction operation on the extracted entities/attributes through a word embedding algorithm;
and further extracting the relation of the wind power data through a language representation model.
According to the above technical solution, the step of further integrating the processed data to construct a knowledge graph comprises:
integrating the entities, the relations and the attributes which are automatically extracted from the wind power data into an entity-relation-entity triple;
and after the triple is obtained, the triple is imported into a graph database, and the constructed wind power data knowledge graph is output.
According to the technical scheme, the step of storing the integrated wind power data in the database comprises the following steps:
the system adopts a role authorization mode, a super administrator or a system administrator with a role modification authority grants role authority to each role, and then the user account is associated to a required role, so that the operation authority of the role is obtained.
According to the above technical solution, the system comprises:
the wind power data acquisition module is used for acquiring wind power data;
the wind power data processing module is used for processing the acquired wind power data;
the wind power data integration module is used for integrating the processed wind power data;
and the safe storage utilization module is used for safely storing the integrated wind power data.
According to the technical scheme, the wind power data processing module comprises:
the unstructured data processing module is used for processing unstructured data;
the unstructured data processing module comprises:
the entity attribute extraction unit is used for extracting entities and attributes in the data;
the part-of-speech dimensionality reduction unit is used for performing dimensionality reduction on the extracted entity and attribute according to the part-of-speech;
and the relation extraction unit is used for extracting the relation in the data.
According to the technical scheme, the wind power data integration module comprises:
the data integration module is used for integrating data;
the knowledge graph construction module is used for constructing a knowledge graph according to the integrated data;
the error detection module is used for carrying out error detection on the output knowledge graph;
and the periodic updating module is used for periodically updating the knowledge graph according to the set period.
According to the above technical solution, the secure storage utilization module includes:
the safe storage module is used for safely storing the wind power data;
and the authority management module is used for carrying out authority management on the system.
Compared with the prior art, the invention has the following beneficial effects: according to the invention, the big data acquires related wind power data by arranging a wind power data acquisition module, a wind power data processing module, a wind power data integration module and a safe storage and utilization module; the method comprises the steps of integrating automatically extracted entities, relations and attributes in wind power data into an entity-relation-entity triple, effectively integrating multisource heterogeneous data dispersed in a system to form a complete data view and safely storing the complete data view in a database, so that a large amount of wind power data can be effectively utilized, workers of a wind power unit can check the running state of the wind power unit in real time, the management efficiency is improved, effective data support for wind power data mining and utilization is provided for the workers through a complete integrated knowledge graph, meanwhile, the integrated knowledge graph is accurately detected through preset rules among entities, and whether node conflict or relation loss exists or not is judged; when a user needs to call and utilize stored data, the user needs to be confirmed by the authority of a manager, and meanwhile, each operation of the user is recorded in a system log, so that the data can be safely stored, and misoperation and malicious operation behaviors of workers can be avoided.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a big data-based information data storage and integration method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a module configuration of a big data-based information data storage and integration system according to a second embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment is as follows:
fig. 1 is a flowchart of a big data-based information data storage and integration method according to an embodiment of the present invention, where the method may be executed by a big data-based information data storage and integration system according to an embodiment of the present invention, where the system includes a plurality of software and hardware modules, and as shown in fig. 1, the method specifically includes the following steps:
s101, acquiring related wind power data by big data;
in some embodiments of the invention, the related wind power data mainly comprises sensor data transmitted back by the fan and used for representing the running state of the fan, operation and maintenance data, periodic inspection data, equipment purchasing data, maintenance data and the like generated around the fan.
For example, in the embodiment of the present invention, the sensor data sent back by the fan includes a large number of sensors such as a temperature sensor, an acceleration sensor, a pressure sensor, and a vibration sensor, so as to monitor the operating state of each component of the fan, and the operating state is collected by the data collection and monitoring control system.
Illustratively, in the embodiment of the invention, the wind power data has the characteristics of large scale, high speed, various types, high value and the like, so that the data is efficiently integrated and stored, the operation state of a wind turbine generator worker can be conveniently checked in real time, the management efficiency is improved, effective data support for mining and utilizing the wind power data is provided for the worker through a complete integrated knowledge map, the intelligent fault diagnosis, equipment purchase and the monitoring on the health state of the fan are facilitated, the time is saved, the reason causing the fault or failure is quickly found, the product quality is improved, and the service life of the product is prolonged.
S102, data processing is carried out on the obtained data;
in some embodiments of the invention, the acquired unstructured data is processed.
Exemplarily, in the embodiment of the invention, unstructured data mainly refers to text-type wind power data, which exists in a natural language form, and elements such as entities, relations and attributes in a text need to be extracted before a knowledge graph is constructed.
In the embodiment of the invention, the entity/attribute extraction is to extract words representing the entity/attribute from unstructured wind power data, firstly perform word segmentation, and specifically cut a wind power data text into separate words by combining a directed graph probability model based on statistics with wind power professional terms.
In the embodiment of the invention, the words segmented in the last step are matched in the field professional term set one by one, and once the target item is matched, the word is used as an entity node, an attribute name and an attribute value of the knowledge graph.
In the embodiment of the invention, the extracted entities/attributes are subjected to dimension reduction operation through a word embedding algorithm, because the wind power data text is recorded in a natural language form, each business department records data according to own business requirements, and the data are not unified in specification, and meanwhile, operation and maintenance personnel can not strictly describe according to the specification when recording faults due to personal habits, different names often appear aiming at the same entity attribute, so that the data are subjected to dimension reduction operation to identify the entities with the same meaning but expressing different entity attributes, so that words with similar semantics are very similar in position in a vector space, the words expressing the same entity attribute are arranged into sets, and standardized words are selected from each synonym set based on a professional term set in the wind power field as the entity attribute of a knowledge map, thereby being beneficial to subsequent data integration and construction.
In the embodiment of the invention, the relation extraction is further carried out on the wind power data through the language representation model, whether the relation and the relation type exist between the entities is judged from the wind power data text, specifically, a shielding language model is embedded in the language representation model, and the context of the word are considered at the same time, so that the deep bidirectional language representation integrating the left context information and the right context information is realized.
S103, further integrating the processed data to construct a knowledge graph;
in some embodiments of the invention, the entities, relationships and attributes automatically extracted from the wind power data are integrated into a triple of "entity-relationship-entity", and the multi-source heterogeneous data dispersed in the system is effectively integrated to form a complete data view, so that a large amount of wind power data can be effectively utilized.
In some embodiments of the invention, after the triple is obtained, the triple is imported into a graph database, and the constructed wind power data knowledge graph is output.
Illustratively, in the embodiment of the invention, when the knowledge graph is output, different colors are marked on the graph according to the classification of the nodes, so that an observer can distinguish various classification nodes conveniently, classification screening can be performed according to the colors in data query, the working efficiency of workers is improved, and meanwhile, the scattered wind power data is fully mined and utilized.
In some embodiments of the invention, a user can set an updating period independently, obtain the latest wind power data in the period and update the map, and the map adds corresponding nodes according to the change of the data and changes continuously along with the increase of the data; meanwhile, for the integrated knowledge graph, accuracy detection is carried out on the knowledge graph through preset rules among entities, whether node conflict or relation loss exists or not is judged, if errors are detected, the corresponding node relation is immediately updated, the missing relation is completed, and further improvement is carried out.
S104, storing the integrated wind power data in a database;
in some embodiments of the present invention, the system adopts a role authorization manner, wherein each role is granted with role authority by a super administrator or a system administrator with role modification authority, and then the user account is associated with a required role, so as to obtain the operation authority of the role.
Illustratively, in the embodiment of the present invention, when a user needs to call and use stored data, the user needs to be authenticated by the authority of a manager, and each operation of the user is recorded in a system log, so as to ensure the safe storage of the data and prevent the misoperation and malicious operation behaviors of workers.
Example two:
an embodiment of the present invention provides an information data storage and integration system based on big data, fig. 2 is a schematic diagram of a module configuration of the information data storage and integration system based on big data provided by the embodiment two, as shown in fig. 2, the system includes:
the wind power data acquisition module is used for acquiring wind power data;
the wind power data processing module is used for processing the acquired wind power data;
the wind power data integration module is used for integrating the processed wind power data;
and the safe storage utilization module is used for safely storing the integrated wind power data.
In some embodiments of the invention, the wind power data processing module comprises:
the unstructured data processing module is used for processing unstructured data;
the unstructured data processing module comprises:
the entity attribute extraction unit is used for extracting entities and attributes in the data;
the part-of-speech dimensionality reduction unit is used for performing dimensionality reduction on the extracted entity and attribute according to the part-of-speech;
and the relation extraction unit is used for extracting the relation in the data.
In some embodiments of the invention, the wind power data integration module comprises:
the data integration module is used for integrating data;
the knowledge graph construction module is used for constructing a knowledge graph according to the integrated data;
the error detection module is used for carrying out error detection on the output knowledge graph;
and the periodic updating module is used for periodically updating the knowledge graph according to the set period.
In some embodiments of the invention, the secure storage utilization module comprises:
the safe storage module is used for safely storing the wind power data;
and the authority management module is used for carrying out authority management on the system.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. The information data storage integration system based on big data is characterized in that the method operated by the system comprises the following steps:
acquiring related wind power data by big data;
carrying out data processing on the acquired data;
further integrating the processed data to construct a knowledge graph;
and storing the integrated wind power data in a database.
2. The big data based information data storage consolidation system of claim 1, wherein: the step of acquiring the related wind power data by the big data comprises the following steps:
the related wind power data mainly comprise sensor data transmitted back by the fan, operation and maintenance data generated around the fan, regular inspection data, equipment purchase data and maintenance data.
3. The big data based information data storage consolidation system of claim 1, wherein: the step of processing the acquired data comprises:
processing the acquired unstructured data;
the method comprises the steps of cutting a wind power data text into independent words by combining a directed graph probability model based on statistics with wind power professional terms;
matching the words segmented in the last step one by one in a field professional term set;
performing dimension reduction operation on the extracted entities/attributes through a word embedding algorithm;
and further extracting the relation of the wind power data through a language representation model.
4. The big data based information data storage integration system of claim 1, wherein: the step of further integrating the processed data to construct a knowledge graph comprises:
integrating the entities, relations and attributes automatically extracted from the wind power data into a triple of 'entity-relation-entity';
and after the triple is obtained, the triple is imported into a graph database, and the constructed wind power data knowledge graph is output.
5. The big data based information data storage consolidation system of claim 1, wherein: the step of storing the integrated wind power data in a database comprises the following steps:
the system adopts a role authorization mode, a super administrator or a system administrator with a role modification authority grants role authority to each role, and then the user account is associated to a required role, so that the operation authority of the role is obtained.
6. The big data based information data storage integration system of claim 1, wherein: the system comprises:
the wind power data acquisition module is used for acquiring wind power data;
the wind power data processing module is used for processing the acquired wind power data;
the wind power data integration module is used for integrating the processed wind power data;
and the safe storage utilization module is used for safely storing the integrated wind power data.
7. The big data based information data storage consolidation system of claim 6, wherein: the wind power data processing module comprises:
the unstructured data processing module is used for processing unstructured data;
the unstructured data processing module comprises:
the entity attribute extraction unit is used for extracting entities and attributes in the data;
the part-of-speech dimensionality reduction unit is used for performing dimensionality reduction on the extracted entity and attribute according to the part-of-speech;
and the relation extraction unit is used for extracting the relation in the data.
8. The big data based information data storage consolidation system of claim 7, wherein: the wind power data integration module comprises:
the data integration module is used for integrating data;
the knowledge graph building module is used for building a knowledge graph according to the integrated data;
the error detection module is used for carrying out error detection on the output knowledge graph;
and the periodic updating module is used for periodically updating the knowledge graph according to the set period.
9. The big data based information data storage consolidation system of claim 8, wherein: the secure storage utilization module includes:
the safe storage module is used for safely storing the wind power data;
and the authority management module is used for carrying out authority management on the system.
CN202211485733.XA 2022-11-24 2022-11-24 Information data storage integration system based on big data Pending CN115934958A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211485733.XA CN115934958A (en) 2022-11-24 2022-11-24 Information data storage integration system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211485733.XA CN115934958A (en) 2022-11-24 2022-11-24 Information data storage integration system based on big data

Publications (1)

Publication Number Publication Date
CN115934958A true CN115934958A (en) 2023-04-07

Family

ID=86653339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211485733.XA Pending CN115934958A (en) 2022-11-24 2022-11-24 Information data storage integration system based on big data

Country Status (1)

Country Link
CN (1) CN115934958A (en)

Similar Documents

Publication Publication Date Title
US11119799B2 (en) Contextual digital twin runtime environment
CN107886238B (en) Business process management system and method based on mass data analysis
CN109271272B (en) Big data assembly fault auxiliary repair system based on unstructured log
CN111885040A (en) Distributed network situation perception method, system, server and node equipment
US20170109676A1 (en) Generation of Candidate Sequences Using Links Between Nonconsecutively Performed Steps of a Business Process
CN106354118A (en) Fault diagnosis system and method for train based on fault tree
US20170109668A1 (en) Model for Linking Between Nonconsecutively Performed Steps in a Business Process
US20060184529A1 (en) System and method for analysis and management of logs and events
CN104616092B (en) A kind of behavior pattern processing method based on distributed information log analysis
CN114153702A (en) Method and system for implementing a log parser in a log analysis system
US20170109667A1 (en) Automaton-Based Identification of Executions of a Business Process
US20170109636A1 (en) Crowd-Based Model for Identifying Executions of a Business Process
US9123006B2 (en) Techniques for parallel business intelligence evaluation and management
US20170109639A1 (en) General Model for Linking Between Nonconsecutively Performed Steps in Business Processes
CN112100149B (en) Automatic log analysis system
CN115858796A (en) Fault knowledge graph construction method and device
CN115544519A (en) Method for carrying out security association analysis on threat information of metering automation system
CN114780798A (en) Knowledge map system based on BIM
US20170109640A1 (en) Generation of Candidate Sequences Using Crowd-Based Seeds of Commonly-Performed Steps of a Business Process
CN113065580A (en) Power plant equipment management method and system based on multi-information fusion
CN112699162A (en) System for processing source network load multi-element data
US20170109670A1 (en) Crowd-Based Patterns for Identifying Executions of Business Processes
CN111414355A (en) Offshore wind farm data monitoring and storing system, method and device
CN115934958A (en) Information data storage integration system based on big data
CN112257423B (en) Equipment symptom information acquisition method and device and equipment operation and maintenance system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination