CN113407723A - Multi-source heterogeneous power load data fusion method, device, equipment and storage medium - Google Patents

Multi-source heterogeneous power load data fusion method, device, equipment and storage medium Download PDF

Info

Publication number
CN113407723A
CN113407723A CN202110809255.2A CN202110809255A CN113407723A CN 113407723 A CN113407723 A CN 113407723A CN 202110809255 A CN202110809255 A CN 202110809255A CN 113407723 A CN113407723 A CN 113407723A
Authority
CN
China
Prior art keywords
data
fusion
heterogeneous
source
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110809255.2A
Other languages
Chinese (zh)
Inventor
夏刚
胡勇胜
陈金鑫
邓盛名
邓鹏程
李贤名
周乐
王翔
余斌
李华喜
罗红祥
丁旭
康志远
马腾飞
谭曜堃
刘茗溪
黄孔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Wuling Power Technology Co Ltd
Wuling Power Corp Ltd
Original Assignee
Hunan Wuling Power Technology Co Ltd
Wuling Power Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Wuling Power Technology Co Ltd, Wuling Power Corp Ltd filed Critical Hunan Wuling Power Technology Co Ltd
Priority to CN202110809255.2A priority Critical patent/CN113407723A/en
Publication of CN113407723A publication Critical patent/CN113407723A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to a multi-source heterogeneous power load data fusion method, device, equipment and storage medium. The method comprises the following steps: acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text; extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification; matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data; and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing. Through multi-level processing of multi-source heterogeneous power grid load big data, a whole set of electric power big data fusion strategy and evaluation from fusion to evaluation are provided, and the data fusion is strong in practicability and high in efficiency.

Description

Multi-source heterogeneous power load data fusion method, device, equipment and storage medium
Technical Field
The application relates to the field of big data, in particular to a multi-source heterogeneous power load data fusion method, device, equipment and storage medium.
Background
The power load data has the characteristics of large scale order, various types, high change speed and the like, and is typical large data. At present, according to different professional requirements, each department usually establishes an independent model parameter library and maintains the model parameter libraries independently, and due to the lack of a cooperative management mechanism, the consistency of the model parameter libraries is difficult to ensure.
Patent document with publication number CN107402976A discloses a power grid multi-source data fusion method and system based on a multi-element heterogeneous model, which establish a unified model of each source system data, and calculate the matching degree between models through model traversal comparison, thereby realizing more than 90% of automatic integrated fusion of data, but the method and system do not provide a related method for large data of power load, and the pertinence is weak; patent document CN103617557A proposes a multi-source heterogeneous power grid operation parameter analysis system for power grid operation parameters, but for massive parameter processing, no relevant big data processing technology is introduced, and the data processing efficiency needs to be improved. Therefore, how to improve the pertinence of load data fusion and the processing technology needs further technical innovation.
Disclosure of Invention
In view of the foregoing, it is necessary to provide a multi-source heterogeneous power load data fusion method, apparatus, device, and storage medium for solving the above technical problems.
In a first aspect, an embodiment of the present invention provides a multi-source heterogeneous power load data fusion method, including the following steps:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
Further, after the fusion of the heterogeneous texts, the evaluation of the fusion data is further included, and the evaluation of the fusion data includes:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
Further, the extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text by multi-source object name classification includes:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
Further, carry out vertical parameter fusion and horizontal parameter fusion to the multisource matching result after Reduce handles, include:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
On the other hand, an embodiment of the present invention further provides a multi-source heterogeneous power load data fusion system, including:
the load data preprocessing module is used for acquiring a multi-source heterogeneous text of the power load data and carrying out format normalization processing on the heterogeneous text;
the data classification module is used for extracting key characters from the heterogeneous text to construct a knowledge dictionary and obtaining an object database of the heterogeneous text through multi-source object name classification;
the data matching module is used for matching the knowledge dictionary with the object names by adopting a MapReduce programming model parallel processing technology and then matching the object names with the object data;
and the parameter fusion module is used for performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
Further, the system further comprises a fusion evaluation module, wherein the fusion evaluation module is used for:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
Further, the data classification module includes a text normalization unit, and the text normalization unit is configured to:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
Further, the parameter fusion module includes a classification fusion unit, and the classification fusion unit is configured to:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the following steps are implemented:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
The multi-source heterogeneous power load data fusion method, device, equipment and storage medium comprise a fusion strategy of load power data and a corresponding data fusion quality evaluation method. The object database and the knowledge dictionary in the fusion strategy realize the structural separation of object names and object values of the multi-source load data, the MapReduce parallel processing technology is used for improving the matching efficiency, and the longitudinal parameter fusion and the transverse parameter fusion are used for fusing the results after Reduce processing; the evaluation method evaluates the data fusion quality and simultaneously respectively checks the data fusion quality with the object database and the knowledge dictionary in real time, so that online repair is realized. The embodiment of the invention provides a whole set of electric power big data fusion strategy and evaluation from fusion to evaluation by multi-level processing of the multi-source heterogeneous power grid load big data, and the data fusion is strong in practicability and high in efficiency.
Drawings
FIG. 1 is a schematic flow chart diagram of a multi-source heterogeneous power load data fusion method in one embodiment;
FIG. 2 is a schematic flow chart diagram illustrating a method for evaluating fused data according to one embodiment;
FIG. 3 is a flow diagram that illustrates the construction of a knowledge dictionary and database in one embodiment;
FIG. 4 is a schematic flow diagram of longitudinal and lateral data fusion in one embodiment;
FIG. 5 is a block diagram of a multi-source heterogeneous power load data fusion system in one embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a multi-source heterogeneous power load data fusion method is provided, which includes the following steps:
step 101, acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
102, extracting key characters from the heterogeneous text to construct a knowledge dictionary, and classifying by multi-source object names to obtain an object database of the heterogeneous text;
103, matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and 104, performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
Specifically, the method comprises a fusion strategy of load power data and a corresponding data fusion quality evaluation method. The object database and the knowledge dictionary in the fusion strategy realize the structure separation of object names and object values of multi-source load data, and the MapReduce parallel processing technology is used for improving the matching efficiency, wherein MapReduce is a programming model and is used for parallel operation of large-scale data sets (larger than 1TB), and the current software implementation specifies a Map function for mapping a group of key value pairs into a group of new key value pairs and specifies a concurrent Reduce function for ensuring that all the mapped key value pairs share the same key group. The longitudinal parameter fusion and the transverse parameter fusion are used for fusing the results after Reduce processing; the evaluation method evaluates the data fusion quality and simultaneously respectively checks the data fusion quality with the object database and the knowledge dictionary in real time, so that online repair is realized. The embodiment of the invention provides a whole set of electric power big data fusion strategy and evaluation from fusion to evaluation by multi-level processing of the multi-source heterogeneous power grid load big data, and the data fusion is strong in practicability and high in efficiency.
In one embodiment, as shown in fig. 2, the evaluation method of the fused data includes:
step 201, completing the online correction of parameters by checking the data of the knowledge dictionary and the object database in real time;
step 202, comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
step 203, ensuring the uniqueness of the same object data, and eliminating redundancy generated by repeated data in the fusion process;
and step 204, evaluating the time effectiveness and the space effectiveness to ensure that the fusion data is real-time and comprehensive.
Specifically, after the fusion of the data is completed, the data fusion quality needs to be evaluated in terms of integrity, accuracy, uniqueness and effectiveness, and the parameters are corrected online through real-time verification of the data with the knowledge dictionary and the object database. The integrity evaluation comprises object quantity evaluation and data quantity evaluation, and the completeness of the object quantity and the completeness of the corresponding data after data fusion is ensured; the accuracy evaluation is to compare the fused data with the fused data to prevent the fusion process from generating error values and abnormal values; the uniqueness evaluation means that the uniqueness of the same object data is guaranteed, and the redundant generation of repeated data is prevented; the effectiveness evaluation comprises time effectiveness and space effectiveness, and the real-time performance and the multifacetability of the data are ensured. And after the four characteristics of the data are evaluated, the object name and the knowledge dictionary are checked, the object data and the object database are checked in real time, and the fused data are corrected on line.
In one embodiment, as shown in fig. 3, the process of knowledge dictionary and database construction includes:
step 301, obtaining a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of an electric power system;
step 302, extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates to obtain the knowledge dictionary;
and step 303, classifying the normalized heterogeneous texts through multiple object names and sorting corresponding numerical values to obtain the object database.
Specifically, the multi-source heterogeneous text format obtained through the Web Service interface comprises an Excel file, a DAT file, a CIM file and the like; the knowledge dictionary is obtained by extracting key characters from the normalized text data and screening and de-duplicating names; the object database is obtained by classifying the normalized text data by multiple object names and sorting corresponding numerical values; MapReduce parallel processing divides the object database into n object sets for parallel matching, so that the efficiency of matching the object names with the knowledge dictionary and matching the object data with the object names is improved.
In one embodiment, as shown in fig. 4, the process of vertical and horizontal data fusion of data includes:
step 401, eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and step 402, completing transverse parameter fusion through parameter fusion between different professional departments in the same-level scheduling.
Specifically, by means of longitudinal and transverse parameter fusion, multi-source heterogeneous power load big data from a PMU (phasor measurement Unit), an SCADA (supervisory control and data acquisition), a fault recorder and a user acquisition system are wide in range; in addition, aiming at the processing of large data of a power load, a MapReduce parallel processing technology is introduced, so that the data processing efficiency is improved; in the evaluation after the integration, the quality evaluation is carried out on the data fusion result from four aspects, and the online correction is completed through the real-time proofreading with the knowledge dictionary and the object database.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 5, there is provided a multi-source heterogeneous power load data fusion system, comprising:
the load data preprocessing module 501 is configured to obtain a multi-source heterogeneous text of power load data, and perform format normalization processing on the heterogeneous text;
the data classification module 502 is used for extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
the data matching module 503 is configured to match the knowledge dictionary with the object names by using a MapReduce programming model parallel processing technology, and then match the object names with the object data;
and the parameter fusion module 504 is configured to perform longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after the Reduce processing.
In one embodiment, further comprising a fusion assessment module to:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
In one embodiment, as shown in fig. 5, the data classification module 502 includes a text normalization unit 5021, which is configured to:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
In one embodiment, as shown in fig. 5, the parameter fusion module 504 includes a classification fusion unit 5041, and the classification fusion unit 5041 is configured to:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
For specific limitations of the multi-source heterogeneous power load data fusion system, reference may be made to the above limitations on the multi-source heterogeneous power load data fusion method, which is not described herein again. All or part of each module in the multi-source heterogeneous power load data fusion system can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. As shown in fig. 6, the computer apparatus includes a processor, a memory, a network interface, an input device, and a display screen connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the method of privilege anomaly detection. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform the method for detecting an abnormality of authority. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A multi-source heterogeneous power load data fusion method is characterized by comprising the following steps:
acquiring a multi-source heterogeneous text of power load data, and performing format normalization processing on the heterogeneous text;
extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text through multi-source object name classification;
matching the knowledge dictionary with the object name by adopting a MapReduce programming model parallel processing technology, and then matching the object name with object data;
and performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
2. The method of claim 1, further comprising, after fusing the heterogeneous text, an evaluation of fused data, the evaluation of fused data comprising:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
3. The method of claim 2, wherein the extracting key characters from the heterogeneous text to construct a knowledge dictionary, and obtaining an object database of the heterogeneous text by multi-source object name classification comprises:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
4. The method according to claim 2, wherein the performing longitudinal parameter fusion and transverse parameter fusion on the Reduce-processed multi-source matching result comprises:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
5. A multi-source heterogeneous power load data fusion system, comprising:
the load data preprocessing module is used for acquiring a multi-source heterogeneous text of the power load data and carrying out format normalization processing on the heterogeneous text;
the data classification module is used for extracting key characters from the heterogeneous text to construct a knowledge dictionary and obtaining an object database of the heterogeneous text through multi-source object name classification;
the data matching module is used for matching the knowledge dictionary with the object names by adopting a MapReduce programming model parallel processing technology and then matching the object names with the object data;
and the parameter fusion module is used for performing longitudinal parameter fusion and transverse parameter fusion on the multi-source matching result after Reduce processing.
6. The multi-source heterogeneous power load data fusion system of claim 5, further comprising a fusion assessment module to:
the online correction of the parameters is completed through the real-time verification of the data of the knowledge dictionary and the object database;
comparing the fused data with the fused data, and eliminating error values and abnormal values generated in the fusion process;
the uniqueness of the data of the same object is ensured, and the redundancy generated by repeated data in the fusion process is eliminated;
and evaluating the time effectiveness and the space effectiveness so as to fuse the data in real time and comprehensively.
7. The multi-source heterogeneous power load data fusion system of claim 5, wherein the data classification module comprises a text normalization unit to:
acquiring a multi-source heterogeneous text format comprising an Excel file, a DAT file and a CIM file from a Web Service interface of a power system;
obtaining the knowledge dictionary by extracting key characters and screening names of the normalized heterogeneous texts to remove duplicates;
and classifying the normalized heterogeneous text by multiple object names and sorting corresponding numerical values to obtain the object database.
8. The multi-source heterogeneous power load data fusion system of claim 5, wherein the parameter fusion module comprises a classification fusion unit configured to:
eliminating parameter differences of the same department specialty at different levels, and completing longitudinal parameter fusion by introducing a difference function:
and transverse parameter fusion is completed through parameter fusion between different professional departments in the same-level scheduling.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 4 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.
CN202110809255.2A 2021-07-16 2021-07-16 Multi-source heterogeneous power load data fusion method, device, equipment and storage medium Pending CN113407723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110809255.2A CN113407723A (en) 2021-07-16 2021-07-16 Multi-source heterogeneous power load data fusion method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110809255.2A CN113407723A (en) 2021-07-16 2021-07-16 Multi-source heterogeneous power load data fusion method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113407723A true CN113407723A (en) 2021-09-17

Family

ID=77686754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110809255.2A Pending CN113407723A (en) 2021-07-16 2021-07-16 Multi-source heterogeneous power load data fusion method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113407723A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836940A (en) * 2021-09-26 2021-12-24 中国南方电网有限责任公司 Knowledge fusion method and device in electric power metering field and computer equipment
CN114970667A (en) * 2022-03-30 2022-08-30 国网吉林省电力有限公司 Multi-source heterogeneous energy data fusion method
CN116303392A (en) * 2023-03-02 2023-06-23 重庆市规划和自然资源信息中心 Multi-source data table management method for real estate registration data

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617557A (en) * 2013-11-06 2014-03-05 广东电网公司电力科学研究院 Multi-source heterogeneous power grid operation parameter analysis system
CN105184424A (en) * 2015-10-19 2015-12-23 国网山东省电力公司菏泽供电公司 Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion
CN107402976A (en) * 2017-07-03 2017-11-28 国网山东省电力公司经济技术研究院 Power grid multi-source data fusion method and system based on multi-element heterogeneous model
CN108170752A (en) * 2017-12-21 2018-06-15 山东合天智汇信息技术有限公司 metadata management method and system based on template
CN109086573A (en) * 2018-07-30 2018-12-25 东北师范大学 Multi-source biology big data convergence platform
CN109165202A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of preprocess method of multi-source heterogeneous big data
CN110781249A (en) * 2019-10-16 2020-02-11 华电国际电力股份有限公司技术服务分公司 Knowledge graph-based multi-source data fusion method and device for thermal power plant
CN111897875A (en) * 2020-07-31 2020-11-06 平安科技(深圳)有限公司 Fusion processing method and device for urban multi-source heterogeneous data and computer equipment
CN112214928A (en) * 2020-09-27 2021-01-12 贵州电网有限责任公司 Multi-source data processing and fusing method and system for low-voltage power distribution network
CN113051249A (en) * 2021-03-22 2021-06-29 江苏杰瑞信息科技有限公司 Cloud service platform design method based on multi-source heterogeneous big data fusion

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617557A (en) * 2013-11-06 2014-03-05 广东电网公司电力科学研究院 Multi-source heterogeneous power grid operation parameter analysis system
CN105184424A (en) * 2015-10-19 2015-12-23 国网山东省电力公司菏泽供电公司 Mapreduced short period load prediction method of multinucleated function learning SVM realizing multi-source heterogeneous data fusion
CN107402976A (en) * 2017-07-03 2017-11-28 国网山东省电力公司经济技术研究院 Power grid multi-source data fusion method and system based on multi-element heterogeneous model
CN108170752A (en) * 2017-12-21 2018-06-15 山东合天智汇信息技术有限公司 metadata management method and system based on template
CN109165202A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of preprocess method of multi-source heterogeneous big data
CN109086573A (en) * 2018-07-30 2018-12-25 东北师范大学 Multi-source biology big data convergence platform
CN110781249A (en) * 2019-10-16 2020-02-11 华电国际电力股份有限公司技术服务分公司 Knowledge graph-based multi-source data fusion method and device for thermal power plant
CN111897875A (en) * 2020-07-31 2020-11-06 平安科技(深圳)有限公司 Fusion processing method and device for urban multi-source heterogeneous data and computer equipment
CN112214928A (en) * 2020-09-27 2021-01-12 贵州电网有限责任公司 Multi-source data processing and fusing method and system for low-voltage power distribution network
CN113051249A (en) * 2021-03-22 2021-06-29 江苏杰瑞信息科技有限公司 Cloud service platform design method based on multi-source heterogeneous big data fusion

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836940A (en) * 2021-09-26 2021-12-24 中国南方电网有限责任公司 Knowledge fusion method and device in electric power metering field and computer equipment
CN113836940B (en) * 2021-09-26 2024-04-12 南方电网数字电网研究院股份有限公司 Knowledge fusion method and device in electric power metering field and computer equipment
CN114970667A (en) * 2022-03-30 2022-08-30 国网吉林省电力有限公司 Multi-source heterogeneous energy data fusion method
CN114970667B (en) * 2022-03-30 2024-03-29 国网吉林省电力有限公司 Multi-source heterogeneous energy data fusion method
CN116303392A (en) * 2023-03-02 2023-06-23 重庆市规划和自然资源信息中心 Multi-source data table management method for real estate registration data
CN116303392B (en) * 2023-03-02 2023-09-01 重庆市规划和自然资源信息中心 Multi-source data table management method for real estate registration data

Similar Documents

Publication Publication Date Title
CN113407723A (en) Multi-source heterogeneous power load data fusion method, device, equipment and storage medium
US11176028B2 (en) System, method and storage device for CIM/E model standard compliance test
CN104573906B (en) System and method for analyzing oscillation stability in power transmission system
Barbosa et al. Using performance profiles to analyze the results of the 2006 CEC constrained optimization competition
Zhu et al. Metanetwork framework for integrated performance assessment under uncertainty in construction projects
CN108460068A (en) Method, apparatus, storage medium and the terminal that report imports and exports
Mandelli et al. Dynamic PRA: an overview of new algorithms to generate, analyze and visualize data
Yang et al. DEJIT: a differential evolution algorithm for effort-aware just-in-time software defect prediction
CN114528688A (en) Method and device for constructing reliability digital twin model and computer equipment
US20120226484A1 (en) Calculation simulation system and method thereof
CN110597726A (en) Safety management method, device, equipment and storage medium for avionic system
Vartziotis et al. Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation
CN114743703A (en) Reliability analysis method, device, equipment and storage medium for nuclear power station unit
Chatterjee et al. NHPP-Based software reliability growth modeling and optimal release policy for N-Version programming system with increasing fault detection rate under imperfect debugging
CN113919609A (en) Power distribution network model quality comprehensive evaluation method and system
Sarker et al. Cp-sam: Cyber-power security assessment and resiliency analysis tool for distribution system
Thomas et al. An innovative and automated solution for NERC PRC-027-1 compliance
CN115372752A (en) Fault detection method, device, electronic equipment and storage medium
Karimishad et al. Probabilistic transient stability assessment using two-point estimate method
CN111105140A (en) Comprehensive risk assessment method for running state of power distribution network
Wang et al. Empirical study on the correlation between software structural modifications and its fault-proneness
Zhi-bo et al. Analysis of software process effectiveness based on orthogonal defect classification
CN103677849A (en) Embedded software credibility guaranteeing method
Singh et al. Prediction of software quality model using gene expression programming
Shen et al. Research on energy digital twin quality model based on data driven

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210917