CN109710647B - Power grid standing book data fusion method and device based on keyword search - Google Patents

Power grid standing book data fusion method and device based on keyword search Download PDF

Info

Publication number
CN109710647B
CN109710647B CN201811640460.5A CN201811640460A CN109710647B CN 109710647 B CN109710647 B CN 109710647B CN 201811640460 A CN201811640460 A CN 201811640460A CN 109710647 B CN109710647 B CN 109710647B
Authority
CN
China
Prior art keywords
data
object group
measuring point
matching
data object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811640460.5A
Other languages
Chinese (zh)
Other versions
CN109710647A (en
Inventor
陈冠缘
田翔
周刚
马凯
罗颖婷
黄勇
鄂盛龙
徐思尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Electric Power Research Institute of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Electric Power Research Institute of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Electric Power Research Institute of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN201811640460.5A priority Critical patent/CN109710647B/en
Publication of CN109710647A publication Critical patent/CN109710647A/en
Application granted granted Critical
Publication of CN109710647B publication Critical patent/CN109710647B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The application discloses a power grid standing book data fusion method and device based on keyword search, wherein the method provided by the application firstly extracts and eliminates identification words which are irrelevant to regional information and equipment information in data naming information by keyword extraction in a mode of combining keyword accurate matching and first letter fuzzy matching, the primary data matching association is performed through keyword comparison, and then the initial data matching association is performed through an initial matching mode, the data which is unsuccessfully matched with the primary data is subjected to secondary fuzzy matching, the probability that data matching fails and a new data island is generated due to human input errors is reduced, and the problem of the data island phenomenon caused by abnormal data naming is solved.

Description

Power grid standing book data fusion method and device based on keyword search
Technical Field
The present application relates to the field of data fusion, and in particular, to a data fusion method and apparatus based on keyword search.
Background
With the maturity of big data technology, each local power grid operation and maintenance department gradually establishes a power quality monitoring system based on the big data of the power grid by establishing a standard database and by a data fusion mode. However, different management teams manage the database systems, and the data belong to independent and heterogeneous data before fusion, so that even the same data object is named differently in different database systems due to personal preference of managers or human error in information entry, the data objects with abnormal names are difficult to merge with the same data object and associate with other legal data objects during data fusion, and a new data island is formed.
However, for the phenomenon of data islanding caused by abnormal data naming, the existing processing method only can manually check, compare and correct abnormal data one by one, which is time-consuming and labor-consuming, and causes the technical problem that the existing processing mode of data islanding caused by abnormal data naming has low efficiency.
Disclosure of Invention
The application provides a power grid ledger data fusion method and device based on keyword search, and the method and device are used for solving the technical problem that the existing processing method only can manually check, compare and correct abnormal data one by one to cause the data island phenomenon in the prior art, so that time and labor are wasted, and the existing processing method of the data island caused by abnormal data naming has low efficiency.
In view of this, the first aspect of the present application provides a power grid ledger data fusion method based on keyword search, including:
acquiring measuring point standing book data in each database platform, and performing keyword extraction processing on naming information of the measuring point standing book data to obtain a data name keyword set corresponding to each measuring point standing book data;
performing primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associating the measuring point ledger data with consistent keyword matching results to a data object group;
extracting the initial of the data name keyword set which is not merged, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent, adding the data name keyword set into the data object group;
and respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and a preset data naming template.
Preferably, the adding the set of data name keys to the set of data objects further comprises:
and acquiring a management region topological relation of each measuring point ledger data in the data object group, checking the consistency of the management region topological relation of each measuring point ledger data in the data object group and a reference management region topological relation of the data object group through data comparison, and removing the current measuring point ledger data from the data object group if the management region topological relations are not consistent.
Preferably, after the uniform updating of the naming information of the measurement point ledger data in the same data object group, the method further includes:
and counting the residual measuring point standing book data which are not combined into the data object group, and combining all the residual measuring point standing book data into the undefined data object group.
Preferably, the data name keyword set specifically includes: measuring point region information, measuring point equipment type information and measuring point equipment parameter information.
Preferably, the station device type specifically includes: transformer substation, power transmission line, distribution transformer equipment and user side equipment.
The second aspect of the present application provides a power grid standing book data fusion device based on keyword search, including:
the system comprises a preprocessing module, a database processing module and a database processing module, wherein the preprocessing module is used for acquiring measuring point ledger data in each database platform and extracting keywords from naming information of the measuring point ledger data to obtain a data name keyword set corresponding to each measuring point ledger data;
the primary association module is used for carrying out primary matching on each data name keyword set according to the keyword elements in the data name keyword set and associating the measuring point ledger data with consistent keyword matching results to a data object group;
the secondary association module is used for extracting the initial of the data name keyword set which is not merged, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and adding the data name keyword set into the data object group if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent;
and the data association processing module is used for respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and a preset data naming template.
Preferably, the method further comprises the following steps:
and the checking module is used for acquiring the management region topological relation of each measuring point ledger data in the data object group, checking the consistency of the management region topological relation of each measuring point ledger data in the data object group and the reference management region topological relation of the data object group through data comparison, and removing the current measuring point ledger data from the data object group if the management region topological relation is not consistent.
Preferably, the method further comprises the following steps:
and the residual data counting module is used for counting residual measuring point standing book data which are not combined into the data object group and combining all the residual measuring point standing book data into the undefined data object group.
Preferably, the data name keyword set specifically includes: measuring point region information, measuring point equipment type information and measuring point equipment parameter information.
Preferably, the station device type specifically includes: transformer substation, power transmission line, distribution transformer equipment and user side equipment.
According to the technical scheme, the method has the following advantages:
the application provides a power grid ledger data fusion method based on keyword search, which comprises the following steps: acquiring measuring point standing book data in each database platform, and performing keyword extraction processing on naming information of the measuring point standing book data to obtain a data name keyword set corresponding to each measuring point standing book data; performing primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associating the measuring point ledger data with consistent keyword matching results to a data object group; extracting the initial of the data name keyword set which is not merged, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent, adding the data name keyword set into the data object group; and respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and a preset data naming template.
According to the method, through a mode of combining accurate keyword matching and fuzzy first letter matching, identification words irrelevant to region information and equipment information in data naming information are removed through keyword extraction, primary data matching association is conducted through keyword comparison, then secondary fuzzy matching is conducted on data which are unsuccessfully matched with the primary data through the first letter matching mode, the probability that data matching fails and new data islands are generated due to human input errors is reduced, and the problem that the data islands are generated due to abnormal data naming is solved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a first embodiment of a power grid ledger data fusion method based on keyword search according to the present application;
fig. 2 is a schematic flowchart of a second embodiment of a power grid ledger data fusion method based on keyword search according to the present application;
fig. 3 is a schematic structural diagram of a power grid ledger data fusion device based on keyword search according to the present application.
Detailed Description
The embodiment of the application provides a power grid ledger data fusion method and device based on keyword search, and the method and device are used for solving the technical problem that the existing processing method only can manually check, compare and correct abnormal data one by one, wastes time and labor, and causes low efficiency of the existing processing mode of the data isolated island caused by abnormal data naming.
In order to make the objects, features and advantages of the present invention more apparent and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the embodiments described below are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, an embodiment of the present application provides a power grid ledger data fusion method based on keyword search, including:
step 101, acquiring measurement point standing book data in each database platform, and performing keyword extraction processing on naming information of the measurement point standing book data to obtain a data name keyword set corresponding to each measurement point standing book data;
after the measurement point ledger data is obtained from each database platform, before data matching and association, keyword extraction preprocessing is performed on the naming information of the obtained measurement point ledger data, identification unit mechanism words such as power supply bureau, power grid, limited company, limited responsibility, bureau, company, power supply station, city, county and province in the naming information are removed, and the reserved region information and equipment information are combined to obtain a data name keyword set corresponding to the measurement point ledger data.
102, performing primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associating measuring point ledger data with consistent keyword matching results to a data object group;
it should be noted that after the data name keyword set is obtained, preliminary matching is performed in a keyword matching manner of single word matching or phrase matching according to keyword elements in the data name keyword set, if the keyword matching results are consistent, it is indicated that the measurement point ledger data belong to the same measurement point object, and at this time, the measurement point ledger data with consistent keyword matching results are added to a data object group for data association.
Irrelevant word samples are removed through a primary matching mode of keyword extraction, and only effective information, such as region information and the like, is matched, so that the condition that data matching is wrong due to the fact that data naming is not standard when database management personnel enter data can be effectively avoided, and the data matching rate is preliminarily improved.
103, extracting the initial of the uncombined data name keyword set, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent, adding the data name keyword set into the data object group;
it should be noted that, after a data object group corresponding to each measuring point object is obtained through preliminary matching, measuring point ledger data which is not successfully matched to the same measuring point object in the preliminary matching step is then obtained, and most of such data causes a problem of matching failure because a database manager enters a word with a similar pinyin when entering data.
And step 104, respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and the preset data naming template.
After the secondary matching, carrying out unified correction on the naming information of the measuring point ledger data after classification according to the preset data naming template and the keyword information in the data name keyword set. And ensuring that the naming information of the data of the same measuring point object is consistent.
In the embodiment, by means of combination of accurate keyword matching and fuzzy initial matching, identification words irrelevant to region information and equipment information in data naming information are removed through keyword extraction, primary data matching association is performed through keyword comparison, and then secondary fuzzy matching is performed on data which is unsuccessfully matched with primary data through the fuzzy initial matching mode, so that the probability of data matching failure and generation of a new data island caused by artificial input errors is reduced, and the problem of time and labor waste caused by the fact that an existing processing method only can manually check, compare and correct abnormal data one by one and the processing efficiency of the existing data island caused by abnormal data naming is low is solved.
The foregoing is a detailed description of a first embodiment of a power grid standing book data fusion method based on keyword search provided by the present application, and the following is a detailed description of a second embodiment of a power grid standing book data fusion method based on keyword search provided by the present application.
Referring to fig. 2, an embodiment of the present application provides a power grid ledger data fusion method based on keyword search, including:
step 201, acquiring measurement point standing book data in each database platform, and performing keyword extraction processing on naming information of the measurement point standing book data to obtain a data name keyword set corresponding to each measurement point standing book data;
after the measurement point ledger data is obtained from each database platform, before data matching and association, keyword extraction preprocessing is performed on the naming information of the obtained measurement point ledger data, identification unit mechanism words such as power supply bureau, power grid, limited company, limited responsibility, bureau, company, power supply station, city, county and province in the naming information are removed, and the reserved region information and equipment information are combined to obtain a data name keyword set corresponding to the measurement point ledger data.
202, performing primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associating the measuring point ledger data with consistent keyword matching results to a data object group;
it should be noted that after the data name keyword set is obtained, preliminary matching is performed in a keyword matching manner according to keyword elements in the data name keyword set, if the keyword matching results are consistent, it is indicated that the measurement point ledger data belong to the same measurement point object, and at this time, the measurement point ledger data with consistent keyword matching results are added to a data object group for data association.
Irrelevant word samples are removed through a primary matching mode of keyword extraction, and only effective information, such as region information and the like, is matched, so that the condition that data matching is wrong due to the fact that data naming is not standard when database management personnel enter data can be effectively avoided, and the data matching rate is preliminarily improved.
Step 203, extracting the initial of the un-merged data name keyword set, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent, adding the data name keyword set into the data object group;
it should be noted that, after a data object group corresponding to each measuring point object is obtained through preliminary matching, measuring point ledger data which is not successfully matched to the same measuring point object in the preliminary matching step is then obtained, and most of such data causes a problem of matching failure because a database manager enters a word with a similar pinyin when entering data.
More specifically, the data name keyword set specifically includes: measuring point region information, measuring point equipment type information and measuring point equipment parameter information.
More specifically, the station equipment types specifically include: transformer substation, power transmission line, distribution transformer equipment and user side equipment.
1. Transformer substation fusion
The transformer substation standing book fusion is similar to the unit fusion, and the transformer substation can be regarded as a type of unit. The substations can be straight pipes of local and city bureaus, and some are straight pipes of branch (county) bureaus, so the substations are listed separately.
The names of the transformer substations are generally named by names of places of the transformer substations, for example, a 35kV sheep street transformer substation is built in a called sheep street place name, the lower the voltage level, the smaller the administration region of the place name, and the higher the voltage level, the larger the management region of the place name, if: 500kV Buddha mountain station.
For effectively managing the transformer substation ledger, the specified transformer substation naming standard is as follows: voltage class + name + station, such as: the 35kV sheep street transformer substation is named as a 35kV sheep street station after being converted into a standard name.
And the transformer substation is fused based on the local city and the name. The local city bureau puts forward prevention and reduction of fusion error possibility, a substation list based on local city bureau statistics can be established, and fusion and error correction are carried out according to an algorithm of a fusion unit in 3.4.2.3.1.
2. Line fusion
The circuit in the part comprises a bus and a feeder. The bus realizes the collection of electric energy and is the source of the outgoing line. The feeder line supplies power to a large user and a distribution transformer with 10kV/0.4kV, and is a key link for power quality evaluation. The dispatching automation system, the voltage monitoring system, the marketing system, the electric energy quality system and the like are provided with monitoring points on the bus and the power supply line, so that data such as voltage, current, active power, reactive power and the like can be acquired, and the line account fusion is very key.
Unified line naming is the basis for multi-system data fusion. The line naming rule of the rule is that the name of a transformer station, the voltage level, the name and the line are named, and the name of the transformer station refers to the name of the transformer station, such as a 10kV large line of a 35kV sheep street station.
The core of the multi-system line fusion is the line name, and the fusion is matched and fused based on the name. The following algorithm may be specifically referred to:
(1) establishing an equivalent character table, wherein the equivalent character table comprises partial keywords and a symbol corresponding relation representing the keywords; such as bus bar and bus, M; 2 and II, number and #, etc.;
(2) and matching the line names. The alternative content in the name is replaced based on an equivalent text table and converted to a standard representation. If the 110kV base pond station 10kV-2 section bus is replaced by a 110kV base pond to 10kVII bus; the keyword can be split according to the name of the transformer station, the voltage level of the line and the name of the line. And matching the names of the transformer stations firstly, continuing if the names of the transformer stations are matched, skipping if the names of the transformer stations are not matched, matching the voltage levels when the names of the transformer stations are matched, and sequentially going on. More suitable algorithms can also be introduced based on this idea.
(3) And (5) error correction measures are taken. In order to prevent the problem of matching rate reduction caused by human input errors, Chinese error correction is introduced and applied to the ledger which cannot be matched by a conventional algorithm. The Chinese character error correction brings word errors by using similar characters, pinyin and the like, and the following algorithm can be referred.
a. Name distance similarity. Similarity of two strings is found based on distance, for example: the rural power station is characterized by comprising a 10kV reliable line rural power station public transformer and a 10kV controllable rural power station public transformer, wherein the distance is 2; the distance between the 10kV reliable line rural village public transformer and the 10kV reliable line rural village public transformer is 0, the editing distance is provided, the distance threshold value is set through longest prefix matching, for example, 2 words are allowed to be different, 1 word is allowed to be different, and 6 words are allowed to be different.
b. Error correction is performed based on the first letter of the pinyin. Most of the input errors of Chinese characters come from homophones. Thus, error correction can be performed based on the full pinyin and the initials, for example, the initials must be the same: the initials of the public variable of the controllable variable farmhouse village are kkbnjcgb, and the system B exists: the first letter kkbnjcgb of the public change of village is the same, and the same standing book is recognized.
3. Transformer fusion
The transformer can be divided into a transformer in a transformer substation station and a low-voltage transformer (specially 10kV/0.4kV) according to the distribution position. The transformer in the station can reduce the fusion error rate based on the transformer substation, and the low-voltage distribution transformer can reduce the fusion error rate based on the power supply line.
The in-station transformers are generally named as a 1# main transformer and a 2# main transformer, and one transformer station generally does not exceed 3 transformers, so that the standing book can be well matched only by establishing a naming rule and the belonging transformer station. The naming rule of the transformer in the station is named in a mode of a number # main transformer of a transformer station name.
The low-voltage transformer is named by a power supply area and can be divided into a private transformer and a public transformer, such as: power community special change, civil bureau public change and the like. In fact, the power supply station account can be better integrated by increasing the power supply line of the power supply substation, so that the power supply station is named in a mode of substation name + power supply line name + public transformer (special transformer), and because most systems do not increase the power supply station and line name in the low-voltage station transformer, the power supply line field in the power supply station account table needs to be extracted for matching when the low-voltage station transformer name is matched, and then the low-voltage station transformer name is matched to realize the integration of the power supply station account. The name of the power supply transformer station can be acquired through the power supply line.
And (5) error correction measures are taken. In order to prevent the problem of matching rate reduction caused by human input errors, Chinese error correction is introduced and applied to the ledger which cannot be matched by a conventional algorithm. The Chinese character error correction brings word errors by using similar characters, pinyin and the like, and the following algorithm can be referred.
(1) Name distance similarity. Similarity of two strings is found based on distance, for example: the rural power station is characterized by comprising a 10kV reliable line rural power station public transformer and a 10kV controllable rural power station public transformer, wherein the distance is 2; the distance between the 10kV reliable line rural village public transformer and the 10kV reliable line rural village public transformer is 0, the editing distance is provided, the distance threshold value is set through longest prefix matching, for example, 2 words are allowed to be different, 1 word is allowed to be different, and 6 words are allowed to be different.
(2) Error correction is performed based on the first letter of the pinyin. Most of the input errors of Chinese characters come from homophones. Thus, error correction can be performed based on the full pinyin and the initials, for example, the initials must be the same: the initials of the controlled variable rural public transformer are kkbnjcg, and the B system exists: the first letter kkbnjcgb of the public change of village is the same, and the same standing book is recognized.
4. User account fusion
The electric power system is provided with large-scale intelligent electric energy meters, negative control terminals, voltage monitors, FTUs and other equipment on a user side in a covering mode, and data are distributed in a metering system, a marketing system, a voltage system and the like, so that users are one type of measuring points.
The user attribute classification method is more and can be classified according to load types, power supply capacity, load characteristics and the like, and the scheme classifies users into three categories of commercial users, industrial users and residential users.
The user ledger fusion first establishes standard naming rules. Industrial and commercial users are named by registered enterprise names, and residential users are named by household names.
And matching and fusing the key points of the user by using the power supply line or the low-voltage station transformer and the name. Firstly, the power supply line or the low-voltage station transformer name is matched, and final matching is carried out through the user name.
And (5) error correction measures are taken. In order to prevent the problem of matching rate reduction caused by human input errors, Chinese error correction is introduced and applied to the ledger which cannot be matched by a conventional algorithm. The Chinese character error correction brings word errors by using similar characters, pinyin and the like, and the following algorithm can be referred.
(1) Name distance similarity. Similarity of two strings is found based on distance, for example: the rural power station is characterized by comprising a 10kV reliable line rural power station public transformer and a 10kV controllable rural power station public transformer, wherein the distance is 2; the distance between the 10kV reliable line rural village public transformer and the 10kV reliable line rural village public transformer is 0, the editing distance is provided, the distance threshold value is set through longest prefix matching, for example, 2 words are allowed to be different, 1 word is allowed to be different, and 6 words are allowed to be different.
(2) Error correction is performed based on the first letter of the pinyin. Most of the input errors of Chinese characters come from homophones. Thus, error correction can be performed based on the full pinyin and the initials, for example, the initials must be the same: the initials of the public variable of the controllable variable farmhouse village are kkbnjcgb, and the system B exists: the first letter kkbnjcgb of the village public variable is the same, and the measuring point objects are recognized to be the same.
And 204, acquiring a management region topological relation of each measurement point ledger data in the data object group, checking the consistency of the management region topological relation of each measurement point ledger data in the data object group and a reference management region topological relation of the data object group through data comparison, and removing the current measurement point ledger data from the data object group if the management region topological relation is inconsistent.
It should be noted that, for a situation that there may be a same name in a low-level region, at this time, it is necessary to check and screen through a region management topology of the measurement point object, and specifically, management region topology relation data of the measurement point ledger data after the secondary matching is completed is obtained, for example, a region C with a same name exists in both regions with the same level a and B, and in a data object group of a measurement point in the region C, if a C region measurement point belonging to the region B is detected in the data object group, the measurement point data of the region B is removed from the data object group in the region C.
And step 205, respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and the preset data naming template.
After the secondary matching, carrying out unified correction on the naming information of the measuring point ledger data after classification according to the preset data naming template and the keyword information in the data name keyword set. And ensuring that the naming information of the data of the same measuring point object is consistent.
And step 206, counting the residual measuring point standing book data which are not combined into the data object group, and combining all the residual measuring point standing book data into the undefined data object group.
It should be noted that, for the measurement point ledger data that has not been successfully matched after two times of matching, it is placed in the undefined data object group and manually determined whether to delete or correct the measurement point ledger data.
According to the method, through a mode of combining accurate keyword matching and fuzzy first letter matching, identification words irrelevant to region information and equipment information in data naming information are removed through keyword extraction, primary data matching association is conducted through keyword comparison, then secondary fuzzy matching is conducted on data which are unsuccessfully matched with the primary data through the first letter matching mode, the probability that data matching fails and new data islands are generated due to human input errors is reduced, and the problem that the data islands are generated due to abnormal data naming is solved. Meanwhile, the embodiment further perfects the data fusion method for solving the data island phenomenon caused by abnormal data naming by introducing a management region topological relation check and a residual measuring point ledger data statistical mechanism.
The foregoing is a detailed description of a second embodiment of the power grid standing book data fusion method based on the keyword search, and the following is a detailed description of an embodiment of the power grid standing book data fusion device based on the keyword search.
Referring to fig. 3, an embodiment of the present application provides a power grid ledger data fusion apparatus based on keyword search, including:
the preprocessing module 301 is configured to acquire measurement point ledger data in each database platform, and perform keyword extraction processing on naming information of the measurement point ledger data to obtain a data name keyword set corresponding to each measurement point ledger data;
the primary association module 302 is configured to perform primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associate measurement point ledger data with consistent keyword matching results to a data object group;
a secondary association module 303, configured to extract the first letter of the data name keyword set that is not merged, perform secondary matching between the first letter of the data name keyword set and the first letter of each data object group element, and add the data name keyword set to the data object group if the matching result between the first letter of the data name keyword set and the first letter of the data object group element is consistent;
and the data association processing module 304 is configured to uniformly update the naming information of the measurement point ledger data in the same data object group according to the keyword element and a preset data naming template.
More specifically, the method further comprises the following steps:
the checking module 305 is configured to obtain a management region topological relation of each measurement point ledger data in the data object group, check, through data comparison, consistency between the management region topological relation of each measurement point ledger data in the data object group and a reference management region topological relation of the data object group, and remove the current measurement point ledger data from the data object group if the management region topological relations are not consistent.
More specifically, the method further comprises the following steps:
and the residual data counting module 306 is used for counting residual measuring point standing book data which are not merged into the data object group, and merging all the residual measuring point standing book data into the undefined data object group.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (6)

1. A power grid standing book data fusion method based on keyword search is characterized by comprising the following steps:
acquiring measurement point ledger data in each database platform, and performing keyword extraction processing on naming information of the measurement point ledger data to obtain a data name keyword set corresponding to each measurement point ledger data, wherein the data name keyword set specifically comprises: measuring point region information, measuring point equipment type information and measuring point equipment parameter information;
performing primary matching on each data name keyword set according to keyword elements in the data name keyword set, and associating the measuring point ledger data with consistent keyword matching results to a data object group;
extracting the initial of the data name keyword set which is not merged, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent, adding the data name keyword set into the data object group;
acquiring a management region topological relation of each measuring point machine account data in the data object group, checking the consistency of the management region topological relation of each measuring point machine account data in the data object group and a reference management region topological relation of the data object group through data comparison, and removing the current measuring point machine account data from the data object group if the management region topological relation is not consistent;
and respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the key word elements and a preset data naming template.
2. The method of claim 1, wherein after the uniformly updating the naming information of the measurement station ledger data in the same data object group respectively, the method further comprises:
and counting the residual measuring point standing book data which are not combined into the data object group, and combining all the residual measuring point standing book data into the undefined data object group.
3. The method according to claim 1, characterized in that the station equipment types include in particular: transformer substation, power transmission line, distribution transformer equipment and user side equipment.
4. The utility model provides a power grid standing book data fusion device based on keyword search which characterized in that includes:
the preprocessing module is used for acquiring the measuring point standing book data in each database platform, and performing keyword extraction processing on the naming information of the measuring point standing book data to obtain a data name keyword set corresponding to each measuring point standing book data, wherein the data name keyword set specifically comprises: measuring point region information, measuring point equipment type information and measuring point equipment parameter information;
the primary association module is used for carrying out primary matching on each data name keyword set according to the keyword elements in the data name keyword set and associating the measuring point ledger data with consistent keyword matching results to a data object group;
the secondary association module is used for extracting the initial of the data name keyword set which is not merged, performing secondary matching according to the initial of the data name keyword set and the initial of each data object group element, and adding the data name keyword set into the data object group if the matching result of the initial of the data name keyword set and the initial of the data object group element is consistent;
the data association processing module is used for respectively and uniformly updating the naming information of the measuring point standing book data in the same data object group according to the keyword elements and a preset data naming template;
further comprising:
and the checking module is used for acquiring the management region topological relation of each measuring point ledger data in the data object group, checking the consistency of the management region topological relation of each measuring point ledger data in the data object group and the reference management region topological relation of the data object group through data comparison, and removing the current measuring point ledger data from the data object group if the management region topological relation is not consistent.
5. The apparatus of claim 4, further comprising:
and the residual data counting module is used for counting residual measuring point standing book data which are not combined into the data object group and combining all the residual measuring point standing book data into the undefined data object group.
6. The apparatus according to claim 4, wherein the station equipment types include: transformer substation, power transmission line, distribution transformer equipment and user side equipment.
CN201811640460.5A 2018-12-29 2018-12-29 Power grid standing book data fusion method and device based on keyword search Active CN109710647B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811640460.5A CN109710647B (en) 2018-12-29 2018-12-29 Power grid standing book data fusion method and device based on keyword search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811640460.5A CN109710647B (en) 2018-12-29 2018-12-29 Power grid standing book data fusion method and device based on keyword search

Publications (2)

Publication Number Publication Date
CN109710647A CN109710647A (en) 2019-05-03
CN109710647B true CN109710647B (en) 2021-06-25

Family

ID=66260238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811640460.5A Active CN109710647B (en) 2018-12-29 2018-12-29 Power grid standing book data fusion method and device based on keyword search

Country Status (1)

Country Link
CN (1) CN109710647B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705230A (en) * 2019-08-23 2020-01-17 国网浙江省电力有限公司杭州供电公司 Electronic ledger generation method and system based on overhaul operation behavior perception
CN111078683A (en) * 2019-11-02 2020-04-28 国网辽宁省电力有限公司经济技术研究院 Interpolation search-based power grid ledger data filling and counting method and device
CN110909525A (en) * 2019-11-19 2020-03-24 云南电网有限责任公司信息中心 System and method for realizing automatic comparison of standing book information
CN111782704A (en) * 2020-07-13 2020-10-16 广东电网有限责任公司电力调度控制中心 Method and related device for pushing similar data in electronic handover
CN112801817B (en) * 2020-12-29 2023-07-21 广东电网有限责任公司电力科学研究院 Electric energy quality data center construction method and system thereof
CN113077236A (en) * 2021-04-13 2021-07-06 国网新疆电力有限公司电力科学研究院 Multi-system electric secondary equipment standing book data association fusion method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014046328A1 (en) * 2012-09-20 2014-03-27 한국전력공사 System data compression system and method thereof
CN103902738A (en) * 2014-04-21 2014-07-02 杭州东方通信软件技术有限公司 Information processing method and system
CN104992382A (en) * 2015-07-21 2015-10-21 国网天津市电力公司 Data fusion method facing current situation assessment of power distribution network
CN106503033A (en) * 2016-09-14 2017-03-15 国网山东省电力公司青岛供电公司 A kind of single address search method of power distribution network work and device
CN106919663A (en) * 2017-02-14 2017-07-04 华北电力大学 Character string matching method in the multi-source heterogeneous data fusion of power regulation system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014046328A1 (en) * 2012-09-20 2014-03-27 한국전력공사 System data compression system and method thereof
CN103902738A (en) * 2014-04-21 2014-07-02 杭州东方通信软件技术有限公司 Information processing method and system
CN104992382A (en) * 2015-07-21 2015-10-21 国网天津市电力公司 Data fusion method facing current situation assessment of power distribution network
CN106503033A (en) * 2016-09-14 2017-03-15 国网山东省电力公司青岛供电公司 A kind of single address search method of power distribution network work and device
CN106919663A (en) * 2017-02-14 2017-07-04 华北电力大学 Character string matching method in the multi-source heterogeneous data fusion of power regulation system

Also Published As

Publication number Publication date
CN109710647A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN109710647B (en) Power grid standing book data fusion method and device based on keyword search
CN108549650B (en) Intelligent substation anti-misoperation lockout logic rule source end configuration method and system
CN105427039A (en) Efficient processing method of distribution network repair work orders based on responsibility areas
CN106815373B (en) Distribution network first-aid repair big data display method and system based on BI analysis
CN111461520B (en) Intelligent analysis method for distribution network line automatic switch distribution
CN104050605B (en) Power equipment matching process and system
CN104463696A (en) Power grid operating risk recognition and prevention method and system
CN104881739B (en) Data consistency verification method is matched somebody with somebody by a kind of battalion based on IEC61970/61968 CIM standards
CN111192010A (en) Standing book data processing method and device
CN111861250A (en) Scheduling decision generation method and device, electronic equipment and storage medium
CN115033704A (en) Distribution network fault plan knowledge graph design method and system based on graph database
CN111241488A (en) Distribution network protection fixed value setting system based on full information data flow
CN111708817A (en) Intelligent disposal method for transformer substation monitoring information
CN104657814A (en) Extraction definition method based on EMS system for relay protection device signal template
CN112148897A (en) Automatic retrieval and classification method and system for fault information of power system
CN112420042A (en) Control method and device of power system
CN116109440A (en) Automatic generation method and device for power outage overhaul mode adjustment scheme based on graph search
CN112821566B (en) Intelligent statistical method and device for remote control intervention processing of distribution network faults
CN114822592B (en) Substation signal acceptance method and system based on voice recognition
CN113868821B (en) Distribution network loss reduction method based on marketing and distribution big data fusion and terminal
CN111078683A (en) Interpolation search-based power grid ledger data filling and counting method and device
CN110502257A (en) A kind of control of supervisory control of substation information and checking method
CN108335231A (en) A kind of power distribution network data diagnosis method of Auto-matching
CN107748819A (en) A kind of electrical secondary equipment modeling method and system based on natural language processing
CN114662279A (en) Relay protection information modeling method and system based on secondary equipment big data platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant