CN107330125A - The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology - Google Patents

The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology Download PDF

Info

Publication number
CN107330125A
CN107330125A CN201710593929.3A CN201710593929A CN107330125A CN 107330125 A CN107330125 A CN 107330125A CN 201710593929 A CN201710593929 A CN 201710593929A CN 107330125 A CN107330125 A CN 107330125A
Authority
CN
China
Prior art keywords
data
entity
illustrative plates
knowledge collection
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710593929.3A
Other languages
Chinese (zh)
Other versions
CN107330125B (en
Inventor
曹敏
邹京希
唐立军
赵旭
周年荣
魏玲
沈鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Yunnan Power System Ltd
Original Assignee
Electric Power Research Institute of Yunnan Power System Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Yunnan Power System Ltd filed Critical Electric Power Research Institute of Yunnan Power System Ltd
Priority to CN201710593929.3A priority Critical patent/CN107330125B/en
Publication of CN107330125A publication Critical patent/CN107330125A/en
Application granted granted Critical
Publication of CN107330125B publication Critical patent/CN107330125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The present invention discloses a kind of unstructured distribution data integrated approach of the magnanimity of knowledge based graphical spectrum technology, data acquisition unit gathers the unstructured distribution data of each information system, and the unstructured distribution data to each information system carries out quality analysis and data cleaning treatment respectively;According to the unstructured distribution data of each information system after processing, the data partial indexes based on local knowledge collection of illustrative plates are built;Data partial indexes based on local knowledge collection of illustrative plates are sent to control data corporation by big data connector;Control data corporation builds the data global index based on global knowledge collection of illustrative plates.The present invention is preposition to each information system by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data, reduces data fusion amount of calculation, storage pressure and the data dispatch burden of control data corporation;It is integrated to data source progress using the data global index based on global knowledge collection of illustrative plates, it is easy to data query and extraction, reduces the workload of control data corporation.

Description

The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology
Technical field
The present invention relates to data fusion and integrated technology field, more particularly to a kind of magnanimity of knowledge based graphical spectrum technology is non- Structuring distribution data integrated approach.
Background technology
Power network includes marketing system, production system, power dispatching data collection and the information-based system such as monitoring system and electric energy meter System is enhancing power grid operation ability and expands power customer service ability and quality, it is necessary to efficiently, rapidly collection come from and match somebody with somebody The mass data of net equipment, and combine marketing system, the operation system data such as production system, and data are carried out effectively identification and Filtering, final output is conducive to power operation, improves the related data of customer service quality and service level.
The distribution data collected from each information system can be divided into two classes, and one kind is structural data, such as data or Symbol class data, another is unstructured data, such as user speech, image, text.Existing unstructured distribution data Integrated approach be to set up unified data center's platform, it is unstructured by what is collected using technologies such as data adapter units Data copy is integrated after then being cleaned to data to data center's platform, so that frequently data between solving all departments The demand of exchange.
However, this method is on the one hand universal to carry out intensive data cleaning in data center, cause data center's flushing dose Greatly, integrated speed is slow, it is impossible to meet the integrated requirement of the unstructured data of magnanimity;On the other hand, each information system Unstructured data has differences in service logic, data format and storage, therefore, when data transfer is flat to data center After platform, the classification not only bad for mass data is stored, and returning data extraction and inquiry makes troubles, and considerably increases in data The workload of heart platform.
The content of the invention
In order to solve the above technical problems, the present invention is provided, a kind of magnanimity of knowledge based graphical spectrum technology is unstructured to match somebody with somebody netting index According to integrated approach.
There is provided a kind of unstructured distribution data collection of the magnanimity of knowledge based graphical spectrum technology for embodiments in accordance with the present invention Into method, including:
The unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to each informationization The unstructured distribution data of system carries out quality analysis and data cleaning treatment;
According to the unstructured distribution data of each information system after processing, build based on local knowledge collection of illustrative plates Data partial indexes, the data partial indexes based on local knowledge collection of illustrative plates include:The part of each information system is known Know collection of illustrative plates and local data directory;
The data partial indexes based on local knowledge collection of illustrative plates are sent into data management by big data connector The heart;
Data global index based on global knowledge collection of illustrative plates is built by the control data corporation, it is described to be based on global knowledge The data global index of collection of illustrative plates includes global knowledge collection of illustrative plates and global data concordance list.
Further, the unstructured distribution data of each information system according to after processing, structure is based on The step of data partial indexes of local knowledge collection of illustrative plates, includes:
Unstructured distribution data to each information system after processing carries out entity extraction, described each to obtain The entity storehouse of the unstructured distribution data of information system, the entity storehouse includes the unstructured of each information system Entity, class and the attribute information of distribution data;
According to the relation of each entity in the entity storehouse, the local knowledge collection of illustrative plates is built;
With the entitled keyword of entity of each entity in the entity storehouse, local data's concordance list, the local data are built Concordance list include the partial indexes information corresponding with each entity in the entity storehouse, the partial indexes information include attribute, Example, affiliated text, DSN, affiliated database.
Further, it is described to wrap the step of build data global index based on global knowledge collection of illustrative plates by control data corporation Include:
Collision detection is carried out to the local knowledge collection of illustrative plates of each information system, the collision detection is rushed including physical name Prominent detection, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection;
If there is conflict between the local knowledge collection of illustrative plates of each information system, conflict is eliminated;
According to detecting and eliminate the entity of the local knowledge collection of illustrative plates obtained in conflict process, class, property value and up and down The partial indexes information of each entity in position relation, unified local data's concordance list, and build global knowledge collection of illustrative plates;
Build the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system;
According to the mapping relations and local data's concordance list, closed so that the entity of each entity in the entity storehouse is entitled Keyword, builds global data concordance list, and the global data concordance list includes corresponding with each entity in the entity storehouse complete Office's index information, whole index informations include belonging relation, trigger conflict, the partial indexes information and affiliated part Knowledge mapping.
Further, if there is conflict between the local knowledge collection of illustrative plates of each information system, punching is eliminated Prominent step includes:
Create the priority of the local knowledge collection of illustrative plates of each information system;
If there is physical name conflict or hyponymy conflict between the local knowledge collection of illustrative plates of each information system, The physical name or hyponymy of the local knowledge collection of illustrative plates of highest priority are then selected as the entity of the global knowledge collection of illustrative plates Name or hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates;
Single-value attribute is traveled through in each local knowledge collection of illustrative plates, if detecting a certain single-value attribute for multivalue, Select the property value of local knowledge collection of illustrative plates of highest priority as the property value of the attribute in global knowledge collection of illustrative plates, and change phase The property value of the corresponding local knowledge collection of illustrative plates;
If the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by all local knowledge collection of illustrative plates Property value merges, and constitutes the property value of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification.
Further, the unstructured distribution data of each information system after described pair of processing carries out entity extraction The step of include:
Whether the unstructured distribution data for judging each information system after processing is text data;
If the unstructured distribution data of each information system after processing is text data, according to preset rules Entity, class and attribute information are extracted with dictionary methods;
If the unstructured distribution data of each information system after processing is not text data, after processing The unstructured distribution data of each information system is converted into text;
Participle is carried out to the text, the sentence of the text is analyzed using the parsing algorithm based on natural language processing Dependence in method structure and sentence between word, then extracts entity, class and attribute information.
Further, the relation according to each entity in the entity storehouse, the step of building the local knowledge collection of illustrative plates Including:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is carried out Inner product, calculates the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, obtains each in the entity storehouse Entity relationship, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set, Include altogether | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R | plant different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
Further, the method for the physical name collision detection includes:
The entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates are calculated according to following formula Similarity;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class LAWith the class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBAway from From;
If the similarity of the entity A and the entity B is more than threshold value, the entity A and the entity B are judged Whether physical name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
Further, the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and Extract the hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnPoint The hyponymy figure of each entity in the hyponymy entity sets Wei not be taken, n is the hyponymy entity sets In physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described Exported in hyponymy figure after merging without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of upper bottom Conflict of relationships;If at least there is a node in the hyponymy figure after the merging, testing result is above and below existing Position conflict of relationships.
Further, methods described also includes:According to new equipment and/or the unstructured distribution data of new user, to institute State the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates is updated.
Further, it is described according to new equipment and/or the unstructured distribution data of new user, known based on local described The step of data partial indexes and the data global index based on global knowledge collection of illustrative plates for knowing collection of illustrative plates are updated includes:
The unstructured distribution data of new equipment and/or new user are obtained, and extracts the new equipment and/or new user Entity, class and the attribute information of unstructured distribution data;
Judge the new equipment and/or the unstructured distribution data of new user entity and class whether with a certain office Entity and class in portion's knowledge mapping match;
If it is judged that being matching, then by the new equipment and/or the entity of the unstructured distribution data of new user Blended with local knowledge collection of illustrative plates this described, and update the hyponymy between corresponding entity attribute and entity, according to melting The local knowledge collection of illustrative plates after conjunction updates local data's concordance list and the data based on global knowledge collection of illustrative plates are global Index;
If it is judged that to mismatch, then creating new entity and class, and according to the new entity and class, update institute State the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.
From above technical scheme, it is unstructured with netting index that the present invention provides a kind of magnanimity of knowledge based graphical spectrum technology According to integrated approach, in each information system cloth such as marketing system, production system, power dispatching data collection and monitoring system, electric energy meter Big data connector and data acquisition unit are put, by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data Process is preposition to arrive each information system, and data fusion amount of calculation, storage pressure and the data dispatch for reducing control data corporation are negative Load.The unstructured distribution data such as the user speech, picture, text of each information system is carried out data and taken out by data acquisition unit Sample, quality analysis and data cleansing, the part of each information system is built using the unstructured distribution data after processing Knowledge mapping and local data directory, and control data corporation is transferred to by big data connector.Control data corporation is examined The conflict between local knowledge collection of illustrative plates is surveyed and eliminated, global knowledge collection of illustrative plates and the global data index suitable for total data is built Table, so as to be carried out using global knowledge collection of illustrative plates and global data concordance list to data source integrated.During newly-increased data integration, It can optimize data integration using global knowledge collection of illustrative plates, utilize the new equipment collected and/or the unstructured distribution of new user Data update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.With The increase of integrated equipment and data, constructed local knowledge collection of illustrative plates and global knowledge collection of illustrative plates does not stop to update, and is easy to follow-up development The inquiry of distribution searching mass data, big data analysis etc..
Brief description of the drawings
Fig. 1 is the flow chart that the distributed multi-source heterogeneous data directory shown in one embodiment of the invention is built;
Fig. 2 is a kind of unstructured distribution data collection of magnanimity of knowledge based graphical spectrum technology shown in one embodiment of the invention Into the flow chart of method;
Fig. 3 is the method flow of data partial indexes of the structure based on local knowledge collection of illustrative plates shown in one embodiment of the invention Figure;
Fig. 4 is the schematic diagram of local data's concordance list shown in one embodiment of the invention;
Fig. 5 is the schematic diagram of the index of the local data based on local knowledge collection of illustrative plates shown in one embodiment of the invention;
Fig. 6 is the method flow of data global index of the structure based on global knowledge collection of illustrative plates shown in one embodiment of the invention Figure;
Fig. 7 is the schematic diagram of the global data concordance list shown in one embodiment of the invention;
Fig. 8 is to eliminate the method flow diagram conflicted between each local knowledge collection of illustrative plates shown in one embodiment of the invention;
Fig. 9 is the method flow diagram of the unstructured distribution data entity extraction shown in one embodiment of the invention;
Figure 10 is that a kind of magnanimity of knowledge based graphical spectrum technology shown in further embodiment of this invention is unstructured with netting index According to the flow chart of integrated approach;
Figure 11 is the method flow diagram of the renewal knowledge mapping shown in further embodiment of this invention.
Embodiment
In order that those skilled in the art more fully understand the technical scheme in the application, it is right below in conjunction with accompanying drawing Technical scheme in the embodiment of the present invention is clearly and completely described.
As shown in figure 1, the flow chart built for the distributed multi-source heterogeneous data directory shown in one embodiment of the invention, bag Multiple information systems are included, (Supervisory Control And Data Acquisition, match somebody with somebody for such as intelligent electric meter, SCADA Electric data collecting and monitoring) system, marketing system and production system etc., wherein, each information system is equipped with data Collecting unit and big data connector, data acquisition unit are used to adopt the unstructured distribution data of each information system Collection, quality analysis and data cleansing, find and correct the mistake that can recognize that in data, including check data consistency, handle nothing Valid value and missing values etc..As data acquisition unit is gathered and handled:The ammeter data of intelligent electric meter, it is the remote measurement of SCADA system, distant Control, remote regulating data, the user profile data of marketing system, facility information data of production system etc..Big data connector is used for Data partial indexes based on local knowledge collection of illustrative plates are transmitted to control data corporation.
In the present invention, the framework of the unstructured distribution data of each information system is distributed multi-source heterogeneous form, is led to Cross by the process of collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data it is preposition arrive each information system, without Control data corporation carries out corresponding operating, thus, advantageously reduce data fusion amount of calculation, the storage pressure of control data corporation With data dispatch burden.
As shown in Fig. 2 unstructured matching somebody with somebody for a kind of magnanimity of knowledge based graphical spectrum technology shown in one embodiment of the invention Network data integrated approach, including:
Step S10, the unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to described The unstructured distribution data of each information system carries out quality analysis and data cleaning treatment.
In the present invention, the unstructured distribution data of each information system derives from different information systems, data knot Structure and type variation, such as user voice data, image and/or text data, therefore, each information system it is unstructured The framework of distribution data be distributed multi-source heterogeneous form, by by the collection of distributed multi-source heterogeneous data, quality analysis and The process of data cleansing is preposition to arrive each information system, and corresponding operating is carried out without control data corporation, thus, advantageously reduce Data fusion amount of calculation, storage pressure and the data dispatch burden of control data corporation.
Step S20, according to the unstructured distribution data of each information system after processing, is built and is known based on local Know the data partial indexes of collection of illustrative plates, the data partial indexes based on local knowledge collection of illustrative plates include:Each information system Local knowledge collection of illustrative plates and local data directory.
In order to eliminate difference of each information system data in service logic, data format and storage, it is necessary to will be each The unstructured distribution data of information system is abstracted into the knowledge such as entity, attribute and inter-entity relation, builds local knowledge figure Spectrum and local data directory, so as to build the data partial indexes based on local knowledge collection of illustrative plates.
Step S30, the data partial indexes based on local knowledge collection of illustrative plates are sent to number by big data connector According to administrative center.
Oracle big datas connector or the database big data connector of other standards may be selected in big data connector.
Step S40, the data global index based on global knowledge collection of illustrative plates is built by the control data corporation, described to be based on The data global index of global knowledge collection of illustrative plates includes global knowledge collection of illustrative plates and global data concordance list.
As shown in figure 3, step S20 includes:
S201, the unstructured distribution data to each information system after processing carries out entity extraction, to obtain The entity storehouse of the unstructured distribution data of each information system, the entity storehouse includes the non-of each information system Entity, class and the attribute information of structuring distribution data.
S202, according to the hyponymy of each entity in the entity storehouse, builds the local knowledge collection of illustrative plates.
The local knowledge collection of illustrative plates of structure is not world knowledge collection of illustrative plates, but a special knowledge figure for being directed to power matching network Spectrum, the class refers to the classification of the entity, such as user subject, equipment entity;The entity refers to the reality under a certain class Body name, such as user name, implementor name, producer's name;The attribute refers to the information and data that a certain entity is collected.
Wherein, implementor name mainly includes overhead transmission line, cable, shaft tower, distribution transformer, disconnecting switch, breaker, coincidence Device, sectionaliser, post load switch, ring main unit, pressure regulator, reactive-load compensation capacitor, and ca bin (Feeder Terminal Unit, FTU), data acquisition and monitoring terminal unit (Distribution Terminal Unit, DTU), match somebody with somebody Piezoelectric transformer monitoring terminal unit (Transformer Terminal Unit, TTU), remote-terminal unit (Remote Terminal Unit, RTU) etc. some affiliated facilities.
Archive information, outage information, electricity price information, electricity charge information and the mobile phone A PP extracted from each information system is returned User profile etc. as user subject attribute;By equipment files, device type, voltage class, affiliated platform area, position letter Breath, GIS information, electric energy meter data, four branch electricity consumption situations and status information etc. as equipment entity attribute.
S203, with the entitled keyword of entity of each entity in the entity storehouse, builds local data's concordance list, the part Data directory includes the partial indexes information corresponding with each entity in the entity storehouse, and the partial indexes information includes category Property, example, affiliated text, DSN, affiliated database.Wherein, the dataSource link is referred to as the informationization where entity It may include in the title of system, database of the affiliated database where the corresponding unstructured distribution data of entity, database The data block of multiple data storages.
As shown in figure 4, being first to be classified as certain in the schematic diagram of local data's concordance list shown in one embodiment of the invention, table The physical name of each entity in the entity storehouse of one information system, with the entitled keyword of entity of each entity in entity storehouse, by entity Each entity is enumerated and distinguished in storehouse;Attribute corresponding with the row entity, example, affiliated text is set out upwards in row in table Sheet, DSN, affiliated database information.
As shown in figure 5, being the signal of the index of the local data based on local knowledge collection of illustrative plates shown in one embodiment of the invention Figure, is illustrated by taking physical name 1 as an example, when needing to be indexed data of the physical name 1 under text 2, according to each information The local knowledge collection of illustrative plates of change system and local data directory, finding out the corresponding affiliated database under text 2 of physical name 1 is Database 1, is further continued for finding out in database 1, and data block 1, data block 2 and data block n are corresponding target data block, Required unstructured data is just indexed out;When needing to be indexed data of the physical name 1 under example 1, according to each letter The local knowledge collection of illustrative plates of breathization system and local data directory, find out the corresponding database under example 1 of physical name 1 For database 2, and database 2 be it is special come storage entity 1 data under example 1 of name.It follows that knowing according to based on local Know local data's index of collection of illustrative plates, you can the target data needed for the user of Directory Enquiries, it is convenient and swift and accuracy rate is high.
Each local knowledge collection of illustrative plates that each information system and data source abstract unstructured distribution data is mutual Independent, it is various, information dispersion " information island " to form system, is retrieved and is analyzed it is difficult to put together.Therefore, need A unified intermediary is set up, the shared and integrated of data between each application system is realized.Specifically, as shown in fig. 6, step S40 includes:
S401, carries out collision detection, the collision detection includes real to the local knowledge collection of illustrative plates of each information system The collision detection of body name, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection.
The entity extracted for different data sources such as marketing system, production system, SCADA system, intelligent electric meters, goes out unavoidably Existing different names refer to the situation that identical things or same name refer to different entities, when data integration is carried out, Some conflicts are inevitably present between each local knowledge collection of illustrative plates, it is therefore necessary to which collection of illustrative plates inspection is carried out to each local knowledge collection of illustrative plates Survey, targetedly to eliminate conflict, identification removes redundancy knowledge and contradiction knowledge, so as to form standard with merging equivalent entities True global knowledge collection of illustrative plates.
S402, if there is conflict between the local knowledge collection of illustrative plates of each information system, eliminates conflict.
Eliminate between the local knowledge collection of illustrative plates of each information system and exist after conflict, accurate global knowledge figure can be generated Spectrum, enables the unstructured distribution data of information system preferably integrated, is easy to control data corporation to the collection of data Into management and inquiry, index.
S403, according to detect and eliminate the entity of the local knowledge collection of illustrative plates obtained in conflict process, class, property value and The partial indexes information of each entity in hyponymy, unified local data's concordance list, and build global knowledge collection of illustrative plates.
S404, builds the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system.
I.e. by the collision detection between local knowledge collection of illustrative plates and elimination process, consolidated entity is in all local knowledge collection of illustrative plates Index;Then in global scope, index of each local knowledge collection of illustrative plates in global knowledge collection of illustrative plates is built, is set up across local knowledge The data mapping relations of collection of illustrative plates, on the basis of local data's concordance list, each entity extracted to data source, increase is affiliated Local knowledge collection of illustrative plates, trigger conflict etc. information, set up across each local knowledge collection of illustrative plates data directory, so as to realize interdepartmental System, the data integration of integration across database.
S405, according to the mapping relations and local data's concordance list, with the entity of each entity in the entity storehouse Entitled keyword, builds global data concordance list, and the global data concordance list includes relative with each entity in the entity storehouse The global index's information answered, whole index informations include belonging relation, trigger conflict, the partial indexes information and institute Belong to local knowledge collection of illustrative plates, Fig. 7 is the schematic diagram of global data concordance list.
As shown in figure 8, step S402 includes:
S4021, creates the priority of the local knowledge collection of illustrative plates of each information system
S4022, if there is physical name conflict or upper the next pass between the local knowledge collection of illustrative plates of each information system System's conflict, then select the physical name or hyponymy of the local knowledge collection of illustrative plates of highest priority to be used as the global knowledge collection of illustrative plates Physical name or hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates.
When detecting entity name conflict or hyponymy conflict, the local knowledge collection of illustrative plates of highest priority is selected Entity name or hyponymy as global knowledge collection of illustrative plates entity or hyponymy, while the entity or upper the next closing System is brought into global knowledge collection of illustrative plates, and changes the entity name and hyponymy of corresponding local knowledge collection of illustrative plates, real Existing entity name and hyponymy it is globally consistent;When being clashed between local knowledge mapping, with global knowledge collection of illustrative plates Entity name and hyponymy be defined.
S4023, single-value attribute is traveled through in each local knowledge collection of illustrative plates, is if detecting a certain single-value attribute Multivalue, select highest priority local knowledge collection of illustrative plates property value as the attribute in global knowledge collection of illustrative plates property value, and The property value of the corresponding local knowledge collection of illustrative plates of modification.
When single-value attribute detects multivalue, the value of the local knowledge collection of illustrative plates of highest priority is selected to be used as global knowledge figure The value of the attribute of this in spectrum, while the attribute is to bring into global knowledge collection of illustrative plates, and changes corresponding local knowledge collection of illustrative plates Property value, realizes the globally consistent of single-value attribute.When being clashed between local knowledge mapping, with the category of global knowledge collection of illustrative plates Property value is defined.
S4024, if the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by all local knowledges The property value of collection of illustrative plates merges, and constitutes the property value of global knowledge collection of illustrative plates, while the corresponding local knowledge collection of illustrative plates of modification Property value.
For multi-valued attribute, if property value is inconsistent between detecting local knowledge collection of illustrative plates, by all local knowledges The value of collection of illustrative plates merges, and constitutes the attribute of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification, Realize the globally consistent of multi-valued attribute.When being clashed between local knowledge mapping, using the property value of global knowledge collection of illustrative plates as It is accurate.
As shown in figure 9, step S201 includes:
S2011, whether the unstructured distribution data for judging each information system after processing is text data.
Unstructured distribution data can include the different types of data modes, pin such as user speech, image and/or text To different types of data, the method that entity is extracted is different.
S2012, if the unstructured distribution data of each information system after processing is text data, according to pre- If rule and dictionary methods extract entity, class and attribute information.
For the forms such as the equipment files in production system, operation manual, standard more fixed text data, using base Entity therein, class and attribute information are extracted in rule and the method for dictionary;Please call net expert and formulate and meet power network industry Entity extraction rule, extracted using dictionary methods implementor name in text, device type, name, place name, institution term, The entities such as special time and its class and attribute information.
S2013, if the unstructured distribution data of each information system after processing is not text data, will locate The unstructured distribution data of each information system after reason is converted into text.
S2014, participle is carried out to the text, and the text is analyzed using the parsing algorithm based on natural language processing Dependence in this syntactic structure and sentence between word, then extracts entity, class and attribute information.
When unstructured distribution data is user voice data, skill is changed using the voice based on HMM Art is converted into text;When unstructured distribution data is image, it will be schemed using the picture recognition technology based on SVMs Word in piece is converted into text.Then text is carried out by participle using the natural language participle technique based on string matching, Then entity therein, class and attribute are extracted, i.e., text is first subjected to participle, the parsing algorithm of natural language processing is utilized Dependence in the syntactic structure and sentence of parsing sentence between word, then identifies entity, class and attribute.
Completed when entity, attribute etc. are extracted, entity storehouse is obtained, on this basis, using the branch based on character string sequence core The relation between entity relation extraction technology two entities of identification of vector machine model is held, the contact set up between entity walks Rapid S202 includes:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is carried out Inner product, calculates the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, obtains each in the entity storehouse Entity relationship, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set, Include altogether | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R | plant different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
The citation form of triple mainly includes entity 1, relation, entity 2, and concept, attribute, property value etc., passes through three Tuple-set, it becomes possible to which the mapping for the initial data set up where any entity and entity, the mapping is by local data's concordance list To realize;Each entity extracted to data source, sets up a concordance list, the concordance list includes with the entitled keyword of the entity Attribute, DSN, belonging relation, affiliated database, affiliated table, affiliated text, example, affiliated local knowledge collection of illustrative plates etc. one Series information associated with the data, by local data's concordance list, can be positioned rapidly in single distribution information system Data, so as to inquire about and extract data.
In step S401, the method for the physical name collision detection includes:
The entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates are calculated according to following formula Similarity;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class LAWith the class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBAway from From;
If the similarity of the entity A and the entity B is more than threshold value, the entity A and the entity B are judged Whether physical name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
Entity, the class of entity and attribute are set up respectively in each local knowledge collection of illustrative plates and indexed, i.e., local data indexes Table, then, for the entity A in some local knowledge collection of illustrative plates, entity B is searched in the index of other local knowledge collection of illustrative plates, Calculate A and B similarity Sim (A, B), if in current local knowledge collection of illustrative plates entity class LAWith attribute SAKnow with other parts Know the class L of some entity B in collection of illustrative plates thenBWith attribute SBIt is much like, but physical name is different, then detects the presence of physical name Conflict.
In step S401, the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and Extract the hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnPoint The hyponymy figure of each entity in the hyponymy entity sets Wei not be taken, n is the hyponymy entity sets In physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described Exported in hyponymy figure after merging without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of upper bottom Conflict of relationships;If at least there is a node in the hyponymy figure after the merging, testing result is above and below existing Position conflict of relationships.
As shown in Figure 10, be a kind of knowledge based graphical spectrum technology shown in further embodiment of this invention magnanimity it is unstructured Also include after distribution data integrated approach, step S203:
S50, according to new equipment and/or the unstructured distribution data of new user, to described based on local knowledge collection of illustrative plates Data partial indexes and the data global index based on global knowledge collection of illustrative plates are updated.
Control data corporation is responsible for safeguarding and updates global knowledge collection of illustrative plates, local knowledge collection of illustrative plates, global data concordance list, office Portion's data directory, manages the exchange of data.According to new equipment and/or the unstructured distribution data of new user, to the base It is updated in the data partial indexes of local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates, can be with Making the integrated data of control data corporation has real-time, accuracy, can when setting up new distribution net equipment and information system Changed with the dynamical state for being adapted to power distribution network, realize that data are managed concentratedly.When needing to inquire about certain entity related data, pass through Global data concordance list, you can inquire data association message and affiliated database, so as to realize the number in each information system According to integrated.
Specifically, as shown in figure 11, step S50 includes:
S501, obtains new equipment and/or the unstructured distribution data of new user, and extract the new equipment and/or new Entity, class and the attribute information of the unstructured distribution data of user;
S502, judge the new equipment and/or the unstructured distribution data of new user entity and class whether with it is a certain Entity and class in the local knowledge collection of illustrative plates match;
S503, if it is judged that being matching, then by the unstructured distribution data of the new equipment and/or new user Entity is blended with local knowledge collection of illustrative plates this described, and updates the hyponymy between corresponding entity attribute and entity, root Local data's concordance list and the data based on global knowledge collection of illustrative plates are updated according to the local knowledge collection of illustrative plates after fusion Global index;
S504, if it is judged that to mismatch, then new entity and class are created, and according to the new entity and class, Update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.
Entity, class and attribute in global knowledge collection of illustrative plates come from multiple local knowledge collection of illustrative plates, with generality, to distribution Data have very strong recognition reaction, and the entity and attribute of newly-increased data source are quickly extracted using global and local knowledge mapping, Newly-increased data source integrated speed and accuracy rate are improved, the optimization of data integration is realized;For the reality of knowledge mapping None- identified Body, extracts corresponding entity, class and attribute, is matched with the class and entity in original knowledge mapping, if matching degree is high Then merged, update the hyponymy between entity attribute and entity, be otherwise created that new class, then updated and be based on office The data partial indexes of portion's knowledge mapping and the data global index based on global knowledge collection of illustrative plates, so as to realize the excellent of knowledge mapping Change.
From above technical scheme, it is unstructured with netting index that the present invention provides a kind of magnanimity of knowledge based graphical spectrum technology According to integrated approach, in each information system cloth such as marketing system, production system, power dispatching data collection and monitoring system, electric energy meter Big data connector and data acquisition unit are put, by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data Process is preposition to arrive each information system, and data fusion amount of calculation, storage pressure and the data dispatch for reducing control data corporation are negative Load.The unstructured distribution data such as the user speech, picture, text of each information system is carried out data and taken out by data acquisition unit Sample, quality analysis and data cleansing, the part of each information system is built using the unstructured distribution data after processing Knowledge mapping and local data directory, and control data corporation is transferred to by big data connector.Control data corporation is examined The conflict between local knowledge collection of illustrative plates is surveyed and eliminated, global knowledge collection of illustrative plates and the global data index suitable for total data is built Table, so as to be carried out using global knowledge collection of illustrative plates and global data concordance list to data source integrated.During newly-increased data integration, It can optimize data integration using global knowledge collection of illustrative plates, utilize the new equipment collected and/or the unstructured distribution of new user Data update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.With The increase of integrated equipment and data, constructed local knowledge collection of illustrative plates and global knowledge collection of illustrative plates does not stop to update, and is easy to follow-up development The inquiry of distribution searching mass data, big data analysis etc..
Those skilled in the art will readily occur to its of the present invention after considering specification and putting into practice invention disclosed herein Its embodiment.The application be intended to the present invention any modification, purposes or adaptations, these modifications, purposes or Person's adaptations follow the general principle of the present invention and including undocumented common knowledge in the art of the invention Or conventional techniques.Description and embodiments be considered only as it is exemplary, true scope and spirit of the invention by right will Ask and point out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and And various modifications and changes can be being carried out without departing from the scope.The scope of the present invention is only limited by appended claim.

Claims (10)

1. a kind of unstructured distribution data integrated approach of the magnanimity of knowledge based graphical spectrum technology, it is characterised in that including:
The unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to each information system Unstructured distribution data carry out quality analysis and data cleaning treatment;
According to the unstructured distribution data of each information system after processing, the data based on local knowledge collection of illustrative plates are built Partial indexes, the data partial indexes based on local knowledge collection of illustrative plates include:The local knowledge figure of each information system Spectrum and local data directory;
The data partial indexes based on local knowledge collection of illustrative plates are sent to control data corporation by big data connector;
Data global index based on global knowledge collection of illustrative plates is built by the control data corporation, it is described to be based on global knowledge collection of illustrative plates Data global index include global knowledge collection of illustrative plates and global data concordance list.
2. according to the method described in claim 1, it is characterised in that each information system according to after processing it is non- Structuring distribution data, the step of building the data partial indexes based on local knowledge collection of illustrative plates includes:
Unstructured distribution data to each information system after processing carries out entity extraction, to obtain each information The entity storehouse of the unstructured distribution data of change system, the entity storehouse includes the unstructured distribution of each information system Entity, class and the attribute information of data;
According to the hyponymy of each entity in the entity storehouse, the local knowledge collection of illustrative plates is built;
With the entitled keyword of entity of each entity in the entity storehouse, local data's concordance list, local data's index are built Table include the partial indexes information corresponding with each entity in the entity storehouse, the partial indexes information include attribute, example, Affiliated text, DSN, affiliated database.
3. according to the method described in claim 1, it is characterised in that described built by control data corporation is based on global knowledge figure The step of data global index of spectrum, includes:
Collision detection is carried out to the local knowledge collection of illustrative plates of each information system, the collision detection is examined including physical name conflict Survey, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection;
If there is conflict between the local knowledge collection of illustrative plates of each information system, conflict is eliminated;
According to entity, class, property value and the upper the next pass for detecting and eliminating the local knowledge collection of illustrative plates obtained in conflict process The partial indexes information of each entity in system, unified local data's concordance list, and build global knowledge collection of illustrative plates;
Build the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system;
According to the mapping relations and local data's concordance list, with the entitled key of entity of each entity in the entity storehouse Word, builds global data concordance list, and the global data concordance list includes the overall situation corresponding with each entity in the entity storehouse Index information, whole index informations include belonging relation, trigger conflict, the partial indexes information and affiliated part to know Know collection of illustrative plates.
4. method according to claim 3, it is characterised in that if the local knowledge figure of each information system There is conflict between spectrum, then the step of eliminating conflict includes:
Create the priority of the local knowledge collection of illustrative plates of each information system;
If there is physical name conflict or hyponymy conflict between the local knowledge collection of illustrative plates of each information system, select Select the physical name or hyponymy of priority highest local knowledge collection of illustrative plates as the physical name of the global knowledge collection of illustrative plates or Person's hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates;
Single-value attribute is traveled through in each local knowledge collection of illustrative plates, if detecting a certain single-value attribute for multivalue, selection The property value of the local knowledge collection of illustrative plates of highest priority and changes corresponding as the property value of the attribute in global knowledge collection of illustrative plates The local knowledge collection of illustrative plates property value;
If the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by the attribute of all local knowledge collection of illustrative plates Value merges, and constitutes the property value of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification.
5. method according to claim 2, it is characterised in that the non-knot of each information system after described pair of processing The step of structure distribution data carries out entity extraction includes:
Whether the unstructured distribution data for judging each information system after processing is text data;
If the unstructured distribution data of each information system after processing is text data, according to preset rules and word Allusion quotation method extracts entity, class and attribute information;
If the unstructured distribution data of each information system after processing is not text data, described in after processing The unstructured distribution data of each information system is converted into text;
Participle is carried out to the text, the syntax knot of the text is analyzed using the parsing algorithm based on natural language processing Dependence in structure and sentence between word, then extracts entity, class and attribute information.
6. method according to claim 2, it is characterised in that the relation according to each entity in the entity storehouse, structure The step of building the local knowledge collection of illustrative plates includes:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is subjected to inner product, Calculate the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, each entity in the entity storehouse is obtained Relation, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set, altogether bag Contain | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R | plant Different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
7. method according to claim 3, it is characterised in that the method for the physical name collision detection includes:
The phase of the entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates is calculated according to following formula Like degree;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class LAWith The class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBDistance;
If the similarity of the entity A and the entity B is more than threshold value, the entity of the entity A and the entity B is judged Whether name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
8. method according to claim 3, it is characterised in that the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and is extracted The hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnRespectively The hyponymy figure of each entity in the hyponymy entity sets is taken, n is in the hyponymy entity sets Physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described merge Exported in hyponymy figure afterwards without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of hyponymy Conflict;If at least there is a node in the hyponymy figure after the merging, testing result is closed to there is upper bottom System's conflict.
9. method according to claim 2, it is characterised in that methods described also includes:According to new equipment and/or new user Unstructured distribution data, to the data partial indexes based on local knowledge collection of illustrative plates and it is described be based on global knowledge collection of illustrative plates Data global index be updated.
10. method according to claim 9, it is characterised in that described according to the unstructured of new equipment and/or new user Distribution data, it is global to the data partial indexes based on local knowledge collection of illustrative plates and the data based on global knowledge collection of illustrative plates The step of index is updated includes:
New equipment and/or the unstructured distribution data of new user are obtained, and extracts the new equipment and/or the non-knot of new user Entity, class and the attribute information of structure distribution data;
Judge whether the entity and class of the new equipment and/or the unstructured distribution data of new user are known with a certain part The entity and class known in collection of illustrative plates match;
If it is judged that being matching, then by the new equipment and/or the entity of the unstructured distribution data of new user with being somebody's turn to do The local knowledge collection of illustrative plates is blended, and updates the hyponymy between corresponding entity attribute and entity, after fusion The local knowledge collection of illustrative plates update local data's concordance list and the data global index based on global knowledge collection of illustrative plates;
If it is judged that to mismatch, then creating new entity and class, and according to the new entity and class, update the base Data partial indexes and the data global index based on global knowledge collection of illustrative plates in local knowledge collection of illustrative plates.
CN201710593929.3A 2017-07-20 2017-07-20 Mass unstructured distribution network data integration method based on knowledge graph technology Active CN107330125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710593929.3A CN107330125B (en) 2017-07-20 2017-07-20 Mass unstructured distribution network data integration method based on knowledge graph technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710593929.3A CN107330125B (en) 2017-07-20 2017-07-20 Mass unstructured distribution network data integration method based on knowledge graph technology

Publications (2)

Publication Number Publication Date
CN107330125A true CN107330125A (en) 2017-11-07
CN107330125B CN107330125B (en) 2020-06-30

Family

ID=60226885

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710593929.3A Active CN107330125B (en) 2017-07-20 2017-07-20 Mass unstructured distribution network data integration method based on knowledge graph technology

Country Status (1)

Country Link
CN (1) CN107330125B (en)

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107908738A (en) * 2017-11-15 2018-04-13 昆明能讯科技有限责任公司 The implementation method of enterprise-level knowledge mapping search engine based on power specialty dictionary
CN108304493A (en) * 2018-01-10 2018-07-20 深圳市腾讯计算机系统有限公司 A kind of the hypernym method for digging and device of knowledge based collection of illustrative plates
CN108595449A (en) * 2017-11-23 2018-09-28 北京科东电力控制系统有限责任公司 The structure and application process of dispatch automated system knowledge mapping
CN109189938A (en) * 2018-08-31 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for updating knowledge mapping
CN109241078A (en) * 2018-08-30 2019-01-18 中国地质大学(武汉) A kind of knowledge mapping hoc queries method based on hybrid database
CN109284394A (en) * 2018-09-12 2019-01-29 青岛大学 A method of Company Knowledge map is constructed from multi-source data integration visual angle
CN109284393A (en) * 2018-08-28 2019-01-29 合肥工业大学 A kind of fusion method for family tree character attribute title
CN109446385A (en) * 2018-11-14 2019-03-08 中国科学院计算技术研究所 A kind of method of equipment map that establishing Internet resources and the application method of the equipment map
CN109446343A (en) * 2018-11-05 2019-03-08 上海德拓信息技术股份有限公司 A kind of method of public safety knowledge mapping building
CN109582958A (en) * 2018-11-20 2019-04-05 厦门大学深圳研究院 A kind of disaster story line construction method and device
CN109685684A (en) * 2018-12-26 2019-04-26 武汉大学 A kind of low-voltage network topological structure method of calibration of knowledge based map
CN109766445A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of knowledge mapping construction method and data processing equipment
CN109783605A (en) * 2018-12-14 2019-05-21 天津大学 A kind of science service interconnection method based on Bayesian inference technology
CN109933582A (en) * 2019-03-11 2019-06-25 国家电网有限公司 Data processing method and device
CN110019150A (en) * 2019-04-11 2019-07-16 软通动力信息技术有限公司 A kind of data administering method, system and electronic equipment
WO2019137033A1 (en) * 2018-01-12 2019-07-18 扬州大学 Automatic construction method for software bug oriented domain knowledge graph
CN110275966A (en) * 2019-07-01 2019-09-24 科大讯飞(苏州)科技有限公司 A kind of Knowledge Extraction Method and device
CN110297910A (en) * 2018-03-23 2019-10-01 国际商业机器公司 Manage distributed knowledge figure
CN110427471A (en) * 2019-07-26 2019-11-08 四川长虹电器股份有限公司 A kind of natural language question-answering method and system of knowledge based map
CN110457482A (en) * 2019-06-06 2019-11-15 福建奇点时空数字科技有限公司 A kind of intelligent information service system of knowledge based map
CN110489475A (en) * 2019-08-14 2019-11-22 广东电网有限责任公司 A kind of multi-source heterogeneous data processing method, system and relevant apparatus
CN111026874A (en) * 2019-11-22 2020-04-17 海信集团有限公司 Data processing method and server of knowledge graph
CN111046115A (en) * 2019-12-24 2020-04-21 四川文轩教育科技有限公司 Knowledge graph-based heterogeneous database interconnection management method
CN111639082A (en) * 2020-06-08 2020-09-08 成都信息工程大学 Object storage management method and system of billion-level node scale knowledge graph based on Ceph
CN111858948A (en) * 2019-04-30 2020-10-30 杭州海康威视数字技术股份有限公司 Ontology construction method and device, electronic equipment and storage medium
CN112200382A (en) * 2020-10-27 2021-01-08 支付宝(杭州)信息技术有限公司 Training method and device of risk prediction model
CN112241458A (en) * 2020-10-13 2021-01-19 北京百分点信息科技有限公司 Text knowledge structuring processing method, device, equipment and readable storage medium
CN112256882A (en) * 2020-10-16 2021-01-22 美林数据技术股份有限公司 Multi-similarity-based cross-system network entity fusion method
CN112287123A (en) * 2020-11-19 2021-01-29 国网湖南省电力有限公司 Entity alignment method and device based on edge type attention mechanism
CN112307172A (en) * 2020-10-31 2021-02-02 平安科技(深圳)有限公司 Semantic parsing equipment, method, terminal and storage medium
CN112650865A (en) * 2021-01-27 2021-04-13 南威软件股份有限公司 Method and system for solving multi-region license data conflict based on flexible rule
CN113159320A (en) * 2021-03-08 2021-07-23 北京航空航天大学 Scientific and technological resource data integration method and device based on knowledge graph
CN113157697A (en) * 2021-04-19 2021-07-23 山东艺术学院 Mingqing custom music score database system
CN113177095A (en) * 2021-04-29 2021-07-27 北京明略软件系统有限公司 Enterprise knowledge management method, system, electronic equipment and storage medium
CN117556059A (en) * 2024-01-12 2024-02-13 天津滨电电力工程有限公司 Detection and correction method based on knowledge fusion and reasoning charging station data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488724A (en) * 2013-09-16 2014-01-01 复旦大学 Book-oriented reading field knowledge map construction method
CN105630901A (en) * 2015-12-21 2016-06-01 清华大学 Knowledge graph representation learning method
US20160328443A1 (en) * 2015-05-06 2016-11-10 Vero Analytics, Inc. Knowledge Graph Based Query Generation
CN106447346A (en) * 2016-08-29 2017-02-22 北京中电普华信息技术有限公司 Method and system for construction of intelligent electric power customer service system
CN106886543A (en) * 2015-12-16 2017-06-23 清华大学 The knowledge mapping of binding entity description represents learning method and system
CN106897273A (en) * 2017-04-12 2017-06-27 福州大学 A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488724A (en) * 2013-09-16 2014-01-01 复旦大学 Book-oriented reading field knowledge map construction method
US20160328443A1 (en) * 2015-05-06 2016-11-10 Vero Analytics, Inc. Knowledge Graph Based Query Generation
CN106886543A (en) * 2015-12-16 2017-06-23 清华大学 The knowledge mapping of binding entity description represents learning method and system
CN105630901A (en) * 2015-12-21 2016-06-01 清华大学 Knowledge graph representation learning method
CN106447346A (en) * 2016-08-29 2017-02-22 北京中电普华信息技术有限公司 Method and system for construction of intelligent electric power customer service system
CN106897273A (en) * 2017-04-12 2017-06-27 福州大学 A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴运兵: "基于多数据源的知识图谱构建方法研究", 《福州大学学报(自然科学版)》 *
胡芳槐: "基于多种数据源的中文知识图谱构建方法研究", 《中国博士学位论文全文数据库(电子期刊)》 *

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107908738A (en) * 2017-11-15 2018-04-13 昆明能讯科技有限责任公司 The implementation method of enterprise-level knowledge mapping search engine based on power specialty dictionary
CN108595449A (en) * 2017-11-23 2018-09-28 北京科东电力控制系统有限责任公司 The structure and application process of dispatch automated system knowledge mapping
CN108304493A (en) * 2018-01-10 2018-07-20 深圳市腾讯计算机系统有限公司 A kind of the hypernym method for digging and device of knowledge based collection of illustrative plates
CN108304493B (en) * 2018-01-10 2020-06-12 深圳市腾讯计算机系统有限公司 Hypernym mining method and device based on knowledge graph
WO2019137033A1 (en) * 2018-01-12 2019-07-18 扬州大学 Automatic construction method for software bug oriented domain knowledge graph
US11386136B2 (en) 2018-01-12 2022-07-12 Yangzhou University Automatic construction method of software bug knowledge graph
CN110297910A (en) * 2018-03-23 2019-10-01 国际商业机器公司 Manage distributed knowledge figure
CN109284393A (en) * 2018-08-28 2019-01-29 合肥工业大学 A kind of fusion method for family tree character attribute title
CN109284393B (en) * 2018-08-28 2020-11-06 合肥工业大学 Fusion method for family tree character attribute names
CN109241078A (en) * 2018-08-30 2019-01-18 中国地质大学(武汉) A kind of knowledge mapping hoc queries method based on hybrid database
CN109241078B (en) * 2018-08-30 2021-07-20 中国地质大学(武汉) Knowledge graph organization query method based on mixed database
CN109189938A (en) * 2018-08-31 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for updating knowledge mapping
CN109284394A (en) * 2018-09-12 2019-01-29 青岛大学 A method of Company Knowledge map is constructed from multi-source data integration visual angle
CN109446343A (en) * 2018-11-05 2019-03-08 上海德拓信息技术股份有限公司 A kind of method of public safety knowledge mapping building
CN109446343B (en) * 2018-11-05 2020-10-27 上海德拓信息技术股份有限公司 Public safety knowledge graph construction method
CN109446385B (en) * 2018-11-14 2022-06-14 中国科学院计算技术研究所 Method for establishing network resource equipment map and using method of equipment map
CN109446385A (en) * 2018-11-14 2019-03-08 中国科学院计算技术研究所 A kind of method of equipment map that establishing Internet resources and the application method of the equipment map
CN109582958B (en) * 2018-11-20 2023-07-18 厦门大学深圳研究院 Disaster story line construction method and device
CN109582958A (en) * 2018-11-20 2019-04-05 厦门大学深圳研究院 A kind of disaster story line construction method and device
CN109766445A (en) * 2018-12-13 2019-05-17 平安科技(深圳)有限公司 A kind of knowledge mapping construction method and data processing equipment
CN109766445B (en) * 2018-12-13 2024-03-26 平安科技(深圳)有限公司 Knowledge graph construction method and data processing device
CN109783605A (en) * 2018-12-14 2019-05-21 天津大学 A kind of science service interconnection method based on Bayesian inference technology
CN109783605B (en) * 2018-12-14 2021-05-11 天津大学 Scientific and technological service docking method based on Bayesian inference technology
CN109685684A (en) * 2018-12-26 2019-04-26 武汉大学 A kind of low-voltage network topological structure method of calibration of knowledge based map
CN109933582A (en) * 2019-03-11 2019-06-25 国家电网有限公司 Data processing method and device
CN110019150A (en) * 2019-04-11 2019-07-16 软通动力信息技术有限公司 A kind of data administering method, system and electronic equipment
CN111858948A (en) * 2019-04-30 2020-10-30 杭州海康威视数字技术股份有限公司 Ontology construction method and device, electronic equipment and storage medium
CN110457482A (en) * 2019-06-06 2019-11-15 福建奇点时空数字科技有限公司 A kind of intelligent information service system of knowledge based map
CN110275966A (en) * 2019-07-01 2019-09-24 科大讯飞(苏州)科技有限公司 A kind of Knowledge Extraction Method and device
CN110275966B (en) * 2019-07-01 2021-10-01 科大讯飞(苏州)科技有限公司 Knowledge extraction method and device
CN110427471A (en) * 2019-07-26 2019-11-08 四川长虹电器股份有限公司 A kind of natural language question-answering method and system of knowledge based map
CN110489475A (en) * 2019-08-14 2019-11-22 广东电网有限责任公司 A kind of multi-source heterogeneous data processing method, system and relevant apparatus
CN111026874A (en) * 2019-11-22 2020-04-17 海信集团有限公司 Data processing method and server of knowledge graph
CN111046115A (en) * 2019-12-24 2020-04-21 四川文轩教育科技有限公司 Knowledge graph-based heterogeneous database interconnection management method
CN111046115B (en) * 2019-12-24 2023-08-08 四川文轩教育科技有限公司 Heterogeneous database interconnection management method based on knowledge graph
CN111639082B (en) * 2020-06-08 2022-12-23 成都信息工程大学 Object storage management method and system of billion-level node scale knowledge graph based on Ceph
CN111639082A (en) * 2020-06-08 2020-09-08 成都信息工程大学 Object storage management method and system of billion-level node scale knowledge graph based on Ceph
CN112241458B (en) * 2020-10-13 2022-10-28 北京百分点科技集团股份有限公司 Text knowledge structuring processing method, device, equipment and readable storage medium
CN112241458A (en) * 2020-10-13 2021-01-19 北京百分点信息科技有限公司 Text knowledge structuring processing method, device, equipment and readable storage medium
CN112256882A (en) * 2020-10-16 2021-01-22 美林数据技术股份有限公司 Multi-similarity-based cross-system network entity fusion method
CN112200382A (en) * 2020-10-27 2021-01-08 支付宝(杭州)信息技术有限公司 Training method and device of risk prediction model
CN112307172A (en) * 2020-10-31 2021-02-02 平安科技(深圳)有限公司 Semantic parsing equipment, method, terminal and storage medium
CN112307172B (en) * 2020-10-31 2023-08-01 平安科技(深圳)有限公司 Semantic analysis device, semantic analysis method, terminal and storage medium
CN112287123A (en) * 2020-11-19 2021-01-29 国网湖南省电力有限公司 Entity alignment method and device based on edge type attention mechanism
CN112650865B (en) * 2021-01-27 2021-11-09 南威软件股份有限公司 Method and system for solving multi-region license data conflict based on flexible rule
CN112650865A (en) * 2021-01-27 2021-04-13 南威软件股份有限公司 Method and system for solving multi-region license data conflict based on flexible rule
CN113159320A (en) * 2021-03-08 2021-07-23 北京航空航天大学 Scientific and technological resource data integration method and device based on knowledge graph
CN113157697A (en) * 2021-04-19 2021-07-23 山东艺术学院 Mingqing custom music score database system
CN113177095A (en) * 2021-04-29 2021-07-27 北京明略软件系统有限公司 Enterprise knowledge management method, system, electronic equipment and storage medium
CN117556059A (en) * 2024-01-12 2024-02-13 天津滨电电力工程有限公司 Detection and correction method based on knowledge fusion and reasoning charging station data

Also Published As

Publication number Publication date
CN107330125B (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN107330125A (en) The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology
CN108491378B (en) Intelligent response system for operation and maintenance of electric power information
CN106447346A (en) Method and system for construction of intelligent electric power customer service system
CN107766483A (en) The interactive answering method and system of a kind of knowledge based collection of illustrative plates
CN109446305A (en) The construction method and system of intelligent tour customer service system
CN107169079A (en) A kind of field text knowledge abstracting method based on Deepdive
CN113157860B (en) Electric power equipment maintenance knowledge graph construction method based on small-scale data
CN114077674A (en) Power grid dispatching knowledge graph data optimization method and system
CN115438199A (en) Knowledge platform system based on smart city scene data middling platform technology
CN115809833A (en) Intelligent monitoring method and device for capital construction project based on portrait technology
CN115660464A (en) Intelligent equipment maintenance method and terminal based on big data and physical ID
CN116028646A (en) Power grid dispatching field knowledge graph construction method based on machine learning
CN114091912A (en) Method for analyzing topological transaction of medium-voltage power grid by using knowledge graph
Yin et al. Sentence-BERT and k-means based clustering technology for scientific and technical literature
Wang et al. Automatic scoring of Chinese fill-in-the-blank questions based on improved P-means
CN116108203A (en) Method, system, storage medium and equipment for constructing power grid panoramic dispatching knowledge graph and managing power grid equipment
CN115563968A (en) Water and electricity transportation and inspection knowledge natural language artificial intelligence system and method
CN114792140A (en) Transformer substation defect analysis system based on knowledge graph
CN113642835A (en) Work ticket generation method based on data similarity and terminal
Xinjie et al. A Construction Method for the Knowledge Graph of Power Grid Supervision Business
Banerjee et al. Automatic Standardization of Data Based on Machine Learning and Natural Language Processing
CN117131184B (en) Site soil pollution question-answering system and question-answering method based on knowledge graph
CN113111189B (en) Interpretable power grid operation risk assessment method and device
CN115374108B (en) Knowledge graph technology-based data standard generation and automatic mapping method
Wang et al. Research on Construction and Application of Knowledge Mapping of Intelligent Transportation Inspection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant