CN107330125A - The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology - Google Patents
The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology Download PDFInfo
- Publication number
- CN107330125A CN107330125A CN201710593929.3A CN201710593929A CN107330125A CN 107330125 A CN107330125 A CN 107330125A CN 201710593929 A CN201710593929 A CN 201710593929A CN 107330125 A CN107330125 A CN 107330125A
- Authority
- CN
- China
- Prior art keywords
- data
- entity
- illustrative plates
- knowledge collection
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Abstract
The present invention discloses a kind of unstructured distribution data integrated approach of the magnanimity of knowledge based graphical spectrum technology, data acquisition unit gathers the unstructured distribution data of each information system, and the unstructured distribution data to each information system carries out quality analysis and data cleaning treatment respectively;According to the unstructured distribution data of each information system after processing, the data partial indexes based on local knowledge collection of illustrative plates are built;Data partial indexes based on local knowledge collection of illustrative plates are sent to control data corporation by big data connector;Control data corporation builds the data global index based on global knowledge collection of illustrative plates.The present invention is preposition to each information system by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data, reduces data fusion amount of calculation, storage pressure and the data dispatch burden of control data corporation;It is integrated to data source progress using the data global index based on global knowledge collection of illustrative plates, it is easy to data query and extraction, reduces the workload of control data corporation.
Description
Technical field
The present invention relates to data fusion and integrated technology field, more particularly to a kind of magnanimity of knowledge based graphical spectrum technology is non-
Structuring distribution data integrated approach.
Background technology
Power network includes marketing system, production system, power dispatching data collection and the information-based system such as monitoring system and electric energy meter
System is enhancing power grid operation ability and expands power customer service ability and quality, it is necessary to efficiently, rapidly collection come from and match somebody with somebody
The mass data of net equipment, and combine marketing system, the operation system data such as production system, and data are carried out effectively identification and
Filtering, final output is conducive to power operation, improves the related data of customer service quality and service level.
The distribution data collected from each information system can be divided into two classes, and one kind is structural data, such as data or
Symbol class data, another is unstructured data, such as user speech, image, text.Existing unstructured distribution data
Integrated approach be to set up unified data center's platform, it is unstructured by what is collected using technologies such as data adapter units
Data copy is integrated after then being cleaned to data to data center's platform, so that frequently data between solving all departments
The demand of exchange.
However, this method is on the one hand universal to carry out intensive data cleaning in data center, cause data center's flushing dose
Greatly, integrated speed is slow, it is impossible to meet the integrated requirement of the unstructured data of magnanimity;On the other hand, each information system
Unstructured data has differences in service logic, data format and storage, therefore, when data transfer is flat to data center
After platform, the classification not only bad for mass data is stored, and returning data extraction and inquiry makes troubles, and considerably increases in data
The workload of heart platform.
The content of the invention
In order to solve the above technical problems, the present invention is provided, a kind of magnanimity of knowledge based graphical spectrum technology is unstructured to match somebody with somebody netting index
According to integrated approach.
There is provided a kind of unstructured distribution data collection of the magnanimity of knowledge based graphical spectrum technology for embodiments in accordance with the present invention
Into method, including:
The unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to each informationization
The unstructured distribution data of system carries out quality analysis and data cleaning treatment;
According to the unstructured distribution data of each information system after processing, build based on local knowledge collection of illustrative plates
Data partial indexes, the data partial indexes based on local knowledge collection of illustrative plates include:The part of each information system is known
Know collection of illustrative plates and local data directory;
The data partial indexes based on local knowledge collection of illustrative plates are sent into data management by big data connector
The heart;
Data global index based on global knowledge collection of illustrative plates is built by the control data corporation, it is described to be based on global knowledge
The data global index of collection of illustrative plates includes global knowledge collection of illustrative plates and global data concordance list.
Further, the unstructured distribution data of each information system according to after processing, structure is based on
The step of data partial indexes of local knowledge collection of illustrative plates, includes:
Unstructured distribution data to each information system after processing carries out entity extraction, described each to obtain
The entity storehouse of the unstructured distribution data of information system, the entity storehouse includes the unstructured of each information system
Entity, class and the attribute information of distribution data;
According to the relation of each entity in the entity storehouse, the local knowledge collection of illustrative plates is built;
With the entitled keyword of entity of each entity in the entity storehouse, local data's concordance list, the local data are built
Concordance list include the partial indexes information corresponding with each entity in the entity storehouse, the partial indexes information include attribute,
Example, affiliated text, DSN, affiliated database.
Further, it is described to wrap the step of build data global index based on global knowledge collection of illustrative plates by control data corporation
Include:
Collision detection is carried out to the local knowledge collection of illustrative plates of each information system, the collision detection is rushed including physical name
Prominent detection, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection;
If there is conflict between the local knowledge collection of illustrative plates of each information system, conflict is eliminated;
According to detecting and eliminate the entity of the local knowledge collection of illustrative plates obtained in conflict process, class, property value and up and down
The partial indexes information of each entity in position relation, unified local data's concordance list, and build global knowledge collection of illustrative plates;
Build the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system;
According to the mapping relations and local data's concordance list, closed so that the entity of each entity in the entity storehouse is entitled
Keyword, builds global data concordance list, and the global data concordance list includes corresponding with each entity in the entity storehouse complete
Office's index information, whole index informations include belonging relation, trigger conflict, the partial indexes information and affiliated part
Knowledge mapping.
Further, if there is conflict between the local knowledge collection of illustrative plates of each information system, punching is eliminated
Prominent step includes:
Create the priority of the local knowledge collection of illustrative plates of each information system;
If there is physical name conflict or hyponymy conflict between the local knowledge collection of illustrative plates of each information system,
The physical name or hyponymy of the local knowledge collection of illustrative plates of highest priority are then selected as the entity of the global knowledge collection of illustrative plates
Name or hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates;
Single-value attribute is traveled through in each local knowledge collection of illustrative plates, if detecting a certain single-value attribute for multivalue,
Select the property value of local knowledge collection of illustrative plates of highest priority as the property value of the attribute in global knowledge collection of illustrative plates, and change phase
The property value of the corresponding local knowledge collection of illustrative plates;
If the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by all local knowledge collection of illustrative plates
Property value merges, and constitutes the property value of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification.
Further, the unstructured distribution data of each information system after described pair of processing carries out entity extraction
The step of include:
Whether the unstructured distribution data for judging each information system after processing is text data;
If the unstructured distribution data of each information system after processing is text data, according to preset rules
Entity, class and attribute information are extracted with dictionary methods;
If the unstructured distribution data of each information system after processing is not text data, after processing
The unstructured distribution data of each information system is converted into text;
Participle is carried out to the text, the sentence of the text is analyzed using the parsing algorithm based on natural language processing
Dependence in method structure and sentence between word, then extracts entity, class and attribute information.
Further, the relation according to each entity in the entity storehouse, the step of building the local knowledge collection of illustrative plates
Including:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is carried out
Inner product, calculates the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, obtains each in the entity storehouse
Entity relationship, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set,
Include altogether | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R
| plant different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
Further, the method for the physical name collision detection includes:
The entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates are calculated according to following formula
Similarity;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class
LAWith the class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBAway from
From;
If the similarity of the entity A and the entity B is more than threshold value, the entity A and the entity B are judged
Whether physical name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
Further, the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and
Extract the hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnPoint
The hyponymy figure of each entity in the hyponymy entity sets Wei not be taken, n is the hyponymy entity sets
In physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described
Exported in hyponymy figure after merging without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of upper bottom
Conflict of relationships;If at least there is a node in the hyponymy figure after the merging, testing result is above and below existing
Position conflict of relationships.
Further, methods described also includes:According to new equipment and/or the unstructured distribution data of new user, to institute
State the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates is updated.
Further, it is described according to new equipment and/or the unstructured distribution data of new user, known based on local described
The step of data partial indexes and the data global index based on global knowledge collection of illustrative plates for knowing collection of illustrative plates are updated includes:
The unstructured distribution data of new equipment and/or new user are obtained, and extracts the new equipment and/or new user
Entity, class and the attribute information of unstructured distribution data;
Judge the new equipment and/or the unstructured distribution data of new user entity and class whether with a certain office
Entity and class in portion's knowledge mapping match;
If it is judged that being matching, then by the new equipment and/or the entity of the unstructured distribution data of new user
Blended with local knowledge collection of illustrative plates this described, and update the hyponymy between corresponding entity attribute and entity, according to melting
The local knowledge collection of illustrative plates after conjunction updates local data's concordance list and the data based on global knowledge collection of illustrative plates are global
Index;
If it is judged that to mismatch, then creating new entity and class, and according to the new entity and class, update institute
State the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.
From above technical scheme, it is unstructured with netting index that the present invention provides a kind of magnanimity of knowledge based graphical spectrum technology
According to integrated approach, in each information system cloth such as marketing system, production system, power dispatching data collection and monitoring system, electric energy meter
Big data connector and data acquisition unit are put, by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data
Process is preposition to arrive each information system, and data fusion amount of calculation, storage pressure and the data dispatch for reducing control data corporation are negative
Load.The unstructured distribution data such as the user speech, picture, text of each information system is carried out data and taken out by data acquisition unit
Sample, quality analysis and data cleansing, the part of each information system is built using the unstructured distribution data after processing
Knowledge mapping and local data directory, and control data corporation is transferred to by big data connector.Control data corporation is examined
The conflict between local knowledge collection of illustrative plates is surveyed and eliminated, global knowledge collection of illustrative plates and the global data index suitable for total data is built
Table, so as to be carried out using global knowledge collection of illustrative plates and global data concordance list to data source integrated.During newly-increased data integration,
It can optimize data integration using global knowledge collection of illustrative plates, utilize the new equipment collected and/or the unstructured distribution of new user
Data update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.With
The increase of integrated equipment and data, constructed local knowledge collection of illustrative plates and global knowledge collection of illustrative plates does not stop to update, and is easy to follow-up development
The inquiry of distribution searching mass data, big data analysis etc..
Brief description of the drawings
Fig. 1 is the flow chart that the distributed multi-source heterogeneous data directory shown in one embodiment of the invention is built;
Fig. 2 is a kind of unstructured distribution data collection of magnanimity of knowledge based graphical spectrum technology shown in one embodiment of the invention
Into the flow chart of method;
Fig. 3 is the method flow of data partial indexes of the structure based on local knowledge collection of illustrative plates shown in one embodiment of the invention
Figure;
Fig. 4 is the schematic diagram of local data's concordance list shown in one embodiment of the invention;
Fig. 5 is the schematic diagram of the index of the local data based on local knowledge collection of illustrative plates shown in one embodiment of the invention;
Fig. 6 is the method flow of data global index of the structure based on global knowledge collection of illustrative plates shown in one embodiment of the invention
Figure;
Fig. 7 is the schematic diagram of the global data concordance list shown in one embodiment of the invention;
Fig. 8 is to eliminate the method flow diagram conflicted between each local knowledge collection of illustrative plates shown in one embodiment of the invention;
Fig. 9 is the method flow diagram of the unstructured distribution data entity extraction shown in one embodiment of the invention;
Figure 10 is that a kind of magnanimity of knowledge based graphical spectrum technology shown in further embodiment of this invention is unstructured with netting index
According to the flow chart of integrated approach;
Figure 11 is the method flow diagram of the renewal knowledge mapping shown in further embodiment of this invention.
Embodiment
In order that those skilled in the art more fully understand the technical scheme in the application, it is right below in conjunction with accompanying drawing
Technical scheme in the embodiment of the present invention is clearly and completely described.
As shown in figure 1, the flow chart built for the distributed multi-source heterogeneous data directory shown in one embodiment of the invention, bag
Multiple information systems are included, (Supervisory Control And Data Acquisition, match somebody with somebody for such as intelligent electric meter, SCADA
Electric data collecting and monitoring) system, marketing system and production system etc., wherein, each information system is equipped with data
Collecting unit and big data connector, data acquisition unit are used to adopt the unstructured distribution data of each information system
Collection, quality analysis and data cleansing, find and correct the mistake that can recognize that in data, including check data consistency, handle nothing
Valid value and missing values etc..As data acquisition unit is gathered and handled:The ammeter data of intelligent electric meter, it is the remote measurement of SCADA system, distant
Control, remote regulating data, the user profile data of marketing system, facility information data of production system etc..Big data connector is used for
Data partial indexes based on local knowledge collection of illustrative plates are transmitted to control data corporation.
In the present invention, the framework of the unstructured distribution data of each information system is distributed multi-source heterogeneous form, is led to
Cross by the process of collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data it is preposition arrive each information system, without
Control data corporation carries out corresponding operating, thus, advantageously reduce data fusion amount of calculation, the storage pressure of control data corporation
With data dispatch burden.
As shown in Fig. 2 unstructured matching somebody with somebody for a kind of magnanimity of knowledge based graphical spectrum technology shown in one embodiment of the invention
Network data integrated approach, including:
Step S10, the unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to described
The unstructured distribution data of each information system carries out quality analysis and data cleaning treatment.
In the present invention, the unstructured distribution data of each information system derives from different information systems, data knot
Structure and type variation, such as user voice data, image and/or text data, therefore, each information system it is unstructured
The framework of distribution data be distributed multi-source heterogeneous form, by by the collection of distributed multi-source heterogeneous data, quality analysis and
The process of data cleansing is preposition to arrive each information system, and corresponding operating is carried out without control data corporation, thus, advantageously reduce
Data fusion amount of calculation, storage pressure and the data dispatch burden of control data corporation.
Step S20, according to the unstructured distribution data of each information system after processing, is built and is known based on local
Know the data partial indexes of collection of illustrative plates, the data partial indexes based on local knowledge collection of illustrative plates include:Each information system
Local knowledge collection of illustrative plates and local data directory.
In order to eliminate difference of each information system data in service logic, data format and storage, it is necessary to will be each
The unstructured distribution data of information system is abstracted into the knowledge such as entity, attribute and inter-entity relation, builds local knowledge figure
Spectrum and local data directory, so as to build the data partial indexes based on local knowledge collection of illustrative plates.
Step S30, the data partial indexes based on local knowledge collection of illustrative plates are sent to number by big data connector
According to administrative center.
Oracle big datas connector or the database big data connector of other standards may be selected in big data connector.
Step S40, the data global index based on global knowledge collection of illustrative plates is built by the control data corporation, described to be based on
The data global index of global knowledge collection of illustrative plates includes global knowledge collection of illustrative plates and global data concordance list.
As shown in figure 3, step S20 includes:
S201, the unstructured distribution data to each information system after processing carries out entity extraction, to obtain
The entity storehouse of the unstructured distribution data of each information system, the entity storehouse includes the non-of each information system
Entity, class and the attribute information of structuring distribution data.
S202, according to the hyponymy of each entity in the entity storehouse, builds the local knowledge collection of illustrative plates.
The local knowledge collection of illustrative plates of structure is not world knowledge collection of illustrative plates, but a special knowledge figure for being directed to power matching network
Spectrum, the class refers to the classification of the entity, such as user subject, equipment entity;The entity refers to the reality under a certain class
Body name, such as user name, implementor name, producer's name;The attribute refers to the information and data that a certain entity is collected.
Wherein, implementor name mainly includes overhead transmission line, cable, shaft tower, distribution transformer, disconnecting switch, breaker, coincidence
Device, sectionaliser, post load switch, ring main unit, pressure regulator, reactive-load compensation capacitor, and ca bin (Feeder
Terminal Unit, FTU), data acquisition and monitoring terminal unit (Distribution Terminal Unit, DTU), match somebody with somebody
Piezoelectric transformer monitoring terminal unit (Transformer Terminal Unit, TTU), remote-terminal unit (Remote
Terminal Unit, RTU) etc. some affiliated facilities.
Archive information, outage information, electricity price information, electricity charge information and the mobile phone A PP extracted from each information system is returned
User profile etc. as user subject attribute;By equipment files, device type, voltage class, affiliated platform area, position letter
Breath, GIS information, electric energy meter data, four branch electricity consumption situations and status information etc. as equipment entity attribute.
S203, with the entitled keyword of entity of each entity in the entity storehouse, builds local data's concordance list, the part
Data directory includes the partial indexes information corresponding with each entity in the entity storehouse, and the partial indexes information includes category
Property, example, affiliated text, DSN, affiliated database.Wherein, the dataSource link is referred to as the informationization where entity
It may include in the title of system, database of the affiliated database where the corresponding unstructured distribution data of entity, database
The data block of multiple data storages.
As shown in figure 4, being first to be classified as certain in the schematic diagram of local data's concordance list shown in one embodiment of the invention, table
The physical name of each entity in the entity storehouse of one information system, with the entitled keyword of entity of each entity in entity storehouse, by entity
Each entity is enumerated and distinguished in storehouse;Attribute corresponding with the row entity, example, affiliated text is set out upwards in row in table
Sheet, DSN, affiliated database information.
As shown in figure 5, being the signal of the index of the local data based on local knowledge collection of illustrative plates shown in one embodiment of the invention
Figure, is illustrated by taking physical name 1 as an example, when needing to be indexed data of the physical name 1 under text 2, according to each information
The local knowledge collection of illustrative plates of change system and local data directory, finding out the corresponding affiliated database under text 2 of physical name 1 is
Database 1, is further continued for finding out in database 1, and data block 1, data block 2 and data block n are corresponding target data block,
Required unstructured data is just indexed out;When needing to be indexed data of the physical name 1 under example 1, according to each letter
The local knowledge collection of illustrative plates of breathization system and local data directory, find out the corresponding database under example 1 of physical name 1
For database 2, and database 2 be it is special come storage entity 1 data under example 1 of name.It follows that knowing according to based on local
Know local data's index of collection of illustrative plates, you can the target data needed for the user of Directory Enquiries, it is convenient and swift and accuracy rate is high.
Each local knowledge collection of illustrative plates that each information system and data source abstract unstructured distribution data is mutual
Independent, it is various, information dispersion " information island " to form system, is retrieved and is analyzed it is difficult to put together.Therefore, need
A unified intermediary is set up, the shared and integrated of data between each application system is realized.Specifically, as shown in fig. 6, step
S40 includes:
S401, carries out collision detection, the collision detection includes real to the local knowledge collection of illustrative plates of each information system
The collision detection of body name, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection.
The entity extracted for different data sources such as marketing system, production system, SCADA system, intelligent electric meters, goes out unavoidably
Existing different names refer to the situation that identical things or same name refer to different entities, when data integration is carried out,
Some conflicts are inevitably present between each local knowledge collection of illustrative plates, it is therefore necessary to which collection of illustrative plates inspection is carried out to each local knowledge collection of illustrative plates
Survey, targetedly to eliminate conflict, identification removes redundancy knowledge and contradiction knowledge, so as to form standard with merging equivalent entities
True global knowledge collection of illustrative plates.
S402, if there is conflict between the local knowledge collection of illustrative plates of each information system, eliminates conflict.
Eliminate between the local knowledge collection of illustrative plates of each information system and exist after conflict, accurate global knowledge figure can be generated
Spectrum, enables the unstructured distribution data of information system preferably integrated, is easy to control data corporation to the collection of data
Into management and inquiry, index.
S403, according to detect and eliminate the entity of the local knowledge collection of illustrative plates obtained in conflict process, class, property value and
The partial indexes information of each entity in hyponymy, unified local data's concordance list, and build global knowledge collection of illustrative plates.
S404, builds the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system.
I.e. by the collision detection between local knowledge collection of illustrative plates and elimination process, consolidated entity is in all local knowledge collection of illustrative plates
Index;Then in global scope, index of each local knowledge collection of illustrative plates in global knowledge collection of illustrative plates is built, is set up across local knowledge
The data mapping relations of collection of illustrative plates, on the basis of local data's concordance list, each entity extracted to data source, increase is affiliated
Local knowledge collection of illustrative plates, trigger conflict etc. information, set up across each local knowledge collection of illustrative plates data directory, so as to realize interdepartmental
System, the data integration of integration across database.
S405, according to the mapping relations and local data's concordance list, with the entity of each entity in the entity storehouse
Entitled keyword, builds global data concordance list, and the global data concordance list includes relative with each entity in the entity storehouse
The global index's information answered, whole index informations include belonging relation, trigger conflict, the partial indexes information and institute
Belong to local knowledge collection of illustrative plates, Fig. 7 is the schematic diagram of global data concordance list.
As shown in figure 8, step S402 includes:
S4021, creates the priority of the local knowledge collection of illustrative plates of each information system
S4022, if there is physical name conflict or upper the next pass between the local knowledge collection of illustrative plates of each information system
System's conflict, then select the physical name or hyponymy of the local knowledge collection of illustrative plates of highest priority to be used as the global knowledge collection of illustrative plates
Physical name or hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates.
When detecting entity name conflict or hyponymy conflict, the local knowledge collection of illustrative plates of highest priority is selected
Entity name or hyponymy as global knowledge collection of illustrative plates entity or hyponymy, while the entity or upper the next closing
System is brought into global knowledge collection of illustrative plates, and changes the entity name and hyponymy of corresponding local knowledge collection of illustrative plates, real
Existing entity name and hyponymy it is globally consistent;When being clashed between local knowledge mapping, with global knowledge collection of illustrative plates
Entity name and hyponymy be defined.
S4023, single-value attribute is traveled through in each local knowledge collection of illustrative plates, is if detecting a certain single-value attribute
Multivalue, select highest priority local knowledge collection of illustrative plates property value as the attribute in global knowledge collection of illustrative plates property value, and
The property value of the corresponding local knowledge collection of illustrative plates of modification.
When single-value attribute detects multivalue, the value of the local knowledge collection of illustrative plates of highest priority is selected to be used as global knowledge figure
The value of the attribute of this in spectrum, while the attribute is to bring into global knowledge collection of illustrative plates, and changes corresponding local knowledge collection of illustrative plates
Property value, realizes the globally consistent of single-value attribute.When being clashed between local knowledge mapping, with the category of global knowledge collection of illustrative plates
Property value is defined.
S4024, if the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by all local knowledges
The property value of collection of illustrative plates merges, and constitutes the property value of global knowledge collection of illustrative plates, while the corresponding local knowledge collection of illustrative plates of modification
Property value.
For multi-valued attribute, if property value is inconsistent between detecting local knowledge collection of illustrative plates, by all local knowledges
The value of collection of illustrative plates merges, and constitutes the attribute of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification,
Realize the globally consistent of multi-valued attribute.When being clashed between local knowledge mapping, using the property value of global knowledge collection of illustrative plates as
It is accurate.
As shown in figure 9, step S201 includes:
S2011, whether the unstructured distribution data for judging each information system after processing is text data.
Unstructured distribution data can include the different types of data modes, pin such as user speech, image and/or text
To different types of data, the method that entity is extracted is different.
S2012, if the unstructured distribution data of each information system after processing is text data, according to pre-
If rule and dictionary methods extract entity, class and attribute information.
For the forms such as the equipment files in production system, operation manual, standard more fixed text data, using base
Entity therein, class and attribute information are extracted in rule and the method for dictionary;Please call net expert and formulate and meet power network industry
Entity extraction rule, extracted using dictionary methods implementor name in text, device type, name, place name, institution term,
The entities such as special time and its class and attribute information.
S2013, if the unstructured distribution data of each information system after processing is not text data, will locate
The unstructured distribution data of each information system after reason is converted into text.
S2014, participle is carried out to the text, and the text is analyzed using the parsing algorithm based on natural language processing
Dependence in this syntactic structure and sentence between word, then extracts entity, class and attribute information.
When unstructured distribution data is user voice data, skill is changed using the voice based on HMM
Art is converted into text;When unstructured distribution data is image, it will be schemed using the picture recognition technology based on SVMs
Word in piece is converted into text.Then text is carried out by participle using the natural language participle technique based on string matching,
Then entity therein, class and attribute are extracted, i.e., text is first subjected to participle, the parsing algorithm of natural language processing is utilized
Dependence in the syntactic structure and sentence of parsing sentence between word, then identifies entity, class and attribute.
Completed when entity, attribute etc. are extracted, entity storehouse is obtained, on this basis, using the branch based on character string sequence core
The relation between entity relation extraction technology two entities of identification of vector machine model is held, the contact set up between entity walks
Rapid S202 includes:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is carried out
Inner product, calculates the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, obtains each in the entity storehouse
Entity relationship, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set,
Include altogether | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R
| plant different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
The citation form of triple mainly includes entity 1, relation, entity 2, and concept, attribute, property value etc., passes through three
Tuple-set, it becomes possible to which the mapping for the initial data set up where any entity and entity, the mapping is by local data's concordance list
To realize;Each entity extracted to data source, sets up a concordance list, the concordance list includes with the entitled keyword of the entity
Attribute, DSN, belonging relation, affiliated database, affiliated table, affiliated text, example, affiliated local knowledge collection of illustrative plates etc. one
Series information associated with the data, by local data's concordance list, can be positioned rapidly in single distribution information system
Data, so as to inquire about and extract data.
In step S401, the method for the physical name collision detection includes:
The entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates are calculated according to following formula
Similarity;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class
LAWith the class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBAway from
From;
If the similarity of the entity A and the entity B is more than threshold value, the entity A and the entity B are judged
Whether physical name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
Entity, the class of entity and attribute are set up respectively in each local knowledge collection of illustrative plates and indexed, i.e., local data indexes
Table, then, for the entity A in some local knowledge collection of illustrative plates, entity B is searched in the index of other local knowledge collection of illustrative plates,
Calculate A and B similarity Sim (A, B), if in current local knowledge collection of illustrative plates entity class LAWith attribute SAKnow with other parts
Know the class L of some entity B in collection of illustrative plates thenBWith attribute SBIt is much like, but physical name is different, then detects the presence of physical name
Conflict.
In step S401, the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and
Extract the hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnPoint
The hyponymy figure of each entity in the hyponymy entity sets Wei not be taken, n is the hyponymy entity sets
In physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described
Exported in hyponymy figure after merging without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of upper bottom
Conflict of relationships;If at least there is a node in the hyponymy figure after the merging, testing result is above and below existing
Position conflict of relationships.
As shown in Figure 10, be a kind of knowledge based graphical spectrum technology shown in further embodiment of this invention magnanimity it is unstructured
Also include after distribution data integrated approach, step S203:
S50, according to new equipment and/or the unstructured distribution data of new user, to described based on local knowledge collection of illustrative plates
Data partial indexes and the data global index based on global knowledge collection of illustrative plates are updated.
Control data corporation is responsible for safeguarding and updates global knowledge collection of illustrative plates, local knowledge collection of illustrative plates, global data concordance list, office
Portion's data directory, manages the exchange of data.According to new equipment and/or the unstructured distribution data of new user, to the base
It is updated in the data partial indexes of local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates, can be with
Making the integrated data of control data corporation has real-time, accuracy, can when setting up new distribution net equipment and information system
Changed with the dynamical state for being adapted to power distribution network, realize that data are managed concentratedly.When needing to inquire about certain entity related data, pass through
Global data concordance list, you can inquire data association message and affiliated database, so as to realize the number in each information system
According to integrated.
Specifically, as shown in figure 11, step S50 includes:
S501, obtains new equipment and/or the unstructured distribution data of new user, and extract the new equipment and/or new
Entity, class and the attribute information of the unstructured distribution data of user;
S502, judge the new equipment and/or the unstructured distribution data of new user entity and class whether with it is a certain
Entity and class in the local knowledge collection of illustrative plates match;
S503, if it is judged that being matching, then by the unstructured distribution data of the new equipment and/or new user
Entity is blended with local knowledge collection of illustrative plates this described, and updates the hyponymy between corresponding entity attribute and entity, root
Local data's concordance list and the data based on global knowledge collection of illustrative plates are updated according to the local knowledge collection of illustrative plates after fusion
Global index;
S504, if it is judged that to mismatch, then new entity and class are created, and according to the new entity and class,
Update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.
Entity, class and attribute in global knowledge collection of illustrative plates come from multiple local knowledge collection of illustrative plates, with generality, to distribution
Data have very strong recognition reaction, and the entity and attribute of newly-increased data source are quickly extracted using global and local knowledge mapping,
Newly-increased data source integrated speed and accuracy rate are improved, the optimization of data integration is realized;For the reality of knowledge mapping None- identified
Body, extracts corresponding entity, class and attribute, is matched with the class and entity in original knowledge mapping, if matching degree is high
Then merged, update the hyponymy between entity attribute and entity, be otherwise created that new class, then updated and be based on office
The data partial indexes of portion's knowledge mapping and the data global index based on global knowledge collection of illustrative plates, so as to realize the excellent of knowledge mapping
Change.
From above technical scheme, it is unstructured with netting index that the present invention provides a kind of magnanimity of knowledge based graphical spectrum technology
According to integrated approach, in each information system cloth such as marketing system, production system, power dispatching data collection and monitoring system, electric energy meter
Big data connector and data acquisition unit are put, by collection, quality analysis and the data cleansing of distributed multi-source heterogeneous data
Process is preposition to arrive each information system, and data fusion amount of calculation, storage pressure and the data dispatch for reducing control data corporation are negative
Load.The unstructured distribution data such as the user speech, picture, text of each information system is carried out data and taken out by data acquisition unit
Sample, quality analysis and data cleansing, the part of each information system is built using the unstructured distribution data after processing
Knowledge mapping and local data directory, and control data corporation is transferred to by big data connector.Control data corporation is examined
The conflict between local knowledge collection of illustrative plates is surveyed and eliminated, global knowledge collection of illustrative plates and the global data index suitable for total data is built
Table, so as to be carried out using global knowledge collection of illustrative plates and global data concordance list to data source integrated.During newly-increased data integration,
It can optimize data integration using global knowledge collection of illustrative plates, utilize the new equipment collected and/or the unstructured distribution of new user
Data update the data partial indexes based on local knowledge collection of illustrative plates and the data global index based on global knowledge collection of illustrative plates.With
The increase of integrated equipment and data, constructed local knowledge collection of illustrative plates and global knowledge collection of illustrative plates does not stop to update, and is easy to follow-up development
The inquiry of distribution searching mass data, big data analysis etc..
Those skilled in the art will readily occur to its of the present invention after considering specification and putting into practice invention disclosed herein
Its embodiment.The application be intended to the present invention any modification, purposes or adaptations, these modifications, purposes or
Person's adaptations follow the general principle of the present invention and including undocumented common knowledge in the art of the invention
Or conventional techniques.Description and embodiments be considered only as it is exemplary, true scope and spirit of the invention by right will
Ask and point out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and
And various modifications and changes can be being carried out without departing from the scope.The scope of the present invention is only limited by appended claim.
Claims (10)
1. a kind of unstructured distribution data integrated approach of the magnanimity of knowledge based graphical spectrum technology, it is characterised in that including:
The unstructured distribution data of each information system is gathered by data acquisition unit, and respectively to each information system
Unstructured distribution data carry out quality analysis and data cleaning treatment;
According to the unstructured distribution data of each information system after processing, the data based on local knowledge collection of illustrative plates are built
Partial indexes, the data partial indexes based on local knowledge collection of illustrative plates include:The local knowledge figure of each information system
Spectrum and local data directory;
The data partial indexes based on local knowledge collection of illustrative plates are sent to control data corporation by big data connector;
Data global index based on global knowledge collection of illustrative plates is built by the control data corporation, it is described to be based on global knowledge collection of illustrative plates
Data global index include global knowledge collection of illustrative plates and global data concordance list.
2. according to the method described in claim 1, it is characterised in that each information system according to after processing it is non-
Structuring distribution data, the step of building the data partial indexes based on local knowledge collection of illustrative plates includes:
Unstructured distribution data to each information system after processing carries out entity extraction, to obtain each information
The entity storehouse of the unstructured distribution data of change system, the entity storehouse includes the unstructured distribution of each information system
Entity, class and the attribute information of data;
According to the hyponymy of each entity in the entity storehouse, the local knowledge collection of illustrative plates is built;
With the entitled keyword of entity of each entity in the entity storehouse, local data's concordance list, local data's index are built
Table include the partial indexes information corresponding with each entity in the entity storehouse, the partial indexes information include attribute, example,
Affiliated text, DSN, affiliated database.
3. according to the method described in claim 1, it is characterised in that described built by control data corporation is based on global knowledge figure
The step of data global index of spectrum, includes:
Collision detection is carried out to the local knowledge collection of illustrative plates of each information system, the collision detection is examined including physical name conflict
Survey, hyponymy conflict monitoring, single-value attribute collision detection and multi-valued attribute collision detection;
If there is conflict between the local knowledge collection of illustrative plates of each information system, conflict is eliminated;
According to entity, class, property value and the upper the next pass for detecting and eliminating the local knowledge collection of illustrative plates obtained in conflict process
The partial indexes information of each entity in system, unified local data's concordance list, and build global knowledge collection of illustrative plates;
Build the mapping relations of the global knowledge collection of illustrative plates and the local knowledge collection of illustrative plates of each information system;
According to the mapping relations and local data's concordance list, with the entitled key of entity of each entity in the entity storehouse
Word, builds global data concordance list, and the global data concordance list includes the overall situation corresponding with each entity in the entity storehouse
Index information, whole index informations include belonging relation, trigger conflict, the partial indexes information and affiliated part to know
Know collection of illustrative plates.
4. method according to claim 3, it is characterised in that if the local knowledge figure of each information system
There is conflict between spectrum, then the step of eliminating conflict includes:
Create the priority of the local knowledge collection of illustrative plates of each information system;
If there is physical name conflict or hyponymy conflict between the local knowledge collection of illustrative plates of each information system, select
Select the physical name or hyponymy of priority highest local knowledge collection of illustrative plates as the physical name of the global knowledge collection of illustrative plates or
Person's hyponymy, and change the physical name and hyponymy of the corresponding local knowledge collection of illustrative plates;
Single-value attribute is traveled through in each local knowledge collection of illustrative plates, if detecting a certain single-value attribute for multivalue, selection
The property value of the local knowledge collection of illustrative plates of highest priority and changes corresponding as the property value of the attribute in global knowledge collection of illustrative plates
The local knowledge collection of illustrative plates property value;
If the multi-valued attribute value for detecting each local knowledge collection of illustrative plates is inconsistent, by the attribute of all local knowledge collection of illustrative plates
Value merges, and constitutes the property value of global knowledge collection of illustrative plates, while the property value of the corresponding local knowledge collection of illustrative plates of modification.
5. method according to claim 2, it is characterised in that the non-knot of each information system after described pair of processing
The step of structure distribution data carries out entity extraction includes:
Whether the unstructured distribution data for judging each information system after processing is text data;
If the unstructured distribution data of each information system after processing is text data, according to preset rules and word
Allusion quotation method extracts entity, class and attribute information;
If the unstructured distribution data of each information system after processing is not text data, described in after processing
The unstructured distribution data of each information system is converted into text;
Participle is carried out to the text, the syntax knot of the text is analyzed using the parsing algorithm based on natural language processing
Dependence in structure and sentence between word, then extracts entity, class and attribute information.
6. method according to claim 2, it is characterised in that the relation according to each entity in the entity storehouse, structure
The step of building the local knowledge collection of illustrative plates includes:
Any subsequence of certain length in the character string sequence of the unstructured distribution data of textual is subjected to inner product,
Calculate the similitude between sentence;
The core of the core of the character string sequence as SVMs is subjected to statistical learning, each entity in the entity storehouse is obtained
Relation, shown local knowledge collection of illustrative plates is built using the triple shown in following formula:
GL=(E, R, S)
Wherein, GLFor the local knowledge collection of illustrative plates;E={ e1,e2,…,e|E|Be the entity storehouse in each entity set, altogether bag
Contain | E | plant different entities;R={ r1,r2,…,r|R|Be each entity relationship in the entity storehouse set, include altogether | R | plant
Different entity relationships;Represent the triplet sets in the local knowledge collection of illustrative plates.
7. method according to claim 3, it is characterised in that the method for the physical name collision detection includes:
The phase of the entity A and the entity B of other local knowledge collection of illustrative plates of a certain local knowledge collection of illustrative plates is calculated according to following formula
Like degree;
Sim (A, B)=Dis (LA,LB)+Dis(SA,SB)
Wherein, Sim (A, B) is the similarity of the entity A and the entity B;Dis(LA,LB) be the entity A class LAWith
The class L of the entity BBDistance;Dis(SA,SB) be the entity A attribute SAWith the attribute S of the entity BBDistance;
If the similarity of the entity A and the entity B is more than threshold value, the entity of the entity A and the entity B is judged
Whether name is identical;
If the entity A is identical with the physical name of the entity B, testing result is the presence of physical name conflict.
8. method according to claim 3, it is characterised in that the method for the hyponymy conflict monitoring includes:
Extract the hyponymy figure of the entity A in a certain local knowledge collection of illustrative plates;
The hyponymy entity sets related to the entity A is found out in other local knowledge collection of illustrative plates, and is extracted
The hyponymy figure of each entity in the hyponymy entity sets;
Hyponymy figure after being merged according to following formula;
G=GA∪Gq1∪Gq2…∪Gqn
Wherein, G is the hyponymy figure after merging;GAFor the hyponymy figure of the entity A;Gq1、Gq2…GqnRespectively
The hyponymy figure of each entity in the hyponymy entity sets is taken, n is in the hyponymy entity sets
Physical quantities;
Delete the summit that all in-degrees are zero in the hyponymy figure after the merging and go out side with related, until described merge
Exported in hyponymy figure afterwards without summit;
If the node in hyponymy figure after the merging is deleted, testing result is in the absence of hyponymy
Conflict;If at least there is a node in the hyponymy figure after the merging, testing result is closed to there is upper bottom
System's conflict.
9. method according to claim 2, it is characterised in that methods described also includes:According to new equipment and/or new user
Unstructured distribution data, to the data partial indexes based on local knowledge collection of illustrative plates and it is described be based on global knowledge collection of illustrative plates
Data global index be updated.
10. method according to claim 9, it is characterised in that described according to the unstructured of new equipment and/or new user
Distribution data, it is global to the data partial indexes based on local knowledge collection of illustrative plates and the data based on global knowledge collection of illustrative plates
The step of index is updated includes:
New equipment and/or the unstructured distribution data of new user are obtained, and extracts the new equipment and/or the non-knot of new user
Entity, class and the attribute information of structure distribution data;
Judge whether the entity and class of the new equipment and/or the unstructured distribution data of new user are known with a certain part
The entity and class known in collection of illustrative plates match;
If it is judged that being matching, then by the new equipment and/or the entity of the unstructured distribution data of new user with being somebody's turn to do
The local knowledge collection of illustrative plates is blended, and updates the hyponymy between corresponding entity attribute and entity, after fusion
The local knowledge collection of illustrative plates update local data's concordance list and the data global index based on global knowledge collection of illustrative plates;
If it is judged that to mismatch, then creating new entity and class, and according to the new entity and class, update the base
Data partial indexes and the data global index based on global knowledge collection of illustrative plates in local knowledge collection of illustrative plates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710593929.3A CN107330125B (en) | 2017-07-20 | 2017-07-20 | Mass unstructured distribution network data integration method based on knowledge graph technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710593929.3A CN107330125B (en) | 2017-07-20 | 2017-07-20 | Mass unstructured distribution network data integration method based on knowledge graph technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107330125A true CN107330125A (en) | 2017-11-07 |
CN107330125B CN107330125B (en) | 2020-06-30 |
Family
ID=60226885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710593929.3A Active CN107330125B (en) | 2017-07-20 | 2017-07-20 | Mass unstructured distribution network data integration method based on knowledge graph technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107330125B (en) |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908738A (en) * | 2017-11-15 | 2018-04-13 | 昆明能讯科技有限责任公司 | The implementation method of enterprise-level knowledge mapping search engine based on power specialty dictionary |
CN108304493A (en) * | 2018-01-10 | 2018-07-20 | 深圳市腾讯计算机系统有限公司 | A kind of the hypernym method for digging and device of knowledge based collection of illustrative plates |
CN108595449A (en) * | 2017-11-23 | 2018-09-28 | 北京科东电力控制系统有限责任公司 | The structure and application process of dispatch automated system knowledge mapping |
CN109189938A (en) * | 2018-08-31 | 2019-01-11 | 北京字节跳动网络技术有限公司 | Method and apparatus for updating knowledge mapping |
CN109241078A (en) * | 2018-08-30 | 2019-01-18 | 中国地质大学(武汉) | A kind of knowledge mapping hoc queries method based on hybrid database |
CN109284394A (en) * | 2018-09-12 | 2019-01-29 | 青岛大学 | A method of Company Knowledge map is constructed from multi-source data integration visual angle |
CN109284393A (en) * | 2018-08-28 | 2019-01-29 | 合肥工业大学 | A kind of fusion method for family tree character attribute title |
CN109446385A (en) * | 2018-11-14 | 2019-03-08 | 中国科学院计算技术研究所 | A kind of method of equipment map that establishing Internet resources and the application method of the equipment map |
CN109446343A (en) * | 2018-11-05 | 2019-03-08 | 上海德拓信息技术股份有限公司 | A kind of method of public safety knowledge mapping building |
CN109582958A (en) * | 2018-11-20 | 2019-04-05 | 厦门大学深圳研究院 | A kind of disaster story line construction method and device |
CN109685684A (en) * | 2018-12-26 | 2019-04-26 | 武汉大学 | A kind of low-voltage network topological structure method of calibration of knowledge based map |
CN109766445A (en) * | 2018-12-13 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of knowledge mapping construction method and data processing equipment |
CN109783605A (en) * | 2018-12-14 | 2019-05-21 | 天津大学 | A kind of science service interconnection method based on Bayesian inference technology |
CN109933582A (en) * | 2019-03-11 | 2019-06-25 | 国家电网有限公司 | Data processing method and device |
CN110019150A (en) * | 2019-04-11 | 2019-07-16 | 软通动力信息技术有限公司 | A kind of data administering method, system and electronic equipment |
WO2019137033A1 (en) * | 2018-01-12 | 2019-07-18 | 扬州大学 | Automatic construction method for software bug oriented domain knowledge graph |
CN110275966A (en) * | 2019-07-01 | 2019-09-24 | 科大讯飞(苏州)科技有限公司 | A kind of Knowledge Extraction Method and device |
CN110297910A (en) * | 2018-03-23 | 2019-10-01 | 国际商业机器公司 | Manage distributed knowledge figure |
CN110427471A (en) * | 2019-07-26 | 2019-11-08 | 四川长虹电器股份有限公司 | A kind of natural language question-answering method and system of knowledge based map |
CN110457482A (en) * | 2019-06-06 | 2019-11-15 | 福建奇点时空数字科技有限公司 | A kind of intelligent information service system of knowledge based map |
CN110489475A (en) * | 2019-08-14 | 2019-11-22 | 广东电网有限责任公司 | A kind of multi-source heterogeneous data processing method, system and relevant apparatus |
CN111026874A (en) * | 2019-11-22 | 2020-04-17 | 海信集团有限公司 | Data processing method and server of knowledge graph |
CN111046115A (en) * | 2019-12-24 | 2020-04-21 | 四川文轩教育科技有限公司 | Knowledge graph-based heterogeneous database interconnection management method |
CN111639082A (en) * | 2020-06-08 | 2020-09-08 | 成都信息工程大学 | Object storage management method and system of billion-level node scale knowledge graph based on Ceph |
CN111858948A (en) * | 2019-04-30 | 2020-10-30 | 杭州海康威视数字技术股份有限公司 | Ontology construction method and device, electronic equipment and storage medium |
CN112200382A (en) * | 2020-10-27 | 2021-01-08 | 支付宝(杭州)信息技术有限公司 | Training method and device of risk prediction model |
CN112241458A (en) * | 2020-10-13 | 2021-01-19 | 北京百分点信息科技有限公司 | Text knowledge structuring processing method, device, equipment and readable storage medium |
CN112256882A (en) * | 2020-10-16 | 2021-01-22 | 美林数据技术股份有限公司 | Multi-similarity-based cross-system network entity fusion method |
CN112287123A (en) * | 2020-11-19 | 2021-01-29 | 国网湖南省电力有限公司 | Entity alignment method and device based on edge type attention mechanism |
CN112307172A (en) * | 2020-10-31 | 2021-02-02 | 平安科技(深圳)有限公司 | Semantic parsing equipment, method, terminal and storage medium |
CN112650865A (en) * | 2021-01-27 | 2021-04-13 | 南威软件股份有限公司 | Method and system for solving multi-region license data conflict based on flexible rule |
CN113159320A (en) * | 2021-03-08 | 2021-07-23 | 北京航空航天大学 | Scientific and technological resource data integration method and device based on knowledge graph |
CN113157697A (en) * | 2021-04-19 | 2021-07-23 | 山东艺术学院 | Mingqing custom music score database system |
CN113177095A (en) * | 2021-04-29 | 2021-07-27 | 北京明略软件系统有限公司 | Enterprise knowledge management method, system, electronic equipment and storage medium |
CN117556059A (en) * | 2024-01-12 | 2024-02-13 | 天津滨电电力工程有限公司 | Detection and correction method based on knowledge fusion and reasoning charging station data |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488724A (en) * | 2013-09-16 | 2014-01-01 | 复旦大学 | Book-oriented reading field knowledge map construction method |
CN105630901A (en) * | 2015-12-21 | 2016-06-01 | 清华大学 | Knowledge graph representation learning method |
US20160328443A1 (en) * | 2015-05-06 | 2016-11-10 | Vero Analytics, Inc. | Knowledge Graph Based Query Generation |
CN106447346A (en) * | 2016-08-29 | 2017-02-22 | 北京中电普华信息技术有限公司 | Method and system for construction of intelligent electric power customer service system |
CN106886543A (en) * | 2015-12-16 | 2017-06-23 | 清华大学 | The knowledge mapping of binding entity description represents learning method and system |
CN106897273A (en) * | 2017-04-12 | 2017-06-27 | 福州大学 | A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates |
-
2017
- 2017-07-20 CN CN201710593929.3A patent/CN107330125B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488724A (en) * | 2013-09-16 | 2014-01-01 | 复旦大学 | Book-oriented reading field knowledge map construction method |
US20160328443A1 (en) * | 2015-05-06 | 2016-11-10 | Vero Analytics, Inc. | Knowledge Graph Based Query Generation |
CN106886543A (en) * | 2015-12-16 | 2017-06-23 | 清华大学 | The knowledge mapping of binding entity description represents learning method and system |
CN105630901A (en) * | 2015-12-21 | 2016-06-01 | 清华大学 | Knowledge graph representation learning method |
CN106447346A (en) * | 2016-08-29 | 2017-02-22 | 北京中电普华信息技术有限公司 | Method and system for construction of intelligent electric power customer service system |
CN106897273A (en) * | 2017-04-12 | 2017-06-27 | 福州大学 | A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates |
Non-Patent Citations (2)
Title |
---|
吴运兵: "基于多数据源的知识图谱构建方法研究", 《福州大学学报(自然科学版)》 * |
胡芳槐: "基于多种数据源的中文知识图谱构建方法研究", 《中国博士学位论文全文数据库(电子期刊)》 * |
Cited By (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908738A (en) * | 2017-11-15 | 2018-04-13 | 昆明能讯科技有限责任公司 | The implementation method of enterprise-level knowledge mapping search engine based on power specialty dictionary |
CN108595449A (en) * | 2017-11-23 | 2018-09-28 | 北京科东电力控制系统有限责任公司 | The structure and application process of dispatch automated system knowledge mapping |
CN108304493A (en) * | 2018-01-10 | 2018-07-20 | 深圳市腾讯计算机系统有限公司 | A kind of the hypernym method for digging and device of knowledge based collection of illustrative plates |
CN108304493B (en) * | 2018-01-10 | 2020-06-12 | 深圳市腾讯计算机系统有限公司 | Hypernym mining method and device based on knowledge graph |
WO2019137033A1 (en) * | 2018-01-12 | 2019-07-18 | 扬州大学 | Automatic construction method for software bug oriented domain knowledge graph |
US11386136B2 (en) | 2018-01-12 | 2022-07-12 | Yangzhou University | Automatic construction method of software bug knowledge graph |
CN110297910A (en) * | 2018-03-23 | 2019-10-01 | 国际商业机器公司 | Manage distributed knowledge figure |
CN109284393A (en) * | 2018-08-28 | 2019-01-29 | 合肥工业大学 | A kind of fusion method for family tree character attribute title |
CN109284393B (en) * | 2018-08-28 | 2020-11-06 | 合肥工业大学 | Fusion method for family tree character attribute names |
CN109241078A (en) * | 2018-08-30 | 2019-01-18 | 中国地质大学(武汉) | A kind of knowledge mapping hoc queries method based on hybrid database |
CN109241078B (en) * | 2018-08-30 | 2021-07-20 | 中国地质大学(武汉) | Knowledge graph organization query method based on mixed database |
CN109189938A (en) * | 2018-08-31 | 2019-01-11 | 北京字节跳动网络技术有限公司 | Method and apparatus for updating knowledge mapping |
CN109284394A (en) * | 2018-09-12 | 2019-01-29 | 青岛大学 | A method of Company Knowledge map is constructed from multi-source data integration visual angle |
CN109446343A (en) * | 2018-11-05 | 2019-03-08 | 上海德拓信息技术股份有限公司 | A kind of method of public safety knowledge mapping building |
CN109446343B (en) * | 2018-11-05 | 2020-10-27 | 上海德拓信息技术股份有限公司 | Public safety knowledge graph construction method |
CN109446385B (en) * | 2018-11-14 | 2022-06-14 | 中国科学院计算技术研究所 | Method for establishing network resource equipment map and using method of equipment map |
CN109446385A (en) * | 2018-11-14 | 2019-03-08 | 中国科学院计算技术研究所 | A kind of method of equipment map that establishing Internet resources and the application method of the equipment map |
CN109582958B (en) * | 2018-11-20 | 2023-07-18 | 厦门大学深圳研究院 | Disaster story line construction method and device |
CN109582958A (en) * | 2018-11-20 | 2019-04-05 | 厦门大学深圳研究院 | A kind of disaster story line construction method and device |
CN109766445A (en) * | 2018-12-13 | 2019-05-17 | 平安科技(深圳)有限公司 | A kind of knowledge mapping construction method and data processing equipment |
CN109766445B (en) * | 2018-12-13 | 2024-03-26 | 平安科技(深圳)有限公司 | Knowledge graph construction method and data processing device |
CN109783605A (en) * | 2018-12-14 | 2019-05-21 | 天津大学 | A kind of science service interconnection method based on Bayesian inference technology |
CN109783605B (en) * | 2018-12-14 | 2021-05-11 | 天津大学 | Scientific and technological service docking method based on Bayesian inference technology |
CN109685684A (en) * | 2018-12-26 | 2019-04-26 | 武汉大学 | A kind of low-voltage network topological structure method of calibration of knowledge based map |
CN109933582A (en) * | 2019-03-11 | 2019-06-25 | 国家电网有限公司 | Data processing method and device |
CN110019150A (en) * | 2019-04-11 | 2019-07-16 | 软通动力信息技术有限公司 | A kind of data administering method, system and electronic equipment |
CN111858948A (en) * | 2019-04-30 | 2020-10-30 | 杭州海康威视数字技术股份有限公司 | Ontology construction method and device, electronic equipment and storage medium |
CN110457482A (en) * | 2019-06-06 | 2019-11-15 | 福建奇点时空数字科技有限公司 | A kind of intelligent information service system of knowledge based map |
CN110275966A (en) * | 2019-07-01 | 2019-09-24 | 科大讯飞(苏州)科技有限公司 | A kind of Knowledge Extraction Method and device |
CN110275966B (en) * | 2019-07-01 | 2021-10-01 | 科大讯飞(苏州)科技有限公司 | Knowledge extraction method and device |
CN110427471A (en) * | 2019-07-26 | 2019-11-08 | 四川长虹电器股份有限公司 | A kind of natural language question-answering method and system of knowledge based map |
CN110489475A (en) * | 2019-08-14 | 2019-11-22 | 广东电网有限责任公司 | A kind of multi-source heterogeneous data processing method, system and relevant apparatus |
CN111026874A (en) * | 2019-11-22 | 2020-04-17 | 海信集团有限公司 | Data processing method and server of knowledge graph |
CN111046115A (en) * | 2019-12-24 | 2020-04-21 | 四川文轩教育科技有限公司 | Knowledge graph-based heterogeneous database interconnection management method |
CN111046115B (en) * | 2019-12-24 | 2023-08-08 | 四川文轩教育科技有限公司 | Heterogeneous database interconnection management method based on knowledge graph |
CN111639082B (en) * | 2020-06-08 | 2022-12-23 | 成都信息工程大学 | Object storage management method and system of billion-level node scale knowledge graph based on Ceph |
CN111639082A (en) * | 2020-06-08 | 2020-09-08 | 成都信息工程大学 | Object storage management method and system of billion-level node scale knowledge graph based on Ceph |
CN112241458B (en) * | 2020-10-13 | 2022-10-28 | 北京百分点科技集团股份有限公司 | Text knowledge structuring processing method, device, equipment and readable storage medium |
CN112241458A (en) * | 2020-10-13 | 2021-01-19 | 北京百分点信息科技有限公司 | Text knowledge structuring processing method, device, equipment and readable storage medium |
CN112256882A (en) * | 2020-10-16 | 2021-01-22 | 美林数据技术股份有限公司 | Multi-similarity-based cross-system network entity fusion method |
CN112200382A (en) * | 2020-10-27 | 2021-01-08 | 支付宝(杭州)信息技术有限公司 | Training method and device of risk prediction model |
CN112307172A (en) * | 2020-10-31 | 2021-02-02 | 平安科技(深圳)有限公司 | Semantic parsing equipment, method, terminal and storage medium |
CN112307172B (en) * | 2020-10-31 | 2023-08-01 | 平安科技(深圳)有限公司 | Semantic analysis device, semantic analysis method, terminal and storage medium |
CN112287123A (en) * | 2020-11-19 | 2021-01-29 | 国网湖南省电力有限公司 | Entity alignment method and device based on edge type attention mechanism |
CN112650865B (en) * | 2021-01-27 | 2021-11-09 | 南威软件股份有限公司 | Method and system for solving multi-region license data conflict based on flexible rule |
CN112650865A (en) * | 2021-01-27 | 2021-04-13 | 南威软件股份有限公司 | Method and system for solving multi-region license data conflict based on flexible rule |
CN113159320A (en) * | 2021-03-08 | 2021-07-23 | 北京航空航天大学 | Scientific and technological resource data integration method and device based on knowledge graph |
CN113157697A (en) * | 2021-04-19 | 2021-07-23 | 山东艺术学院 | Mingqing custom music score database system |
CN113177095A (en) * | 2021-04-29 | 2021-07-27 | 北京明略软件系统有限公司 | Enterprise knowledge management method, system, electronic equipment and storage medium |
CN117556059A (en) * | 2024-01-12 | 2024-02-13 | 天津滨电电力工程有限公司 | Detection and correction method based on knowledge fusion and reasoning charging station data |
Also Published As
Publication number | Publication date |
---|---|
CN107330125B (en) | 2020-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107330125A (en) | The unstructured distribution data integrated approach of magnanimity of knowledge based graphical spectrum technology | |
CN108491378B (en) | Intelligent response system for operation and maintenance of electric power information | |
CN106447346A (en) | Method and system for construction of intelligent electric power customer service system | |
CN107766483A (en) | The interactive answering method and system of a kind of knowledge based collection of illustrative plates | |
CN109446305A (en) | The construction method and system of intelligent tour customer service system | |
CN107169079A (en) | A kind of field text knowledge abstracting method based on Deepdive | |
CN113157860B (en) | Electric power equipment maintenance knowledge graph construction method based on small-scale data | |
CN114077674A (en) | Power grid dispatching knowledge graph data optimization method and system | |
CN115438199A (en) | Knowledge platform system based on smart city scene data middling platform technology | |
CN115809833A (en) | Intelligent monitoring method and device for capital construction project based on portrait technology | |
CN115660464A (en) | Intelligent equipment maintenance method and terminal based on big data and physical ID | |
CN116028646A (en) | Power grid dispatching field knowledge graph construction method based on machine learning | |
CN114091912A (en) | Method for analyzing topological transaction of medium-voltage power grid by using knowledge graph | |
Yin et al. | Sentence-BERT and k-means based clustering technology for scientific and technical literature | |
Wang et al. | Automatic scoring of Chinese fill-in-the-blank questions based on improved P-means | |
CN116108203A (en) | Method, system, storage medium and equipment for constructing power grid panoramic dispatching knowledge graph and managing power grid equipment | |
CN115563968A (en) | Water and electricity transportation and inspection knowledge natural language artificial intelligence system and method | |
CN114792140A (en) | Transformer substation defect analysis system based on knowledge graph | |
CN113642835A (en) | Work ticket generation method based on data similarity and terminal | |
Xinjie et al. | A Construction Method for the Knowledge Graph of Power Grid Supervision Business | |
Banerjee et al. | Automatic Standardization of Data Based on Machine Learning and Natural Language Processing | |
CN117131184B (en) | Site soil pollution question-answering system and question-answering method based on knowledge graph | |
CN113111189B (en) | Interpretable power grid operation risk assessment method and device | |
CN115374108B (en) | Knowledge graph technology-based data standard generation and automatic mapping method | |
Wang et al. | Research on Construction and Application of Knowledge Mapping of Intelligent Transportation Inspection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |