Disclosure of Invention
In view of the above problems, the present invention provides a method and an apparatus for constructing a knowledge graph in the field of oil and gas exploration and development, which overcome or at least partially solve the above problems, and the technical solution is as follows:
a method for constructing a knowledge graph in the field of oil and gas exploration and development comprises the following steps:
obtaining a pre-established ontology library of a field of oil and gas exploration and development, wherein the ontology library comprises an ontology constructed according to a first knowledge system of a first aspect of the field of oil and gas exploration and development, the ontology library further comprising: an ontology constructed according to a second knowledge system in a second aspect of the field of oil and gas exploration and development;
obtaining a first data source in a first aspect of the hydrocarbon exploration development field, obtaining a second data source in a second aspect of the hydrocarbon exploration development field;
acquiring unstructured data with knowledge labels in the first data source, and performing machine learning on the unstructured data with knowledge labels to acquire a first knowledge point extraction model, wherein the knowledge labels in the first data source are matched with the ontology base; extracting knowledge points from unstructured data without knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library;
acquiring unstructured data with knowledge labels in the second data source, and performing machine learning on the unstructured data with knowledge labels to acquire a second knowledge point extraction model, wherein the knowledge labels in the second data source are matched with the ontology base; extracting knowledge points from the unstructured data without knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library;
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source;
fusing the first and second knowledge-graphs into a third knowledge-graph.
Optionally, the merging the first knowledge-graph and the second knowledge-graph into a third knowledge-graph includes:
for any entity in the first knowledge-graph: comparing the attributes of the entity with the attributes of the entities in the second knowledge graph respectively, and determining the similarity between the entity and each entity in the second knowledge graph according to the attribute comparison result; and fusing the entities with the similarity higher than a preset threshold into one entity.
Optionally, the merging the first knowledge-graph and the second knowledge-graph into a third knowledge-graph includes:
clustering the entities in the first knowledge graph and the entities in the second knowledge graph to obtain a plurality of entity clusters;
at least part of the entities in the same entity cluster are fused into one entity.
Optionally, the method further includes:
obtaining first metadata of the first data source;
establishing a corresponding relation between the first metadata and the first data source;
receiving an operation instruction of a user on the first data source according to the corresponding relation;
and processing the first data source according to the operation instruction.
Optionally, the method further includes:
determining whether the first data source has changed, and if so, extracting knowledge points from the first data source again and constructing a fourth knowledge map of the first aspect of the field of hydrocarbon exploration and development again from the extracted knowledge points from the first data source;
merging the third knowledge-graph and the fourth knowledge-graph into a fifth knowledge-graph.
Optionally, the constructing a first knowledge-graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source and a second knowledge-graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source comprises:
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development domain from the callout of knowledge in the first data source and the points of knowledge extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development domain from the callout of knowledge in the second data source and the points of knowledge extracted from the second data source.
A device for constructing a knowledge graph in the field of oil and gas exploration and development comprises: a body library construction unit, a data source obtaining unit, a first extraction unit, a second extraction unit, an atlas construction unit and an atlas fusion unit,
the ontology library construction unit is configured to obtain a pre-established ontology library of the oil and gas exploration and development field, where the ontology library includes an ontology constructed according to a first knowledge system of a first aspect of the oil and gas exploration and development field, and the ontology library further includes: an ontology constructed according to a second knowledge system in a second aspect of the field of oil and gas exploration and development;
the data source obtaining unit is used for obtaining a first data source in a first aspect of the oil and gas exploration and development field and obtaining a second data source in a second aspect of the oil and gas exploration and development field;
the first extraction unit is configured to obtain unstructured data with knowledge labels in the first data source, perform machine learning on the unstructured data with knowledge labels, and obtain a first knowledge point extraction model, where the knowledge labels in the first data source are matched with the ontology base; extracting knowledge points from unstructured data without knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library;
the second extraction unit is configured to obtain unstructured data with knowledge labels in the second data source, perform machine learning on the unstructured data with knowledge labels, and obtain a second knowledge point extraction model, where the knowledge labels in the second data source are matched with the ontology base; extracting knowledge points from the unstructured data without knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library;
the map construction unit is used for constructing a first knowledge map of a first aspect of the oil and gas exploration and development field according to the knowledge points extracted from the first data source, and constructing a second knowledge map of a second aspect of the oil and gas exploration and development field according to the knowledge points extracted from the second data source;
the map fusion unit is used for fusing the first knowledge map and the second knowledge map into a third knowledge map.
Optionally, the map fusion unit is specifically configured to: for any entity in the first knowledge-graph: comparing the attributes of the entity with the attributes of the entities in the second knowledge graph respectively, and determining the similarity between the entity and each entity in the second knowledge graph according to the attribute comparison result; and fusing the entities with the similarity higher than a preset threshold into one entity.
An apparatus, comprising: at least one processor, and at least one memory, bus connected with the processor; the processor and the memory complete mutual communication through the bus; the processor is used for calling the program instructions in the memory so as to execute the construction method of the knowledge graph in any one of the oil and gas exploration and development fields.
A storage medium, wherein computer executable instructions are stored in the storage medium, and when the computer executable instructions are loaded and executed by a processor, the method for constructing the knowledge graph in any oil and gas exploration and development field is realized.
By means of the technical scheme, the method and the device for constructing the knowledge graph in the oil and gas exploration and development field can obtain a pre-established ontology base of the oil and gas exploration and development field, obtain a first data source in the first aspect of the oil and gas exploration and development field, and obtain a second data source in the second aspect of the oil and gas exploration and development field; acquiring unstructured data with knowledge labels in the first data source, performing machine learning on the unstructured data with the knowledge labels to acquire a first knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library; acquiring unstructured data with knowledge labels in the second data source, performing machine learning on the unstructured data with the knowledge labels to acquire a second knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library; constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source; fusing the first and second knowledge-graphs into a third knowledge-graph. The invention can respectively extract knowledge points from unstructured data, semi-structured data and structured data in different aspects according to a pre-established ontology base in the field of oil and gas exploration and development, thereby respectively constructing knowledge maps in multiple aspects. The knowledge map obtained by the invention fuses the knowledge points in the unstructured data, the semi-structured data and the structured data in different aspects of the oil and gas exploration and development field, thereby completing the knowledge management in the oil and gas exploration and development field. Through the knowledge graph constructed by the invention, a user can conveniently use the knowledge in the field of oil and gas exploration and development.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, a method for constructing a knowledge graph in the field of oil and gas exploration and development provided by an embodiment of the present invention may include:
s100, obtaining a pre-established ontology library of the oil and gas exploration and development field, wherein the ontology library comprises an ontology constructed according to a first knowledge system of the first aspect of the oil and gas exploration and development field, and the ontology library further comprises: an ontology constructed according to a second knowledge system in a second aspect of the field of oil and gas exploration and development.
Wherein, oil and gas exploration development field can be divided according to different partition modes and obtain multiple aspects, for example: geological analysis, well logging, well drilling, production process control, production line layout and management, fault handling and the like. Of course, the above aspects can also be divided into smaller aspects, such as: the fault handling can be further divided into: well blowout fault treatment, water injection fault, oil pump fault and the like. The field of oil and gas exploration and development can be divided according to actual needs by those skilled in the art, and the invention is not limited.
Knowledge systems are the provision of knowledge naming and knowledge classification, such as: as shown in fig. 2, the knowledge of the surface production can be divided into a station library, a single well and a pipeline, wherein the knowledge of the station library can be divided into: a crude oil treatment subsystem, a sewage treatment subsystem, a crude oil export subsystem, a water injection subsystem and the like. Further, knowledge of the crude processing subsystem can be subdivided into: device a, device B, device C, etc. Of course, the device may be further divided into various components or parts. The specific display mode of the knowledge system can be various, fig. 3 shows a display effect schematic diagram of the knowledge system in the aspect of geological properties, and a user can click on the knowledge system branches in fig. 3 to view detailed naming and classification.
The ontology library in step S100 includes an ontology constructed from a first knowledge system of a first aspect of the field of oil and gas exploration and an ontology constructed from a second knowledge system of a second aspect of the field of oil and gas exploration and development. Therefore, the invention can construct all aspects of ontologies according to all aspects of knowledge systems in the field of oil and gas exploration and development and put all aspects of ontologies into an ontology library. Thus, the ontology library can fuse various aspects of the knowledge system.
Specifically, the method can construct an ontology node for each type of knowledge in a knowledge system, and establish the connection relationship between ontology nodes according to the relationship of each type of knowledge in the knowledge system. For example, for the knowledge system shown in fig. 2, the present invention can construct ontology nodes and connection relationships between nodes as shown in fig. 4.
Specifically, the invention can establish a relatively complete knowledge system in the field of oil and gas exploration and development, and divide the relatively complete knowledge system into a plurality of convenient knowledge systems according to various aspects contained in the field of oil and gas exploration and development. For example: both the first and second systems of knowledge may be located in the same system of knowledge (i.e., the more complete system of knowledge).
Specifically, the invention can also set the constraint of the knowledge point as the constraint of the ontology in the ontology library or the constraint of the ontology attribute. For example: the ontology is water pressure, then the constraints of the text type attribute of the ontology may be numbers. Of course, the type of constraint is not limited to text types, but may include: value ranges, units, etc.
S200, obtaining a first data source in the first aspect of the oil and gas exploration and development field, and obtaining a second data source in the second aspect of the oil and gas exploration and development field.
Wherein, the data source may include: structured data, unstructured data and semi-structured data, wherein the structured data can be a database, the semi-structured data can be data with partial structure such as tables, encyclopedia data and the like, and the unstructured data can be data such as text and the like.
In particular, the present invention can obtain structured data, unstructured data, and semi-structured data in different ways. For example: for structured data such as a database, the invention can obtain data in at least one table from the database through information such as a database name, a name of a device where the database is located, a port corresponding to a database service, a table name, a user name, a password and the like. The invention can obtain semi-structured data from the table, and stores the unstructured data which is not originally stored in the table into the table according to the data structure. For unstructured data, the method can directly obtain complete unstructured data without preprocessing the unstructured data. Optionally, for unstructured data, the method may crawl from a network, for example, crawling the unstructured data sources of the first aspect according to the crawling keywords of the first aspect.
S300, acquiring unstructured data with knowledge labels in the first data source, and performing machine learning on the unstructured data with the knowledge labels to acquire a first knowledge point extraction model, wherein the knowledge labels in the first data source are matched with the ontology base; extracting knowledge points from unstructured data without knowledge labels in the first data source by using the first knowledge point extraction model; and extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library.
S400, acquiring unstructured data with knowledge labels in the second data source, and performing machine learning on the unstructured data with the knowledge labels to acquire a second knowledge point extraction model, wherein the knowledge labels in the second data source are matched with the ontology base; extracting knowledge points from the unstructured data without knowledge labels in the second data source by using the second knowledge point extraction model; and extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library.
It will be appreciated that for structured and semi-structured data, the present invention can extract knowledge points directly from the data structure. For example: and extracting knowledge points according to the fields in the table (for example, extracting various failure type knowledge points corresponding to the fields according to the field 'failure type').
However, with respect to structured data and semi-structured data, it is difficult to extract knowledge points from unstructured data because unstructured data does not have a more rigorous data structure. To address this problem, the present invention extracts knowledge points from unstructured data through a machine learning model.
The method can firstly carry out machine learning through the unstructured data with knowledge labels to obtain a knowledge point extraction model. The invention constructs different knowledge point extraction models aiming at different aspects of the field of oil-gas exploration and development, so that the obtained knowledge point extraction model has stronger speciality and more accurate extracted knowledge points. Because the knowledge labels are matched with the ontology library, and the ontology library integrates knowledge systems in various aspects of the field of oil and gas exploration and development, certain associations can be generated among the knowledge points extracted by the knowledge point extraction models in different aspects, such as: the same or corresponding knowledge points are extracted by the knowledge point extraction models in different aspects. In this way, subsequent steps can perform the fusion of the knowledge-graph according to the associated knowledge points.
Specifically, the knowledge labeling may include: and at least one of entity labeling, entity relation labeling and entity attribute labeling.
Specifically, the knowledge points extracted by the knowledge point extraction model of the present invention may exist in the form of triples, for example: (entity-entity relationship-entity), (entity-attribute value), and the like.
S500, constructing a first knowledge graph in the first aspect of the oil and gas exploration and development field according to the knowledge points extracted from the first data source, and constructing a second knowledge graph in the second aspect of the oil and gas exploration and development field according to the knowledge points extracted from the second data source.
The invention respectively extracts knowledge from the data sources of different aspects, and respectively constructs the knowledge graph of the aspect according to the knowledge points extracted from the data sources of each aspect, so that the knowledge graph constructed by the invention has stronger specialty and effectiveness in the aspect. The problem of mutual interference caused by knowledge extraction by doping data sources of all aspects together is avoided.
Specifically, the process of constructing the knowledge graph according to the knowledge points may include:
establishing entity nodes corresponding to entities in the knowledge points;
establishing a connection relation between entity nodes according to the entity relation in the knowledge points;
and setting the attribute and the attribute value of the entity node according to the attribute and the attribute value in the knowledge point.
Because the knowledge labels carry a large number of credible knowledge points, the knowledge labels can be used for constructing a knowledge graph at the same time. Optionally, step S500 may specifically include:
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development domain from the callout of knowledge in the first data source and the points of knowledge extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development domain from the callout of knowledge in the second data source and the points of knowledge extracted from the second data source.
S600, fusing the first knowledge graph and the second knowledge graph into a third knowledge graph.
Specifically, the invention can determine the entities with higher similarity or even the same entity in the knowledge graph, and the fusion of the knowledge graph can be realized by combining the entities with higher similarity or even the same entity.
There are various specific execution manners of step S600, and two of them are provided in the following exemplary:
first, step S600 may include:
for any entity in the first knowledge-graph: comparing the attributes of the entity with the attributes of the entities in the second knowledge graph respectively, and determining the similarity between the entity and each entity in the second knowledge graph according to the attribute comparison result; and fusing the entities with the similarity higher than a preset threshold into one entity.
It can be understood that the entity has the attribute, and if the similarity of the attributes of the two entities is higher, the two entities can be determined to be one entity by the invention, and the two entities can be fused. The preset threshold may be 95%, but may also be other values.
By fusing the entities in the two knowledge graphs, the knowledge graph spectrum can be fused into one knowledge graph.
Second, step S600 may include:
clustering the entities in the first knowledge graph and the entities in the second knowledge graph to obtain a plurality of entity clusters;
at least part of the entities in the same entity cluster are determined as the same entity.
Through clustering, the method can classify the entities by using more characteristics of the entities, and has higher accuracy.
Optionally, the user may manage at least one of the first, second, and third knowledge-graphs, for example: adding, deleting, modifying entities/entity relationships/entity attributes in the knowledge-graph.
The method for constructing the knowledge graph in the oil and gas exploration and development field can obtain a pre-established ontology base in the oil and gas exploration and development field, obtain a first data source in the first aspect of the oil and gas exploration and development field, and obtain a second data source in the second aspect of the oil and gas exploration and development field; acquiring unstructured data with knowledge labels in the first data source, performing machine learning on the unstructured data with the knowledge labels to acquire a first knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library; acquiring unstructured data with knowledge labels in the second data source, performing machine learning on the unstructured data with the knowledge labels to acquire a second knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library; constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source; fusing the first and second knowledge-graphs into a third knowledge-graph. The invention can respectively extract knowledge points from unstructured data, semi-structured data and structured data in different aspects according to a pre-established ontology base in the field of oil and gas exploration and development, thereby respectively constructing knowledge maps in multiple aspects. The knowledge map obtained by the invention fuses the knowledge points in the unstructured data, the semi-structured data and the structured data in different aspects of the oil and gas exploration and development field, thereby completing the knowledge management in the oil and gas exploration and development field. Through the knowledge graph constructed by the invention, a user can conveniently use the knowledge in the field of oil and gas exploration and development.
As shown in fig. 5, another method for constructing a knowledge graph in the field of oil and gas exploration and development provided by the embodiment of the present invention may further include:
and S001, obtaining first metadata of the first data source.
Wherein the metadata of the data source may include: title of data source, data volume, source, author, keyword, time of release, category, etc. The invention can obtain the metadata of the data source from the data source.
And S002, establishing a corresponding relation between the first metadata and the first data source.
Optionally, after the corresponding relationship is established, the metadata and the data source may be correspondingly displayed to the user, so that the user can conveniently search and operate the data source according to the metadata (such as a title of the data source).
Of course, the user may also make modifications to the metadata.
And S003, receiving an operation instruction of the user to the first data source according to the corresponding relation.
The operation instruction can be a deletion operation, a modification operation and the like on the data source.
Over time, portions of the data source may no longer be needed (e.g., the data source has lost timeliness), at which point it may be deleted. There may be some errors in the data sources that may be modified at this time. Of course, in addition to deleting and modifying data sources, users may also add new data sources.
And S004, processing the first data source according to the operation instruction.
The execution sequence of the steps in fig. 5 and the steps shown in fig. 1 is not limited in the present invention.
It is to be understood that the present invention may perform the extraction of knowledge again to update the knowledge-graph after the data source is changed. Therefore, the knowledge graph is updated synchronously with the updating of the data source, and the timeliness of the knowledge graph can be effectively kept.
On the basis of the method embodiment shown in fig. 1, another method for constructing a knowledge graph in the field of oil and gas exploration and development provided by the embodiment of the invention may further include:
determining whether the first data source has changed, and if so, extracting knowledge points from the first data source again and constructing a fourth knowledge map of the first aspect of the field of hydrocarbon exploration and development again from the extracted knowledge points from the first data source;
merging the third knowledge-graph and the fourth knowledge-graph into a fifth knowledge-graph.
Optionally, when the data source changes, the knowledge point extraction model can learn the changed data source so as to update the knowledge point extraction model.
It will be appreciated that in obtaining the third knowledge-graph of the present invention, the user may use the third knowledge-graph to perform a variety of operations, such as: data query, knowledge question answering, knowledge reasoning, measure recommendation and the like. The specific application is various, and the invention is not limited herein.
Corresponding to the method shown in fig. 1, an embodiment of the present invention further provides an apparatus for constructing a knowledge graph in the field of oil and gas exploration and development, as shown in fig. 6, the apparatus may include: an ontology library construction unit 100, a data source obtaining unit 200, a first extraction unit 300, a second extraction unit 400, a graph construction unit 500 and a graph fusion unit 600,
the ontology library constructing unit 100 is configured to obtain a pre-established ontology library of the oil and gas exploration and development field, where the ontology library includes an ontology constructed according to a first knowledge system of a first aspect of the oil and gas exploration and development field, and the ontology library further includes: an ontology constructed according to a second knowledge system in a second aspect of the field of oil and gas exploration and development.
The data source obtaining unit 200 is configured to obtain a first data source in a first aspect of the hydrocarbon exploration development area and obtain a second data source in a second aspect of the hydrocarbon exploration development area.
The first extraction unit 300 is configured to obtain unstructured data with knowledge labels in the first data source, perform machine learning on the unstructured data with knowledge labels, and obtain a first knowledge point extraction model, where the knowledge labels in the first data source are matched with the ontology library; extracting knowledge points from unstructured data without knowledge labels in the first data source by using the first knowledge point extraction model; and extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library.
The second extraction unit 400 is configured to obtain unstructured data with knowledge labels in the second data source, perform machine learning on the unstructured data with knowledge labels, and obtain a second knowledge point extraction model, where the knowledge labels in the second data source are matched with the ontology library; extracting knowledge points from the unstructured data without knowledge labels in the second data source by using the second knowledge point extraction model; and extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library.
The map construction unit 500 is configured to construct a first knowledge-map in a first aspect of the hydrocarbon exploration development area based on the knowledge points extracted from the first data source, and to construct a second knowledge-map in a second aspect of the hydrocarbon exploration development area based on the knowledge points extracted from the second data source.
Optionally, the map building unit 500 is specifically configured to:
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development domain from the callout of knowledge in the first data source and the points of knowledge extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development domain from the callout of knowledge in the second data source and the points of knowledge extracted from the second data source.
The map fusion unit 600 is configured to fuse the first knowledge-map and the second knowledge-map into a third knowledge-map.
Optionally, the map fusion unit 600 is specifically configured to: for any entity in the first knowledge-graph: comparing the attributes of the entity with the attributes of the entities in the second knowledge graph respectively, and determining the similarity between the entity and each entity in the second knowledge graph according to the attribute comparison result; and fusing the entities with the similarity higher than a preset threshold into one entity.
Optionally, the map fusion unit 600 is specifically configured to:
clustering the entities in the first knowledge graph and the entities in the second knowledge graph to obtain a plurality of entity clusters; at least part of the entities in the same entity cluster are fused into one entity.
In other embodiments of the present invention, the apparatus shown in fig. 6 may further include: a metadata obtaining unit, a corresponding relation establishing unit, an instruction receiving unit and a processing unit,
the metadata obtaining unit is used for obtaining first metadata of the first data source;
the corresponding relation establishing unit is used for establishing the corresponding relation between the first metadata and the first data source;
the instruction receiving unit is used for receiving an operation instruction of a user on the first data source according to the corresponding relation;
and the processing unit is used for processing the first data source according to the operation instruction.
In other embodiments of the present invention, the apparatus shown in fig. 6 may further include: a change determining unit, a third extracting unit and a re-fusing unit,
the change determining unit is used for determining whether the first data source changes, and if so, triggering the third extracting unit;
the third extraction unit is used for extracting knowledge points from the first data source again and constructing a fourth knowledge graph of the first aspect of the oil and gas exploration and development field again according to the knowledge points extracted from the first data source;
and the second-time fusion unit is used for fusing the third knowledge graph and the fourth knowledge graph into a fifth knowledge graph.
The device for constructing the knowledge graph in the oil and gas exploration and development field, provided by the embodiment of the invention, can obtain a pre-established ontology library in the oil and gas exploration and development field, obtain a first data source in the first aspect of the oil and gas exploration and development field, and obtain a second data source in the second aspect of the oil and gas exploration and development field; acquiring unstructured data with knowledge labels in the first data source, performing machine learning on the unstructured data with the knowledge labels to acquire a first knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library; acquiring unstructured data with knowledge labels in the second data source, performing machine learning on the unstructured data with the knowledge labels to acquire a second knowledge point extraction model, and extracting knowledge points from the unstructured data without the knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library; constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source; fusing the first and second knowledge-graphs into a third knowledge-graph. The invention can respectively extract knowledge points from unstructured data, semi-structured data and structured data in different aspects according to a pre-established ontology base in the field of oil and gas exploration and development, thereby respectively constructing knowledge maps in multiple aspects. The knowledge map obtained by the invention fuses the knowledge points in the unstructured data, the semi-structured data and the structured data in different aspects of the oil and gas exploration and development field, thereby completing the knowledge management in the oil and gas exploration and development field. Through the knowledge graph constructed by the invention, a user can conveniently use the knowledge in the field of oil and gas exploration and development.
The device for constructing the knowledge graph in the field of oil and gas exploration and development comprises a processor and a memory, wherein the ontology base constructing unit, the data source obtaining unit, the first extracting unit, the second extracting unit, the graph constructing unit, the graph fusing unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more, and the knowledge graph is constructed by adjusting the parameters of the kernel.
The embodiment of the invention provides a storage medium, wherein a program is stored on the storage medium, and the program is used for realizing the construction method of the knowledge graph in the field of oil and gas exploration and development when being executed by a processor.
The embodiment of the invention provides a processor, which is used for running a program, wherein the program executes a construction method of a knowledge graph in the field of oil and gas exploration and development when running.
As shown in fig. 7, an embodiment of the present invention provides an apparatus 70, where the apparatus 70 includes at least one processor 701, and at least one memory 702 and a bus 703 connected to the processor 701; the processor 701 and the memory 702 complete mutual communication through a bus 703; the processor 701 is configured to call program instructions in the memory 702 to perform the above-described method for constructing a knowledge-graph of the field of oil and gas exploration and development. The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
obtaining a pre-established ontology library of a field of oil and gas exploration and development, wherein the ontology library comprises an ontology constructed according to a first knowledge system of a first aspect of the field of oil and gas exploration and development, the ontology library further comprising: an ontology constructed according to a second knowledge system in a second aspect of the field of oil and gas exploration and development;
obtaining a first data source in a first aspect of the hydrocarbon exploration development field, obtaining a second data source in a second aspect of the hydrocarbon exploration development field;
acquiring unstructured data with knowledge labels in the first data source, and performing machine learning on the unstructured data with knowledge labels to acquire a first knowledge point extraction model, wherein the knowledge labels in the first data source are matched with the ontology base; extracting knowledge points from unstructured data without knowledge labels in the first data source by using the first knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the first data source according to the ontology library;
acquiring unstructured data with knowledge labels in the second data source, and performing machine learning on the unstructured data with knowledge labels to acquire a second knowledge point extraction model, wherein the knowledge labels in the second data source are matched with the ontology base; extracting knowledge points from the unstructured data without knowledge labels in the second data source by using the second knowledge point extraction model; extracting knowledge points from the structured data and the semi-structured data in the second data source according to the ontology library;
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source;
fusing the first and second knowledge-graphs into a third knowledge-graph.
Optionally, the merging the first knowledge-graph and the second knowledge-graph into a third knowledge-graph includes:
for any entity in the first knowledge-graph: comparing the attributes of the entity with the attributes of the entities in the second knowledge graph respectively, and determining the similarity between the entity and each entity in the second knowledge graph according to the attribute comparison result; and fusing the entities with the similarity higher than a preset threshold into one entity.
Optionally, the merging the first knowledge-graph and the second knowledge-graph into a third knowledge-graph includes:
clustering the entities in the first knowledge graph and the entities in the second knowledge graph to obtain a plurality of entity clusters;
at least part of the entities in the same entity cluster are fused into one entity.
Optionally, the method further includes:
obtaining first metadata of the first data source;
establishing a corresponding relation between the first metadata and the first data source;
receiving an operation instruction of a user on the first data source according to the corresponding relation;
and processing the first data source according to the operation instruction.
Optionally, the method further includes:
determining whether the first data source has changed, and if so, extracting knowledge points from the first data source again and constructing a fourth knowledge map of the first aspect of the field of hydrocarbon exploration and development again from the extracted knowledge points from the first data source;
merging the third knowledge-graph and the fourth knowledge-graph into a fifth knowledge-graph.
Optionally, the constructing a first knowledge-graph in a first aspect of the hydrocarbon exploration development area from the knowledge points extracted from the first data source and a second knowledge-graph in a second aspect of the hydrocarbon exploration development area from the knowledge points extracted from the second data source comprises:
constructing a first knowledge graph in a first aspect of the hydrocarbon exploration development domain from the callout of knowledge in the first data source and the points of knowledge extracted from the first data source, and constructing a second knowledge graph in a second aspect of the hydrocarbon exploration development domain from the callout of knowledge in the second data source and the points of knowledge extracted from the second data source.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.