CN114817553A

CN114817553A - Knowledge graph construction method, knowledge graph construction system and computing equipment

Info

Publication number: CN114817553A
Application number: CN202111396510.1A
Authority: CN
Inventors: 代旭东; 李宝善; 盛志超; 方昕; 刘俊华; 陈志刚
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-07-29

Abstract

The invention discloses a knowledge graph construction method, a knowledge graph construction system and a computing device, wherein the method comprises the following steps: acquiring a text serving as original data, and performing chapter-level knowledge extraction and sentence-level event extraction on the text to obtain a knowledge extraction result and an event extraction result; and performing knowledge fusion on the knowledge extraction result and the event extraction result to obtain a knowledge graph. The invention changes the concept of the knowledge graph from the traditional static knowledge triple into the composite knowledge containing the static knowledge quintuple and the dynamic event knowledge, uses chapter-level element extraction to be associated with the element, can greatly improve the extraction efficiency and recall rate of the quintuple, and leads the knowledge structural attribute to be stronger.

Description

Knowledge graph construction method, knowledge graph construction system and computing equipment

Technical Field

The present invention relates to the technical field of artificial intelligence, and more particularly, to a knowledge graph construction method, a knowledge graph construction system, and a computing device.

Background

At present, with the continuous development of intelligent information service application, knowledge maps have been widely applied in the fields of intelligent search, intelligent question answering, personalized recommendation, intelligence analysis, anti-fraud and the like. The knowledge graph is converted into a simple and clear triple of entities, relations and entities by effectively processing, processing and integrating the data of the complicated document, and finally a great deal of knowledge is aggregated, so that the quick response and reasoning of the knowledge are realized.

The existing knowledge graph construction method is realized through a sentence-level relation extraction mode, but a triplet implicit in some data is not in the range of a sentence, and a lot of structured data information is usually lost through the existing sentence-level relation extraction mode. In addition, in the conventional knowledge graph, knowledge is generally static knowledge, and human society is generally dynamic, so that static triple information does not consider dynamically changing knowledge, and errors may exist. In addition, the traditional method only constructs the knowledge map in an extraction mode, a relatively perfect map cleaning and automatic supplementing mechanism is not provided, the accuracy of the constructed knowledge map is not high, and the condition that partial attributes or relations are lost easily occurs.

Therefore, a new knowledge graph construction method, knowledge graph construction system and computing device are needed to solve the above problems.

Disclosure of Invention

In this summary, concepts in a simplified form are introduced that are further described in the detailed description. This summary of the invention is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

According to an aspect of the present invention, there is provided a method of knowledge-graph construction, the method comprising: acquiring a text serving as original data, and performing chapter-level knowledge extraction and sentence-level event extraction on the text to obtain a knowledge extraction result and an event extraction result; and performing knowledge fusion on the knowledge extraction result and the event extraction result to obtain a knowledge graph.

In one embodiment, wherein extracting chapter-level knowledge from the text comprises: based on a preset category label, performing knowledge extraction on the text by using a sliding window to extract a knowledge element corresponding to the category label in the text; and associating the extracted knowledge elements to obtain the knowledge extraction result.

In one embodiment, wherein knowledge extraction of the text using a sliding window comprises: sliding the sliding window on the text, and calculating the prediction probability of each word in the text in the sliding window for each category label; calculating the average value of all the prediction probabilities of the same word at all the positions of the sliding window to serve as the final prediction probability of the word; and selecting the knowledge elements from the text according to preset knowledge element selection conditions based on the final prediction probability of each word.

In one embodiment, associating the extracted knowledge elements comprises: processing each word of the knowledge elements to obtain a word vector of each word; processing the word vectors of all the words corresponding to the knowledge elements to obtain element vectors of the knowledge elements; performing association judgment on the element vectors of the knowledge elements to obtain association results aiming at the knowledge elements; and combining the knowledge elements according to the association result to obtain the knowledge extraction result.

In one embodiment, sentence-level event extraction of the text comprises: predicting whether each word in each sentence of the text is a trigger word and the type of the trigger word based on a preset event trigger word label; predicting event parameters and parameter types thereof corresponding to the trigger words in the sentences based on preset event parameter tags by using the predicted trigger words, trigger word types and positions of the trigger words; and combining the trigger words and the event parameters to obtain the event extraction result.

In one embodiment, the knowledge fusion of the knowledge extraction result and the event extraction result comprises: performing entity linking processing on the knowledge extraction result and the event extraction result to obtain an entity linking result, wherein the entity linking result indicates whether each entity in the knowledge extraction result and the event extraction result is the same entity and whether each entity is the same entity with an existing entity in the knowledge graph; recombining the knowledge extraction result and the event extraction result based on the entity link result to obtain graph knowledge for the knowledge graph.

In one embodiment, the knowledge fusion of the knowledge extraction result and the event extraction result further comprises: deducing new map knowledge based on the map knowledge and/or existing knowledge in the knowledge map; updating the knowledge-graph with the new graph knowledge.

In one embodiment, the method further comprises: carrying out intra-chapter index chain identification on the event extraction results to determine whether each event extraction result of the text points to the same event according to a preset intra-chapter event co-index feature type; performing cross-chapter event coreference resolution processing on the event extraction result to determine whether the event extraction result of the text and the existing event in the knowledge graph point to the same event according to a preset cross-chapter event coreference characteristic type; based on the determination, duplicate events are removed.

In one embodiment, the method further comprises a knowledge cleansing step for correcting or removing redundant or erroneous information, wherein the knowledge cleansing step comprises redundant expression merging, explicit error correction and removal, similar entity inference and supplemental attribute name merging, and synonymous attribute name merging.

According to another aspect of the present invention, there is provided a knowledge-graph building system, the system comprising: the composite knowledge extraction module is used for extracting chapter-level knowledge and sentence-level events of the acquired text serving as the original data so as to obtain a knowledge extraction result and an event extraction result; and the composite knowledge fusion module is used for carrying out knowledge fusion on the knowledge extraction result and the event extraction result so as to obtain a knowledge graph.

According to a further embodiment of the invention, a computing device is provided, comprising a memory and a processor, the memory having stored thereon a computer program, which, when executed by the processor, causes the processor to carry out the method as described above.

According to a further embodiment of the invention, a computer-readable medium is provided, on which a computer program is stored, which computer program, when executed, performs the method as described above.

According to the knowledge graph construction method, the knowledge graph construction system and the computing equipment, the concept of the knowledge graph is changed from the traditional static knowledge triple into the composite knowledge containing the static knowledge quintuple and the dynamic event knowledge, and the chapter-level element extraction is used for being associated with the element, so that the extraction efficiency and the recall rate of the quintuple can be greatly improved, and the structured attribute of the knowledge is stronger.

Drawings

The following drawings of the invention are included to provide a further understanding of the invention. The drawings illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.

In the drawings:

fig. 1 is a schematic structural block diagram of an electronic device implementing a knowledge graph construction method, a knowledge graph construction system, and a computing device according to an embodiment of the present invention.

FIG. 2 is a flowchart of exemplary steps of a method of knowledge-graph construction, according to one embodiment of the present invention.

FIG. 3 shows a schematic diagram of knowledge extraction of text using a sliding window, according to one embodiment of the invention.

FIG. 4 shows a schematic block diagram of a knowledge-graph building system according to one embodiment of the present invention.

FIG. 5 shows a schematic block diagram of a computing device according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of embodiments of the invention and not all embodiments of the invention, with the understanding that the invention is not limited to the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the invention described herein without inventive step, shall fall within the scope of protection of the invention.

As described above, the existing knowledge graph construction method may lose much structured data information, does not consider dynamically changing knowledge, and may have errors.

Therefore, in order, the present invention provides a method for constructing a knowledge graph, the method comprising: acquiring a text serving as original data, and performing chapter-level knowledge extraction and sentence-level event extraction on the text to obtain a knowledge extraction result and an event extraction result; and performing knowledge fusion on the knowledge extraction result and the event extraction result to obtain a knowledge graph.

According to the knowledge graph construction method, the concept of the knowledge graph is changed from the traditional static knowledge triple into the composite knowledge containing the static knowledge quintuple and the dynamic event knowledge, and the chapter-level element extraction and element association are used, so that the extraction efficiency and recall rate of the quintuple can be greatly improved, and the structured attribute of the knowledge is stronger.

The method of constructing a knowledge graph, the system for constructing a knowledge graph, and the computing device according to the present invention are described in detail below with reference to specific embodiments.

First, an electronic apparatus 100 for implementing the knowledge graph construction method, the knowledge graph construction system, and the computing apparatus according to the embodiments of the present invention is described with reference to fig. 1.

In one embodiment, the electronic device 100 may be, for example, a laptop, a desktop computer, a tablet computer, a learning machine, a mobile device (such as a smartphone, a telephone watch, etc.), an embedded computer, a tower server, a rack server, a blade server, or any other suitable electronic device.

In one embodiment, the electronic device 100 may include at least one processor 102 and at least one memory 104.

The memory 104 may be a volatile memory, such as a Random Access Memory (RAM), a cache memory (cache), a Dynamic Random Access Memory (DRAM) (including stacked DRAMs), or a High Bandwidth Memory (HBM), or may be a non-volatile memory, such as a Read Only Memory (ROM), a flash memory, a 3D Xpoint, or the like. In one embodiment, some portions of memory 104 may be volatile memory, while other portions may be non-volatile memory (e.g., using a two-level memory hierarchy). The memory 104 is used to store a computer program that, when executed, enables the client functionality (implemented by the processor) of the embodiments of the invention described below and/or other desired functionality.

Processor 102 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a microprocessor, or other processing unit having data processing capabilities and/or instruction execution capabilities. The processor 102 may be communicatively coupled to any suitable number or variety of components, peripherals, modules, or devices via a communication bus. In one embodiment, the communication bus may be implemented using any suitable protocol, such as Peripheral Component Interconnect (PCI), peripheral component interconnect express (PCIe), Accelerated Graphics Port (AGP), hypertransport, or any other bus or one or more point-to-point communication protocols.

The electronic device 100 may also include an input device 106 and an output device 108. The input device 106 is a device for receiving user input, and may include a keyboard, a mouse, a touch pad, a microphone, and the like. In addition, the input device 106 may be any interface for receiving information. The output device 108 may output various information (e.g., images or sounds) to an external (e.g., user), which may include one or more of a display, speakers, and the like. The output device 108 may be any other device having an output function, such as a printer.

An exemplary step flow diagram of a knowledge-graph construction method 200 according to one embodiment of the invention is described below with reference to FIG. 2. As shown in FIG. 2, the knowledge-graph construction method 200 may include the steps of:

in step S210, a text as raw data is acquired, and chapter-level knowledge extraction and sentence-level event extraction are performed on the text to obtain a knowledge extraction result and an event extraction result.

In one embodiment, the raw data may include structured data, semi-structured data, unstructured data, which is primarily used to process unstructured data.

Structured data refers to data that can be stored directly using a relational database, i.e., in the form of traditional triples, such as (mao dun, original name, shendhun).

Semi-structured data is a form of structured data that does not conform to a relational database or data model structure associated in the form of other data tables, but contains relevant tags to separate semantic elements and to stratify records and fields. It is therefore also referred to as a self-describing structure. Common semi-structured data has formats such as XML, JSON and the like.

Unstructured data refers to plain text data without structured information.

In one embodiment, any pre-training language model known in the art (e.g., BERT (converter-based bi-directional encoder representation) model, RoBERTa model, ALBERT model, NEZHA model, XLNET model, GPT model, UniLM model, etc.) may be used for chapter-level knowledge extraction of text, and the invention is not limited in this respect. However, the problem of extracting discourse-level knowledge cannot be solved by extracting discourse-level knowledge by using the pre-training language model known in the art, so that the problem of extracting discourse-level knowledge is solved by performing some processing on the pre-training language model in the training stage and adopting a sliding window strategy on the basis of adopting the pre-training language model.

Specifically, the processing of the pre-trained language model in the training phase includes:

(1) when the text is too long, a sliding window strategy is used for dividing the long text (for example, more than 512 characters) into a plurality of text segments, and each text segment is combined with a question to form a piece of training data;

(2) referring to the SQuAD (reading comprehension) task, the start and end positions of the answer in the body text are predicted. This time is handled in four cases:

-including the complete answer within the sliding window, the start and end positions being specific to a certain Tok position;

-no answer segment within the sliding window, both start and end positions being [ CLS ] positions;

-an anterior segment of the sliding window containing the answer, the starting position being a specific Tok position and the ending position being a [ CLS ] position;

-the later partial segment containing the answer within the sliding window, with the start position being the [ CLS ] position and the end position being the specific Tok position.

In one embodiment, extracting chapter-level knowledge from text may include the steps of:

in step a1, based on the preset category label, using a sliding window to extract knowledge elements corresponding to the category label in the text;

in step a2, the extracted knowledge elements are associated to obtain a knowledge extraction result.

In one embodiment, the category labels may be set in stages, such as primary labels, secondary labels, tertiary labels, and the like. An exemplary class label is shown in table one below:

watch 1

Wherein, the secondary labels of network location, WeChat, Xin Lang, QQ, Taobao, occupation and graduation colleges can be grouped.

Referring now to FIG. 3, FIG. 3 illustrates a diagram of knowledge extraction of text using a sliding window, according to one embodiment of the invention.

As shown in fig. 3, for a chapter scene of long text, the size of the sliding window (shown as a solid rectangle in fig. 3) and the sliding step size may be preset to slide on the entire text of the chapter. In fig. 3, taking 1000-word text (including CLS, SEP, etc. construction marks) as an example, the size of the preset sliding window is 512 words, the sliding step is 128 words, a total of 8 windows (not all shown in the figure) are obtained by sliding, and the starting positions are words 1, 129, 257, 385, 513, 641, 769 and 897, respectively.

When the sliding window is used for sliding on a long text as original data, the prediction probability of each word in the text in the sliding window is calculated for each preset class label, then the average value (shown by a dotted rectangle in fig. 3) of the prediction probabilities of the same word at each position of the sliding window is calculated to serve as the final prediction probability of the word, and then the knowledge element is selected from the text according to the preset knowledge element selection condition based on the final prediction probability of each word.

In one embodiment, after obtaining the final prediction probability for each word, knowledge elements may be selected from the raw data according to the following exemplary knowledge element selection conditions:

outputting 10 groups of starting position probabilities and ending position probabilities with highest final prediction probability;

-comparing with the CLS position probability, removing the start and end position probabilities of which the final prediction probability among the 10 sets of start and end position probabilities is lower than the CLS position probability;

and removing the starting position probability and the ending position probability of which the final prediction probability is lower than one tenth of the highest prediction probability from the residual starting position probability and the residual ending position probability.

In addition, text segments with different sets of start and end positions combined may have their repeated portions partially repeated, for example, two sets of start and end positions <5,10> and <8,12> may be modified to <5,7> and <8,12 >.

And after the final groups of starting and ending positions are obtained according to the knowledge element selection conditions, the corresponding text segments are the selected knowledge elements.

In one embodiment, any language characterization model known in the art (e.g., BERT model) may be used to correlate the extracted knowledge elements, as the invention is not limited in this respect.

In one embodiment, associating the extracted knowledge elements may comprise the steps of:

in step b1, each word of the extracted knowledge elements is processed to obtain a word vector representation for each word.

Illustratively, the word vectors for each word are obtained using a pre-training model or other model (e.g., a BERT model).

In step b2, the word vectors of all the words corresponding to each knowledge element are processed to obtain an element vector of the knowledge element.

For example, the word vectors of all words may be processed using a neural network algorithm + pooling to obtain the element vector of the knowledge element.

In step b3, the element vectors of the knowledge elements are subjected to association determination, and association results for the knowledge elements are obtained.

For example, any classification network (e.g., sigmoid network) known in the art may be used to perform the association determination on the element vectors of the two knowledge elements, which is not limited in the present invention.

For example, one or more knowledge elements may be used as a center, and other knowledge elements may be subjected to classification determination, thereby obtaining association results for the respective knowledge elements. For example, with the name of a person as the center, all other knowledge elements are classified into two categories, and if the extracted knowledge elements have no related name, the association determination of the knowledge elements is not performed, and the content of the knowledge elements is invalidated.

Illustratively, for secondary labels that need to be grouped, such as network location, WeChat, New wave, QQ, Taobao, profession, graduate, etc., it is determined whether other knowledge elements are associated therewith, preferably centered on the contents of the first line of the tertiary labels.

In step b4, the knowledge elements are combined according to the association result to obtain a knowledge extraction result.

Illustratively, the knowledge extraction results may include triplets and/or quintuples of knowledge elements.

Table two below shows exemplary knowledge extraction results.

In one embodiment, after the knowledge extraction result is obtained, post-processing may be performed on the knowledge extraction result. Illustratively, the post-processing may include:

(1) if the content of a certain knowledge element is associated to a plurality of knowledge elements, according to the association probability, if the association probability is larger than a preset threshold value, the corresponding group or groups of association content is reserved;

(2) and if two knowledge elements to be predicted appear in a plurality of sliding windows, and the plurality of sliding windows give prediction results, taking the group of results with the highest association probability as the final prediction probability.

In one embodiment, sentence-level event extraction on text may comprise the steps of:

in step c1, it is predicted whether each word in each sentence of the text is a trigger word and its trigger word type based on a preset event trigger word tag.

In one embodiment, the event trigger word tag may be customized as needed, which is not limited in the present invention.

Table three below illustrates an example event-triggered word tag in accordance with one embodiment.

Watch III

In one embodiment, the trained trigger word model may be used to predict trigger words and their trigger word types. The trigger word model may be obtained by training any suitable model known in the art (e.g., BERT semantic representation model, etc.), and the present invention is not limited thereto.

In one embodiment, a semantic representation model in a trigger word model (e.g., a trained BERT semantic representation model) may be used to obtain a semantic representation of each word in a sentence, and then each word is classified (class is a sequence label system, and biees is matched with a trigger word class name), and a trigger word sequence with an output probability exceeding a predetermined threshold is used as a prediction result of the trigger word. It should be understood that there may be multiple trigger words in a sentence.

In one embodiment, an example process for triggering the word model to obtain a semantic representation of each word in a sentence is as follows: acquiring [ CLS ] and semantic representation of each word in the sentence, then outputting a sequence labeling output result (matching of BIOES and trigger word category name) of each word aiming at the semantic representation of each word in the sentence, and taking a trigger word sequence with the output probability exceeding a preset threshold value as a trigger word prediction result.

In step c2, using the predicted trigger word, trigger word type and trigger word position, predicting the event parameter and its parameter type corresponding to the trigger word in the sentence based on the preset event parameter tag.

In one embodiment, the event parameter tags may be customized as desired, but are not limited in this respect.

An example event parameter tag according to one embodiment is shown in table four below.

Watch four

In one embodiment, the event parameters and their parameter types corresponding to the predicted trigger words may be predicted using a trained parametric model. The parametric model may be trained from any suitable model known in the art (e.g., BERT semantic representation model, etc.), and is not limited in this respect.

In one embodiment, the types of the parameter model and the trigger word model are only one more flag bit at the input end, the flag bit of the trigger word position currently processed in the sentence is 1, and the flag bits of the other trigger words are 0, so that which trigger word is currently processed is represented, and after the semantic representation model in the parameter model, the subsequent steps are similar to the prediction process of the trigger word model.

In one embodiment, an example process for the parametric model to predict the event parameters and their parameter types corresponding to the predicted trigger is as follows: setting Segment of the corresponding position of the predicted trigger word as 1, setting other positions as 0, obtaining semantic representation of each word in the sentence by using, for example, a BERT semantic representation model, outputting a sequence labeling output result (BIOES is matched with a parameter category) of each word aiming at the semantic representation of each word in the sentence, and taking the trigger word sequence with the output probability exceeding a preset threshold value as a parameter prediction result.

In step c3, the predicted trigger words and event parameters are combined to obtain an event extraction result.

In one embodiment, the event extraction result may include a triplet and/or a quintet of trigger words and event parameters.

In one embodiment, after the event extraction result is obtained, the event coreference resolution processing may be performed on the event extraction result. The goal of the event coreference resolution process is to effectively understand whether the references in the text point to the same event and to remove duplicate events when it is determined that the same event is pointed to.

In one embodiment, the event coreference resolution process may include two steps: firstly, identifying a reference chain in a chapter, and determining whether each event extraction result of a text points to the same event according to a preset event co-reference feature type in the chapter; and the second step is cross-chapter event coreference resolution based on the index chain, which is used for determining whether the event extraction result of the text and the existing event in the knowledge graph point to the same event according to the preset cross-chapter event coreference characteristic type.

In one embodiment, first, feature types for event coreference resolution processing may be preset, where the feature types include an intra-chapter event coreference feature type and a cross-chapter event coreference feature type, then it is determined whether each event extraction result of the text points to the same event according to a predetermined judgment rule according to the intra-chapter event coreference feature type, and it is determined whether the event extraction result of the text points to the same event as an existing event in the knowledge graph according to the predetermined judgment rule according to the preset cross-chapter event coreference feature type.

In an embodiment, the determination rule may be preset as needed, for example, the number of the events is 2, 3, 4, and the like, and the features are the same, that is, it may be determined that two events are the same event, which is not limited in the present invention.

The current known intra-chapter based event coreference resolution studies mostly depend on the choice of feature space. During the past decade, many scholars have introduced different features for the specificity of event coreference resolution tasks. Compared with the traditional natural language processing, the method has the characteristic of more discreteness in characteristic engineering because the reference is composed of attributes, arguments and the like.

Exemplary designs of the intra-chapter event co-reference feature types and the cross-chapter event co-reference feature types are given in the following tables five and six, respectively.

Common-index characteristics of events in five sections of a table

Common reference characteristics of six-span discourse events

For ease of understanding, the following features of table six are described below:

1) chain of attributes features

Each of the two named chains corresponds to one type and one subtype, and in the extraction of the type characteristics of the named chains, the influence of the named chains with different types on the performance can be eliminated by judging whether the types and the subtypes of the two named chains are consistent; in the extraction of the generic features of the named chains, because each named chain may have a plurality of different generic types, enumerating and encoding the generic types that may appear, and taking the encoded generic types as generic type features; meanwhile, in order to judge whether the generalization types between the two reference chains are consistent, the generalization type with the most frequent occurrence frequency in each reference chain is selected as the main generalization type of the reference chain, and the generalization consistency between the reference chains is judged according to the main generalization type in the two reference chains.

2) Finger chain trigger word features

The trigger word is used as a core part of event composition, plays a role in the event coreference resolution task, and a lot of work is expanded around the trigger word. Based on traditional feature engineering, a trigger word set is extracted from an index chain after considering the factor that the index chain may be composed of a plurality of indexes. The same or different quantities of the trigger words are obtained by comparing whether the trigger words have the same trigger words or not; acquiring the root word and the part of speech of each trigger word by using an interface in the NLTK, and respectively judging the consistency of the root word and the part of speech of each trigger word; in order to avoid differences among narratives of different articles, similarity calculation is carried out on trigger words to serve as similarity characteristics of the trigger words.

3) Argument characteristics of chain of scales

The argument is an important component of an event and has a very important role in understanding event information. Through the argument information, the information of the time, the place, the people and the like of the event can be known. Because the names in the name chain all point to the same event, argument information contained in each name also points to arguments corresponding to the same event, and by acquiring and combining different argument roles and central words of different names in the same name chain, an argument of one name chain can be obtained. By combining different arguments and the headword in different references, the information contained in the argument of the reference chain is greatly enriched. In an argument feature extraction part of the reference chain, the number of the same argument roles among the reference chains is compared to be used as features, and whether corresponding central words in the same argument roles are overlapped or not is compared to be used as further features.

4) Distribution characteristics of finger chain

The distribution of the reference chains in the text can also help us to understand the coreference resolution of the cross-chapter events. If the number of the references in a reference chain is larger, the importance of the event pointed to by the reference chain in the chapter is higher; meanwhile, in other articles under the same subject, the event is also mentioned multiple times, which is consistent with the characteristic that news texts chase and report hot events. According to the frequency of the reference chain, respectively proposing a reference chain length ratio characteristic and a reference chain text importance characteristic, wherein the text importance characteristic is obtained by calculating the proportion of the reference number contained in the reference chain in all the reference numbers in the text; in addition, the distribution of the index chain in the text is convenient for us to understand the importance degree of the index chain in the chapters. Based on this, we obtain the relative position of each reference in the chapter and the relative position of the sentence in which the reference is located in the chapter. And (4) providing a characteristic of head-to-end difference ratio of the relative position of the reference and a characteristic of head-to-end difference ratio of the relative position of the sentence according to the distribution of the positions in the reference chain.

In step S220, the knowledge extraction result and the event extraction result are subjected to knowledge fusion to obtain a knowledge graph.

Wherein knowledge fusion may comprise the steps of: and carrying out entity linking processing on the knowledge extraction result and the event extraction result to obtain an entity linking result, wherein the entity linking result indicates whether each entity in the knowledge extraction result and the event extraction result is the same entity or not and whether each entity is the same entity with the existing entity in the knowledge graph or not. Recombining the knowledge extraction result and the event extraction result based on the entity link result to obtain the graph knowledge for the knowledge graph.

In one embodiment, the entity linking process includes: comparing the entity and attribute in the knowledge extraction result with the entity and attribute in the existing knowledge graph to determine whether the entity and attribute are the same entity, and comparing the entity and attribute in the event extraction result with the entity and attribute in the existing knowledge graph to determine whether the entity and attribute are the same entity. In other words, the entity link of the present invention is not only for the triple or quintuple extracted by the knowledge, but also needs to determine whether the entity type parameter in the event extraction result is the same entity as the entity in the existing triple or quintuple.

In one embodiment, the entity linking process utilizes an unsupervised entity alignment algorithm that synthetically utilizes relationship and attribute triplets for entity linking. The algorithm firstly utilizes the attribute triple to carry out iterative entity alignment, the generated alignment result can be used as training data of an embedded model, the embedded model utilizes the relationship triple to obtain another part of alignment result, and finally, the two parts of alignment results are combined through a regression model to be used as a final entity link result.

In one embodiment, the entity linking process for the knowledge extraction result and the event extraction result may include the steps of:

in step a, entity data in the knowledge base is preprocessed. And dividing the triples in the knowledge graph into attribute triples and relation triples according to the types of the objects, and formulating rules for the attribute triples to standardize the attribute values.

In step b, entity alignment processing is performed by using the attribute triples.

In step c, entity alignment processing is performed using the relationship triplets.

In step d, the two results are combined using a regression model: and (3) taking the result of the iterative model as training data, utilizing a regression model to learn a certain weight, combining the two results to obtain a final entity alignment result, and directly combining the two result sets instead of directly manually assigning the weight. This is done to make better use of the distribution characteristics of the relationships and attributes of the entities in each data set, and it is intuitively understood that setting a fixed weight is not reasonable because the number and quality of the relationships and attributes owned by entities in different data sets are different.

In one embodiment, entity alignment using attribute triplets may include: the attribute value similarity of the same attribute between the two entities is calculated (for example, an edit distance, a cosine distance and the like can be used), the attribute values under all the same attributes are added and averaged, and the obtained average attribute value similarity is used for measuring the similarity of the two entities, so that an entity alignment result is obtained.

Because different knowledge graphs have too few same attributes, based on the aligned entity pairs, many attribute pairs with inconsistent expressions but same meanings can be found, and more entity pairs can be found by using the new attribute pairs. Thus, in one embodiment, when entity alignment is performed, attribute alignment is performed simultaneously to find more entity pairs. In other words, the two tasks of entity alignment and attribute alignment are interacted, and are bundled together to be executed iteratively, so that the problem of entity alignment caused by the diversity of attribute names is alleviated.

In one embodiment, entity alignment using relationship triplets may include: setting a relationship threshold, and screening high-quality entity pairs from results generated by entity alignment by using the attribute triples as training data of a model (for example, an embedding model) for entity alignment by using the relationship triples, so as to obtain other entity alignment results.

In one embodiment, a regression model may be utilized to merge the two-part entity alignment results to obtain an entity linking result. The regression model is used for combination, so that the relationship and the distribution characteristics of the attributes of the entities in each data set can be better utilized, and the entity link result is more reasonable and accurate.

In one embodiment, knowledge fusion may further include: and deducing new map knowledge based on the obtained map knowledge and/or the existing knowledge in the knowledge map, and updating the knowledge map by using the new map knowledge. The process of using existing knowledge (triples) in the graph to obtain some new relationships between entities or attributes (triples) of entities is called knowledge reasoning. For example, there are two triplets in the knowledge-graph: < Zhangsan, wife, lie four >, and < Zhang san, mam, wangwu >, the < lie four, wife, wangwu > can be obtained through knowledge reasoning.

Illustratively, any inference method known in the art may be employed to infer new atlas knowledge, such as a method based on Tableux calculation, a method based on logical programming rewrite, a method based on first order query rewrite, a method based on generative rules, and the like, without limitation.

In one embodiment, the knowledge-graph construction method 200 may further include a knowledge cleansing step to correct or remove redundant or erroneous information. The knowledge cleaning step may include redundant expression merging, obvious error correction and removal, similar entity inference and attribute name supplement, synonymous attribute name merging, and the like, which is not limited in the present invention.

In one embodiment, the knowledge cleansing may be performed using any model or neural network known in the art, and the invention is not limited in this respect.

In one embodiment, redundant representation merging may be implemented using an attribute value information parsing model. For example, taking the household location as an example, the attribute value information analysis model may be used to analyze information such as province, city, district (county), town, village, cell, residential building, and house number, and then merge the expressions with high repetition degree. The attribute value information analysis model generally includes two parts: the method comprises the steps of entity boundary identification and entity classification, wherein the entity boundary identification is used for judging whether a character string forms a complete entity, and the entity classification is used for classifying the identified entities into different preset classes.

The obvious error correction and removal can set a check rule for part of the attribute value data with clear restrictions such as format requirements, value range and the like so as to remove the attribute value data which does not meet the check rule. For example, if the number of digits of a domestic mobile phone number is 11 digits, and if the number of the mobile phone is 1336836, the data is obviously wrong, the data can be corrected or removed.

In one embodiment, a Graph neural network with Inductive reasoning capability, such as a Graph Inductive Learning (gray) model, may be used to perform similar entity inference and supplement attribute names, and the invention is not limited in this respect.

An example process for similar entity inference and supplementing attribute names using the GraIL model is briefly described below.

Firstly, the GraIL model carries out envelope subgraph extraction around the fact to be judged, namely k-hop neighborhood subgraphs are respectively extracted from head and tail nodes in the fact, and intersection is carried out on the obtained head and tail neighborhood subgraphs, so that an envelope graph around the fact is obtained.

Secondly, the extracted envelope subgraph structure is used for carrying out feature coding on the nodes in the subgraph. Here, the feature representation is obtained by measuring the distance between each point in the subgraph and the target node, for example, for a certain point i in the subgraph with the target node u, v, the feature is represented by a tuple (d (i, u), d (i, v)), where d (,) represents the shortest distance between the two points. In particular, the two points u and v are represented by (0,1) and (1,0), respectively.

And finally, evaluating the fact to be judged, reasoning the GraIL model on the envelope subgraph and scoring the reasonableness of the fact.

The GraIL has the characteristic of inductive learning because the characteristics of nodes in the graph are not required to be pre-trained in the whole process.

An example design of the loss function when training the network parameters of the GraIL model is as follows:

wherein ni and pi respectively represent a negative sample and a positive sample, the positive sample is from a training data set, and the negative sample is generated by randomly replacing head and tail entities of the positive sample. The higher the score of negative samples and the lower the score of positive samples, the greater the loss, and the final goal of the optimization is to make all reasonable facts score high and unreasonable facts score low, even though the gray model has the ability to judge the correct facts.

When the trained GraIL model is used for realizing the inference of the target attribute, a candidate set can be generated according to the attribute to be predicted of the target, the possibility of each fact in the candidate set is scored, and the attribute in the fact with the highest score is the predicted attribute.

The nature of the synonymous attribute name merging is that the attribute names with the same meaning need to be found, namely text similarity test. In one embodiment, a Sentence BERT (sequence-BERT) (sbert) model may be employed for synonymy attribute name merging. The model structure utilizes a twin network and a triplet network structure to generate sentence embedding vectors with semantic meanings, and sentences with similar semantics are embedded in closer vectors, so that similarity calculation (cosine similarity, Manhattan distance and Euclidean distance) can be performed. When the most similar sentence pairs are searched by the model structure, the calculation cost is as low as 5 seconds (the cosine similarity is calculated to be about 0.01s), and the precision can still be kept unchanged. Thus, the SBERT model can accomplish some new specific tasks, such as similarity comparison, clustering, and semantic-based information retrieval.

For fine tuning, the sentence BERT model may employ twin networks and triplet networks to update the weight parameters so that the generated sentence vectors have semantic meaning.

The sentence BERT model relies on specific training data, and in one embodiment, the following objective function may be employed herein: a classification objective function, a regression objective function, or a triple objective function.

When a regression objective function or a triple objective function is adopted, a main sentence a, a positive sentence p and a negative sentence n are given, and the triple loss adjusts the network, so that the distance between a and p is smaller than the distance between a and n. Mathematically, we minimize the following loss function:

max(||s _a -s _p ||-||s _a -s _n ||+∈，0)

s represents sentence embedding vectors of a, p and n, | | | | | represents distance, and the edge parameter epsilon represents s _p And s _a Is at least than s _n And (4) approaching.

In another embodiment, the invention provides a knowledge-graph building system. Referring to FIG. 4, FIG. 4 shows a schematic block diagram of a knowledge-graph building system 400 according to one embodiment of the present invention. As shown in FIG. 4, the knowledge-graph building system 400 may include a composite knowledge extraction module 410 and a composite knowledge fusion module 420. The composite knowledge extraction module 410 is configured to perform chapter-level knowledge extraction and sentence-level event extraction on the obtained text as the original data to obtain a knowledge extraction result and an event extraction result. The composite knowledge fusion module 420 is configured to perform knowledge fusion on the knowledge extraction result and the event extraction result to obtain a knowledge graph.

Those skilled in the art can understand the specific implementation method of the knowledge graph constructing system 400 according to the embodiment of the present invention in combination with the foregoing content, and for brevity, detailed descriptions of specific details are omitted here.

In yet another embodiment, the invention provides a computing device. Referring to FIG. 5, FIG. 5 illustrates a schematic block diagram of a computing device 500, according to an embodiment of the invention. As shown in fig. 5, the computing device 500 may include a memory 510 and a processor 520, wherein the memory 510 has stored thereon a computer program that, when executed by the processor 520, causes the processor 520 to perform the method 200 of knowledge-graph construction as described above.

Those skilled in the art can understand the detailed operations of the computing device 500 according to the embodiments of the present invention in combination with the foregoing descriptions, and for the sake of brevity, detailed details are not repeated here, and only some main operations of the processor 520 are described as follows:

acquiring a text serving as original data, and performing chapter-level knowledge extraction and sentence-level event extraction on the text to obtain a knowledge extraction result and an event extraction result; and

and carrying out knowledge fusion on the knowledge extraction result and the event extraction result to obtain a knowledge graph.

In yet another embodiment, the present invention provides a computer readable medium having stored thereon a computer program which, when executed, performs the method 200 of knowledge-graph construction as described in the previous embodiments. Any tangible, non-transitory computer-readable medium may be used including magnetic storage devices (hard disks, floppy disks, etc.), optical storage devices (CD-ROMs, DVDs, blu-ray discs, etc.), flash memory, and/or the like. These computer program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions which execute on the computer or other programmable data processing apparatus create means for implementing the functions specified. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including means for implementing the function specified. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified.

The invention has the following beneficial effects:

(1) the invention changes the concept of the knowledge graph from the traditional static knowledge triple into the composite knowledge containing the static knowledge quintuple and the dynamic event knowledge, uses chapter-level element extraction to be associated with the element, can greatly improve the extraction efficiency and recall rate of the quintuple, and leads the knowledge structural attribute to be stronger.

(2) And performing entity link and cross-chapter event coreference resolution by using entity parameters in the event extraction result, and associating the event with entity information to form a composite knowledge graph.

(3) The invention has a set of perfect automatic knowledge cleaning and updating process, and can effectively improve the quality of the final knowledge map.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the present invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where such features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of knowledge graph construction, the method comprising:

2. The method of claim 1, wherein the chapter-level knowledge extraction of the text comprises:

based on a preset category label, performing knowledge extraction on the text by using a sliding window to extract a knowledge element corresponding to the category label in the text;

and associating the extracted knowledge elements to obtain the knowledge extraction result.

3. The method of claim 2, wherein knowledge extraction of the text using a sliding window comprises:

sliding the sliding window on the text, and calculating the prediction probability of each word in the text in the sliding window for each category label;

calculating the average value of all the prediction probabilities of the same word at all the positions of the sliding window to serve as the final prediction probability of the word;

and selecting the knowledge elements from the text according to preset knowledge element selection conditions based on the final prediction probability of each word.

4. The method of claim 2, wherein correlating the extracted knowledge elements comprises:

processing each word of the knowledge elements to obtain a word vector of each word;

processing the word vectors of all the words corresponding to the knowledge elements to obtain element vectors of the knowledge elements;

performing association judgment on the element vectors of the knowledge elements to obtain association results aiming at the knowledge elements;

and combining the knowledge elements according to the association result to obtain the knowledge extraction result.

5. The method of claim 1, wherein performing sentence-level event extraction on the text comprises:

predicting whether each word in each sentence of the text is a trigger word and the type of the trigger word based on a preset event trigger word label;

predicting event parameters and parameter types thereof corresponding to the trigger words in the sentences based on preset event parameter tags by using the predicted trigger words, the types of the trigger words and the positions of the trigger words;

and combining the trigger words and the event parameters to obtain the event extraction result.

6. The method of claim 1, wherein knowledge fusing the knowledge extraction results and the event extraction results comprises:

performing entity linking processing on the knowledge extraction result and the event extraction result to obtain an entity linking result, wherein the entity linking result indicates whether each entity in the knowledge extraction result and the event extraction result is the same entity and whether each entity is the same entity with an existing entity in the knowledge graph;

recombining the knowledge extraction result and the event extraction result based on the entity link result to obtain graph knowledge for the knowledge graph.

7. The method of claim 6, wherein knowledge fusing the knowledge extraction results and the event extraction results further comprises:

deducing new map knowledge based on the map knowledge and/or existing knowledge in the knowledge map;

updating the knowledge-graph with the new graph knowledge.

8. The method of claim 5, wherein the method further comprises:

performing intra-chapter index chain identification on the event extraction results to determine whether each event extraction result of the text points to the same event according to a preset intra-chapter event common index feature type;

performing cross-chapter event coreference resolution processing on the event extraction result to determine whether the event extraction result of the text and the existing event in the knowledge graph point to the same event or not according to a preset cross-chapter event coreference characteristic type;

based on the determination, duplicate events are removed.

9. The method of claim 1, further comprising a knowledge cleansing step for correcting or removing redundant or erroneous information, wherein the knowledge cleansing step comprises redundant expression merging, explicit error correction and removal, similar entity inference and supplemental attribute name, synonymous attribute name merging.

10. A knowledge graph building system, the system comprising:

the composite knowledge extraction module is used for extracting chapter-level knowledge and sentence-level events of the acquired text serving as the original data so as to obtain a knowledge extraction result and an event extraction result;

and the composite knowledge fusion module is used for carrying out knowledge fusion on the knowledge extraction result and the event extraction result so as to obtain a knowledge graph.

11. A computing device, characterized in that the computing device comprises a memory and a processor, the memory having stored thereon a computer program which, when executed by the processor, causes the processor to carry out the method according to any one of claims 1-9.

12. A computer-readable medium, characterized in that a computer program is stored on the computer-readable medium, which computer program, when executed, performs the method according to any one of claims 1-9.