CN116127087A

CN116127087A - Knowledge graph construction method and device, electronic equipment and storage medium

Info

Publication number: CN116127087A
Application number: CN202211558415.1A
Authority: CN
Inventors: 殷悦迪
Original assignee: Ping An Health Insurance Company of China Ltd
Current assignee: Ping An Health Insurance Company of China Ltd
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-05-16

Abstract

The invention provides a knowledge graph construction method, a knowledge graph construction device, electronic equipment and a computer readable storage medium. The knowledge graph construction method provided by the invention comprises the following steps: acquiring an image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, inputting the text of the text slice, coordinates corresponding to the text and the image material to be processed into a first multi-modal model with complete training to obtain a sequence labeling result corresponding to the text; obtaining a first text relation classification according to a sequence labeling result corresponding to the text and a second multi-modal model which is complete in training; obtaining a text main entity, a text object and a second text relation classification according to the text comprising the first text relation classification and the trained complete relation extraction model; and constructing a knowledge graph according to the text main entity, the text object and the second text relation classification. The knowledge graph construction method can realize knowledge graph construction of the insurance image material.

Description

Knowledge graph construction method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a knowledge graph construction method, a knowledge graph construction device, an electronic device, and a computer readable storage medium.

Background

In the age of artificial intelligence, traditional data processing methods have failed to meet the requirements of humans for information integration and knowledge acquisition. Knowledge graph technology, which is one of the important bases of artificial intelligence, has gained widespread attention in recent years due to its powerful semantic information processing capability and ability to support knowledge reasoning and analysis.

Abundant business knowledge exists in many application scenes, for example, in a security scene, the abundant business knowledge is mainly hidden in image materials of claims and claims uploaded by users, the precious business knowledge only acts on corresponding claims and claims, the business knowledge is not stored in a structured mode, and the business knowledge can be structured through a knowledge graph technology, so that the structured business knowledge can become important reference data of applications such as enterprise product design, risk control and the like. The prior knowledge graph construction scheme for the insurance image material is lacking.

Disclosure of Invention

The invention aims to provide a knowledge graph construction method, a knowledge graph construction device, electronic equipment and a computer readable storage medium, so as to solve the technical problem that a knowledge graph construction scheme about an insurance image material is lacked in the prior art.

The technical scheme of the invention is as follows, and provides a knowledge graph construction method, which comprises the following steps:

acquiring an image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, and inputting a text of the text slice, coordinates corresponding to the text and the image material to be processed into a first multi-modal model with complete training to obtain a sequence labeling result corresponding to the text;

obtaining a first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model which is complete in training;

obtaining a text main entity, a text object and a second text relation classification according to the text comprising the first text relation classification and a trained complete relation extraction model;

and constructing a knowledge graph according to the text main entity, the text object and the second text relation classification.

Further, the training process of the first multimodal model includes:

taking a text sample, coordinates corresponding to the text sample and a first image material sample as input of a first preset multi-mode pre-training model, taking output text of the first preset multi-mode pre-training model as input of a linear layer, taking representation output by the linear layer as input of a CRF layer, and outputting a BIO sequence labeling result corresponding to the text sample by the CRF layer.

Further, the sequence labeling result corresponding to the text includes a text labeled with a BIO sequence, and correspondingly, according to the sequence labeling result corresponding to the text and a second multi-modal model with complete training, a first text relationship classification is obtained, including:

inputting the text marked by the BIO sequence, the corresponding coordinates of the text marked by the BIO sequence and the image material of the text marked by the BIO sequence into a second multi-modal model with complete training, outputting a relation classification result, and obtaining a first text relation classification according to the relation classification result.

Further, the training process of the second multimodal model includes:

and taking the text sample marked by the BIO sequence, the corresponding coordinates of the text sample marked by the BIO sequence and the second image material sample as input of a second preset multi-mode pre-training model, enabling the second preset multi-mode pre-training model to output the representation vector of the text character, respectively carrying out linear layer transformation on the representation vectors of the first character in the two to-be-classified relation entities to obtain transformed representation, taking the transformed representation as input of a Biaffine layer, and outputting a relation classification result by the Biaffine layer.

Further, according to the text containing the first text relationship classification and the trained complete relationship extraction model, obtaining a text main entity, a text object and a second text relationship classification, including:

and carrying out fuzzy matching on the key entities in the text containing the first text relationship classification to obtain a text after fuzzy matching, and inputting the text after fuzzy matching into the relation extraction model with complete training to obtain a text main entity, a text object and a second text relationship classification.

Further, before obtaining a text main entity, a text object and a second text relationship classification according to the text comprising the first text relationship classification and training a complete relationship extraction model, constructing a relationship extraction model according to a BERT text representation layer, a main entity extraction layer, an object extraction and relationship classification layer, and taking a Conditional LayerNormalization network structure as a residual connection mode of the relationship extraction model;

the training process of the relation extraction model comprises the following steps:

inputting a text sample containing the first text relationship classification as the relationship extraction model, and outputting a corresponding text main entity, a text object and a second text relationship classification as the relationship extraction model; and carrying out model training by using a direction propagation algorithm, taking the value of the combined loss function of the text main entity and the text object as a model training target, and stopping training when the value of the combined loss function is larger than or equal to the preset training times and the value of the combined loss function is not reduced, so as to obtain a complete relation extraction model.

Further, constructing a knowledge graph according to the text host entity, the text object and the second text relationship classification, including:

classifying the text main entity, the text object and the second text relationship as texts to be standardized, inputting the texts to be standardized into a similar text retrieval model with complete training, outputting texts with standardized entity names, and constructing a knowledge graph according to the texts with standardized entity names;

the training process of the similar text retrieval model comprises the steps of forming a training data set by using a sample of the text to be standardized and a code base standard name corresponding to the sample of the text to be standardized, training the BERT model by using the training data set, representing the corresponding code base standard name by using the trained BERT model, generating a representation vector, and generating a faiss index file according to the representation vector to obtain the similar text retrieval model.

The invention also provides a knowledge graph construction device, which comprises a data preprocessing module, a first relationship classification module, a second relationship classification module and a graph construction module;

The data preprocessing module is used for acquiring an image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, and inputting a text of the text slice, coordinates corresponding to the text and the image material to be processed into a first multi-mode model with complete training to obtain a sequence labeling result corresponding to the text;

the first relation classification module is used for obtaining first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model which is complete in training;

the second relation classification module is used for obtaining a text main entity, a text object and a second text relation classification according to the text containing the first text relation classification and a trained complete relation extraction model;

and the map construction module is used for constructing a knowledge map according to the text host entity, the text object and the second text relation classification.

Another technical scheme of the present invention is as follows, and further provides an electronic device, including a memory and a processor, where the memory stores a computer program executable by the processor, and the processor implements the knowledge graph construction method according to any one of the above technical schemes when executing the computer program.

Another aspect of the present invention further provides a computer readable storage medium storing a computer program, where the computer program when executed by a processor implements the knowledge graph construction method according to any one of the above aspects.

The invention has the beneficial effects that: acquiring an image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, and inputting a text of the text slice, coordinates corresponding to the text and the image material to be processed into a first multi-modal model with complete training to obtain a sequence labeling result corresponding to the text; obtaining a first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model which is complete in training; obtaining a text main entity, a text object and a second text relation classification according to the text comprising the first text relation classification and a trained complete relation extraction model; constructing a knowledge graph according to the text main entity, the text object and the second text relation classification; by the method, the knowledge graph of the insurance image material can be constructed, and the knowledge graph with higher quality can be comprehensively and rapidly constructed.

Drawings

Fig. 1 is a schematic flow chart of a knowledge graph construction method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an original image material according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an image material including a plurality of text slices according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of text slice content provided in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a network structure of a first multi-mode model according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of labeling results provided by an embodiment of the present invention;

FIG. 7 is a schematic diagram of an image material including sequence annotations according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a network structure of a second multi-modal model according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of an image material including a first text relationship classification according to an embodiment of the present invention;

fig. 10 is a schematic diagram of a network structure of a CASREL model according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a knowledge graph construction device according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of this application, the words "first," "second," and the like are used solely for the purpose of distinguishing between descriptions and not necessarily for the purpose of indicating or implying a relative importance or order. The terms "comprising," "including," "having," and variations thereof herein mean "including but not limited to," unless otherwise specifically noted.

It should be noted that, in the embodiment of the present application, "and/or" describe the association relationship of the association object, which means that three relationships may exist, for example, a and/or B may be represented: a exists alone, A and B exist together, and B exists alone. The character "/", unless otherwise specified, generally indicates that the associated object is an "or" relationship.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

Fig. 1 is a flow chart of a knowledge graph construction method according to an embodiment of the present invention. It should be noted that, if the results are substantially the same, the knowledge graph construction method of the present invention is not limited to the flow sequence shown in fig. 1. As shown in fig. 1, the knowledge graph construction method mainly includes the following steps:

s1, acquiring an image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, and inputting a text of the text slice, coordinates corresponding to the text and the image material to be processed into a first multi-modal model with complete training to obtain a sequence labeling result corresponding to the text;

the image material to be processed can be medical history material of user uploading check or/and claim case, and mainly can comprise medical history material of physical examination report, clinic medical record, admission knot, examination report, test report, etc.; medical history materials contain rich medical knowledge and treatment passes, such as which examination or test is performed in a hospital by a user due to discomfort of individual parts, and the treatment of a certain disease requires operation or chemotherapy. By OCR (optical character recognition, word recognition) technology, text slices in the image material are located and text in the text slices is identified, and coordinates of all text lines in the image material and text are obtained.

S2, obtaining a first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model with complete training; wherein the first multimodal model and the second multimodal model may be different multimodal models;

s3, obtaining a text main entity, a text object and a second text relation classification according to the text containing the first text relation classification and a trained complete relation extraction model;

and S4, constructing a knowledge graph according to the text main entity, the text object and the second text relation classification.

According to the embodiment of the invention, the image material to be processed is obtained, the image material to be processed is identified, a text slice of the image material to be processed is obtained, and a text of the text slice, coordinates corresponding to the text and the image material to be processed are input into a first multi-modal model with complete training, so that a sequence labeling result corresponding to the text is obtained; obtaining a first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model which is complete in training; obtaining a text main entity, a text object and a second text relation classification according to the text comprising the first text relation classification and a trained complete relation extraction model; constructing a knowledge graph according to the text main entity, the text object and the second text relation classification; the knowledge graph construction of the insurance image material can be realized, and the knowledge graph with higher quality can be comprehensively and rapidly constructed.

In an alternative embodiment, the training process of the first multimodal model includes:

It should be noted that, based on sequence labeling of plain text, the effect is poor in the task of labeling image material sequence, the main reasons include that the layout information of image materials, such as the common fonts, positions and texts of keyword entities like "name", "discharge diagnosis", etc. are ignored, the accuracy of restoring paragraphs of image materials based on text and coordinate information rules is low, and the effect is poor for the conditions outside the rule coverage. The OCR analysis result can well acquire the coordinates of the text slice and the corresponding text character string, and at the same time, paragraph information of the text can be lost, the paragraph effect of restoring an image report by using single text information is limited, and the paragraph of the image material can be restored more accurately and effectively by jointly modeling layout information and text (text) information of the image material.

In one embodiment, paragraph merging involves two part sequence labeling and relationship classification. The actual user applies for the original image material of claim or/and the verification, and a schematic diagram of the original image material is shown in fig. 2; after the original image material is subjected to OCR recognition, the text in the original image material is divided into a plurality of text slices, and a schematic diagram of the image material containing the text slices is shown in fig. 3; as shown in fig. 4, the contents of the text slice include text content ", coordinates" position "of the text in the image, and serial number" id "of the slice. Identifying key, value, hospital, other entities in the image material by using a first multi-mode network structure based on a labeling mode of BIO, and dividing and slicing based on an entity identification result, wherein the network structure of the first multi-mode model comprises a LayoutLMv3 network structure (a first preset multi-mode pre-training model network structure) and a CRF layer as shown in FIG. 5. And when the model is trained, a direction propagation algorithm is adopted, a value of a reduced loss function (crf-loss function) is used as a model training target, an early-stop mechanism is added, when the value of the crf-loss function is not reduced and is greater than or equal to the preset training times (for example, 3 rounds of training), the training is stopped, and a model with the minimum value of the crf-loss function in the training process is output as an optimal model.

As shown in fig. 5, in actual labeling, a text to be extracted, coordinates corresponding to the text and an image (picture, original image material) are used as input of layoutLMv3, layoutLMv3 outputs a text token, the token after the text token is transformed by a Linear layer is used as emission probability of a CRF layer, the CRF layer is input, and the CRF layer decodes to obtain a BIO sequence labeling result corresponding to the input text.

In another embodiment, entities in the image material are annotated in sequence, as shown in Table 1,

table 1 shows entities in image material

Entity type	Entity description
		key	Keyword(s)
value	Value corresponding to key
		hospital	Hospital
other	Others

And (3) carrying out slice labeling on the OCR recognition result to obtain a labeling result, wherein a schematic diagram of the labeling result is shown in FIG. 6, and a label of "label" labels the slice text as a key entity in FIG. 6. The BIO labeling results, as shown in Table 2,

TABLE 2 BIO labeling results

Surname with a name	Name of name	Certain kind	Certain kind	Certain kind
					B-key	I-key	B-value	I-value	I-value

In the BIO labeling, a text of "name is input, and a sequence labeling result of" B-key I-key other B-value I-value I-value "is output.

In the implementation, as shown in fig. 3, the text in the original image material is divided into a plurality of text slices, and the slice sequence can only be restored based on the slice coordinate sequence information, but the paragraph information and the structure information cannot be restored; the image material containing the sequence annotation can be obtained based on the sequence annotation of the first multi-mode model, and as shown in fig. 7, by the annotation of the first multi-mode model, key type entities such as "name", "date of admission", "diagnosis of admission" and the like, and value type entities such as "men", "37 years" and the like can be annotated.

In an optional embodiment, the sequence labeling result corresponding to the text includes text labeled with a BIO sequence, and correspondingly, according to the sequence labeling result corresponding to the text and the trained second multimodal model, obtaining the first text relationship classification includes:

In a specific embodiment, the network structure of the second multi-mode model includes a LayoutLMv3 network (a second preset multi-mode pre-training model) and a Biaffine layer, as shown in fig. 8. And performing a relationship classification task by using the second multi-modal model.

In an alternative embodiment, the training process of the second multimodal model includes:

In a specific embodiment, in the training and predicting process of the second multimodal model, after knowing the entity type corresponding to each text slice, sampling a plurality of (three) nearest key (key) entities near each value (value) entity to form a pair of to-be-classified entities, calculating the euclidean distance between coordinates of the key entity (lower right corner point) and the key entity (upper left corner point), and taking the euclidean distance as an evaluation index of the value entity and the distances between the key entity and the nearest key entity.

And training the second multi-mode model by adopting a direction propagation algorithm, taking the value of the cross entropy loss (CE-loss) function as a model training target, adding an early-stop mechanism, stopping training when the value of the CE-loss is larger than or equal to the preset training times (3 rounds of training) and the value of the CE-loss is not reduced, and outputting a model with the minimum value of the cross entropy loss function in the training process as an optimal model.

In practical application, as shown in fig. 8, text corresponding coordinates of a text slice in an image and the image (picture) are taken as input of a layoutLMv3 (second preset multi-mode pre-training model), a representation vector of a text character is output, representation vectors of first characters of two entities (entity 1 and entity 2) to be classified are taken out, the representation vectors are respectively transformed by a Linear layer, the representation after being transformed by the Linear layer is taken as input of a Biaffine layer, a

classification result

1 or 0 is output, and a classification result 1 or 0 (relationship classification result) indicates whether the two entities form a relationship of key and value. And obtaining a first text relation classification according to the relation classification result, namely classifying that each entity in the text forms a key and value relation.

In specific implementation, according to the sequence labeling result corresponding to the text and the trained second multimodal model, a first text relationship classification may be obtained, and a schematic diagram of an image material including the first text relationship classification is shown in fig. 9, where, as shown in fig. 9, a line between boxes represents the first text relationship classification, for example, a value corresponding to a key of "gender" is "male"; the structuring of the text sequence is realized through the first text relation classification, and further, the paragraph understanding of the image material is realized.

In an alternative embodiment, obtaining the text host entity, the text object and the second text relationship classification according to the text containing the first text relationship classification and the trained complete relationship extraction model includes:

In a specific embodiment, the text including the first text relationship classification, that is, the text forming the key and value relationship classification for each entity, that is, the text paragraph after being structured, may implement paragraph topic classification based on a fuzzy matching algorithm according to the text content of the key entity. Fuzzy matching of key entity texts is achieved by using a fuzzywuzzy tool kit, for example, descriptions of a department include a department, a doctor department, an executive department, a patient department, an admission department and the like, when OCR (optical character) recognition has wrong words, and a key (key) entity is an admission shake or the accumulated corpus does not have classification of the admission department, the similarity of descriptions of the belonging department and the department can be obtained based on fuzzywuzzy, for example, the similarity of the admission diagnosis and the past history is high, and the belonging department of the key entity can be classified as the department. Key-value pairs of the key-value pair type can directly return paragraph topic classification extraction results, such as the number of days in hospital: 7 days, sex: male, equal key value pairs. Paragraph types, such as admission, hospitalization passes, etc., require task refinements to be extracted based on textual relationships.

In an optional embodiment, before obtaining the text main entity, the text object and the second text relationship classification according to the text containing the first text relationship classification and training a complete relationship extraction model, the method further comprises constructing a relationship extraction model according to a BERT text representation layer, a main entity extraction layer, object extraction and relationship classification layer, and taking a Conditional LayerNormalization network structure as a residual connection mode of the relationship extraction model;

In one embodiment, the paragraph text is extracted via relationships to obtain a relationship table, as shown in table 3,

TABLE 3 relationship Table

/>

/>

User case information rich in node relationships as shown in table 3 may be structured from paragraph text (text containing the first text relationship classification). Such as "assays and auxiliary checks: hospital check ultrasound in the Jizhou area: the fatty liver (moderate) left kidney stones can be extracted to obtain corresponding checking results, namely ultrasonic checking, and the corresponding checking results include fatty liver (moderate) and left kidney stones.

In another embodiment, a CASREL model is used as a basic model of the relation extraction model, and a network structure diagram of the CASREL model is shown in FIG. 10, wherein FIG. 10 includes a BERT text representation layer (BERT Encoder), a main entity extraction layer (Subject), an object extraction and relation classification layer (Relations), h _N For output of the BERT text presentation layer, v _sub The master entity output by the layer is extracted for the master entity, and k represents the kth master entity. The relation extraction model comprises a BERT text representation layer, which is used for outputting a representation vector of a text; a primary entity extraction layer for identifying primary entities in text; the object extraction and relationship classification layer is used for extracting objects corresponding to the main entity and performing relationship classification; and changing the residual connection mode in the basic model into Conditional LayerNormalization to optimize the network structure. In the model training and prediction process, each text in a paragraph (text containing the first text relationship classification) is taken as input, a main entity (text main entity) in the text is identified, and a corresponding object (text object) and a relationship classification (second text relationship classification) are extracted for each main entity.

Training the relation extraction model by using a direction propagation algorithm to reduce the value of a combined loss function of a host entity and an object as a model training target, wherein the loss functions of the host entity and the object are both poor entropy loss functions, the combined loss function loss=w1×subject_loss+w2×subject_loss, and adding an early-stop mechanism, stopping training when the value of the combined loss function is not reduced by more than or equal to a preset training number (for example, 3 rounds of training), outputting a model with the minimum value of the combined loss function in the training process as an optimal model, wherein the subject_loss is the loss function of the host entity, the subject_loss is the loss function of the object, w1 and w2 are loss function weights, and w1+w2=1.

In specific implementation, the text input to the relation extraction model is "past history: "gastric polypectomy" was performed several months ago, and a review of the gastroscopy showed that: erosive gastritis is sometimes associated with subxiphoid discomfort. The relationship extraction model identifies the primary entity to obtain two primary entities, the surgical name "gastric polypectomy" and the examination name "gastroscope", respectively, the primary entity identification results being shown in table 4. After the two main entities are respectively extracted and classified, the object and the relationship classification corresponding to the main entity 1 are obtained, as shown in table 5, and the object and the relationship classification corresponding to the main entity 2 are obtained, as shown in table 6.

TABLE 4 Master entity identification results

TABLE 5 Classification of objects and relationships corresponding to Master entity 1

TABLE 6 Classification of objects and relationships corresponding to Main entity 2

In an alternative embodiment, constructing a knowledge-graph according to the text host entity, the text object, and the second text-relationship classification includes:

It should be noted that the claim or/and the nuclear care image material may be from different hospitals, and the description modes of the same entity by different hospitals are different, for example, the left kidney stone and the right kidney stone are coded to the ICD code of "N20.000", and the corresponding diseases are "kidney stone"; this requires a name normalization process for entity names such as diseases, operations, examination/assay names, hospital names, etc., that is, the extracted entity name pairs need to be coded to the corresponding code base standard names.

In one embodiment, a similar text retrieval model is built using the SimBERT model and the Faiss toolkit. Taking diseases as an example, disease texts in all ICDs-10 are built into a Faiss index library, and the extracted disease names are input, so that the corresponding diseases in the ICDs-10 and corresponding ICD codes can be output to realize disease name standardization. The input of the SimBERT model is a text pair, which, as shown in table 7,

TABLE 7 text pairs

text	Malignant tumor of esophageal squamous cell and malignant tumor of esophageal junction gland
		synonyms	Malignant tumor at joint of esophagus and stomach

In table 7, text is a disease description (text main entity, text object and second text relation classification) extracted from an image material, synonyms is a disease name (code base standard name) after expected code matching, and on the basis of a SimBERT model, sambert is finely tuned based on service scene data, namely, a training data set is formed by the sample of the text to be standardized and the code base standard name corresponding to the sample of the text to be standardized, and the BERT model (SimBERT model) is trained by the training data set; all disease standard names, the trimmed simbert representation (trained BERT model) is sentence vector, and a Faiss index file is generated based on a Faiss toolkit. And inputting the text to be standardized into the Faiss index file, and outputting the text with standardized entity names.

In another embodiment, the case data structured results (text normalized by entity names) may be written into the graph database, so that the graph data in the graph database includes basic information of the user, such as name, gender, age, etc.; disease information such as a certain disease in a certain year; checking/assaying information; surgical information; medication information; chemotherapy information, etc., by constructing a knowledge-graph from the graph data in the graph database. It should be noted that, before writing the structured result of the case data into the graph database, the result with low confidence level may be audited to improve accuracy of the graph data in the graph database, where the confidence level of the structured result of the case data (image material) may be obtained according to accuracy of the model algorithms such as the first multi-model, the second multi-model, the relation extraction model, and the similar text retrieval model.

According to the knowledge graph construction method provided by the embodiment of the invention, the image material to be processed is obtained, the image material to be processed is identified, a text slice of the image material to be processed is obtained, the text of the text slice, the coordinates corresponding to the text and the image material to be processed are input into a first multi-modal model with complete training, and a sequence labeling result corresponding to the text is obtained; obtaining a first text relation classification according to the sequence labeling result corresponding to the text and a second multi-modal model which is complete in training; obtaining a text main entity, a text object and a second text relation classification according to the text comprising the first text relation classification and a trained complete relation extraction model; constructing a knowledge graph according to the text main entity, the text object and the second text relation classification; the knowledge graph construction of the insurance image material can be realized, and the knowledge graph with higher quality can be comprehensively and rapidly constructed.

It should be noted that, for a plurality of lines of paragraph text or list structure text, OCR recognition is directly performed, and then the text is structured, so that it is difficult to restore the document structure; after OCR recognizes an image material, there is often a misplaced word or punctuation recognition error, and if the document structure is restored based on only text features, the restoration effect is greatly affected by the OCR recognition effect. According to the knowledge graph construction method provided by the embodiment of the invention, the image material is structured through the multi-mode features, the image layout features, the coordinate information and the text features are integrated, and the text structure of the image material can be effectively restored, so that compared with the method for restoring the text structure by only considering the text features or according to the coordinate information rule, the method is more effective; the embodiment of the invention also uses a relation extraction model comprising a CASREL network and Conditional Layer Normalization to convert a discrete relation classification task into a ternary group searching task, so that a plurality of relations can be extracted from a text at the same time, and the problem of relation overlapping can be effectively solved; because the same entity has a plurality of description modes, and because mispronounced words can exist after OCR recognition, the embodiment of the invention utilizes a similar text retrieval model constructed by a SimBERT model and a Faiss toolkit to realize entity name standardization of hospitals, diseases, examination, assays and the like. Nodes or relations with lower confidence level for the structural result of the image material can be checked and corrected; and finally, writing the structured image material into a map database to update map data.

The image material in the insurance scene can become important reference data of scenes such as enterprise product design, risk control, claim settlement automation, and nuclear insurance automation, and is an important knowledge and financial resource of insurance enterprises. Through the knowledge in the structured image material, the domain knowledge graph is constructed, so that various pressures of insurance enterprises in risk management and control can be relieved to a certain extent, damage to enterprise interests caused by vulnerabilities in the enterprise operation process by a projector is avoided, meanwhile, supervision of the insurance industry can be optimized through the knowledge graph, and a supervision layer can accurately grasp the overall operation condition of the insurance enterprises, so that the non-rainy silk on supervision is realized. In addition, the knowledge graph can solve the problems of asymmetric information of consumers, high cost of obtaining product information and the like in transaction behaviors, and provides a new thought for optimizing consumer service experience.

The knowledge graph construction method provided by the embodiment of the invention can be used for constructing based on artificial intelligence, acquiring and processing related data based on an artificial intelligence technology, and realizing unattended knowledge graph construction. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Fig. 11 is a schematic structural diagram of a knowledge graph construction device according to an embodiment of the present invention, and as shown in fig. 11, the knowledge graph construction device 110 includes a data preprocessing module 111, a first relationship classification module 112, a second relationship classification module 113, and a graph construction module 114;

the data preprocessing module 111 is configured to obtain an image material to be processed, identify the image material to be processed, obtain a text slice of the image material to be processed, input a text of the text slice, coordinates corresponding to the text, and the image material to be processed into a first multi-modal model with complete training, and obtain a sequence labeling result corresponding to the text;

the first relationship classification module 112 is configured to obtain a first text relationship classification according to the sequence labeling result corresponding to the text and a second multi-modal model that is complete in training;

The second relationship classification module 113 is configured to obtain a text host entity, a text object, and a second text relationship classification according to the text including the first text relationship classification and a trained complete relationship extraction model;

the map construction module 114 is configured to construct a knowledge map according to the text host entity, the text object, and the second text relationship classification.

In an optional embodiment, the knowledge graph construction apparatus 110 further includes a first training module, where the first training module is configured to train a first multi-modal model, and a training process of the first multi-modal model includes:

In an optional embodiment, the sequence labeling result corresponding to the text includes a text labeled with a BIO-sequence, and accordingly, the first relationship classification module 112 is further configured to input the text labeled with the BIO-sequence, the corresponding coordinates of the text labeled with the BIO-sequence, and the image material where the text labeled with the BIO-sequence is located into a second multimodal model with complete training, output a relationship classification result, and obtain a first text relationship classification according to the relationship classification result.

In an optional embodiment, the knowledge graph construction apparatus 110 further includes a second training module, where the second training module is configured to train a second multi-modal model, and a training process of the second multi-modal model includes:

In an optional implementation manner, the second relationship classification module 113 is further configured to perform fuzzy matching on the key entity in the text that includes the first text relationship classification, obtain a text after fuzzy matching, and input the text after fuzzy matching into the trained complete relationship extraction model, so as to obtain a text main entity, a text object, and a second text relationship classification.

In an optional embodiment, the knowledge graph construction device 110 further includes a relationship extraction model construction module and a third training module, where the relationship extraction model construction module is configured to construct a relationship extraction model according to the BERT text representation layer, the main entity extraction layer, the object extraction and the relationship classification layer, and take a Conditional Layer Normalization network structure as a residual connection mode of the relationship extraction model;

the third training module is configured to train a relationship extraction model, where a training process of the relationship extraction model includes:

In an optional implementation manner, the map construction module 114 is further configured to classify the text host entity, the text object, and the second text relationship as a text to be normalized, input the text to be normalized into a training complete similar text retrieval model, output a text normalized by an entity name, and construct a knowledge map according to the text normalized by the entity name;

According to the knowledge graph construction device provided by the embodiment of the invention, the data preprocessing module 111 is used for acquiring the image material to be processed, identifying the image material to be processed to obtain a text slice of the image material to be processed, and inputting the text of the text slice, the coordinates corresponding to the text and the image material to be processed into a first multi-modal model with complete training to obtain a sequence labeling result corresponding to the text; the first relation classification module 112 obtains a first text relation classification according to the sequence labeling result corresponding to the text and the trained second multi-modal model; the second relation classification module 113 obtains a text host entity, a text object and a second text relation classification according to the text containing the first text relation classification and a trained complete relation extraction model; the map construction module 114 constructs a knowledge map according to the text host entity, the text object, and the second text relationship classification; the knowledge graph construction of the insurance image material can be realized, and the knowledge graph with higher quality can be comprehensively and rapidly constructed.

Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 12, the electronic device 120 includes a processor 121 and a memory 122 communicatively coupled to the processor 121.

The memory 122 stores program instructions for implementing the knowledge graph construction method of any of the above embodiments.

The processor 121 is configured to execute program instructions stored in the memory 122 for knowledge graph construction.

The processor 121 may also be referred to as a CPU (Central Processing Unit ). The processor 121 may be an integrated circuit chip with signal processing capabilities. Processor 121 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The embodiment of the invention provides a storage medium, which stores program instructions capable of implementing all the methods, and the storage medium can be nonvolatile or volatile. Wherein the program instructions may be stored in the form of a software product on the above-mentioned storage medium, comprising instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), a magnetic disk, or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The foregoing is only the embodiments of the present invention, and the patent scope of the invention is not limited thereto, but is also covered by the patent protection scope of the invention, as long as the equivalent structures or equivalent processes of the present invention and the contents of the accompanying drawings are changed, or the present invention is directly or indirectly applied to other related technical fields.

While the invention has been described with respect to the above embodiments, it should be noted that modifications can be made by those skilled in the art without departing from the inventive concept, and these are all within the scope of the invention.

Claims

1. The knowledge graph construction method is characterized by comprising the following steps of:

2. The knowledge graph construction method according to claim 1, wherein the training process of the first multimodal model includes:

3. The knowledge graph construction method according to claim 2, wherein the sequence labeling result corresponding to the text includes a text labeled with a BIO sequence, and the obtaining of the first text relationship classification according to the sequence labeling result corresponding to the text and the second multi-modal model with complete training includes:

4. The knowledge graph construction method according to claim 3, wherein the training process of the second multi-modal model includes:

5. The knowledge graph construction method according to claim 1, wherein obtaining a text host entity, a text object, and a second text relationship classification based on the text containing the first text relationship classification and a trained complete relationship extraction model, comprises:

6. The knowledge graph construction method according to claim 1, wherein before obtaining a text main entity, a text object and a second text relationship classification according to the text comprising the first text relationship classification and training a complete relationship extraction model, further comprising constructing a relationship extraction model according to a BERT text representation layer, a main entity extraction layer, an object extraction and relationship classification layer, and using a Conditional Layer Normalization network structure as a residual connection mode of the relationship extraction model;

7. The knowledge-graph construction method according to claim 1, wherein constructing a knowledge-graph from the text host entity, the text object, and the second text-relationship classification comprises:

8. The knowledge graph construction device is characterized by comprising a data preprocessing module, a first relationship classification module, a second relationship classification module and a graph construction module;

9. An electronic device comprising a memory, a processor, the memory storing a computer program executable by the processor, wherein the processor implements the knowledge-graph construction method of any one of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the knowledge-graph construction method according to any one of claims 1 to 7.