CN115168620A - Self-supervision joint learning method oriented to knowledge graph entity alignment - Google Patents

Self-supervision joint learning method oriented to knowledge graph entity alignment Download PDF

Info

Publication number
CN115168620A
CN115168620A CN202211098589.4A CN202211098589A CN115168620A CN 115168620 A CN115168620 A CN 115168620A CN 202211098589 A CN202211098589 A CN 202211098589A CN 115168620 A CN115168620 A CN 115168620A
Authority
CN
China
Prior art keywords
entity
alignment
similarity
embedding
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211098589.4A
Other languages
Chinese (zh)
Inventor
王永恒
金雄男
蒋雷
王芷霖
王超
巫英才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Zhejiang Lab
Original Assignee
Zhejiang University ZJU
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU, Zhejiang Lab filed Critical Zhejiang University ZJU
Priority to CN202211098589.4A priority Critical patent/CN115168620A/en
Publication of CN115168620A publication Critical patent/CN115168620A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a self-supervision joint learning method for knowledge graph entity alignment, which comprises the following steps: firstly, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the image similarity by calculating Euclidean distance between the image characteristics, and selecting an entity pair with the highest similarity as seed alignment; embedding the knowledge graph into a computer low-dimensional vector space by using multi-mode information of the knowledge graph based on a knowledge embedding model under the supervision of seed alignment in the step one; and step three, calculating multi-mode interaction similarity vectors of the entities based on the embedding of the knowledge graph in the step two, learning the weight of each mode through mode fusion and generating final entity embedding, calculating cosine distance based on the entity embedding to measure the alignment possibility between the entities, and outputting an alignment entity list. The invention carries out entity alignment in a self-supervision joint learning mode, does not need manual intervention in the whole process and ensures the expansibility of the system.

Description

Self-supervision joint learning method oriented to knowledge graph entity alignment
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a self-supervision joint learning method for knowledge graph entity alignment.
Background
Enabling human knowledge to machines is an important research direction in the field of artificial intelligence. Knowledge maps are a form of expressing structured human knowledge, attracting widespread attention in academia and industry. With the development of knowledge modeling, extraction, and construction technologies, more and more knowledge graphs are constructed and exposed through the Web. Typical knowledge maps are DBpedia, YAGO and Wikidata. The knowledge graph is widely applied to artificial intelligent systems such as intelligent question answering, recommendation systems and information retrieval.
In knowledge graph-based applications, information interaction between knowledge graph systems is often required to obtain data or implement specific functions. Even in a single knowledge graph system, the information sources are typically from different domains. For example, assume that a cargo ship full of GPU chips suddenly disappears from the monitoring system. To assess potential impacts and locate cargo ships, knowledge maps in various fields are needed, such as companies, industries, logistics, satellites, drones, etc. However, cross-domain knowledge-graphs are typically heterogeneous. Within the same domain, knowledge graphs constructed by different organizations are also often heterogeneous. In addition, the complexity of human knowledge and the variability of subjective opinion of the world make it impractical to build an all-inclusive, unified knowledge map. Therefore, interoperability difficulties between knowledge graphs due to heterogeneity are ubiquitous.
Knowledge fusion is an effective method to solve the above problems. Knowledge fusion aims at establishing relationships between heterogeneous knowledge graphs so that the knowledge graphs can communicate and cooperate with each other. Knowledge fusion can be divided into body-layer fusion and physical-layer fusion according to the type of heterogeneity. In practical applications, since knowledge-graph entities are usually large in scale, entity-layer fusion has become a major task of knowledge fusion in recent years, especially entity alignment.
The goal of entity alignment is to establish equivalence relationships between entities, which are typically contained in different knowledge graphs. Studies of entity alignment can be broadly divided into three types: the technology of semantic network research community, the method of database research community and the method based on knowledge map embedding. The first two methods have a limitation in that they can only align entities if the knowledge graph contains a certain amount of attribute information. In contrast, the knowledge-graph-based embedding method is applicable to most scenarios and ensures competitive performance.
In academia, entity alignment based on embedding is a popular research topic. SEA, OTEA and AKE are embedded by knowledge graph based on knowledge triplets and aligned by transformation. IPTransE and RSN4EA employ path-based knowledge-graph embedding and perform alignment in a parameter sharing mode. MuGNN and KECG utilize neighbor-based embedding and perform alignment in a calibrated manner. Still other entity alignment methods may utilize auxiliary information such as attributes and text. However, these approaches employ supervised or semi-supervised learning based on manual labeling for entity alignment. Manual annotation is a time consuming and costly task that limits the applicability and scalability of the (semi-) supervised entity alignment approach.
More recently, researchers have proposed an automated, supervised entity alignment method that can generate seed alignments without manual intervention. The self-supervision entity alignment method EVA uses a pre-training image learning model to generate seed alignment, and then provides a joint learning method for embedding structure, image, relation and attribute information. However, the description and type information of the entity is ignored, resulting in limited alignment accuracy.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a self-supervision joint learning method oriented to knowledge graph entity alignment, which effectively utilizes multi-mode information carried by a knowledge graph and combines a knowledge embedding model to automatically and efficiently fuse the knowledge graph, and the specific technical scheme is as follows:
a self-supervision joint learning method facing knowledge graph entity alignment comprises the following steps:
firstly, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the similarity of images by calculating Euclidean distance between the image characteristics, and selecting the entity pair with the highest similarity as a seed alignment according to the image similarity;
secondly, under the supervision of seed alignment in the first step, embedding the knowledge graph into a computer low-dimensional vector space by using structural information, entity description, attribute values and entity type information of the knowledge graph based on a graph convolution network, a language pre-training model and a hierarchical type embedding model;
and step three, based on the embedding of the knowledge graph in the step two, calculating similarity vectors of paired description interaction, neighbor description interaction, attribute interaction and type interaction of the entities, learning the weight of each modality through modality fusion, generating final entity embedding, finally, calculating cosine distance based on the entity embedding to measure the alignment possibility between the entities, and outputting an alignment entity list.
Further, the first step specifically comprises: adopting a ResNet-152 model pre-trained on a recognition task of an ImageNet image database as an image pre-training deep learning model, extracting and outputting the characteristics of each entity image from a first layer of the model, and measuring the similarity between the image pairs of the entities by utilizing the Euclidean distance between the extracted characteristics; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.
Further, the second step specifically includes the following substeps:
step (2.1) embedding the entities by using a language pre-training model based on the character description and attribute information of the entities, so that the entities with similar character description are adjacent in a vector space;
step (2.2) based on the structure information of the knowledge graph, the graph convolution network is utilized to embed the knowledge graph for enhancing the neighbor information;
and (2.3) based on the type information of the entities, embedding the models through the hierarchical types to enable the entities with similar types to be adjacent in a vector space.
Further, the step (2.1) specifically comprises:
first, a training data set is constructed based on seed alignment
Figure 100002_DEST_PATH_IMAGE002
Figure 100002_DEST_PATH_IMAGE004
Is an entity
Figure 100002_DEST_PATH_IMAGE006
The corresponding entities, i.e. similar entities,
Figure 100002_DEST_PATH_IMAGE008
is randomly selected from
Figure 27050DEST_PATH_IMAGE004
The dissimilar entities of (a);
then, the text description of the entity is led into a language pre-training model BERT for fine adjustment of knowledge graph embedding;
finally, the CLS embedding of the language pre-training model BERT is filtered by utilizing a multi-layer perceptronIn, obtaining the embedding of entities
Figure 100002_DEST_PATH_IMAGE010
CLS is a special class label used in model BERT;
pair of margin classification loss functions for use in fine tuning
Figure 100002_DEST_PATH_IMAGE012
The following were used:
Figure 100002_DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE016
is obtained by
Figure 865562DEST_PATH_IMAGE010
And
Figure 100002_DEST_PATH_IMAGE018
in between
Figure 100002_DEST_PATH_IMAGE020
The distance is initialized and the distance is initialized,
Figure 100002_DEST_PATH_IMAGE022
Figure 42638DEST_PATH_IMAGE004
or
Figure 969006DEST_PATH_IMAGE008
Figure 100002_DEST_PATH_IMAGE024
Representing the margin employed between a similar entity pair and a dissimilar entity pair,
Figure 100002_DEST_PATH_IMAGE026
smaller means that the two entities are more similar in describing the embedding.
Further, the step (2.2) is specifically:
based on a graph convolution network GCN, an entity is embedded into aggregation neighbor entity information in an information propagation mode, and the information propagation mode has the following rules:
Figure 100002_DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE030
representing the adjacency matrix with the self-join added,
Figure 100002_DEST_PATH_IMAGE032
is a matrix of units, and is,
Figure 100002_DEST_PATH_IMAGE034
and
Figure 100002_DEST_PATH_IMAGE036
meaning that the layer-specific trainable weight matrix,
Figure 100002_DEST_PATH_IMAGE038
it is indicated that an activation function is to be used,
Figure 100002_DEST_PATH_IMAGE040
and (4) representing the activation matrix at the l-th layer, wherein N represents the number of entities, and D represents the dimension of an entity vector.
Further, the step (2.3) is specifically:
perceiving distance functions by hierarchical type
Figure 100002_DEST_PATH_IMAGE042
Calculating the distance between the two entities in the type embedding space, and learning the vector representation of the entities in the type embedding space based on a hierarchical type embedding model HTE;
the hierarchical type perceptual distance function is as follows:
Figure 100002_DEST_PATH_IMAGE044
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE046
is a parameter of the distance to be measured,
Figure 100002_DEST_PATH_IMAGE048
indicating the similarity of the type between the two entities,
Figure 100002_DEST_PATH_IMAGE050
calculating the similarity of the type through the information content of the nearest common father type by using the similarity value of the normalized type, wherein the similarity value of the type is as follows:
Figure 100002_DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE054
is shown at the same time as
Figure 100002_DEST_PATH_IMAGE056
And
Figure 626121DEST_PATH_IMAGE022
of the set of parent classes of (a) is,
Figure 100002_DEST_PATH_IMAGE058
meaning the value of the information content of type t, the more specific the information content, the higher the value of the information content.
Further, the third step specifically includes the following substeps:
step (3.1) based on the entity text description embedding of step (2.1), calculating the cosine distance between two entity vectors, and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a pair description similarity vector;
step (3.2) based on the entity neighbor information embedding of step (2.2), calculating a neighbor description similarity vector;
step (3.3) based on the entity attribute information embedding of step (2.1), calculating an attribute similarity vector;
step (3.4) calculating the cosine distance between two entity vectors based on the entity type embedding in step (2.3), and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a type similarity vector;
step (3.5) under the supervision of alignment in the step one, jointly learning pairwise description, neighbor description, attribute and type similarity to generate the similarity between the final entity pair, and finding out a corresponding aligned entity based on the entity similarity;
and (3.6) selecting a candidate entity with the highest alignment probability for each entity by adopting a greedy alignment strategy to generate an aligned entity pair, deleting the aligned entities from the corresponding knowledge graph, and repeating the step (3.6) until the entity in one knowledge graph is empty.
Further, the step (3.2) is specifically: first of all, the structure
Figure 100002_DEST_PATH_IMAGE060
The value of the matrix represents the ith neighbor of the entity e and the entity
Figure 776480DEST_PATH_IMAGE022
The cosine similarity of the jth neighbor of (1); secondly, applying a max-posing technology to each row and each column of the neighbor matrix to select the most relevant neighbor pair; then, extracting the characteristics of the collected similarity through an RBF kernel aggregation function; finally, row and column aggregation vectors are concatenated to represent the neighbor description similarity vector.
Further, the step (3.3) is specifically: first of all, is constructed
Figure 823939DEST_PATH_IMAGE060
The value of the matrix represents the ith attribute of the entity e and the entity
Figure 553998DEST_PATH_IMAGE022
The cosine similarity of the jth attribute of (1); secondly, applying max-pooling and RBF kernel aggregation functions to each row and column of the attribute matrix; finally, row and column aggregation vectors are concatenated to represent the attribute similarity vector.
Further, the specific process of the step (3.5) is as follows: based on entity similarity vector
Figure 100002_DEST_PATH_IMAGE062
Learning alignment possibilities between two entities through a multi-layered perceptron
Figure 100002_DEST_PATH_IMAGE064
Figure 567084DEST_PATH_IMAGE064
The greater the value of (a), the greater the probability that two entities are aligned, wherein
Figure 91606DEST_PATH_IMAGE062
The vector is generated by fusing the vectors of the steps (3.1) to (3.4) in a weight concatenation mode.
Has the beneficial effects that:
the invention aims to solve the problem of how to break through the constraint of manual labeling and fuse large-scale knowledge maps, and aims to provide an entity alignment-oriented self-supervision joint learning method, fully and effectively utilize multi-mode information carried by the knowledge maps, combine with a knowledge embedding model, automatically and efficiently fuse the knowledge maps, perform entity alignment in a self-supervision joint learning mode, and ensure the expansibility without manual intervention in the whole process.
Drawings
FIG. 1 is a schematic flow chart of an adaptive supervised joint learning method for knowledge-graph entity alignment according to the present invention;
FIG. 2 is a schematic diagram of an adaptive supervised joint learning framework for knowledge-graph entity alignment according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the type hierarchy of the knowledge graph Dbpedia according to an embodiment of the invention;
fig. 4 is a schematic structural diagram of an automatic supervision joint learning foolproof system facing knowledge-graph entity alignment according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples.
As pre-trained language models develop, textual descriptions can play an important role in entity alignment, and furthermore, entities of disparate types are less likely to be aligned entities, and types in a knowledge graph are typically hierarchical, meaning that similarities between types are measurable.
As shown in fig. 2, the entity alignment framework based on knowledge graph embedding mainly includes a seed alignment generation module, a knowledge graph embedding module, and an alignment interaction module. The seed alignment generation module collects initial alignment labels, namely, an initial entity pair set with the same meaning is generated; the knowledge map embedding module maps the high-dimensional knowledge map to a low-dimensional computing space for computer processing, and the embedding process is executed under the supervision of seed alignment; the alignment interaction module measures similarity between the entity pairs based on the embedding and generates a final alignment list.
As shown in fig. 1, the invention provides an entity alignment-oriented self-supervision joint learning method, which adopts a seed alignment generation module to automatically generate an entity pair with the same semantics and high reliability as seed alignment according to an image of an entity; mapping high-dimensional map data to a low-dimensional vector space which is easy to process by a computer under the supervision of automatically generated seed alignment by using a knowledge map embedding module; judging whether a pair of entities in different knowledge graphs are the same or not by adopting an alignment interaction module according to embedding, wherein the alignment interaction module is an interaction model based on an alignment strategy, and the alignment strategy guides the sequence of an alignment process, and the method specifically comprises the following steps:
step one, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the similarity of images by calculating Euclidean distances between the image characteristics, and selecting the entity pair with the highest similarity as a seed alignment according to the image similarity.
Specifically, at the seed alignment generation module, the image features are learned by using a pre-trained deep learning model to measure the similarity of the images. Specifically, the entity alignment Self-supervision joint learning method Self-EA adopts a ResNet-152 model which is trained in advance on the recognition task of an ImageNet image database. ImageNet is one of the most well-known image databases organized according to the WordNet hierarchy, containing 1400 more than ten thousand images. The ResNet-152 model is a deep residual learning network of depth 152 layers. Performing a simple feed-forward process on the ResNet-152 model and extracting the output from the first layer as features for each entity image; then, measuring the similarity between the image pairs of the entities by using the Euclidean distance between the extracted features; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.
WordNet is an english dictionary based on cognitive linguists designed jointly by psychologists, linguists and computer engineers at Princeton university. WordNet differs most significantly from other standard dictionaries in that: it divides vocabulary into five major classes: nouns, verbs, adjectives, adverbs, and fictitious words. In fact, wordNet contains only nouns, verbs, adjectives and adverbs. The null words are usually part of the syntactic component of the language, and WordNet ignores the smaller set of null words in english. WordNet differs from general dictionaries in the organization structure, which is organized with synonym set (sync) as the basic unit of construction, and a user can find a suitable word in the synonym set to express a known concept.
Step two, embedding the knowledge graph into a low-dimensional vector space which is easy to process by a computer by using the structure information, the entity description, the attribute value and the entity type information of the knowledge graph in a knowledge graph embedding module based on a graph convolution network, a language pre-training model and a hierarchical type embedding model, and simultaneously keeping the original information of the entity as much as possible, wherein the method specifically comprises the following substeps:
step (2.1) based on the text description and attribute of the entity, the entity is embedded by using a language pre-training model, so that entities with similar text description are adjacent in a vector space, and the specific process is as follows:
first, a training data set is constructed based on seed alignment
Figure 111515DEST_PATH_IMAGE002
Figure 379685DEST_PATH_IMAGE004
Is an entity
Figure 434229DEST_PATH_IMAGE006
The corresponding entities, i.e. similar entities,
Figure 644499DEST_PATH_IMAGE008
is randomly selected from
Figure 417283DEST_PATH_IMAGE004
The dissimilar entities of (a);
then, the text description of the entity is led into a language pre-training model BERT for fine adjustment of knowledge graph embedding;
finally, using the multilayer perceptron to filter CLS embedding of the language pre-training model BERT to obtain embedding of the entity
Figure 223565DEST_PATH_IMAGE010
CLS is a special class label used in model BERT;
pair of margin classification loss functions for use in fine tuning
Figure 132615DEST_PATH_IMAGE012
The following were used:
Figure DEST_PATH_IMAGE014A
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE066
is obtained by
Figure 343148DEST_PATH_IMAGE010
And
Figure 852495DEST_PATH_IMAGE018
in between
Figure 462468DEST_PATH_IMAGE020
The distance is initialized and the distance is initialized,
Figure 226025DEST_PATH_IMAGE024
representing the margin employed between a similar entity pair and a dissimilar entity pair,
Figure 528830DEST_PATH_IMAGE026
smaller means that the two entities are more similar in describing embeddings.
Step (2.2) based on the structure information of the knowledge graph, the graph convolution network is utilized to embed the knowledge graph for enhancing the neighbor information, and the specific process is as follows:
based on a graph convolution network GCN, an entity is embedded into aggregation neighbor entity information in an information propagation mode, and the information propagation mode has the following rules:
Figure DEST_PATH_IMAGE028A
wherein the content of the first and second substances,
Figure 823677DEST_PATH_IMAGE030
representing the adjacency matrix with the self-join added,
Figure 971761DEST_PATH_IMAGE032
is a matrix of units, and is,
Figure 855404DEST_PATH_IMAGE034
and
Figure 578378DEST_PATH_IMAGE036
meaning that a layer-specific trainable weight matrix is constructed,
Figure 547471DEST_PATH_IMAGE038
an activation function is indicated. In addition, the first and second substrates are,
Figure 499246DEST_PATH_IMAGE040
and (4) representing the activation matrix at the l-th layer, wherein N represents the number of entities, and D represents the dimension of an entity vector.
And (2.3) based on the type information of the entities, embedding a hierarchical type embedded model HTE to ensure that the entities with similar types are adjacent in a vector space, wherein the specific process comprises the following steps:
perceiving distance function by hierarchy type
Figure 237395DEST_PATH_IMAGE042
Calculating the distance between the two entities in the type embedding space, and learning the vector representation of the entities in the type embedding space based on a hierarchical type embedding model HTE; the hierarchical type perceptual distance function is as follows:
Figure DEST_PATH_IMAGE044A
wherein the content of the first and second substances,
Figure 695053DEST_PATH_IMAGE046
is a parameter of the distance to be measured,
Figure 151442DEST_PATH_IMAGE048
indicating the similarity of the type between the two entities,
Figure 641329DEST_PATH_IMAGE050
calculating the similarity of the type through the information content of the nearest common father type by using the similarity value of the normalized type, wherein the similarity value of the type is as follows:
Figure DEST_PATH_IMAGE052A
wherein, the first and the second end of the pipe are connected with each other,
Figure 545569DEST_PATH_IMAGE054
is shown at the same time as
Figure 361078DEST_PATH_IMAGE056
And
Figure 55496DEST_PATH_IMAGE022
the set of parent classes of (a) is,
Figure 614653DEST_PATH_IMAGE058
meaning the value of the information content of type t, the more specific the information content, the higher the value of the information content.
As shown in fig. 3, the knowledge graph DBpedia type hierarchy part structure, the relationship between types in the graph is subclass, the numerical value represents the information content value, and the more abstract the type is, the lower the information content value is; conversely, the more specific the type, the higher the information content value. For example, the closest common parent of the sports league and the samba school is an organization, so the similarity of the type before generalization is 0.531 of the information content of the organization, the closest common parent of the sports league and the morba school is the broadest owl: thing (something common), and the information content is 0.
Thirdly, calculating similarity vectors of pairwise description interaction, neighbor description interaction, attribute interaction and type interaction of entities based on embedding of the knowledge graph in the second step in an alignment interaction module, learning the weight of each modality through modality fusion, generating final entity embedding, calculating cosine distance based on the entity embedding to measure alignment possibility between the entities, and outputting an alignment entity list, wherein the alignment interaction module specifically comprises the following substeps:
and (3.1) based on the entity text description embedding in the step (2.1), calculating the cosine distance between two entity vectors, and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a pair description similarity vector.
Step (3.2) based on the entity neighbor information embedding of step (2.2), calculating the neighbor description similarity vector, the concrete process is: first of all, is constructed
Figure DEST_PATH_IMAGE068
The value of the matrix represents the ith neighbor of the entity e and the entity
Figure DEST_PATH_IMAGE070
The cosine similarity of the jth neighbor of (1); secondly, applying a max-posing technology to each row and each column of the neighbor matrix to select the most relevant neighbor pair; then, extracting the characteristics of the collected similarity through an RBF kernel aggregation function; finally, row and column aggregation vectors are concatenated to represent the neighbor description similarity vector.
And (3.3) based on the entity attribute embedding in the step (2.1), calculating an attribute similarity vector, wherein the specific process is as follows: first of all, is constructed
Figure 124132DEST_PATH_IMAGE068
The value of the matrix represents the ith attribute of e
Figure 94231DEST_PATH_IMAGE070
The cosine similarity of the jth attribute of (1); secondly, applying max-pooling and RBF kernel aggregation functions to each row and column; finally, row and column aggregation vectors are concatenated to represent the attribute similarity vector.
And (3.4) calculating the cosine distance between two entity vectors based on the entity type embedding in the step (2.3), and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a type similarity vector.
And (3.5) under the supervision of seed alignment, jointly learning pairwise description, neighbor description, attribute and type similarity to generate the similarity between the final entity pairs, and finding out the corresponding aligned entity based on the entity similarity, wherein the specific process comprises the following steps of: vector based on entity similarity
Figure DEST_PATH_IMAGE072
Learning alignment possibilities between two entities through a multi-layered perceptron
Figure DEST_PATH_IMAGE074
Figure 587529DEST_PATH_IMAGE074
The greater the value of (a), the greater the probability that two entities are aligned, wherein
Figure 169951DEST_PATH_IMAGE072
The vector is generated by fusing the vectors of the steps (3.1) to (3.4) in a weight concatenation mode.
And (3.6) selecting a candidate entity with the highest alignment probability for each entity by adopting a greedy alignment strategy to generate an aligned entity pair, deleting the aligned entities from the corresponding knowledge graph, and repeating the step (3.6) until the entity in one graph is empty.
The invention fully utilizes the self-carried information of the knowledge graph, carries out entity alignment in a self-supervision joint learning mode, does not need manual intervention in the whole process, and ensures the expansibility of the knowledge graph. The experimental result of the data set DBP15K aligned in the reference cross-language entity shows that in the knowledge map fusion task aiming at Chinese-English, japanese-English and French-English, the alignment accuracy (Hit @ 1) of the invention reaches 96.2%, 97.0% and 99.4%. Compared with the latest self-supervision entity alignment method EVA, the alignment accuracy of the invention is improved by 23.5% on average. The alignment accuracy of the present invention is even slightly higher (0.07%) than the latest supervised learning methods based on manual labeling. This result demonstrates the high accuracy and expandability of the present invention.
The average image coverage of the entities in the reference data set DBP15K is 71.3%, which provides a good condition for the automatic seed alignment based on image collection of the invention, however, the invention can also show excellent performance when the knowledge graph only contains a few images, and the experimental result shows that the alignment accuracy of the invention can reach 93.4% as long as there are 800 images in Japanese-English setting; in French-English setting, as long as 100 images exist, the alignment accuracy of the method can reach 97.8 percent; under the condition that the knowledge graph has no image, the method can also carry out entity alignment by inputting a manual marking seed alignment mode, and achieves excellent alignment accuracy. Specifically, when the alignment of the artificial seeds is 800, the alignment accuracy of the method is 96% in the Japanese-English setting; the alignment accuracy of the present invention was 97.8% when the artificial seed alignment was 100 in the french-english setting, which demonstrates the versatility of the present invention.
Corresponding to the embodiment of the self-supervision joint learning method facing the knowledge-graph entity alignment, the invention also provides an embodiment of a self-supervision joint learning device facing the knowledge-graph entity alignment.
Referring to fig. 4, an embodiment of the present invention provides an apparatus for self-supervised joint learning for knowledge-graph entity alignment, which includes one or more processors, and is configured to implement a method for self-supervised joint learning for knowledge-graph entity alignment in the foregoing embodiment.
The embodiment of the invention, which is directed to an automatic supervision joint learning device for knowledge graph entity alignment, can be applied to any equipment with data processing capability, such as computers and other equipment or devices. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. From a hardware aspect, as shown in fig. 4, the hardware structure diagram of an arbitrary device with data processing capability where an automatic supervision joint learning apparatus aligned to a knowledge graph entity is located according to the present invention is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 4, in an embodiment, an arbitrary device with data processing capability where an apparatus is located may also include other hardware according to an actual function of the arbitrary device with data processing capability, which is not described again.
The specific details of the implementation process of the functions and actions of each unit in the above device are the implementation processes of the corresponding steps in the above method, and are not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
An embodiment of the present invention further provides a computer-readable storage medium, on which a program is stored, where the program, when executed by a processor, implements a self-supervised joint learning method oriented to knowledge-graph entity alignment in the foregoing embodiments.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium may also be an external storage device such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing-capable device, and may also be used for temporarily storing data that has been output or is to be output.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described the practice of the present invention in detail, it will be apparent to those skilled in the art that modifications may be made to the practice of the invention as described in the foregoing examples, or that certain features may be substituted in the practice of the invention. All changes, equivalents and modifications which come within the spirit and scope of the invention are desired to be protected.

Claims (10)

1. A self-supervision joint learning method facing knowledge graph entity alignment is characterized by comprising the following steps:
firstly, learning the image characteristics of an entity by using an image pre-training deep learning model, measuring the similarity of images by calculating Euclidean distance between the image characteristics, and selecting the entity pair with the highest similarity as a seed alignment according to the image similarity;
secondly, under the supervision of seed alignment in the first step, embedding the knowledge graph into a computer low-dimensional vector space by using structural information, entity description, attribute values and entity type information of the knowledge graph based on a graph convolution network, a language pre-training model and a hierarchical type embedding model;
and step three, based on the embedding of the knowledge graph in the step two, calculating similarity vectors of paired description interaction, neighbor description interaction, attribute interaction and type interaction of the entities, learning the weight of each modality through modality fusion, generating final entity embedding, finally, calculating cosine distance based on the entity embedding to measure the alignment possibility between the entities, and outputting an alignment entity list.
2. The method of claim 1, wherein the first step is specifically: adopting a ResNet-152 model pre-trained on an identification task of an ImageNet image database as an image pre-training deep learning model, extracting and outputting characteristics of each entity image from a first layer of the model, and measuring the similarity between the image pairs of the entities by utilizing the Euclidean distance between the extracted characteristics; and finally, selecting the top-k pair entity with the highest similarity as seed alignment according to the image similarity.
3. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 2, wherein the second step specifically comprises the following sub-steps:
step (2.1) embedding the entities by using a language pre-training model based on the character description and attribute information of the entities, so that the entities with similar character description are adjacent in a vector space;
step (2.2) based on the structure information of the knowledge graph, the graph convolution network is utilized to embed the knowledge graph for enhancing the neighbor information;
and (2.3) based on the type information of the entities, embedding the models through the hierarchical types to enable the entities with similar types to be adjacent in a vector space.
4. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 3, wherein the step (2.1) is specifically as follows:
first, a training data set is constructed based on seed alignment
Figure DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE004
Is an entity
Figure DEST_PATH_IMAGE006
The corresponding entities, i.e. similar entities,
Figure DEST_PATH_IMAGE008
is randomly selected from
Figure 919099DEST_PATH_IMAGE004
The dissimilar entities of (a);
then, the text description of the entity is led into a language pre-training model BERT for fine adjustment of knowledge graph embedding;
finally, using the multilayer perceptron to filter CLS embedding of the language pre-training model BERT to obtain embedding of the entity
Figure DEST_PATH_IMAGE010
CLS is a special class label used in model BERT;
pair of margin classification loss functions for use in fine tuning
Figure DEST_PATH_IMAGE012
The following:
Figure DEST_PATH_IMAGE014
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE016
is obtained by
Figure 466623DEST_PATH_IMAGE010
And with
Figure DEST_PATH_IMAGE018
In between
Figure DEST_PATH_IMAGE020
The distance is initialized and the distance is initialized,
Figure DEST_PATH_IMAGE022
Figure 493223DEST_PATH_IMAGE004
or
Figure 394183DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE024
Represents the margin employed between a similar entity pair and a dissimilar entity pair,
Figure DEST_PATH_IMAGE026
the smaller the representation of two entitiesThe more similar the embedding is described.
5. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 4, wherein the step (2.2) is specifically as follows:
based on a graph convolution network GCN, an entity is embedded into aggregation neighbor entity information in an information propagation mode, and the information propagation mode has the following rules:
Figure DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE030
representing the adjacency matrix with the self-join added,
Figure DEST_PATH_IMAGE032
is a matrix of units, and is,
Figure DEST_PATH_IMAGE034
and with
Figure DEST_PATH_IMAGE036
Meaning that the layer-specific trainable weight matrix,
Figure DEST_PATH_IMAGE038
it is an indication of an activation function,
Figure DEST_PATH_IMAGE040
and (4) representing the activation matrix at the l-th layer, wherein N represents the number of entities, and D represents the dimension of an entity vector.
6. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 5, wherein the step (2.3) is specifically as follows:
distance perception by hierarchical typeFunction of departure
Figure DEST_PATH_IMAGE042
Calculating the distance between the two entities in the type embedding space, and learning the vector representation of the entities in the type embedding space based on a hierarchical type embedding model HTE;
the hierarchical type perceptual distance function is as follows:
Figure DEST_PATH_IMAGE044
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE046
is a parameter of the distance to be measured,
Figure DEST_PATH_IMAGE048
indicating the similarity of the type between the two entities,
Figure DEST_PATH_IMAGE050
calculating the similarity of the type through the information content of the nearest common father type by using the similarity value of the normalized type, wherein the similarity value of the type is as follows:
Figure DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE054
is expressed simultaneously as
Figure DEST_PATH_IMAGE056
And with
Figure 704685DEST_PATH_IMAGE022
The set of parent classes of (a) is,
Figure DEST_PATH_IMAGE058
meaning the value of the information content of type t, the more specific the information content, the higher the value of the information content.
7. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 6, wherein the third step specifically comprises the following sub-steps:
step (3.1) based on the entity text description embedding of step (2.1), calculating the cosine distance between two entity vectors, and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a pair description similarity vector;
step (3.2) based on the entity neighbor information embedding of step (2.2), calculating a neighbor description similarity vector;
step (3.3) calculating an attribute similarity vector based on the entity attribute information embedding in step (2.1);
step (3.4) calculating the cosine distance between two entity vectors based on the entity type embedding in step (2.3), and generating a one-dimensional similarity vector for each entity, namely calculating to obtain a type similarity vector;
step (3.5) under the supervision of alignment in the step one, jointly learning pairwise description, neighbor description, attribute and type similarity to generate the similarity between the final entity pair, and finding out a corresponding aligned entity based on the entity similarity;
and (3.6) selecting a candidate entity with the highest alignment probability for each entity by adopting a greedy alignment strategy to generate an aligned entity pair, deleting the aligned entities from the corresponding knowledge graph, and repeating the step (3.6) until the entity in one knowledge graph is empty.
8. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.2) is specifically as follows: first of all, the structure
Figure DEST_PATH_IMAGE060
Of the neighbor matrix, the value of the matrixThe ith neighbor representing entity e and entity
Figure 95215DEST_PATH_IMAGE022
The cosine similarity of the jth neighbor of (1); secondly, applying a max-posing technology to each row and each column of the neighbor matrix to select the most relevant neighbor pair; then, extracting the characteristics of the collected similarity through an RBF kernel aggregation function; finally, row and column aggregation vectors are concatenated to represent the neighbor description similarity vector.
9. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.3) is specifically as follows: first of all, is constructed
Figure 985942DEST_PATH_IMAGE060
The value of the matrix represents the ith attribute of the entity e and the entity
Figure 690593DEST_PATH_IMAGE022
The cosine similarity of the jth attribute of (1); secondly, applying max-pooling and RBF kernel aggregation functions to each row and column of the attribute matrix; finally, row and column aggregation vectors are concatenated to represent the attribute similarity vector.
10. The method for self-supervised joint learning oriented to knowledge-graph entity alignment as claimed in claim 7, wherein the step (3.5) comprises the following specific processes: based on entity similarity vector
Figure DEST_PATH_IMAGE062
Learning alignment possibilities between two entities through a multi-layered perceptron
Figure DEST_PATH_IMAGE064
Figure 567455DEST_PATH_IMAGE064
The larger the value of (A), the two real values are representedThe greater the probability of the body alignment, wherein
Figure 66569DEST_PATH_IMAGE062
The vector is generated by fusing the vectors of the steps (3.1) to (3.4) in a weight concatenation mode.
CN202211098589.4A 2022-09-09 2022-09-09 Self-supervision joint learning method oriented to knowledge graph entity alignment Pending CN115168620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211098589.4A CN115168620A (en) 2022-09-09 2022-09-09 Self-supervision joint learning method oriented to knowledge graph entity alignment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211098589.4A CN115168620A (en) 2022-09-09 2022-09-09 Self-supervision joint learning method oriented to knowledge graph entity alignment

Publications (1)

Publication Number Publication Date
CN115168620A true CN115168620A (en) 2022-10-11

Family

ID=83482423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211098589.4A Pending CN115168620A (en) 2022-09-09 2022-09-09 Self-supervision joint learning method oriented to knowledge graph entity alignment

Country Status (1)

Country Link
CN (1) CN115168620A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116069956A (en) * 2023-03-29 2023-05-05 之江实验室 Drug knowledge graph entity alignment method and device based on mixed attention mechanism
CN116431835A (en) * 2023-06-06 2023-07-14 中汽数据(天津)有限公司 Automatic knowledge graph construction method, equipment and medium in automobile authentication field
CN116561346A (en) * 2023-07-06 2023-08-08 北京邮电大学 Entity alignment method and device based on graph convolution network and information fusion
CN117407689A (en) * 2023-12-14 2024-01-16 之江实验室 Entity alignment-oriented active learning method and device and electronic device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753101A (en) * 2020-06-30 2020-10-09 华侨大学 Knowledge graph representation learning method integrating entity description and type
CN112200317A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-modal knowledge graph construction method
CN112800770A (en) * 2021-04-15 2021-05-14 南京樯图数据研究院有限公司 Entity alignment method based on heteromorphic graph attention network
CN112948597A (en) * 2021-01-06 2021-06-11 中国人民解放军国防科技大学 Unsupervised knowledge graph entity alignment method and unsupervised knowledge graph entity alignment equipment
CN113656596A (en) * 2021-08-18 2021-11-16 中国人民解放军国防科技大学 Multi-modal entity alignment method based on triple screening fusion
CN114647715A (en) * 2022-04-07 2022-06-21 杭州电子科技大学 Entity recognition method based on pre-training language model
CN114969367A (en) * 2022-05-30 2022-08-30 大连民族大学 Cross-language entity alignment method based on multi-aspect subtask interaction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753101A (en) * 2020-06-30 2020-10-09 华侨大学 Knowledge graph representation learning method integrating entity description and type
CN112200317A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-modal knowledge graph construction method
CN112948597A (en) * 2021-01-06 2021-06-11 中国人民解放军国防科技大学 Unsupervised knowledge graph entity alignment method and unsupervised knowledge graph entity alignment equipment
CN112800770A (en) * 2021-04-15 2021-05-14 南京樯图数据研究院有限公司 Entity alignment method based on heteromorphic graph attention network
CN113656596A (en) * 2021-08-18 2021-11-16 中国人民解放军国防科技大学 Multi-modal entity alignment method based on triple screening fusion
CN114647715A (en) * 2022-04-07 2022-06-21 杭州电子科技大学 Entity recognition method based on pre-training language model
CN114969367A (en) * 2022-05-30 2022-08-30 大连民族大学 Cross-language entity alignment method based on multi-aspect subtask interaction

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116069956A (en) * 2023-03-29 2023-05-05 之江实验室 Drug knowledge graph entity alignment method and device based on mixed attention mechanism
CN116069956B (en) * 2023-03-29 2023-07-04 之江实验室 Drug knowledge graph entity alignment method and device based on mixed attention mechanism
CN116431835A (en) * 2023-06-06 2023-07-14 中汽数据(天津)有限公司 Automatic knowledge graph construction method, equipment and medium in automobile authentication field
CN116431835B (en) * 2023-06-06 2023-09-15 中汽数据(天津)有限公司 Automatic knowledge graph construction method, equipment and medium in automobile authentication field
CN116561346A (en) * 2023-07-06 2023-08-08 北京邮电大学 Entity alignment method and device based on graph convolution network and information fusion
CN116561346B (en) * 2023-07-06 2023-10-31 北京邮电大学 Entity alignment method and device based on graph convolution network and information fusion
CN117407689A (en) * 2023-12-14 2024-01-16 之江实验室 Entity alignment-oriented active learning method and device and electronic device
CN117407689B (en) * 2023-12-14 2024-04-19 之江实验室 Entity alignment-oriented active learning method and device and electronic device

Similar Documents

Publication Publication Date Title
US11182562B2 (en) Deep embedding for natural language content based on semantic dependencies
CN115168620A (en) Self-supervision joint learning method oriented to knowledge graph entity alignment
US10394956B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
US8793208B2 (en) Identifying common data objects representing solutions to a problem in different disciplines
Galitsky Machine learning of syntactic parse trees for search and classification of text
Abebe et al. Generic metadata representation framework for social-based event detection, description, and linkage
CN111967242A (en) Text information extraction method, device and equipment
CN111737560B (en) Content search method, field prediction model training method, device and storage medium
Overell Geographic information retrieval: Classification, disambiguation and modelling
Zhang et al. Annotating needles in the haystack without looking: Product information extraction from emails
WO2024015323A1 (en) Methods and systems for improved document processing and information retrieval
US20210374488A1 (en) Systems and methods for a k-nearest neighbor based mechanism of natural language processing models
CN117332852A (en) Knowledge graph-based large model training deployment method and system
Anuyah et al. Using structured knowledge and traditional word embeddings to generate concept representations in the educational domain
Fauzi et al. Image understanding and the web: a state-of-the-art review
Wang et al. Inductive zero-shot image annotation via embedding graph
Zhang et al. Multilabel Image Annotation Based on Double‐Layer PLSA Model
Galitsky et al. Assuring Chatbot Relevance at Syntactic Level
CN113392312A (en) Information processing method and system and electronic equipment
Dasgupta et al. A Survey of Numerous Text Similarity Approach
CN116662579B (en) Data processing method, device, computer and storage medium
US20220350814A1 (en) Intelligent data extraction
Miller et al. Digging into Human Rights Violations: phrase mining and trigram visualization
Ghaemmaghami et al. Integrated-Block: A New Combination Model to Improve Web Page Segmentation
Cheng et al. Retrieving Articles and Image Labeling Based on Relevance of Keywords

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20221011

RJ01 Rejection of invention patent application after publication