CN113177105A - Word embedding-based multi-source heterogeneous water conservancy field data fusion method - Google Patents

Word embedding-based multi-source heterogeneous water conservancy field data fusion method Download PDF

Info

Publication number
CN113177105A
CN113177105A CN202110490308.9A CN202110490308A CN113177105A CN 113177105 A CN113177105 A CN 113177105A CN 202110490308 A CN202110490308 A CN 202110490308A CN 113177105 A CN113177105 A CN 113177105A
Authority
CN
China
Prior art keywords
candidate
similarity
pair
attribute
water conservancy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110490308.9A
Other languages
Chinese (zh)
Inventor
胡伟
高祥涛
朱向荣
陆小明
高凤宁
司存友
曹帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Province Hydrology And Water Resources Investigation Bureau
Nanjing University
Original Assignee
Jiangsu Province Hydrology And Water Resources Investigation Bureau
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Province Hydrology And Water Resources Investigation Bureau, Nanjing University filed Critical Jiangsu Province Hydrology And Water Resources Investigation Bureau
Priority to CN202110490308.9A priority Critical patent/CN113177105A/en
Publication of CN113177105A publication Critical patent/CN113177105A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a word embedding-based multi-source heterogeneous water conservancy field data fusion method, which comprises the following steps of: given multi-source heterogeneous water conservancy field data, firstly, the multi-source heterogeneous water conservancy field data is constructed into a water conservancy knowledge map. Next, using the word embedding model, a vector representation is generated for each entity or attribute in the hydraulic knowledge graph. Then, similarity between every two is calculated according to the literal quantity Chinese part, the literal quantity English part and the vector representation of the entity or the attribute. And finally, combining the three similarities to obtain similarity scores of the two candidate entities or attributes. And using a preset similarity score threshold value and a candidate similar entry quantity upper limit to restrict the quantity of similar entities or attributes, and obtaining an entity pair or an attribute pair which is finally determined to be matched. By applying the method and the device, the similar entity pair similar attribute pair in the multi-source heterogeneous water conservancy field data can be found, and the complexity of data retrieval by water conservancy professional practitioners is reduced.

Description

Word embedding-based multi-source heterogeneous water conservancy field data fusion method
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a word embedding-based multi-source heterogeneous water conservancy field data fusion method.
Background
In 2012, Google corporation first proposed a new concept, the knowledge graph, which improves the quality of the search by introducing the knowledge graph to structure information about the search targets. From the content perspective, the knowledge graph is mainly composed of interconnected entities and their attributes; and in essence, it can be viewed as a knowledge base built based on a semantic network, where each piece of knowledge can be represented by a triplet. For example (Yangcheng lake, position, Suzhou), characterizes a piece of knowledge (facts) in the real world: yangcheng lake is located in Suzhou. Since many scenes in the real world are suitable for representation by knowledge graph, in recent years, the construction and application work on knowledge graphs has become a new research hotspot. Currently, a large set of quality knowledge maps are emerging in the industry, such as Freebase, which is widely used in real world applications.
"water is invisible and has a shape of ten thousand, and the treatment of water and water consumption is a millennium problem of maintaining the livelihood of people. Due to the inherent continuity in time span and the wide distribution in space span, the water conservancy field can continuously generate massive field data, and the water conservancy field data is particularly suitable for being managed by using a knowledge graph. The problems of flood control and drainage, water environment, water resource, water ecology and the like need extensive knowledge and complex reasoning, and the knowledge map can be used as a powerful tool for storing, managing and utilizing knowledge by experts and common practitioners in the water conservancy field.
Traditionally, the water conservancy industry generally adopts a keyword-based search technology, and information retrieval is difficult to perform by using the relation between objects. On the other hand, the same entity or attribute of different data sources can be expressed in different texts, and the search technology based on keywords is difficult to deal with the retrieval problem of multi-source heterogeneous data.
Disclosure of Invention
The purpose of the invention is as follows: in view of the problems and deficiencies in the prior art, the invention aims to provide a word embedding-based multi-source heterogeneous water conservancy field data fusion method, which can find similar entities and attributes for entities and attributes in multi-source heterogeneous water conservancy field data, assist in linking and fusing the multi-source heterogeneous water conservancy field data, improve recall rate of water conservancy field data retrieval, and improve information retrieval efficiency of water conservancy professional practitioners.
The technical scheme is as follows: in order to achieve the purpose, the technical scheme adopted by the invention is a word embedding-based multi-source heterogeneous water conservancy field data fusion method, which comprises the following steps:
(1.1) for the currently given water conservancy field data, separating the entity from the attribute to generate a candidate entity pair and a candidate attribute pair;
(1.2) for the candidate entity pair and the candidate attribute pair generated in the step (1.1), respectively calculating the similarity of Chinese character face quantity, English word face quantity and vector representation level of two entities or attributes;
(1.3) calculating the similarity of an entity pair and the similarity of an attribute pair by combining the Chinese character face quantity, the English character face quantity and the similarity of the vector representation level calculated in the step (1.2);
(1.4) comparing the similarity calculated in the step (1.3) with a preset threshold, filtering the candidate entity pairs and the candidate attribute pairs with the similarity lower than the threshold, reserving the candidate entity pairs and the candidate attribute pairs with the similarity higher than the threshold, and screening out the matching entity pairs and the matching attribute pairs.
Further, the candidate entity pair consists of two candidate entities, the candidate attribute pair consists of two candidate attributes, the step (1.2) comprises the steps of:
(2.1) calculating the character string similarity of the two candidate entities or the attribute Chinese names according to the Jacobian index;
(2.2) calculating the character string similarity of the English names of the two candidate entities or the attributes according to the editing distance;
and (2.3) calculating the similarity of the two candidate entities or the candidate attribute embedded vector level according to the cosine distance.
Further, the step (2.3) comprises the steps of:
(3.1) for the candidate entities and the candidate attributes generated in the step (1.1), obtaining vector representations of the candidate entities and the candidate attributes by using a CBoW word vector model;
and (3.2) according to the vector representation of each candidate entity and candidate attribute obtained in the step (3.1), extracting the vector representation of the current candidate entity pair or candidate attribute pair, and calculating the cosine similarity of the vector representation of the candidate entity pair or candidate attribute pair.
Further, the step (1.3) comprises the steps of:
(4.1) determining the weights of the similarity of the Chinese character face quantity, the English character face quantity and the vector representation level in the similarity of the entity pair and the attribute pair according to the water conservancy field data characteristics, and ensuring that the sum of the weights of the similarity of the Chinese character face quantity, the English character face quantity and the vector representation level is 1;
and (4.2) according to the weight determined in the step (4.1), calculating a weighted average of the similarity of the Chinese character face quantity, the English face quantity and the vector representation level as the similarity of the entity pair and the similarity of the attribute pair.
Has the advantages that: (1) similar entities and attributes in the multi-source heterogeneous water conservancy data are matched, and fusion of the multi-source heterogeneous water conservancy field data is assisted. (2) The method can be used as a component to be applied to the traditional water conservancy field data retrieval method based on keywords, the recall rate of retrieval is improved, and the efficiency of data retrieval of water conservancy field workers is further improved.
Drawings
FIG. 1 is an overall flow chart of the present invention.
Detailed Description
The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.
In order to better develop the aggregation effect of knowledge, the multi-source heterogeneous data needs to be linked and fused. The word embedding technology in the field of machine learning can project entities and attributes in different knowledge maps to a uniform low-dimensional vector space, and realize linkage and fusion of multi-source heterogeneous data.
The water conservancy field data are managed by using the knowledge graph, and the similar entity pairs and the similar attribute pairs in the knowledge graph are matched by using a word vector technology, so that the fusion of the multi-source heterogeneous water conservancy field data is realized. Dividing the water conservancy knowledge map into an entity part and an attribute part, and respectively carrying out fusion of multi-source heterogeneous data and matching of similar concepts on the entity part and the attribute part. Firstly, respectively calculating the similarity of Chinese character face quantity, English character face quantity and vector representation level for candidate entities or candidate attributes in a candidate entity pair or a candidate attribute pair. And then calculating a weighted average value of the three similarity degrees to obtain the similarity degree of the candidate entity pair or the candidate attribute pair. And finally, filtering the candidate entity pairs and the candidate attribute pairs by using a preset similarity threshold and a preset matching number upper limit to obtain matching entity pairs and matching attribute pairs.
The overall process of the invention is shown in fig. 1, and comprises 4 parts: dividing the current knowledge graph into a candidate entity pair and a candidate attribute pair, respectively calculating the similarity of Chinese character face quantity, English character face quantity and vector representation level for the candidate entity pair or the candidate attribute pair, calculating a weighted average value for the three similarities as the similarity of the candidate entity pair or the candidate attribute pair, and screening out a matched entity pair or the candidate attribute pair according to a preset threshold value.
The specific implementation methods are respectively described as follows:
1. partitioning a current knowledge-graph into candidate entity pairs and candidate attribute pairs
For a given knowledge graph, separating head and tail entities from attributes in the triples to generate entity sets and attribute sets. Aiming at the entity set, matching every two entities in the entity set, calculating the similarity of the literal quantity of the names of the entities, and directly filtering the entity pairs with the similarity lower than a threshold value; and calculating the similarity of the literal quantities of the attribute names of a pair of attribute sets, wherein the attribute pairs with the similarity lower than a threshold value are directly filtered. In the invention, the similarity of the word sizes is calculated by using the Jacobian index, and the similarity threshold is set to be 0.4. The jacarat index, also called cross-over-cross-over ratio, is used for measuring the similarity of a finite sample set, and is defined as the ratio between the size of the intersection of two sets and the size of a union set, and the calculation method is as follows:
J(A,B)=|A∩B|÷|A∪B|
in the formula, a and B represent two sets of jacobian indexes to be calculated, J (a, B) represents the jacobian indexes of the two sets to be calculated, | a ∞ B | represents the size of the intersection of the two sets, and | a ∑ B | represents the size of the union of the two sets.
In the invention, two candidate entities or candidate attributes are split into a set of Chinese entries, and then the Jacobi index of the two entry sets is calculated to be used as the similarity of Chinese character face quantity levels.
2. Calculating Chinese and English literal quantity and vector representation similarity for candidate entity pair or candidate attribute pair
And calculating the similarity of the character face quantity in the candidate entity pair or the candidate attribute pair, and using a Jacobian index. And calculating the similarity of the candidate entity pair or the candidate attribute pair to the English word size, and using the edit distance. The edit distance measures the degree of difference between two character strings by calculating the minimum number of operations required to process one character string into another. The levens distance is used in the present invention, and the defined atomic editing operations include deleting, adding, and replacing a character. The edit distance of two candidate entities or candidate attributes is normalized to a similarity measure between 0-1 using the following:
S(C,D)=1-L(C,D)÷max(|C|,|D|)
in the formula, C and D represent two character strings whose similarity needs to be measured, S (C, D) represents the calculated similarity based on the edit distance of the two character strings, L (C, D) represents the edit distance of the two character strings, and max (| C |, | D |) represents the length of a longer character string of the two character strings. The degree of difference between the two strings is measured by dividing the edit distance by the greater of the lengths of the two strings, with the value normalized to between 0 and 1.
And training on a given knowledge graph by using a word embedding model to obtain vector representations of the entities and the attributes, acquiring vector representations corresponding to two entities or attributes in a candidate entity pair or a candidate attribute pair, calculating the similarity of the vector representations of the candidate entity pair or the candidate attribute pair, and using cosine similarity. The cosine similarity measures the similarity between two vector included angles by measuring cosine values of the two vector included angles, and the cosine values of the two vector included angles can be solved by an Euclidean dot product formula:
cos(θ)=(E·F)÷(|E|·|F|)
in the formula, E and F represent two vectors of which the cosine of the included angle needs to be calculated, theta represents the included angle of the two vectors, E · F represents the dot product of the two vectors, | E | · | F | represents the product of the lengths of the two vectors, and cos (theta) represents the cosine value of the included angle of the two vectors.
The representation learning model used in the invention is CBoW (Continuous bands-of-Words) model in word2vec algorithm. The word2vec algorithm is based on the distributed assumption that the word frequency of a document represents the subject of the document, and two words similar in context have similar semantics. The CBoW model predicts the central word by using the context, namely, the training input is a word vector corresponding to the context-dependent word of a characteristic word, and the output is a word vector of a specific central word.
3. Calculating the weighted average of the three similarity as the similarity of the candidate entity pair or the candidate attribute pair
In the step 2, the similarity of the candidate entity pair or the candidate attribute pair in the Chinese character face quantity, the English face quantity and the vector representation level is already calculated. In order to obtain the similarity of the candidate entity pair or the candidate attribute pair, a weighted average is calculated for the three similarities, and the similarity of the Chinese character face quantity is written as a, the similarity of the English character face quantity is written as b, the similarity of the vector representation is written as c, and the weighted average is written as d:
d=α*a+β*b+γ*c
in the formula, α, β, and γ are weights of the three-part similarity in the final candidate entity pair or the candidate attribute pair similarity, and are set to be 0.6, 0.3, and 0.1 in the present invention.
4. Screening out matched entity pairs or candidate attribute pairs according to a preset threshold value
And 3, calculating the similarity of each candidate entity pair or candidate attribute pair, and the invention aims to perform fusion of multi-source heterogeneous water conservancy field data by using a word embedding technology, and needs to filter candidate entity pairs and candidate attribute pairs with lower similarity by using a preset threshold value so as to improve the accuracy. The preset similarity threshold is 0.6, the candidate entity pairs and the candidate attribute pairs with the similarity lower than the threshold are filtered, and the candidate entity pairs and the candidate attribute pairs with the similarity higher than the threshold are reserved.
Because the similarity of some entities or attributes is excessive, the number of the matched entities or attributes is restricted by using an upper limit value of one matched entity and one matched attribute, the method is limited to 10, namely, the candidate entities or the candidate attributes with the similarity higher than the similarity threshold value of 0.6 and the similarity size of 10 before are reserved as the finally generated matched entity pair and matched attribute pair.
The multi-source heterogeneous water conservancy field data fusion method based on the word vector can match similar entities and similar attributes in multi-source heterogeneous water conservancy field data, improves recall rate of water conservancy field information retrieval, and can improve accuracy of retrieval results by using double constraints of a threshold and an upper limit. Several examples are given in table one below, where the first column is the object entity or attribute and the second column is the entity or attribute with higher similarity (in descending order of similarity).
Table 1: matching entity pair or matching attribute pair paradigm in the present invention
Figure BDA0003051702160000061

Claims (4)

1. A multi-source heterogeneous water conservancy field data fusion method based on word embedding is characterized by comprising the following steps:
(1.1) for the currently given water conservancy field data, separating the entity from the attribute to generate a candidate entity pair and a candidate attribute pair;
(1.2) for the candidate entity pair and the candidate attribute pair generated in the step (1.1), respectively calculating the similarity of Chinese character face quantity, English word face quantity and vector representation level of two entities or attributes;
(1.3) calculating the similarity of an entity pair and the similarity of an attribute pair by combining the Chinese character face quantity, the English character face quantity and the similarity of the vector representation level calculated in the step (1.2);
(1.4) comparing the similarity calculated in the step (1.3) with a preset threshold, filtering the candidate entity pairs and the candidate attribute pairs with the similarity lower than the threshold, reserving the candidate entity pairs and the candidate attribute pairs with the similarity higher than the threshold, and screening out the matching entity pairs and the matching attribute pairs.
2. The word embedding-based multi-source heterogeneous water conservancy field data fusion method according to claim 1, wherein the candidate entity pair consists of two candidate entities, the candidate attribute pair consists of two candidate attributes, and the step (1.2) comprises the following steps:
(2.1) calculating the character string similarity of the Chinese names of the two candidate entities or the candidate attributes according to the Jacobian index;
(2.2) calculating the character string similarity of the English names of the two candidate entities or the candidate attributes according to the editing distance;
and (2.3) calculating the similarity of the two candidate entities or the candidate attribute embedded vector level according to the cosine distance.
3. The word-embedding-based multi-source heterogeneous water conservancy field data fusion method according to claim 2, wherein the step (2.3) comprises the following steps:
(3.1) for the candidate entities and the candidate attributes generated in the step (1.1), obtaining vector representations of the candidate entities and the candidate attributes by using a CBoW word vector model;
and (3.2) according to the vector representation of each candidate entity and candidate attribute obtained in the step (3.1), extracting the vector representation of the current candidate entity pair or candidate attribute pair, and calculating the cosine similarity of the vector representation of the candidate entity pair or candidate attribute pair.
4. The word embedding-based multi-source heterogeneous water conservancy field data fusion method according to claim 1, wherein the step (1.3) comprises the following steps:
(4.1) determining the weights of the similarity of the Chinese character face quantity, the English character face quantity and the vector representation level in the similarity of the entity pair and the attribute pair according to the water conservancy field data characteristics, and ensuring that the sum of the weights of the similarity of the Chinese character face quantity, the English character face quantity and the vector representation level is 1;
and (4.2) according to the weight determined in the step (4.1), calculating a weighted average of the similarity of the Chinese character face quantity, the English face quantity and the vector representation level as the similarity of the entity pair and the similarity of the attribute pair.
CN202110490308.9A 2021-05-06 2021-05-06 Word embedding-based multi-source heterogeneous water conservancy field data fusion method Pending CN113177105A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110490308.9A CN113177105A (en) 2021-05-06 2021-05-06 Word embedding-based multi-source heterogeneous water conservancy field data fusion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110490308.9A CN113177105A (en) 2021-05-06 2021-05-06 Word embedding-based multi-source heterogeneous water conservancy field data fusion method

Publications (1)

Publication Number Publication Date
CN113177105A true CN113177105A (en) 2021-07-27

Family

ID=76928445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110490308.9A Pending CN113177105A (en) 2021-05-06 2021-05-06 Word embedding-based multi-source heterogeneous water conservancy field data fusion method

Country Status (1)

Country Link
CN (1) CN113177105A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966027A (en) * 2021-03-22 2021-06-15 青岛科技大学 Entity association mining method based on dynamic probe
CN114676237A (en) * 2022-03-15 2022-06-28 平安科技(深圳)有限公司 Sentence similarity determining method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710663A (en) * 2018-05-14 2018-10-26 北京大学 A kind of data matching method and system based on ontology model
CN111428044A (en) * 2020-03-06 2020-07-17 中国平安人寿保险股份有限公司 Method, device, equipment and storage medium for obtaining supervision identification result in multiple modes
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710663A (en) * 2018-05-14 2018-10-26 北京大学 A kind of data matching method and system based on ontology model
CN111428044A (en) * 2020-03-06 2020-07-17 中国平安人寿保险股份有限公司 Method, device, equipment and storage medium for obtaining supervision identification result in multiple modes
CN112100356A (en) * 2020-09-17 2020-12-18 武汉纺织大学 Knowledge base question-answer entity linking method and system based on similarity

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
钱双双: "金融领域的知识图谱构建与应用", 《中国优秀硕士学位论文全文数据库信息科技辑》, pages 138 - 2941 *
高凤宁 等: "面向智能搜索应用的水利知识图谱构建", 《江苏水利》, pages 59 - 64 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966027A (en) * 2021-03-22 2021-06-15 青岛科技大学 Entity association mining method based on dynamic probe
CN112966027B (en) * 2021-03-22 2022-10-21 青岛科技大学 Entity association mining method based on dynamic probe
CN114676237A (en) * 2022-03-15 2022-06-28 平安科技(深圳)有限公司 Sentence similarity determining method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107133213B (en) Method and system for automatically extracting text abstract based on algorithm
Shi et al. Deep adaptively-enhanced hashing with discriminative similarity guidance for unsupervised cross-modal retrieval
US20170161619A1 (en) Concept-Based Navigation
WO2023065617A1 (en) Cross-modal retrieval system and method based on pre-training model and recall and ranking
CN111324765A (en) Fine-grained sketch image retrieval method based on depth cascade cross-modal correlation
CN110263659A (en) A kind of finger vein identification method and system based on triple loss and lightweight network
WO2024131111A1 (en) Intelligent writing method and apparatus, device, and nonvolatile readable storage medium
CN113177105A (en) Word embedding-based multi-source heterogeneous water conservancy field data fusion method
CN113051368B (en) Double-tower model training method, retrieval device and electronic equipment
CN112347761B (en) BERT-based drug relation extraction method
CN113032541B (en) Answer extraction method based on bert and fusing sentence group retrieval
CN109492678A (en) A kind of App classification method of integrated shallow-layer and deep learning
CN112084312B (en) Intelligent customer service system constructed based on knowledge graph
CN113158674B (en) Method for extracting key information of documents in artificial intelligence field
CN114661914A (en) Contract examination method, device, equipment and storage medium based on deep learning and knowledge graph
CN116227594A (en) Construction method of high-credibility knowledge graph of medical industry facing multi-source data
CN117113982A (en) Big data topic analysis method based on embedded model
CN117151052B (en) Patent query report generation method based on large language model and graph algorithm
CN116701665A (en) Deep learning-based traditional Chinese medicine ancient book knowledge graph construction method
CN110674293A (en) Text classification method based on semantic migration
CN112800259B (en) Image generation method and system based on edge closure and commonality detection
CN113656556B (en) Text feature extraction method and knowledge graph construction method
CN106384127B (en) The method and system of comparison point pair and binary descriptor are determined for image characteristic point
CN112948544B (en) Book retrieval method based on deep learning and quality influence
CN111199154B (en) Fault-tolerant rough set-based polysemous word expression method, system and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210727

WD01 Invention patent application deemed withdrawn after publication