CN112015911A - Method for searching massive knowledge maps - Google Patents

Method for searching massive knowledge maps Download PDF

Info

Publication number
CN112015911A
CN112015911A CN202010857339.9A CN202010857339A CN112015911A CN 112015911 A CN112015911 A CN 112015911A CN 202010857339 A CN202010857339 A CN 202010857339A CN 112015911 A CN112015911 A CN 112015911A
Authority
CN
China
Prior art keywords
knowledge graph
data matrix
knowledge
visual data
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010857339.9A
Other languages
Chinese (zh)
Other versions
CN112015911B (en
Inventor
樊星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Original Assignee
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd filed Critical Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority to CN202010857339.9A priority Critical patent/CN112015911B/en
Publication of CN112015911A publication Critical patent/CN112015911A/en
Application granted granted Critical
Publication of CN112015911B publication Critical patent/CN112015911B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a massive knowledge graph retrieval method, which is used for solving the problem that the conventional knowledge graph retrieval method cannot better retrieve a knowledge graph related to a knowledge graph searched by a user to the user. The method comprises the following steps: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance; calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; and storing the second knowledge graph corresponding to the visual data matrix with the relevance degree with the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree with the first knowledge graph. According to the method, the relevance of each knowledge graph can be accurately determined by calculating the relevance and the similarity between the knowledge graphs, so that the knowledge graph with strong relevance to the knowledge graph searched by the user can be pushed to the user, and the user experience is improved.

Description

Method for searching massive knowledge maps
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method for searching a mass knowledge map.
Background
Knowledge map (Knowledge Graph) is a series of different graphs which are called Knowledge domain visualization or Knowledge domain mapping map in the book intelligence world and display the relationship between the Knowledge development process and the structure. With the rapid development of theories and methods applying subjects such as mathematics, graphics, information visualization technology, information science and the like, the number of knowledge maps is also developed in a burst mode, the relevance among the knowledge maps is larger and larger, and each knowledge map is not an information isolated island any more. Therefore, how to enable a user to browse the knowledge graph related to the user in an extensible manner when searching for a certain knowledge graph enables the user to retrieve more useful information, improves the experience of the user, and becomes a hot spot of recent research. However, the matching accuracy of the current relevant knowledge graph retrieval method is low, and the user experience is poor.
Disclosure of Invention
The invention provides a massive knowledge graph retrieval method, which is used for solving the problems of low matching accuracy and poor user experience of the conventional knowledge graph retrieval method. According to the method for searching the massive knowledge maps, the relevance of each knowledge map can be accurately determined by calculating the relevance and the similarity before the knowledge map, so that the knowledge map with strong relevance with the knowledge map searched by a user can be pushed to the user, and the user experience is improved.
The invention provides a method for searching a massive knowledge map, which comprises the following steps:
constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
receiving a retrieval request about a target knowledge-graph;
and retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, after storing a second knowledge graph corresponding to a visualization data matrix having a degree of association with a visualization data matrix of the first knowledge graph as a knowledge graph having a degree of association with the first knowledge graph, before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, after storing a second knowledge graph corresponding to a visualization data matrix having a degree of association with a visualization data matrix of the first knowledge graph as a knowledge graph having a degree of association with the first knowledge graph, before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer, and the initial value of N is 1;
judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not;
if the similarity between the visual data matrix of the Nth knowledge graph in the sequencing result and the visual data matrix of the first knowledge graph reaches a preset similarity threshold value, putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
judging whether the count value of the counter is equal to the preset map extension number or not;
if the count value of the counter is not equal to the preset map extension number, judging whether N is equal to M or not; the M is the number of the knowledge graphs in the sequencing result;
if N is not equal to M, after N is set to be N +1, returning to the step of executing the step of calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to the preset similarity calculation method;
if the counting value of the counter is equal to the preset map extension number or N is equal to M, storing a recommended knowledge map set corresponding to the current first knowledge map;
if the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result does not reach a preset similarity threshold value, executing the step of judging whether N is equal to M or not;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, the preset association algorithm formula is as follows:
Figure BDA0002646882650000041
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe learning rate is the data information of visual data of the ith row and the jth column in the visual data matrix of the second knowledge graph, L is an association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, e is a natural constant, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, beta is expressed as a strong association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and lambda is a preset learning rate; l, e, the values of alpha, beta and lambda are all preset values.
In one embodiment, L has a value interval of [0.1,0.3], e has a value interval of 2.58, and λ has a value interval of [0,1 ].
In one embodiment, the preset similarity is calculated by the following formula:
Figure BDA0002646882650000042
the simY is the similarity between the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, the theta is expressed as the preset probability that the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph contain the data information of the same visual data, i and j respectively express the row number, column number and S of the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graphmIs confidence coefficient, Y, of the mth data information in the visual data matrix of the first knowledge-graphmFor the confidence of the mth data information in the visualized data matrix of the third knowledge-graph, GmIs the evaluation score of the mth data information in the visual data matrix of the first knowledge graph, CmIs as followsThe evaluation scores of the m pieces of data information in the visual data matrix of the third knowledge graph are shown, and omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph; and the values of theta and omega are preset values.
In one embodiment, θ is 15%, and ω is within 0.05, 0.1.
In an embodiment, before constructing the visualized data matrix of each knowledge graph in the pre-acquired massive set of knowledge graphs, the method further includes:
acquiring a massive knowledge graph;
preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
combining the preprocessed knowledge maps into a mass knowledge map set;
the method for constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set comprises the following steps:
analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map;
taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
The invention provides a method for searching a mass knowledge graph, which comprises the steps of firstly, determining the relevance between knowledge graphs by calculating the relevance between knowledge graphs, further ensuring that when a user searches a knowledge graph, the knowledge graph with larger relevance can be recommended to the user in an extending way, expanding the searching range of the user, enabling the user to search more useful information and improving the experience of the user; furthermore, after the association degree between the knowledge graphs is calculated, the similarity between the knowledge graphs with the larger association is calculated, so that the similarity is further ensured on the basis of ensuring the association degree, the association of each knowledge graph can be more accurately determined, the accuracy of determining the association relation is improved, and the user experience is greatly improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a first embodiment of a method for massive knowledge graph retrieval according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method prior to step S101 in FIG. 1;
FIG. 3 is a flowchart of a second embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention;
FIG. 4 is a flowchart of a third embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention;
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 1 is a flowchart of an embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention. As shown in fig. 1, the method comprises the following steps S101-S105:
s101: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
in this embodiment, each mass knowledge graph is constructed into a respective visual data matrix, so that the association degree between the visual data matrices is calculated by using a preset association degree algorithm, and the association degree is the association degree between the knowledge graphs corresponding to the matrices.
As an optional manner, as shown in fig. 2, before this step S101, steps S201 to S203 may be further included:
s201: acquiring a massive knowledge graph;
in this embodiment, a massive knowledge graph, such as Baidu encyclopedia, may be obtained from some target websites.
S202: preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
in the embodiment, the repeated data and the messy code data of each knowledge graph are deleted, so that the association degree obtained through calculation is more accurate.
S203: and combining the preprocessed knowledge maps into a mass knowledge map set.
This step S101 includes: analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map; taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
S102: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
in this embodiment, the preset association algorithm formula is as follows:
Figure BDA0002646882650000071
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe learning rate is the data information of visual data of the ith row and the jth column in the visual data matrix of the second knowledge graph, L is an association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, e is a natural constant, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, beta is expressed as a strong association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and lambda is a preset learning rate; l, e, alpha, beta and lambda values are all preset values, wherein the value range of L is [0.1,0.3]]E is 2.58, and lambda is in the range of [0,1]]λ increases with increasing data information of the visualization data.
S103: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
in this embodiment, as an optional manner, the step S103 includes: judging whether the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph is larger than or equal to a preset association degree threshold value or not; and if the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph is larger than or equal to a preset association degree threshold value, storing the second knowledge graph as the knowledge graph with the association degree with the first knowledge graph.
S104: receiving a retrieval request about a target knowledge-graph;
s105: and retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to be capable of extensibly browsing the knowledge map with the larger relevance when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved.
Fig. 3 is a schematic flow chart of a second embodiment of the method for retrieving a massive knowledge map according to the present invention. Referring to fig. 3, the embodiment of the method for retrieving the massive knowledge map of the present invention includes the following steps:
s301: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
s302: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
s303: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
s304: calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
in this embodiment, the preset similarity calculation method includes:
Figure BDA0002646882650000091
wherein simY is as defined aboveThe similarity between the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph is represented as theta, the theta is represented as the preset probability that the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph contain the data information of the same visual data, and the value is 15%; i. j represents the row number, column number, S of the visual data matrix of the first/third knowledge graph respectivelymThe confidence coefficient of the mth data information in the visual data matrix of the first knowledge graph is taken as [0,1]]The probability of the mth data information appearing in a certain region in a visual verse matrix is reduced along with the increase of the area of the region; y ismIs the confidence coefficient of the mth data information in the visual data matrix of the third knowledge graph, and the value of the mth data information is equal to SmSimilarly; gmThe evaluation score of the mth data information in the visual data matrix of the first knowledge graph is taken as [0,1]]The data proportion of the mth data information in all the data information in the first visual data matrix is increased along with the increase of the data proportion; cmThe evaluation score of the mth data information in the visual data matrix of the third knowledge graph is the value and GmSimilarly; omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, and the value is [0.05, 0.1%]The value is closer to 0.1 when the amount of unnecessary data is larger, and closer to 0.05 when the amount of unnecessary data is smaller.
S305: screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
s306: receiving a retrieval request about a target knowledge-graph;
s307: and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to browse the knowledge maps with larger relevance in an extensible manner when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved; furthermore, after the association degree between the knowledge maps is calculated, the similarity between the knowledge maps is calculated, so that the similarity is further ensured on the basis of ensuring the association degree, the association of each knowledge map can be more accurately determined, the accuracy of determining the association relation is improved, and the user experience is greatly improved.
Fig. 4 is a schematic flow chart of a third embodiment of the method for retrieving a massive knowledge map according to the present invention. Referring to fig. 4, the embodiment of the method for retrieving the massive knowledge map of the present invention includes the following steps:
s401: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
s402: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
s403: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
s404: sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
s405: calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer and is initialized to 1;
s406: judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not; if yes, executing step S407, otherwise, executing step S409;
s407: putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
s408: judging whether the count value of the counter is equal to the preset atlas extension number, if so, executing the step S411, otherwise, executing the step S409;
s409: judging whether N is equal to M, wherein M is the number of the knowledge maps in the sequencing result, if yes, executing a step S411, otherwise, executing a step S410;
s410: n +1, and step S405 is performed;
s411: storing a recommended knowledge map set corresponding to the current first knowledge map;
s412: receiving a retrieval request about a target knowledge-graph;
s41: 3: and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to browse the knowledge maps with larger relevance in an extensible manner when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved; furthermore, according to the sequence of the relevance degrees from large to small, the similarity between the corresponding knowledge maps is sequentially calculated, the knowledge maps with the similarity reaching a certain degree are used as recommended knowledge map sets until the number of the recommended knowledge map sets reaches the preset map extension number or the relevance knowledge maps are traversed, the similarity is further ensured on the basis of ensuring the relevance degrees, the relevance of each knowledge map can be more accurately determined, the accuracy of determining the relevance relation is improved, and meanwhile, due to the fact that the energy of a user is limited, the extended knowledge maps cannot be too much, and therefore the second knowledge map with the large similarity with the first knowledge map can be selected to better meet the requirements of the user.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for searching massive knowledge maps is characterized by comprising the following steps:
constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
receiving a retrieval request about a target knowledge-graph;
and retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user.
2. The method for retrieving the massive knowledge graph according to claim 1, wherein after storing a second knowledge graph corresponding to a visualized data matrix with a degree of association with a visualized data matrix of the first knowledge graph as a knowledge graph with a degree of association with the first knowledge graph, and before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
3. The method for retrieving the massive knowledge graph according to claim 1, wherein after storing a second knowledge graph corresponding to a visualized data matrix with a degree of association with a visualized data matrix of the first knowledge graph as a knowledge graph with a degree of association with the first knowledge graph, and before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer, and the initial value of N is 1;
judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not;
if the similarity between the visual data matrix of the Nth knowledge graph in the sequencing result and the visual data matrix of the first knowledge graph reaches a preset similarity threshold value, putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
judging whether the count value of the counter is equal to the preset map extension number or not;
if the count value of the counter is not equal to the preset map extension number, judging whether N is equal to M or not; the M is the number of the knowledge graphs in the sequencing result;
if N is not equal to M, after N is set to be N +1, returning to the step of executing the step of calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to the preset similarity calculation method;
if the counting value of the counter is equal to the preset map extension number or N is equal to M, storing a recommended knowledge map set corresponding to the current first knowledge map;
if the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result does not reach a preset similarity threshold value, executing the step of judging whether N is equal to M or not;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
4. The method for retrieving the massive knowledge map according to any one of claims 1 to 3, wherein the preset association algorithm formula is as follows:
Figure FDA0002646882640000031
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe learning rate is the data information of visual data of the ith row and the jth column in the visual data matrix of the second knowledge graph, L is an association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, e is a natural constant, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, beta is expressed as a strong association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and lambda is a preset learning rate; l, e, the values of alpha, beta and lambda are all preset values.
5. The method for retrieving the massive knowledge map as claimed in claim 4, wherein the value interval of L is [0.1,0.3], the value of e is 2.58, and the value interval of λ is [0,1 ].
6. The method for massive knowledge graph retrieval according to claim 2, wherein the algorithm formula of the preset similarity is as follows:
Figure FDA0002646882640000032
wherein simY is the similarity of the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, and theta is expressed as the first knowledge graphThe visual data matrix of the knowledge graph and the visual data matrix of the third knowledge graph comprise preset probability of data information of the same visual data, i and j respectively represent the row number and column number of the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, and SmIs confidence coefficient, Y, of the mth data information in the visual data matrix of the first knowledge-graphmFor the confidence of the mth data information in the visualized data matrix of the third knowledge-graph, GmIs the evaluation score of the mth data information in the visual data matrix of the first knowledge graph, CmThe evaluation score of the mth data information in the visual data matrix of the third knowledge graph is shown, and omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph; and the values of theta and omega are preset values.
7. The method for retrieving the massive knowledge map according to claim 6, wherein θ is 15%, and ω is [0.05,0.1 ].
8. The method for retrieving the massive knowledge graph according to claim 5, wherein before the constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set, the method further comprises:
acquiring a massive knowledge graph;
preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
combining the preprocessed knowledge maps into a mass knowledge map set;
the method for constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set comprises the following steps:
analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map;
taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
CN202010857339.9A 2020-08-24 2020-08-24 Method for searching massive knowledge maps Active CN112015911B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857339.9A CN112015911B (en) 2020-08-24 2020-08-24 Method for searching massive knowledge maps

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857339.9A CN112015911B (en) 2020-08-24 2020-08-24 Method for searching massive knowledge maps

Publications (2)

Publication Number Publication Date
CN112015911A true CN112015911A (en) 2020-12-01
CN112015911B CN112015911B (en) 2021-07-20

Family

ID=73505725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857339.9A Active CN112015911B (en) 2020-08-24 2020-08-24 Method for searching massive knowledge maps

Country Status (1)

Country Link
CN (1) CN112015911B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11829889B2 (en) 2021-06-28 2023-11-28 Institute Of Geology And Geophysics, Chinese Academy Of Sciences Processing method and device for data of well site test based on knowledge graph

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
US20200050604A1 (en) * 2018-08-07 2020-02-13 Accenture Global Solutions Limited Approaches for knowledge graph pruning based on sampling and information gain theory
CN111241241A (en) * 2020-01-08 2020-06-05 平安科技(深圳)有限公司 Case retrieval method, device and equipment based on knowledge graph and storage medium
CN111369318A (en) * 2020-02-28 2020-07-03 安徽农业大学 Commodity knowledge graph feature learning-based recommendation method and system
US20200226133A1 (en) * 2016-10-18 2020-07-16 Hithink Financial Services Inc. Knowledge map building system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200226133A1 (en) * 2016-10-18 2020-07-16 Hithink Financial Services Inc. Knowledge map building system and method
US20200050604A1 (en) * 2018-08-07 2020-02-13 Accenture Global Solutions Limited Approaches for knowledge graph pruning based on sampling and information gain theory
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
CN111241241A (en) * 2020-01-08 2020-06-05 平安科技(深圳)有限公司 Case retrieval method, device and equipment based on knowledge graph and storage medium
CN111369318A (en) * 2020-02-28 2020-07-03 安徽农业大学 Commodity knowledge graph feature learning-based recommendation method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11829889B2 (en) 2021-06-28 2023-11-28 Institute Of Geology And Geophysics, Chinese Academy Of Sciences Processing method and device for data of well site test based on knowledge graph

Also Published As

Publication number Publication date
CN112015911B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
KR102564144B1 (en) Method, apparatus, device and medium for determining text relevance
US8150859B2 (en) Semantic table of contents for search results
US7853599B2 (en) Feature selection for ranking
JP6299596B2 (en) Query similarity evaluation system, evaluation method, and program
US8527564B2 (en) Image object retrieval based on aggregation of visual annotations
CN112328891B (en) Method for training search model, method for searching target object and device thereof
US20180150466A1 (en) System and method for ranking search results
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
CN112463976B (en) Knowledge graph construction method taking crowd sensing task as center
KR20180053731A (en) How to find K extreme values within a certain processing time
CN110765368A (en) Artificial intelligence system and method for semantic retrieval
CN110737756B (en) Method, apparatus, device and medium for determining answer to user input data
KR20150137006A (en) Annotation display assistance device and method of assisting annotation display
Wang et al. A distance matrix based algorithm for solving the traveling salesman problem
CN109635004B (en) Object description providing method, device and equipment of database
CN112015911B (en) Method for searching massive knowledge maps
US20140059062A1 (en) Incremental updating of query-to-resource mapping
CN111079035B (en) Domain searching and sorting method based on dynamic map link analysis
CN112015914B (en) Knowledge graph path searching method based on deep learning
CN108170665B (en) Keyword expansion method and device based on comprehensive similarity
CN112199461B (en) Document retrieval method, device, medium and equipment based on block index structure
CN114139530A (en) Synonym extraction method and device, electronic equipment and storage medium
JPH09259141A (en) Map data linkage system
CN111723286A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20221020

Granted publication date: 20210720

PP01 Preservation of patent right