CN114461813A - Data pushing method, system and storage medium based on knowledge graph - Google Patents
Data pushing method, system and storage medium based on knowledge graph Download PDFInfo
- Publication number
- CN114461813A CN114461813A CN202210049886.3A CN202210049886A CN114461813A CN 114461813 A CN114461813 A CN 114461813A CN 202210049886 A CN202210049886 A CN 202210049886A CN 114461813 A CN114461813 A CN 114461813A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- node
- candidate
- subtrees
- central
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a data pushing method, a system and a storage medium based on a knowledge graph, wherein the method comprises the following steps: the method comprises the steps of obtaining a problem searched by a user, wherein the problem comprises at least two knowledge keywords, extracting the knowledge keywords in the problem, and obtaining a corresponding minimum knowledge sub-tree in a knowledge graph according to the knowledge keywords; establishing candidate knowledge subtrees according to all side information related to the knowledge keywords in the minimum knowledge subtrees, performing score sorting on all the candidate knowledge subtrees according to a scoring rule, and screening the candidate knowledge subtrees; and establishing an individualized model according to the candidate knowledge subtrees acquired each time in the historical retrieval of the user, and pushing the knowledge node with the highest function value of the individualized model. Compared with the prior art, the method and the device have the advantages of high pushing efficiency and the like.
Description
Technical Field
The invention relates to the field of knowledge graphs, in particular to a data pushing method, a data pushing system and a data pushing storage medium based on knowledge graphs.
Background
In the knowledge explosion era, higher education enters the stages of large knowledge quantity, strong subject crossing and high knowledge updating speed, and also puts higher requirements on professional knowledge crossing fusion. To grow into a college complex talent, college students need to have the ability to build a comprehensive knowledge system. In the face of the educational knowledge resources which are explosively increased in the current big data era, college students have the dilemma that professional cross knowledge is difficult to accurately acquire, and the dilemma also becomes the challenge which needs to be met by college cross discipline talent cultivation. However, the barriers set by the traditional subject specialties of colleges and universities still exist, and the teaching practice is in the states of independent development, few cross-subject comprehensive courses and the like. Students lack systematic understanding of knowledge points of the current or cross-professional courses, and the knowledge points lack systematic connection; the teaching contents of different courses are greatly different and lack of knowledge consistency; the theory course arrangement and the skill practice teaching can not be quickly connected, and the reasons cause the incoherence of the teaching, thereby causing the knowledge points learned by students to be in a scattered independent state in a macroscopic angle.
Due to the limitation of knowledge accumulation of students, the accuracy and the specialty are lacked, and the implicit association and the long-path association between knowledge points are difficult to discover. The existing pushing method aiming at knowledge mainly extracts the characteristics of knowledge data, fuses the knowledge data based on the characteristics, and pushes the fused result to a user. However, since all knowledge data need to be updated and processed in real time, the method cannot dynamically adjust in time, and when a student searches, the amount of processed data is large, the time for acquiring and pushing is long, the efficiency is low, and the method cannot meet the requirement that a huge university student group acquires knowledge instantly.
Disclosure of Invention
The present invention is directed to a method, system and storage medium for pushing data based on knowledge graph, which overcome the above-mentioned shortcomings of the prior art.
The purpose of the invention can be realized by the following technical scheme:
a data pushing method based on knowledge graph includes the following steps:
s1, obtaining a problem searched by a user, wherein the problem comprises at least two knowledge keywords, extracting the knowledge keywords in the problem, and obtaining a corresponding minimum knowledge sub-tree in a knowledge graph according to the knowledge keywords;
s2, establishing candidate knowledge subtrees according to all node information and side information related to the knowledge keywords in the minimum knowledge subtree, performing score sorting on all the candidate knowledge subtrees according to a scoring rule, and screening the candidate knowledge subtrees;
s3, establishing a personalized model according to the candidate knowledge subtree acquired each time in the user history retrieval, and pushing the knowledge node with the highest personalized model function value.
Further, the scoring rule specifically includes:
scoring the candidate knowledge subtrees according to the following expression:
wherein score (target) represents score, t represents root node of candidate knowledge sub-tree, α represents compactness, β represents precision, num (e)k) And n each represents the number of candidate knowledge sub-tree nodes, num (r)k) Representing the number of candidate knowledge sub-tree edges,representing the shortest distance of each node to the root node.
Further, the method for establishing the personalized model comprises the following steps:
a1, obtaining the distance from each node to other nodes in the candidate knowledge subtree, summing the distances, and determining a central knowledge node according to the centrality;
a2, obtaining the central knowledge node of the candidate knowledge subtree of each retrieval according to the historical retrieval records of the user, and establishing a central knowledge point set;
a3, calculating the correlation between the central knowledge point of the central knowledge point set and the central knowledge point of the next retrieval;
a4, establishing a personalized model T (o) according to the relevance and the centrality of the central knowledge nodei+1) The expression is as follows:
T(oi+1)=αsim(oi,oi+1)+βcore(oi+1)
wherein α represents compactness, β represents precision, sim (o)i,oi+1) Indicates the correlation, core (o)i+1) Representing the centrality.
Further, the calculation expression of the centrality is as follows:
wherein e isxRepresenting the current node, N (e)x) Denotes exAdjacent node of l (e)j,ek) 0 denotes the node ejAnd node ekThere is no connecting edge between l (e)j,ek) 1 denotes a node ejAnd node ekWith connecting edges in between.
Further, the correlation sim (o)k,ok+1) The calculation expression of (a) is as follows:
where dist denotes the average path length, okRepresents the central knowledge point in the k-th search of the central knowledge point set, p (e)i,ei+1) Represents a relationship weight, spi,oi+1) Represents oiTo oi+1The shortest path length between, num (r) represents the number of times the relation r appears in the path, e represents the knowledge point on the shortest path between two central knowledge points, and w represents the support.
Further, after the central knowledge node is obtained, the central knowledge node is fed back to the user.
Further, when the user has no history retrieval record, randomly pushing knowledge nodes from the minimum knowledge subtree.
Further, after step S2 is executed, the knowledge information corresponding to the screened candidate knowledge subtree k before scoring is transmitted to the user side.
A data push system based on a knowledge graph, comprising a memory and a processor; the memory for storing a computer program; the processor is used for realizing the data pushing method based on the knowledge graph when the computer program is executed.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a method of knowledge-graph based data push as described above.
Compared with the prior art, the invention has the following advantages:
1. the invention establishes the minimum knowledge subtree based on the knowledge graph, establishes the candidate knowledge subtree according to the side information of the subtree, namely the relation between knowledge points, and simultaneously establishes the personalized model by combining the historical retrieval records of the user to push the user. And the knowledge map has the characteristic of real-time updating, so that the user can be ensured to acquire real-time knowledge information.
2. According to the invention, the personalized model combines the relevance and the centrality of the knowledge nodes of the historical retrieval center, so that the personalized model can better reflect the retrieval requirements of real users, and the rationality of pushed contents is ensured.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a diagram illustrating a complete process of retrieving and recommending in accordance with the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
The embodiment provides a data pushing method based on a knowledge graph, as shown in fig. 1, specifically including the following steps:
and step S1, obtaining the problem searched by the user, wherein the problem comprises at least two knowledge keywords, extracting the knowledge keywords in the problem, and obtaining the corresponding minimum knowledge subtree in the knowledge graph according to the knowledge keywords.
The knowledge graph is derived from teaching related contents such as an expert system, a domain database, a learning network, a course selection system, documents and the like, and specifically comprises nodes and edges, wherein the nodes store information about knowledge points, such as a matching analysis method, a conditional probability and the like, so that the nodes are one node, the edges store associated information among different knowledge points, such as a mixed factor is the evaluation of the matching analysis method, and the evaluation is edge information. And extracting a part with the knowledge key words in the knowledge graph as a minimum knowledge sub-tree according to the knowledge key words in the retrieval problem.
And step S2, establishing candidate knowledge subtrees according to all the node information and the side information related to the knowledge keywords in the minimum knowledge subtree, performing score sorting on all the candidate knowledge subtrees according to a scoring rule, and screening the candidate knowledge subtrees.
The process of establishing the candidate knowledge subtree is a process of associating the knowledge nodes with the edges, for example, when a "matching analysis method" of retrieving knowledge points is used, the candidate knowledge subtree which is possibly acquired is converted into complete knowledge point information including node information and edge information, for example, "conditional probability is a precursor knowledge point of the matching analysis method".
After all candidate knowledge subtrees are obtained, the candidate knowledge subtrees are scored according to the following expression:
wherein score (target) represents score, t represents root node of candidate knowledge sub-tree, α represents compactness, β represents precision, num (e)k) And n each represents the number of candidate knowledge sub-tree nodes, num (r)k) Representing the number of candidate knowledge sub-tree edges,representing the shortest distance of each node to the root node.
And finally, reserving the candidate knowledge subtree with the highest score.
And step S3, establishing a personalized model according to the candidate knowledge subtrees acquired each time in the user history retrieval, and pushing the knowledge node with the highest personalized model function value.
The pushing method is completed based on the history retrieval records of the user, the content which most meets the user interest is pushed to the user based on the idea of big data, if the user never retrieves the content, that is, has no history retrieval content, the knowledge node is randomly pushed in the minimum knowledge sub-tree acquired in step S1 as the pushed content.
Each historical retrieval record corresponds to a screened candidate knowledge subtree, and a personalized model is established according to the candidate knowledge subtrees in the following establishing process:
and A1, obtaining the distances from each node to other nodes in the candidate knowledge subtree, summing the distances, determining a central knowledge node according to the centrality, and feeding back to the user.
Wherein, the calculation expression of the centrality is as follows:
wherein e isxRepresenting the current node, N (e)x) Denotes exAdjacent node of l (e)j,ek) 0 denotes the node ejAnd node ekThere is no connecting edge between l (e)j,ek) 1 denotes a node ejAnd node ekWith connecting edges in between.
Step A2, according to the historical search record of the user, obtaining the central knowledge node of the candidate knowledge sub-tree of each search, and establishing a central knowledge point set O ═ { O ═1,o2,...,ok}。
And A3, calculating the correlation between the central knowledge point of the central knowledge point set and the central knowledge point searched for the next time.
Where the correlation sim (o)k,ok+1) The calculation expression of (a) is as follows:
wherein o iskRepresents the central knowledge point in the k-th search of the central knowledge point set, p (e)i,ei+1) Represents a relationship weight, spi,oi+1) Represents oiTo oi+1The shortest path length between the two, num (r) represents the frequency of the relation r appearing in the path, e represents the knowledge point on the shortest path between the two central knowledge points, w represents the support degree, dist represents the average path length, and o is adoptediTo oi+1Length of shortest path between sp.len (o)i,oi+1) Wherein the side information score for determining the shortest path is defined artificially, for example, the scores of "predecessor" and "evaluation" are both 0.5, the other relations are all 0, and the average path length is specifically expressed as:
step A4, establishing a personalized model T (o) according to the relevance and the centrality of the central knowledge nodei+1) The expression is as follows:
T(oi+1)=αsim(oi,oi+1)+βcore(oi+1)
the expression shows that the personalized model reflects the relevance and the centrality of the central knowledge points of the historical retrieval records including the current time each time, the central knowledge point with the highest personalized model function value is selected, and the central knowledge point is pushed to the user.
The usage flow of the user end is shown in fig. 2, which specifically includes the following steps:
first the user asks a question, for example, how there is a correlation between the "matching analysis" and the "naive bayes classifier".
Then, the keywords are positioned in the knowledge graph through the extraction of a knowledge keyword matching analysis method and a naive Bayes classifier, and candidate knowledge subtrees are established according to side information, wherein three knowledge trees can position the knowledge keywords.
The scores of the candidate knowledge sub-trees are calculated according to the method in step S2 above, the scores of the three candidate knowledge sub-trees being ranked as a < b < c.
Assuming that the candidate knowledge subtrees 2 before scoring are selected as subsequent push contents, the user will receive knowledge information and knowledge node information corresponding to the two candidate knowledge subtrees at the same time.
And selecting the candidate knowledge subtree b as the content for establishing the personalized model, determining a central knowledge point according to the centrality, establishing the personalized model by combining the correlation, and finally selecting the knowledge point with the highest function value from the personalized model to push, namely pushing the 'Bernoulli model'.
The embodiment also provides a data pushing system based on the knowledge graph, which comprises a memory and a processor; a memory for storing a computer program; the processor executes the data pushing method based on the knowledge graph.
The present embodiment further provides a computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implements the data pushing method based on knowledge graph as mentioned in the embodiments of the present invention, and any combination of one or more computer-readable media may be adopted. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.
Claims (10)
1. A data pushing method based on knowledge graph is characterized by comprising the following steps:
s1, obtaining a problem searched by a user, wherein the problem comprises at least two knowledge keywords, extracting the knowledge keywords in the problem, and obtaining a corresponding minimum knowledge sub-tree in a knowledge graph according to the knowledge keywords;
s2, establishing candidate knowledge subtrees according to all node information and side information related to the knowledge keywords in the minimum knowledge subtree, performing score sorting on all the candidate knowledge subtrees according to a scoring rule, and screening the candidate knowledge subtrees;
s3, establishing a personalized model according to the candidate knowledge subtree acquired each time in the user history retrieval, and pushing the knowledge node with the highest personalized model function value.
2. The data pushing method based on the knowledge-graph as claimed in claim 1, wherein the scoring rules specifically include:
scoring the candidate knowledge subtrees according to the following expression:
wherein score (target) represents score, t represents root node of candidate knowledge sub-tree, α represents compactness, β represents precision, num (e)k) And n each represents the number of candidate knowledge sub-tree nodes, num (r)k) Representing the number of candidate knowledge sub-tree edges,representing the shortest distance of each node to the root node.
3. The data pushing method based on the knowledge graph according to claim 1, wherein the personalized model is established by the following method:
a1, obtaining the distance from each node to other nodes in the candidate knowledge subtree, summing the distances, and determining a central knowledge node according to the centrality;
a2, obtaining the central knowledge node of the candidate knowledge subtree of each retrieval according to the historical retrieval records of the user, and establishing a central knowledge point set;
a3, calculating the correlation between the central knowledge point of the central knowledge point set and the central knowledge point of the next retrieval;
a4, establishing a personalized model T (o) according to the relevance and the centrality of the central knowledge nodei+1) The expression is as follows:
T(oi+1)=αsim(oi,oi+1)+βcore(oi+1)
wherein α represents compactness, β represents precision, sim (o)i,oi+1) Indicates the correlation, core (o)i+1) Representing the centrality.
4. The knowledge-graph-based data pushing method according to claim 3, wherein the centrality is calculated by the following expression:
wherein e isxRepresenting the current node, N (e)x) Denotes exAdjacent node of l (e)j,ek) 0 denotes the node ejAnd node ekThere is no connecting edge between l (e)j,ek) 1 denotes a node ejAnd node ekWith connecting edges in between.
5. The knowledge-graph-based data pushing method according to claim 3, wherein the correlation sim (o) isk,ok+1) The calculation expression of (a) is as follows:
where dist denotes the average path length, okRepresents the central knowledge point in the k-th search of the central knowledge point set, p (e)i,ei+1) Represents a relationship weight, spi,oi+1) Represents oiTo oi+1The shortest path length between, num (r) represents the number of times the relation r appears in the path, e represents the knowledge point on the shortest path between two central knowledge points, and w represents the support.
6. The data pushing method based on the knowledge-graph of claim 3, wherein after the central knowledge node is obtained, the central knowledge node is fed back to the user.
7. The knowledge-graph-based data pushing method according to claim 1, wherein when the user has no history retrieval record, the knowledge nodes are randomly pushed from the smallest knowledge sub-tree.
8. The method of claim 1, wherein after step S2 is executed, the knowledge information corresponding to the candidate k-before-score knowledge subtrees is transmitted to the user end.
9. A data push system based on knowledge graph is characterized by comprising a memory and a processor; the memory for storing a computer program; the processor, when executing the computer program, is configured to implement the following method:
s1, obtaining the problem searched by the user, extracting the knowledge key words in the problem, and obtaining the corresponding minimum knowledge sub-tree in the knowledge map according to the knowledge key words;
s2, establishing candidate knowledge subtrees according to all the side information related to the knowledge keywords in the minimum knowledge subtree, performing score sorting on all the candidate knowledge subtrees according to a scoring rule, and screening the candidate knowledge subtrees;
s3, establishing a personalized model according to the candidate knowledge subtree acquired each time in the user history retrieval, and pushing the knowledge node with the highest personalized model function value.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of a method for knowledge-graph based data push according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210049886.3A CN114461813A (en) | 2022-01-17 | 2022-01-17 | Data pushing method, system and storage medium based on knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210049886.3A CN114461813A (en) | 2022-01-17 | 2022-01-17 | Data pushing method, system and storage medium based on knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114461813A true CN114461813A (en) | 2022-05-10 |
Family
ID=81408886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210049886.3A Pending CN114461813A (en) | 2022-01-17 | 2022-01-17 | Data pushing method, system and storage medium based on knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114461813A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117828539A (en) * | 2024-03-06 | 2024-04-05 | 昆明智合力兴信息系统集成有限公司 | Intelligent data fusion analysis system and method |
-
2022
- 2022-01-17 CN CN202210049886.3A patent/CN114461813A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117828539A (en) * | 2024-03-06 | 2024-04-05 | 昆明智合力兴信息系统集成有限公司 | Intelligent data fusion analysis system and method |
CN117828539B (en) * | 2024-03-06 | 2024-05-24 | 昆明智合力兴信息系统集成有限公司 | Intelligent data fusion analysis system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11500818B2 (en) | Method and system for large scale data curation | |
CN108345690B (en) | Intelligent question and answer method and system | |
CN110727779A (en) | Question-answering method and system based on multi-model fusion | |
CN109299245B (en) | Method and device for recalling knowledge points | |
CN111353030A (en) | Knowledge question and answer retrieval method and device based on travel field knowledge graph | |
US20150006528A1 (en) | Hierarchical data structure of documents | |
CN108846138B (en) | Question classification model construction method, device and medium fusing answer information | |
CN111563192B (en) | Entity alignment method, device, electronic equipment and storage medium | |
JP2013519138A (en) | Join embedding for item association | |
CN110968684A (en) | Information processing method, device, equipment and storage medium | |
CN110765348B (en) | Hot word recommendation method and device, electronic equipment and storage medium | |
WO2023207096A1 (en) | Entity linking method and apparatus, device, and nonvolatile readable storage medium | |
Yahia et al. | A new approach for evaluation of data mining techniques | |
CN115563313A (en) | Knowledge graph-based document book semantic retrieval system | |
CN113742446A (en) | Knowledge graph question-answering method and system based on path sorting | |
CN113505190B (en) | Address information correction method, device, computer equipment and storage medium | |
CN114328800A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN114461813A (en) | Data pushing method, system and storage medium based on knowledge graph | |
CN111813916B (en) | Intelligent question-answering method, device, computer equipment and medium | |
CN111597400A (en) | Computer retrieval system and method based on way-finding algorithm | |
CN117112727A (en) | Large language model fine tuning instruction set construction method suitable for cloud computing service | |
CN104376000A (en) | Webpage attribute determination method and webpage attribute determination device | |
CN113392294B (en) | Sample labeling method and device | |
CN115329083A (en) | Document classification method and device, computer equipment and storage medium | |
CN114417010A (en) | Knowledge graph construction method and device for real-time workflow and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |