CN110609887A

CN110609887A - Scientific and technological resource big data query recommendation system and method based on knowledge graph

Info

Publication number: CN110609887A
Application number: CN201910882779.7A
Authority: CN
Inventors: 郭雨齐; 刘俊中; 沈洪超; 李大春
Original assignee: China Cosys Suzhou Network Technology Co Ltd
Current assignee: Beijing Saisi Litong Technology Co.,Ltd.
Priority date: 2019-09-18
Filing date: 2019-09-18
Publication date: 2019-12-24

Abstract

The embodiment of the invention provides a scientific and technological resource big data query recommendation system based on a knowledge graph, which comprises the following steps: the scientific and technological resource database is used for storing scientific and technological resource data; the knowledge map analysis module is used for analyzing the scientific and technological resource data and forming an analysis result; the knowledge graph storage module is used for storing the analysis result; the query module is used for querying a plurality of matched items from the analysis result according to the keywords provided by the user; the matching items are sorted according to the matching degree with the keywords; the sorting adjustment module is used for sorting and adjusting the sorted matching items according to a set rule; the setting rule comprises the following steps: a user relevance sub-rule, a user historical query record matching sub-rule, the matching entry hot sub-rule, and a location sub-rule of the matching entry in the knowledge graph. A corresponding method is also provided. The technical scheme of the invention is suitable for intelligent query recommendation.

Description

Scientific and technological resource big data query recommendation system and method based on knowledge graph

Technical Field

The invention relates to the technical field of knowledge maps, in particular to a knowledge map-based scientific and technological resource big data query and recommendation system, a knowledge map-based scientific and technological resource big data query and recommendation method and a corresponding storage medium.

Background

With the explosive growth of information resources and scientific and technical literature, users need to spend more time in order to retrieve the required scientific and technical resources. The knowledge graph technology can be used for greatly improving the retrieval efficiency; however, the information retrieved by the method has certain limitation, and other information is not integrated. It is difficult to mine information of interest to users from mass data information

The knowledge graph is a huge semantic network graph essentially comprising nodes and edges, wherein the nodes represent entities or concepts, and the edges represent relationships among the entities or attributes of the entities. The knowledge graph is introduced into the recommendation system of the scientific and technological resources, and rich semantic information in the knowledge graph can be fully utilized, so that the use habits of users can be better fitted, the intellectualization and individuation of the recommendation system are improved, and the satisfaction degree of the users on the recommendation results is improved.

Although the concepts of user portrait, machine learning and the like are introduced into the conventional knowledge graph-based query, an additional database needs to be established for data support, so that additional system resources are increased, the computational complexity of a graph-checking algorithm is high, and the computational speed is seriously influenced when the data volume of the knowledge graph is large.

Disclosure of Invention

The invention aims to provide a knowledge graph-based scientific and technological resource big data query recommendation system and method, which at least solve the problem of insufficient intellectualization in scientific and technological resource big data query recommendation of the conventional knowledge graph.

In order to achieve the above object, in a first aspect of the present invention, a knowledge-graph-based scientific and technological resource big data query recommendation system is provided, including the following modules:

the scientific and technological resource database is used for storing scientific and technological resource data;

the knowledge map analysis module is used for analyzing the scientific and technological resource data and forming an analysis result;

the knowledge graph storage module is used for storing the analysis result;

the query module is used for querying a plurality of matched items from the analysis result according to the keywords provided by the user; the matching items are sorted according to the matching degree with the keywords;

the sorting adjustment module is used for sorting and adjusting the sorted matching items according to a set rule;

the setting rule comprises the following steps: a user relevance sub-rule, a user historical query record matching sub-rule, a matching entry hot sub-rule, and a matching entry location sub-rule in the knowledge graph.

Optionally, the knowledge-graph analysis module includes: and analyzing the fields, the keywords, the related institutions and the copyright holders in the scientific and technological resource data through knowledge map analysis software to obtain the analysis result.

Optionally, the user relevance sub-rule includes:

matching the user with a copyright holder in the matching entry; and

matching the mechanism where the user is located with the related mechanism in the matching entry;

performing correlation calculation on the matching result to obtain a correlation calculation result;

and adjusting the sequence of the corresponding matched items according to the correlation calculation result.

Optionally, the matching sub-rule for the user history query record includes:

acquiring a second keyword in the matched entry, and matching the second keyword in a historical query record of a user;

and adjusting the sequence of the corresponding matched items according to the matching times of the second keyword.

Optionally, the matching entry hot sub-rule includes:

acquiring the matched times of each matched item in a preset time period;

calculating the ratio of the matched times of each matching item to the average value of the matched times of all the matching items;

and adjusting the sequence of the corresponding matched items according to the ratio.

Optionally, the position sub-rule of the matching entry in the knowledge-graph includes:

acquiring the connection relation of the matching items in the knowledge graph, and calculating the centrality of each matching item;

calculating the ratio of the centrality of each matching item to the average of the centralities of all matching items;

Optionally, the system further includes a keyword recommendation module, configured to search, according to the keyword input by the user, an adjacent keyword to the keyword in the knowledge graph, and provide the adjacent keyword to the user for selection by the user.

In a second aspect of the invention, a server is further provided, and the system for querying and recommending scientific resource big data based on a knowledge graph is loaded on the server.

In a third aspect of the present invention, a scientific and technological resource big data query recommendation method based on a knowledge graph is further provided, where the method includes:

analyzing the stored scientific and technological resource data to form an analysis result; and storing the analysis result;

inquiring a plurality of matching items from the analysis result according to the keywords provided by the user; the matching items are sorted according to the matching degree with the keywords;

the matching items are used for carrying out sorting adjustment on the sorted matching items according to a set rule;

the setting rule comprises the following steps: a user relevance sub-rule, a user historical query record matching sub-rule, the matching entry hot sub-rule, and a location sub-rule of the matching entry in the knowledge graph.

In a fourth aspect of the present invention, there is also provided a storage medium, on which computer program instructions are stored, and the computer program instructions, when executed by a processor, implement the steps of the above-mentioned knowledge-graph-based scientific and technological resource big data query recommendation method.

According to the technical scheme, the intelligence of the query result in the knowledge graph can be improved by adjusting the query result, the query result is more fit with the habit of the user by adjusting the query result by adopting the user related information, and meanwhile, complicated data processing steps such as user portrait are avoided.

Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the embodiments of the invention without limiting the embodiments of the invention. In the drawings:

FIG. 1 is a block diagram of a scientific and technological resource big data query recommendation system based on knowledge graph according to an embodiment of the present invention;

fig. 2 is a flowchart of a scientific and technological resource big data query recommendation method based on a knowledge graph according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.

In the embodiments of the present invention, unless otherwise specified, the use of directional terms such as "upper, lower, top, and bottom" is generally used with respect to the orientation shown in the drawings or the positional relationship of the components with respect to each other in the vertical, or gravitational direction.

Fig. 1 is a structural diagram of a scientific and technological resource big data query recommendation system based on a knowledge graph according to an embodiment of the present invention. As shown in fig. 1, an embodiment of the present invention provides a scientific and technological resource big data query recommendation system based on a knowledge graph, including the following modules:

the knowledge graph storage module is used for storing the analysis result;

Therefore, the intelligent recommendation system provided by the embodiment of the invention not only avoids the problems of insufficient intellectualization and individuation in the conventional knowledge-based map query, but also combines various factors such as user relevance, user historical query records, item popularity and item positions, and improves user experience.

In particular, although the conventional knowledge graph-based query has introduced concepts such as user representation and machine learning, it needs to establish an additional database for support and is difficult to implement. The method can directly sequence and adjust the query results according to the established rules, and is simple in calculation. And the specification of the rule can be modified according to the time requirement. The user relevance sub-rule, the user historical query record matching sub-rule, the matching item heat sub-rule and the position sub-rule of the matching item in the knowledge graph comprise most information of a client, so that recommendation is more intelligent.

In an embodiment provided by the present invention, the knowledge-graph analysis module includes: and analyzing the fields, the keywords, the related institutions and the copyright holders in the scientific and technological resource data through knowledge map analysis software. The scientific and technological resource database is used for storing scientific and technological resource data; the knowledge map analysis module is used for analyzing the scientific and technological resource data and forming an analysis result; and the knowledge graph storage module is used for storing the analysis result. The scientific and technological resources in the scientific and technological resource database can include data interfaces with the existing scientific and technological resources, such as data interfaces with the internet and the world Wide Web, which can acquire the existing scientific and technological resources. The knowledge graph analysis module is mainly used for extracting entities and attribute association thereof needed by construction of a knowledge graph from the existing scientific and technological resources to form a complete knowledge graph. The knowledge graph can be constructed by adopting an open-source construction tool, such as a project or OpenKE, and the like, and can also adopt common software, such as TDA or Citespace, and the like.

The knowledge graph storage module adopts one of the following two common storage modes of the knowledge graph: one type is traditional RDF structure storage, and the structured query language of the RDF standard is SPARQL; the other type is a graph database, which can make up the defects of complex and slow query of the traditional relational database when storing the knowledge graph. Currently available graph database software includes Neo4j, OrientDB, ArangoDB, allegrograph, and the like.

In an embodiment of the present invention, the user relevance sub-rule includes: matching the user with a copyright holder in the matching entry; matching the mechanism where the user is located with the related mechanism in the matching entry; performing correlation calculation on the matching result to obtain a correlation calculation result; and adjusting the sequence of the corresponding matched items according to the correlation calculation result. For example, if the user is a first copyright owner, the corresponding correlation score is obtained for the matching entry, and correspondingly, if the user is a second copyright owner, another corresponding correlation score is obtained, and so on. And meanwhile, whether the structure related in the matching item comprises the structure of the user is included, if so, the corresponding relevance score is obtained, and finally the relevance scores of each matching item are summed. If the obtained sum is higher than the first threshold sum, the sorting is moved forward by one bit, if the obtained sum is larger than the set first threshold sum and is also larger than the set second threshold sum, the sorting is moved forward by two bits, and so on. By the embodiment, when the user sees the recommendation information, the user can see that the work of the user appears at the front position, and the click interest of the user is stimulated.

In an embodiment provided by the present invention, the matching sub-rule for the user history query record includes: acquiring a second keyword in the matched entry, and matching the second keyword in a historical query record of a user; and adjusting the sequence of the corresponding matched items according to the matching times of the second keyword.

Such as: the user inquires out a result about the patent law, and finds that the historical inquiry contains keywords such as 'explanation' and the like, so that the user can pay attention not only to the patent law but also to the explanation of the patent law. Then additional keywords in the matching results are required for the query, and if any matching entry contains a patent law, including an explanation of the patent law, the ranking of the matching entry is advanced.

The above is a qualitative algorithm, and in the specific implementation, a quantitative rule may be set to determine the specific number of forward-moving bits of the matching entry. In another embodiment: and acquiring a second keyword in the matched entry, matching the second keyword in a historical query record of a user, if the acquired matching times are greater than a set first threshold value, sequencing the second keyword by one bit, if the acquired matching times are greater than the set first threshold value, sequencing the second keyword by two bits if the acquired matching times are greater than the set first threshold value, and repeating the steps. Through this embodiment, can make the user can pay close attention to current popular science and technology resource, promote the tracking of user to new science and technology trend. In the embodiment, a third keyword or more keywords can be matched according to the setting of the user, and the technical effect of forward movement is also achieved.

In an embodiment provided by the present invention, the matching entry hot sub-rule includes: acquiring the matched times of each matched item in a preset time period; calculating the ratio of the matched times of each matching item to the average value of the matched times of all the matching items; and adjusting the sequence of the corresponding matched items according to the ratio. The preset time period herein may be one week or ten days or one month. For example, the number of times each matching entry is matched in one week is obtained and is marked as Cn, and the value of N is 1, 2, and … N; n is the total number of the matching entries; then Cn mean is notedCalculate Cn andif the ratio is greater than the set first threshold ratio, the sorting is advanced by one bit, and if the ratio is greater than the set second threshold ratio on the basis of the number of times of being greater than the set first threshold, the sorting is advanced by two bits, and so on. Through this embodiment, can make the user can pay close attention to current popular science and technology resource, promote the tracking of user to new science and technology trend.

In one embodiment of the present invention, the location of the matching item in the knowledge-graph comprises: acquiring the connection relation of the matching items in the knowledge graph, and calculating the centrality of each matching item; calculating the ratio of the centrality of each matching item to the average of the centralities of all matching items; and adjusting the sequence of the corresponding matched items according to the ratio. The centrality here refers to the number of connections between the node where the matching entry is located and the adjacent node, and if the number of connections between the node and the adjacent node is larger, it indicates that the node or the matching entry is located in the center of the knowledge graph, and conversely, if the number of connections is smaller, it indicates that the node or the matching entry is located at the edge of the knowledge graph. The centrality here can be determined according to the number of edges of the node, that is, the number of entity relationships corresponding to the entity. Similar to the matching item hot degree sub-rule, the centrality of each matching item is calculated, the average value of all the centralities is calculated, and the sorting is adjusted according to the ratio relation between each centrality and the average value of all the centralities.

In an embodiment provided by the present invention, the system further includes a keyword recommendation module, configured to search, according to the keyword input by the user, an adjacent keyword to the keyword in the knowledge graph, and provide the adjacent keyword to the user for selection by the user. For example, when a user searches for a patent law, the user may search for adjacent keywords of the patent law in an existing knowledge graph, for example, provide other keywords such as implementation rules, historical versions, and modification histories, so that the user may develop knowledge.

In an embodiment of the invention, a server is further provided, and the system for querying and recommending scientific resource big data based on the knowledge graph is loaded on the server. In another embodiment provided by the present invention, there is also provided an apparatus, comprising: a memory and a processor; the memory to store program instructions; the processor is used for calling the program instructions stored in the memory to realize the knowledge-graph-based scientific and technological resource big data query recommendation system. Among other things, a processor may include but is not limited to a general purpose processor, a special purpose processor, a conventional processor, a plurality of microprocessors, a controller, a microcontroller, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) circuit, any other type of Integrated Circuit (IC), a state machine, and so forth. In a common scenario, the device is preferably a server.

Fig. 2 is a flowchart of a scientific and technological resource big data query recommendation method based on a knowledge graph according to an embodiment of the present invention. As shown in fig. 2: in an embodiment provided by the present invention, a scientific and technological resource big data query recommendation method based on a knowledge graph is further provided, the method includes:

analyzing the stored scientific and technological resource data to form an analysis result; and storing the analysis result; inquiring a plurality of matching items from the analysis result according to the keywords provided by the user; the matching items are sorted according to the matching degree with the keywords; the matching items are used for carrying out sorting adjustment on the sorted matching items according to a set rule; the setting rule comprises the following steps: a user relevance sub-rule, a user historical query record matching sub-rule, the matching entry hot sub-rule, and a location sub-rule of the matching entry in the knowledge graph.

The details of the above method embodiments are referenced to the system described above and will not be repeated here.

Embodiments of the present invention also provide a storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the steps of the method for recommending scientific and technological resource big data query based on knowledge graph are implemented.

By the technical scheme, the intelligence of the query result in the knowledge graph can be improved, the query result is more fit with the habit of the user by adjusting the query result by adopting the relevant information of the user, and complicated data processing steps such as user portrait and the like are avoided.

While the embodiments of the present invention have been described in detail with reference to the accompanying drawings, the embodiments of the present invention are not limited to the details of the above embodiments, and various simple modifications can be made to the technical solution of the embodiments of the present invention within the technical idea of the embodiments of the present invention, and the simple modifications are within the scope of the embodiments of the present invention.

It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, the embodiments of the present invention will not be described separately for the various possible combinations.

Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by a program, which is stored in a storage medium and includes several instructions to enable a single chip, a chip, or a processor (processor) to execute all or part of the steps in the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In addition, any combination of the various embodiments of the present invention is also possible, and the same should be considered as disclosed in the embodiments of the present invention as long as it does not depart from the spirit of the embodiments of the present invention.

Claims

1. A scientific and technological resource big data query recommendation system based on knowledge graph is characterized by comprising the following modules:

the knowledge graph storage module is used for storing the analysis result;

2. The knowledge-graph-based scientific resource big data query recommendation system of claim 1, wherein the knowledge-graph analysis module comprises: and analyzing the fields, the keywords, the related institutions and the copyright holders in the scientific and technological resource data through knowledge map analysis software to obtain the analysis result.

3. The knowledge-graph-based scientific resource big data query recommendation system of claim 2, wherein the user relevance sub-rule comprises:

matching the user with a copyright holder in the matching entry; and

performing correlation calculation on the matched result to obtain a correlation calculation result;

4. The knowledge-graph-based scientific resource big data query recommendation system according to claim 2, wherein the user historical query record matches sub-rules, comprising:

5. The knowledge-graph-based scientific resource big data query recommendation system according to claim 2, wherein the matching item hot sub-rule comprises:

acquiring the matched times of each matched item in a preset time period;

6. The knowledge-graph-based scientific resource big data query recommendation system of claim 2 wherein the position sub-rule of the matching entry in the knowledge-graph comprises:

7. The knowledge-graph-based scientific and technological resource big data query recommendation system of claim 1, further comprising a keyword recommendation module for searching the knowledge graph for keywords adjacent to the keywords according to the keywords input by the user and providing the keywords to the user for selection.

8. A server, characterized in that the knowledge-graph based scientific and technological resource big data query recommendation system of any one of claims 1 to 7 is loaded on the server.

9. A scientific and technological resource big data query recommendation method based on knowledge graph is characterized by comprising the following steps:

10. A storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the steps of the method for knowledge-graph based scientific resource big data query recommendation of claim 9.