CN111563172B

CN111563172B - Academic hot spot trend prediction method and device based on dynamic knowledge graph construction

Info

Publication number: CN111563172B
Application number: CN202010376472.2A
Authority: CN
Inventors: 高军晖; 谭润东; 张心觉; 江荣峰; 龚建兵; 楼敬伟
Original assignee: Shanghai Biotecan Medical Diagnostics Co ltd; Shanghai Zhangjiang Medical Innovation Research Institute; Shanghai Biotecan Biology Medicine Technology Co ltd
Current assignee: Shanghai Biotecan Medical Diagnostics Co ltd; Shanghai Zhangjiang Medical Innovation Research Institute; Shanghai Biotecan Biology Medicine Technology Co ltd
Priority date: 2020-05-07
Filing date: 2020-05-07
Publication date: 2024-02-06
Anticipated expiration: 2040-05-07
Also published as: CN111563172A

Abstract

The invention discloses an academic hot spot trend prediction method and device based on dynamic knowledge graph construction. The method comprises the following steps: acquiring paper texts corresponding to target words, and determining a high-frequency word dictionary corresponding to the target words according to the paper texts; establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes. According to the embodiment of the invention, the knowledge graph can be established according to the paper text corresponding to the target vocabulary, the network index of each node can be determined according to the relation among the nodes in the established knowledge graph, and the academic hot spot trend corresponding to the target vocabulary is predicted according to the network index of each node.

Description

Academic hot spot trend prediction method and device based on dynamic knowledge graph construction

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to an academic hot spot trend prediction method and device based on dynamic knowledge graph construction.

Background

How to better analyze and understand academic hot spot trends is a problem facing researchers. The method can be used for accurately predicting the academic hot spot trend, so that scientific researchers can be helped to quickly know and master the current situation of industrial research, and the method has high efficiency.

In the prior art, an academic hot spot trend prediction method based on dynamic knowledge graph construction is based on word frequency of professional vocabulary. And determining the academic hot trend of the field by using the frequency of the professional vocabulary of the field in the literature of the field.

The inventors have found that the following drawbacks exist in the prior art in the process of implementing the present invention: the existing academic hot trend prediction method based on dynamic knowledge graph construction is based on word frequency of professional vocabulary, and does not consider relations among the professional vocabulary, and is affected by the number of literature articles.

Disclosure of Invention

The invention provides an academic hot spot trend prediction method and device based on dynamic knowledge graph construction, which are used for predicting academic hot spot trend according to the relation among vocabularies in paper texts.

In a first aspect, an embodiment of the present invention provides an academic hotspot trend prediction method based on dynamic knowledge graph construction, including:

Acquiring paper texts corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper texts;

establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary;

and determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

In a second aspect, an embodiment of the present invention further provides an academic hotspot trend prediction apparatus constructed based on a dynamic knowledge graph, including:

the paper text acquisition module is used for acquiring paper texts corresponding to the target vocabulary and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper texts;

the knowledge graph establishing module is used for establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary;

and the trend ranking determining module is used for determining network indexes of all nodes in the knowledge graph and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

In a third aspect, an embodiment of the present invention further provides a computer apparatus, including:

one or more processors;

A storage means for storing one or more programs;

and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the academic hotspot trend prediction method based on the dynamic knowledge graph construction provided by any embodiment of the invention.

In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, where a computer program is stored, where the program when executed by a processor implements the academic hotspot trend prediction method based on dynamic knowledge graph construction provided by any embodiment of the present invention.

According to the embodiment of the invention, the high-frequency vocabulary dictionary corresponding to the target vocabulary is determined according to the paper text corresponding to the target vocabulary, then the knowledge graph corresponding to the target vocabulary is established according to the paper text and the high-frequency vocabulary dictionary, finally the network index of each node in the knowledge graph is determined, the academic hot spot trend ranking corresponding to the target vocabulary is obtained according to the network index, the knowledge graph can be established according to the paper text corresponding to the target vocabulary, the network index of each node can be determined according to the relation between each node in the established knowledge graph, and the academic hot spot trend corresponding to the target vocabulary is predicted according to the network index of each node.

Drawings

Fig. 1 is a flowchart of an academic hotspot trend prediction method based on dynamic knowledge graph construction according to an embodiment of the present invention;

fig. 2 is a flowchart of an academic hotspot trend prediction method based on dynamic knowledge graph construction provided in a second embodiment of the present invention;

fig. 3 is a flowchart of an academic hotspot trend prediction method based on dynamic knowledge graph construction according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an academic hot spot trend prediction device constructed based on a dynamic knowledge graph according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof.

It should be further noted that, for convenience of description, only some, but not all of the matters related to the present invention are shown in the accompanying drawings. Before discussing exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently, or at the same time. Furthermore, the order of the operations may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Example 1

Fig. 1 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph, which is provided in an embodiment of the present invention, and the method may be implemented by an academic hotspot trend prediction device constructed based on a dynamic knowledge graph, and the device may be implemented by software and/or hardware, and may be generally integrated in a computer device. Accordingly, as shown in fig. 1, the method includes the following operations:

and 101, acquiring paper texts corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper texts.

Alternatively, the target vocabulary is a specialized vocabulary that requires academic hotspot trend prediction. Illustratively, the target vocabulary is intestinal flora.

Alternatively, the target vocabulary may be a professional vocabulary input by the user. The user inputs a target vocabulary, and requests to acquire academic hot spot trends corresponding to the target vocabulary.

Alternatively, the paper text may be paper abstract text or paper body text. The paper summary text is the summary text of the paper. The paper abstract text is the body text of the paper.

In a specific example, obtaining the paper text corresponding to the target vocabulary may include: in at least one paper retrieval platform, retrieving papers associated with the target vocabulary; and downloading the abstract text of the searched paper associated with the target vocabulary, and taking the downloaded abstract text as the paper abstract text corresponding to the target vocabulary.

In another specific example, obtaining the paper text corresponding to the target vocabulary may include: searching papers related to the target vocabulary in a paper searching platform; and downloading the text of the paper associated with the target vocabulary, and taking the downloaded text as the text of the paper corresponding to the target vocabulary.

Alternatively, the paper retrieval platform may include: and a universal data knowledge service platform and a knowledge network.

Optionally, determining the high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper text may include: according to the paper text, obtaining a high-frequency vocabulary corresponding to the target vocabulary; and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary.

The high-frequency vocabulary is a vocabulary which is co-present with the target vocabulary for a number of times greater than or equal to a preset number of times threshold. The preset number of times threshold may be set according to the service requirement. Illustratively, the preset number of times threshold is 50. The high-frequency vocabulary dictionary is a list for storing high-frequency vocabulary corresponding to the target vocabulary.

Optionally, according to the paper text, obtaining the high-frequency vocabulary corresponding to the target vocabulary includes: analyzing the paper texts to obtain words which appear together with the target words in each text; counting the number of times that the vocabulary and the target vocabulary jointly appear aiming at each acquired vocabulary; and determining the vocabulary with the times greater than or equal to the preset times threshold as the high-frequency vocabulary corresponding to the target vocabulary.

Optionally, determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary includes: and adding the high-frequency vocabulary corresponding to the target vocabulary into a high-frequency vocabulary dictionary corresponding to the target vocabulary.

And 102, establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary.

Optionally, establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary may include: classifying paper texts according to years; determining a target phrase set corresponding to the paper texts of each year according to the word relation in each sentence in the paper texts of each year; according to the high-frequency vocabulary dictionary, each target phrase in the target phrase set corresponding to the paper text of each year is filtered; and establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year.

And classifying the paper texts according to the years to obtain the paper texts of each year. In one specific example, the total number of paper texts corresponding to the target vocabulary is 95. And classifying the paper texts according to the years to obtain the paper texts of each year. The number of paper texts in 2017 is 28. The number of paper texts in 2018 is 35. The number of paper texts in 2019 is 32.

Optionally, determining the target phrase set corresponding to the paper text of each year according to the word relation in each sentence in the paper text of each year may include: one year is sequentially obtained from the years as the current treatment year; extracting subjects, predicates and objects in sentences according to word relations in the sentences aiming at each sentence in the paper texts of the current processing year, combining the subjects, the predicates and the objects according to the sequence of the subjects and the predicates to form a target phrase, and adding the target phrase into a target phrase set corresponding to the paper texts of the current processing year; and returning to execute the operation of sequentially acquiring one year from the years as the current processing year until the processing of all the years is completed.

Optionally, for each sentence in the paper text of the current processing year, extracting subjects, predicates and objects in the sentence according to word relations in the sentence, and combining the subjects, predicates and objects according to the subject-to-object sequence to form a target phrase, and adding the target phrase to a target phrase set corresponding to the paper text of the current processing year by using a natural language processing (Natural Language Processing, NLP) technology.

In one specific example, the sentence is "probiotic bacteria kill bacteria that cause diarrhea". And extracting subjects 'probiotics', predicates 'kills' and objects 'bacteria' in the sentences according to word relation in the sentences by using an NLP technology, combining the subjects, predicates and objects according to a subject-to-object sequence to form a target phrase 'probiotics kills bacteria', and adding the target phrase into a target phrase set corresponding to the paper text of the current processing year.

Optionally, filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary may include: judging whether a subject or object in the current processing target phrase belongs to a high-frequency vocabulary dictionary; if the subjects or objects in the current processing target phrase are determined to belong to the high-frequency vocabulary dictionary, reserving the current processing target phrase in the target phrase set; if it is determined that neither subject nor object in the current processing target phrase belongs to the high frequency vocabulary dictionary, the current processing target phrase is deleted from the target phrase set.

And filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary, and only keeping the target phrases of which the subject or object belongs to the high-frequency vocabulary dictionary corresponding to the target vocabulary.

Optionally, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year may include: one year is sequentially obtained from the years as the current treatment year; establishing a connecting line between a subject and an object in a target phrase set corresponding to the paper text of the current processing year by taking the subject and the object in the target phrase as nodes for each target phrase in the filtered target phrase set; converging all the established links to obtain a knowledge graph corresponding to the current processing year; and returning to execute the operation of sequentially acquiring one year from the years as the current processing year until the processing of all the years is completed. Thus, knowledge maps corresponding to the respective years are obtained. Each year corresponds to a knowledge graph.

And 103, determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

Optionally, determining the network index of each node in the knowledge graph, and obtaining the academic hotspot trend ranking corresponding to the target vocabulary according to the network index may include: calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool; ordering the network indexes of the nodes according to the year aiming at each node, and calculating the network index change trend of the nodes according to the ordering result by using a linear regression technology; and sequencing the nodes from high to low according to the change trend of the network index to obtain the academic hotspot trend ranking corresponding to the target vocabulary.

The network index is an evaluation index for evaluating importance of each node in the network. And according to the size of the network index, the importance ranking of each node in the network can be obtained. Alternatively, the network index may be degree or node centrality.

The degree of a node refers to the number of links associated with the node. The higher the degree of a node, the higher the importance of the node.

Optionally, the node centrality may include: center-to-center and center-to-center. The mediating centrality is the ratio of the number of shortest paths through a node in the network to the number of shortest paths in the entire network. The higher the node's midpoints are, the higher the importance of the node is. The centrality of degree is the ratio of the degree of a node in the network to the sum of the degrees of all nodes in the network. The degree intermediacy reflects the number of nodes directly connected to the node, and if a node has a high degree intermediacy, it is possible that the node is located in the center of the network and has high authority. The higher the degree-centering of the node, the higher the importance of the node.

A complex network tool is a software tool for analyzing a complex network. For example, a complex network tool network x is used to analyze the knowledge-graph network corresponding to each year, and calculate the network index of each node in the knowledge-graph corresponding to each year.

Optionally, for each node, sorting the network indexes of the nodes according to the year, and calculating the network index change trend of the node according to the sorting result by using a linear regression technology, may include: acquiring network indexes of current processing nodes in each year from the calculated network indexes of each node in the knowledge graph corresponding to each year; ordering the network indexes of the current processing nodes according to the years, namely listing the network indexes of the current processing nodes according to the years; and determining a quantitative relation between the year and the network index of the current processing node according to the sequencing result by using a linear regression technology, and calculating the change trend of the network index of the current processing node according to the quantitative relation, namely calculating the predicted value of the network index of the current processing node of the next year according to the quantitative relation.

The network index change trend is a predicted value of the network index of the next year. In one specific example, the network metrics of the current processing node are ranked by year, i.e., the network metrics of the current processing node are listed by year: 2017, 9;2018, 10;2019, 16. And determining a quantitative relation between the year and the network index of the current processing node according to the sequencing result by using a linear regression technology, and calculating a predicted value of the network index of the node in 2020 according to the quantitative relation.

And sequencing the nodes from high to low according to the change trend of the network index, wherein the obtained node ranking is the academic hot spot trend ranking corresponding to the target vocabulary. Nodes are words extracted from the paper text. The network index change trend represents the importance trend of each node in the next year. In the academic hotspot trend ranking, the higher the ranking of the nodes, the higher the importance of the representative node in the next year, namely, the nodes are academic hotspots corresponding to target vocabularies. And determining the academic hotspot trend corresponding to the target vocabulary according to the academic hotspot trend ranking corresponding to the target vocabulary.

Example two

Fig. 2 is a flowchart of an academic hotspot trend prediction method based on dynamic knowledge graph construction, which is provided in the second embodiment of the present invention. This embodiment may be combined with each of the alternatives in one or more embodiments described above, and in this embodiment, creating a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary may include: classifying paper texts according to years; determining a target phrase set corresponding to the paper texts of each year according to the word relation in each sentence in the paper texts of each year; according to the high-frequency vocabulary dictionary, each target phrase in the target phrase set corresponding to the paper text of each year is filtered; and establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year.

Accordingly, as shown in fig. 2, the method includes the following operations:

step 201, acquiring paper text corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper text.

And 202, classifying the paper texts according to the years.

The paper texts are classified according to the years, and the paper texts of each year are obtained.

In one specific example, the total number of paper texts corresponding to the target vocabulary is 95. And classifying the paper texts according to the years to obtain the paper texts of each year. The number of paper texts in 2017 is 28. The number of paper texts in 2018 is 35. The number of paper texts in 2019 is 32.

Step 203, determining a target phrase set corresponding to the paper texts of each year according to the word relation in each sentence in the paper texts of each year.

And 204, filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary.

And 205, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year.

And 206, determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

According to the embodiment of the invention, the paper texts of all the years are classified according to the years, then the target phrase set corresponding to the paper texts of all the years is determined according to the word relation in each sentence in the paper texts of all the years, all the target phrases in the target phrase set corresponding to the paper texts of all the years are filtered according to the high-frequency vocabulary dictionary, finally the knowledge graph corresponding to all the years is built according to the filtered target phrase set corresponding to the paper texts of all the years, and the knowledge graph can be built according to the paper texts of all the years corresponding to the target vocabulary.

Example III

Fig. 3 is a flowchart of an academic hotspot trend prediction method based on dynamic knowledge graph construction according to a third embodiment of the present invention. The present embodiment may be combined with each of the alternatives in the one or more embodiments, and in this embodiment, determining a network index of each node in the knowledge graph, and obtaining an academic hotspot trend ranking corresponding to the target vocabulary according to the network index may include: calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool; ordering the network indexes of the nodes according to the year aiming at each node, and calculating the network index change trend of the nodes according to the ordering result by using a linear regression technology; and sequencing the nodes from high to low according to the change trend of the network index to obtain the academic hotspot trend ranking corresponding to the target vocabulary.

Accordingly, as shown in fig. 3, the method includes the following operations:

step 301, acquiring paper text corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper text.

And 302, classifying the paper texts according to years.

And step 303, determining a target phrase set corresponding to the paper texts of all the years according to the word relation in all the sentences in the paper texts of all the years.

And step 304, filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary.

And 305, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year.

Step 306, calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool.

Optionally, the network indicator is an evaluation indicator for evaluating importance of each node in the network. And according to the size of the network index, the importance ranking of each node in the network can be obtained. Alternatively, the network index may be degree or node centrality.

Step 307, for each node, ordering the network indexes of the nodes according to the year, and calculating the network index change trend of the nodes according to the ordering result by using a linear regression technology.

And 308, sequencing the nodes from high to low according to the change trend of the network index to obtain the academic hotspot trend ranking corresponding to the target vocabulary.

And sequencing the nodes from high to low according to the change trend of the network index, wherein the obtained node ranking is the academic hotspot trend ranking corresponding to the target vocabulary. Nodes are words extracted from the paper text. The network index change trend represents the importance trend of each node in the next year. In the academic hotspot trend ranking, the higher the ranking of the nodes, the higher the importance of the representative node in the next year, namely, the nodes are academic hotspots corresponding to target vocabularies. And determining the academic hotspot trend corresponding to the target vocabulary according to the academic hotspot trend ranking corresponding to the target vocabulary.

According to the embodiment of the invention, the network indexes of the nodes in the knowledge graph corresponding to the years are calculated by using a complex network tool, then the network indexes of the nodes are sequenced according to the years for each node, a linear regression technology is used, the network index change trend of the nodes is calculated according to the sequencing result, finally the nodes are sequenced from high to low according to the network index change trend, the academic hot trend ranking corresponding to the target vocabulary is obtained, the network index change trend of each node can be determined according to the relation among the nodes in the established knowledge graph, and the academic hot trend corresponding to the target vocabulary is predicted according to the network index change trend of each node.

Example IV

Fig. 4 is a schematic structural diagram of an academic hot spot trend prediction device constructed based on a dynamic knowledge graph according to a fourth embodiment of the present invention. The embodiment can be suitable for predicting academic hot spot trend. The apparatus may be implemented in software and/or hardware, and the apparatus may be configured in a computer device. As shown in fig. 4, the apparatus may include: a paper text acquisition module 401, a knowledge graph creation module 402, and a trend ranking determination module 403.

The paper text obtaining module 401 is configured to obtain a paper text corresponding to a target vocabulary, and determine a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper text; a knowledge graph establishing module 402, configured to establish a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary; the trend ranking determining module 403 is configured to determine a network index of each node in the knowledge graph, and obtain an academic hotspot trend ranking corresponding to the target vocabulary according to the network index.

Optionally, based on the above technical solution, the knowledge graph building module 402 may include: the text classification unit is used for classifying paper texts according to years; a set determining unit, configured to determine a set of target phrases corresponding to each year of the paper text according to word relations in each sentence in each year of the paper text; the phrase filtering unit is used for filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary; and the atlas establishing unit is used for establishing knowledge atlas corresponding to each year according to the filtered object phrase set corresponding to the paper text of each year.

Optionally, on the basis of the above technical solution, the set determining unit may include: a year acquisition subunit for sequentially acquiring one year from the respective years as a current processing year; the collection determining subunit is used for extracting subjects, predicates and objects in sentences according to word relations in the sentences for each sentence in the paper text of the current processing year, combining the subjects, the predicates and the objects according to the sequence of the subjects and the predicates to form a target phrase, and adding the target phrase into the target phrase collection corresponding to the paper text of the current processing year; and the operation return subunit is used for returning to execute the operation of sequentially acquiring one year from the years as the current processing year until the processing of all the years is completed.

Optionally, on the basis of the above technical solution, the phrase filtering unit may include: a phrase judging subunit for judging whether the subject or object in the current processing target phrase belongs to a high-frequency vocabulary dictionary; a phrase retaining subunit, configured to retain, in the target phrase set, the current processing target phrase if it is determined that the subject or object in the current processing target phrase belongs to the high-frequency vocabulary dictionary; and the phrase deleting subunit is used for deleting the current processing target phrase in the target phrase set if the fact that the subject and the object in the current processing target phrase do not belong to the high-frequency vocabulary dictionary is determined.

Optionally, based on the above technical solution, the map building unit may include: a year acquisition subunit for sequentially acquiring one year from the respective years as a current processing year; the phrase processing subunit is used for establishing a connecting line between a subject and an object by taking the subject and the object in the target phrase as nodes for each target phrase in the filtered target phrase set corresponding to the paper text of the current processing year; the link aggregation subunit is used for aggregating all the links established to obtain a knowledge graph corresponding to the current processing year; and the operation return subunit is used for returning to execute the operation of sequentially acquiring one year from the years as the current processing year until the processing of all the years is completed.

Optionally, based on the above technical solution, the trend ranking determining module 403 may include: an index calculation unit for calculating a network index of each node in the knowledge graph corresponding to each year using a complex network tool; the change trend calculation unit is used for sequencing the network indexes of the nodes according to the year aiming at each node, and calculating the change trend of the network indexes of the nodes according to the sequencing result by using a linear regression technology; and the node ordering unit is used for ordering the nodes from high to low according to the change trend of the network index to obtain academic hot spot trend ranking corresponding to the target vocabulary.

Optionally, based on the above technical solution, the paper text obtaining module 401 may include: the vocabulary acquisition unit is used for acquiring a high-frequency vocabulary corresponding to the target vocabulary according to the paper text; and a dictionary generating unit for determining a high-frequency vocabulary dictionary corresponding to the target vocabulary based on the high-frequency vocabulary.

The academic hot spot trend prediction device based on dynamic knowledge graph construction can execute the academic hot spot trend prediction method based on dynamic knowledge graph construction provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the academic hot spot trend prediction method based on dynamic knowledge graph construction.

Example five

Fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. Fig. 5 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 5 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 5, the computer device 12 is in the form of a general purpose computer device. Components of computer device 12 may include, but are not limited to: one or more processors 16, a memory 28, a bus 18 that connects the various system components, including the memory 28 and the processor 16. The processor 16 includes, but is not limited to, an AI processor.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown in fig. 5, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

The processor 16 of the computer device 12 executes programs stored in the memory 28 to perform various functional applications and data processing, such as implementing the academic hotspot trend prediction method based on dynamic knowledge-graph construction provided by embodiments of the present invention. The method specifically comprises the following steps: acquiring paper texts corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper texts; establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

Example six

The sixth embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the academic hotspot trend prediction method based on dynamic knowledge graph construction provided by the embodiment of the invention. The method specifically comprises the following steps: acquiring paper texts corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the paper texts; establishing a knowledge graph corresponding to the target vocabulary according to the paper text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes.

The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

The computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++, ruby, go and conventional procedural programming languages, such as the "C" programming language or similar programming languages, and computer languages of AI algorithms. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. The academic hot spot trend prediction method based on dynamic knowledge graph construction is characterized by comprising the following steps of:

acquiring paper texts corresponding to target words, and determining a high-frequency word dictionary corresponding to the target words according to the paper texts;

determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes;

According to the paper text and the high-frequency vocabulary dictionary, establishing a knowledge graph corresponding to the target vocabulary, wherein the knowledge graph comprises: classifying the paperwork text by year; determining a target phrase set corresponding to the paper texts of each year according to the word relation in each sentence in the paper texts of each year; according to the high-frequency vocabulary dictionary, each target phrase in the target phrase set corresponding to the paper text of each year is filtered; establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the paper text of each year;

determining a target phrase set corresponding to the paper text of each year according to the word relation in each sentence in the paper text of each year, comprising: one year is sequentially obtained from the years as the current treatment year; extracting subjects, predicates and objects in sentences according to word relations in the sentences aiming at each sentence in the paper texts of the current processing year, combining the subjects, the predicates and the objects according to a main-predicate-object sequence to form a target phrase, and adding the target phrase into a target phrase set corresponding to the paper texts of the current processing year; returning to execute the operation of sequentially obtaining one year from the years as the current processing year until the processing of all the years is completed;

According to the high-frequency vocabulary dictionary, filtering each target phrase in the target phrase set corresponding to the paper text of each year, including: judging whether a subject or object in the current processing target phrase belongs to the high-frequency vocabulary dictionary; if the subjects or objects in the current processing target phrase are determined to belong to the high-frequency vocabulary dictionary, reserving the current processing target phrase in the target phrase set; deleting the current processing target phrase in the target phrase set if determining that neither subject nor object in the current processing target phrase belongs to the high-frequency vocabulary dictionary;

according to the filtered target phrase set corresponding to the paper text of each year, establishing a knowledge graph corresponding to each year, including: one year is sequentially obtained from the years as the current treatment year; establishing a connecting line between a subject and an object in a target phrase as nodes for each target phrase in a filtered target phrase set corresponding to the paper text of the current processing year; converging all the established links to obtain a knowledge graph corresponding to the current processing year; returning to execute the operation of sequentially obtaining one year from the years as the current processing year until the processing of all the years is completed;

Determining network indexes of all nodes in the knowledge graph, and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes, wherein the method comprises the following steps: calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool; ordering the network indexes of the nodes according to the year aiming at each node, and calculating the network index change trend of the nodes according to the ordering result by using a linear regression technology; and sequencing all the nodes from high to low according to the change trend of the network index to obtain the academic hot spot trend ranking corresponding to the target vocabulary.

2. The method of claim 1, wherein determining a high frequency vocabulary dictionary corresponding to the target vocabulary from the discussion text comprises:

according to the paper text, obtaining a high-frequency vocabulary corresponding to the target vocabulary;

and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary.

3. An academic hot spot trend prediction device constructed based on dynamic knowledge graph is characterized by comprising:

the paper text acquisition module is used for acquiring paper texts corresponding to target words and determining a high-frequency word dictionary corresponding to the target words according to the paper texts;

the trend ranking determining module is used for determining network indexes of all nodes in the knowledge graph and obtaining academic hot spot trend ranking corresponding to the target vocabulary according to the network indexes;

the knowledge graph building module comprises: the text classification unit is used for classifying paper texts according to years; a set determining unit, configured to determine a set of target phrases corresponding to each year of the paper text according to word relations in each sentence in each year of the paper text; the phrase filtering unit is used for filtering each target phrase in the target phrase set corresponding to the paper text of each year according to the high-frequency vocabulary dictionary; the atlas establishing unit is used for establishing knowledge atlas corresponding to each year according to the filtered object phrase set corresponding to the paper text of each year;

the set determination unit includes: a year acquisition subunit for sequentially acquiring one year from the respective years as a current processing year; the collection determining subunit is used for extracting subjects, predicates and objects in sentences according to word relations in the sentences for each sentence in the paper text of the current processing year, combining the subjects, the predicates and the objects according to the sequence of the subjects and the predicates to form a target phrase, and adding the target phrase into the target phrase collection corresponding to the paper text of the current processing year; an operation return subunit for returning to execute operations of sequentially acquiring one year from the respective years as the current processing year until the processing of all the years is completed;

The phrase filtering unit includes: a phrase judging subunit for judging whether the subject or object in the current processing target phrase belongs to a high-frequency vocabulary dictionary; a phrase retaining subunit, configured to retain, in the target phrase set, the current processing target phrase if it is determined that the subject or object in the current processing target phrase belongs to the high-frequency vocabulary dictionary; a phrase deleting subunit, configured to delete, in the target phrase set, the current processing target phrase if it is determined that neither the subject nor the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary;

the map creation unit includes: a year acquisition subunit for sequentially acquiring one year from the respective years as a current processing year; the phrase processing subunit is used for establishing a connecting line between a subject and an object by taking the subject and the object in the target phrase as nodes for each target phrase in the filtered target phrase set corresponding to the paper text of the current processing year; the link aggregation subunit is used for aggregating all the links established to obtain a knowledge graph corresponding to the current processing year; an operation return subunit for returning to execute operations of sequentially acquiring one year from the respective years as the current processing year until the processing of all the years is completed;

The trend ranking determination module includes: an index calculation unit for calculating a network index of each node in the knowledge graph corresponding to each year using a complex network tool; the change trend calculation unit is used for sequencing the network indexes of the nodes according to the year aiming at each node, and calculating the change trend of the network indexes of the nodes according to the sequencing result by using a linear regression technology; and the node ordering unit is used for ordering the nodes from high to low according to the change trend of the network index to obtain academic hot spot trend ranking corresponding to the target vocabulary.

4. A computer device, the device comprising:

one or more processors;

storage means for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the academic hotspot trend prediction method of dynamic knowledge graph construction as claimed in any one of claims 1-2.

5. A computer storage medium having stored thereon a computer program, which when executed by a processor implements the academic hotspot trend prediction method based on dynamic knowledge graph construction as claimed in any one of claims 1-2.