CN111563172A

CN111563172A - Academic hotspot trend prediction method and device based on dynamic knowledge graph construction

Info

Publication number: CN111563172A
Application number: CN202010376472.2A
Authority: CN
Inventors: 高军晖; 谭润东; 张心觉; 江荣峰; 龚建兵; 楼敬伟
Original assignee: Shanghai Biotecan Medical Diagnostics Co ltd; Shanghai Zhangjiang Medical Innovation Research Institute; Shanghai Biotecan Biology Medicine Technology Co ltd
Current assignee: Shanghai Biotecan Medical Diagnostics Co ltd; Shanghai Zhangjiang Medical Innovation Research Institute; Shanghai Biotecan Biology Medicine Technology Co ltd
Priority date: 2020-05-07
Filing date: 2020-05-07
Publication date: 2020-08-21
Anticipated expiration: 2040-05-07
Also published as: CN111563172B

Abstract

The invention discloses an academic hotspot trend prediction method and device based on dynamic knowledge graph construction. The method comprises the following steps: acquiring a thesis text corresponding to a target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text; establishing a knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes. The embodiment of the invention can establish the knowledge graph according to the thesis text corresponding to the target vocabulary, can determine the network index of each node according to the relationship among the nodes in the established knowledge graph, and can predict the academic hotspot trend corresponding to the target vocabulary according to the network index of each node.

Description

Academic hotspot trend prediction method and device based on dynamic knowledge graph construction

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to an academic hotspot trend prediction method and device based on dynamic knowledge graph construction.

Background

How to better analyze and understand academic hotspot trends is a problem faced by researchers. The accurate prediction of academic hotspot trends can help scientific researchers to quickly understand and master the current industrial research situation, and the method has high efficiency.

In the prior art, an academic hotspot trend prediction method constructed based on a dynamic knowledge graph is based on the word frequency of a professional vocabulary. The academic hotspot tendency of a certain field is determined by using the frequency of occurrence of professional vocabularies of the field in the literature of the field.

In the process of implementing the invention, the inventor finds that the prior art has the following defects: the existing academic hotspot trend prediction method constructed based on the dynamic knowledge graph is based on the word frequency of professional vocabularies, does not consider the relationship among the professional vocabularies, and can be influenced by the number of articles in the literature.

Disclosure of Invention

The invention provides an academic hotspot trend prediction method and device based on dynamic knowledge graph construction, which are used for predicting academic hotspot trends according to relations among vocabularies in a thesis text.

In a first aspect, an embodiment of the present invention provides an academic hotspot trend prediction method constructed based on a dynamic knowledge graph, including:

acquiring a thesis text corresponding to a target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text;

establishing a knowledge map corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary;

and determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

In a second aspect, an embodiment of the present invention further provides an academic hotspot trend prediction apparatus constructed based on a dynamic knowledge graph, including:

the system comprises a thesis text acquisition module, a high-frequency vocabulary dictionary generation module and a high-frequency vocabulary dictionary generation module, wherein the thesis text acquisition module is used for acquiring a thesis text corresponding to a target vocabulary and determining the high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text;

the knowledge map establishing module is used for establishing a knowledge map corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary;

and the trend ranking determining module is used for determining network indexes of all nodes in the knowledge graph and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the academic hotspot trend prediction method based on dynamic knowledge graph construction provided by any embodiment of the invention.

In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the academic hotspot trend prediction method based on dynamic knowledge graph construction, provided by any embodiment of the present invention.

The embodiment of the invention determines the high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text corresponding to the target vocabulary, then establishes the knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary, finally determines the network index of each node in the knowledge graph, obtains the academic hotspot trend ranking corresponding to the target vocabulary according to the network index, can establish the knowledge graph according to the thesis text corresponding to the target vocabulary, can determine the network index of each node according to the relationship among each node in the established knowledge graph, and predicts the academic hotspot trend corresponding to the target vocabulary according to the network index of each node.

Drawings

Fig. 1 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to an embodiment of the present invention;

fig. 2 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to a second embodiment of the present invention;

fig. 3 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an academic hotspot trend prediction device constructed based on a dynamic knowledge graph according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.

It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Example one

Fig. 1 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to an embodiment of the present invention, where the embodiment is applicable to a case of predicting an academic hotspot trend, and the method may be executed by an academic hotspot trend prediction apparatus constructed based on a dynamic knowledge graph, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in a computer device. Accordingly, as shown in fig. 1, the method comprises the following operations:

step 101, obtaining a thesis text corresponding to a target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text.

Alternatively, the target vocabulary is a professional vocabulary that needs academic hotspot trend prediction. Illustratively, the target vocabulary is the intestinal flora.

Alternatively, the target vocabulary may be a professional vocabulary input by the user. And inputting a target vocabulary by a user, and requesting to acquire an academic hotspot trend corresponding to the target vocabulary.

Alternatively, the paper text may be a paper abstract text or a paper body text. The paper abstract text is the abstract text of the paper. The paper abstract text is the body text of the paper.

In one embodiment, obtaining a paper text corresponding to the target vocabulary may include: in at least one paper retrieval platform, retrieving a paper associated with a target vocabulary; and downloading the abstract text of the retrieved paper associated with the target vocabulary, and taking the downloaded abstract text as the abstract text of the paper corresponding to the target vocabulary.

In another embodiment, obtaining a paper text corresponding to the target vocabulary may include: searching a paper associated with the target vocabulary on a paper search platform; and downloading the text of the retrieved paper associated with the target vocabulary, and taking the downloaded text as the text of the paper corresponding to the target vocabulary.

Optionally, the paper retrieval platform may include: a universal data knowledge service platform and a knowledge network.

Optionally, determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text may include: acquiring a high-frequency vocabulary corresponding to the target vocabulary according to the thesis text; and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary.

The high-frequency vocabulary is the vocabulary with the common occurrence frequency of the target vocabulary more than or equal to the preset frequency threshold value. The preset time threshold value can be set according to the service requirement. Illustratively, the preset number threshold is 50. The high-frequency vocabulary dictionary is a list for storing high-frequency vocabularies corresponding to target vocabularies.

Optionally, obtaining a high-frequency vocabulary corresponding to the target vocabulary according to the thesis text includes: analyzing the thesis text to obtain words which appear together with the target words in each text; counting the frequency of the common occurrence of the vocabulary and the target vocabulary aiming at each acquired vocabulary; and determining the vocabulary with the frequency greater than or equal to a preset frequency threshold value as the high-frequency vocabulary corresponding to the target vocabulary.

Optionally, determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary includes: and adding the high-frequency vocabulary corresponding to the target vocabulary into the high-frequency vocabulary dictionary corresponding to the target vocabulary.

And 102, establishing a knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary.

Optionally, establishing a knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary may include: classifying the thesis texts according to the year; determining a target phrase set corresponding to the thesis text of each year according to the word relation in each sentence in the thesis text of each year; filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary; and establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year.

And classifying the thesis texts according to the years to obtain the thesis texts of each year. In one specific example, the total number of paper texts corresponding to the target vocabulary is 95. And classifying the thesis texts according to the years to obtain the thesis texts of each year. The number of the paper texts in 2017 was 28. The number of the paper texts in 2018 is 35. The number of the paper texts in 2019 was 32.

Optionally, determining the target phrase set corresponding to the treatise text of each year according to the word relationship in each sentence in the treatise text of each year may include: sequentially acquiring a year from each year as a current processing year; aiming at each sentence in the paper text of the current processing year, extracting a subject, a predicate and an object in the sentence according to the word relation in the sentence, and combining the subject, the predicate and the object according to the sequence of the subject and the predicate to form a target phrase which is added into a target phrase set corresponding to the paper text of the current processing year; and returning to execute the operation of sequentially acquiring one year from each year as the current processing year until the processing of all the years is finished.

Optionally, a Natural Language Processing (NLP) technology is used, and for each sentence in the paper text of the current Processing year, a subject, a predicate, and an object in the sentence are extracted according to a word relationship in the sentence, and the subject, the predicate, and the object are combined according to a subject-predicate order to form a target phrase, and the target phrase is added to a target phrase set corresponding to the paper text of the current Processing year.

In one particular example, the sentence "probiotics kill bacteria that cause diarrhea". The method comprises the steps of using NLP technology, extracting subjects, predicates, kills and objects, namely bacteria, from sentences according to word relations in the sentences, combining the subjects, the predicates and the objects according to the order of the subjects and the predicates to form an object phrase, namely the probiotics kills the bacteria, and adding the object phrase into an object phrase set corresponding to the paper text of the current treatment year.

Optionally, filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary may include: judging whether the subject or the object in the current processing target phrase belongs to a high-frequency vocabulary dictionary or not; if the subject or the object in the current processing target phrase is determined to belong to the high-frequency vocabulary dictionary, keeping the current processing target phrase in the target phrase set; if it is determined that neither the subject nor the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary, the current processing target phrase is deleted in the target phrase set.

And filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary, and only keeping the target phrases of which the subjects or the objects belong to the high-frequency vocabulary dictionary corresponding to the target vocabularies.

Optionally, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year may include: sequentially acquiring a year from each year as a current processing year; aiming at each target phrase in a target phrase set corresponding to the filtered thesis text of the current processing year, taking a subject and an object in the target phrase as nodes, and establishing a connection line between the subject and the object; converging all established connecting lines to obtain a knowledge graph corresponding to the current processing year; and returning to execute the operation of sequentially acquiring one year from each year as the current processing year until the processing of all the years is finished. Thereby, a knowledge map corresponding to each year is obtained. Each year corresponds to a knowledge graph.

And 103, determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

Optionally, determining a network index of each node in the knowledge graph, and obtaining an academic hotspot trend ranking corresponding to the target vocabulary according to the network index may include: calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool; for each node, the network indexes of the nodes are sorted according to the year, and the network index change trend of the nodes is calculated according to the sorting result by using a linear regression technology; and sequencing all the nodes according to the network index variation trend from high to low to obtain an academic hotspot trend ranking corresponding to the target vocabulary.

The network index is an evaluation index for evaluating the importance of each node in the network. And the importance ranking of each node in the network can be obtained according to the size of the network index. Alternatively, the network index may be degree or node centrality.

The degree of a node refers to the number of links associated with the node. The higher the degree of a node, the higher the importance of the node.

Optionally, the node centrality may include: mesocentrality and centrocentrality. The centrality of the intermediary is the ratio of the number of shortest paths through a node in the network to the number of shortest paths throughout the network. The higher the betweenness of the nodes, the higher the importance of the nodes. The degree centrality is the ratio of the degree of a node in the network to the sum of the degrees of all nodes in the network. The degree intermediacy reflects the number of nodes directly connected to the node, and if a node has a higher degree intermediacy, it is likely that the node is located at the center of the network and has a higher authority. The higher the degree centrality of a node, the higher the importance of the node.

A complex network tool is a software tool for analyzing a complex network. Illustratively, a complex network tool, NetworkX, is used to analyze the knowledge graph network corresponding to each year respectively, and calculate the network index of each node in the knowledge graph corresponding to each year.

Optionally, for each node, the network indexes of the nodes are sorted according to the year, and a linear regression technique is used to calculate the network index variation trend of the node according to the sorting result, which may include: acquiring network indexes of current processing nodes in each year from the calculated network indexes of each node in the knowledge graph corresponding to each year; sorting the network indexes of the current processing node according to the year, namely listing the network indexes of the current processing node according to the year; and determining the quantitative relation between the year and the network index of the current processing node according to the sequencing result by using a linear regression technology, and calculating the network index change trend of the current processing node according to the quantitative relation, namely calculating the predicted value of the network index of the current processing node in the next year according to the quantitative relation.

The network index variation trend is a predicted value of the network index of the next year. In one embodiment, the network metrics of the current processing node are sorted by year, i.e. the network metrics of the current processing node are listed by year: 2017, 9; 2018, 10; 2019, 16. And determining the quantitative relation between the year and the network index of the current processing node according to the sequencing result by using a linear regression technology, and calculating the predicted value of the network index of the node in 2020 according to the quantitative relation.

And sequencing all the nodes from high to low according to the network index variation trend, wherein the obtained node ranking is the academic hotspot trend ranking corresponding to the target vocabulary. Nodes are words extracted from the paper text. The network index variation trend represents the importance trend of each node in the next year. In the academic hotspot trend ranking, the higher the ranking of the nodes is, the higher the importance of the nodes in the next year is represented, that is, the nodes are academic hotspots corresponding to the target vocabulary. And according to the academic hotspot trend ranking corresponding to the target vocabulary, the academic hotspot trend corresponding to the target vocabulary can be determined.

Example two

Fig. 2 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to a second embodiment of the present invention. In this embodiment, the creating a knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary may include: classifying the thesis texts according to the year; determining a target phrase set corresponding to the thesis text of each year according to the word relation in each sentence in the thesis text of each year; filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary; and establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year.

Accordingly, as shown in fig. 2, the method includes the following operations:

step 201, obtaining a thesis text corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text.

Step 202, classifying the paper texts according to the year.

The paper texts are classified according to the year to obtain the paper texts of each year.

In one specific example, the total number of paper texts corresponding to the target vocabulary is 95. And classifying the thesis texts according to the years to obtain the thesis texts of each year. The number of the paper texts in 2017 was 28. The number of the paper texts in 2018 is 35. The number of the paper texts in 2019 was 32.

Step 203, determining a target phrase set corresponding to the treatise text of each year according to the word relation in each sentence in the treatise text of each year.

And step 204, filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary.

And step 205, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year.

And step 206, determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

The embodiment of the invention classifies the thesis texts according to the years, then determines the target phrase set corresponding to the thesis texts of each year according to the word relation in each sentence in the thesis texts of each year, filters each target phrase in the target phrase set corresponding to the thesis texts of each year according to the high-frequency vocabulary dictionary, finally establishes the knowledge map corresponding to each year according to the filtered target phrase set corresponding to the thesis texts of each year, and can establish the knowledge map according to the thesis texts of each year corresponding to the target vocabularies.

EXAMPLE III

Fig. 3 is a flowchart of an academic hotspot trend prediction method constructed based on a dynamic knowledge graph according to a third embodiment of the present invention. In this embodiment, determining a network index of each node in the knowledge graph, and obtaining an academic hotspot trend ranking corresponding to the target vocabulary according to the network index may include: calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool; for each node, the network indexes of the nodes are sorted according to the year, and the network index change trend of the nodes is calculated according to the sorting result by using a linear regression technology; and sequencing all the nodes according to the network index variation trend from high to low to obtain an academic hotspot trend ranking corresponding to the target vocabulary.

Accordingly, as shown in fig. 3, the method includes the following operations:

301, obtaining a thesis text corresponding to the target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text.

Step 302, classifying the paper texts according to the year.

Step 303, determining a target phrase set corresponding to the treatise text of each year according to the word relation in each sentence in the treatise text of each year.

And step 304, filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary.

And 305, establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis texts of each year.

And step 306, calculating the network indexes of all nodes in the knowledge graph corresponding to all the years by using a complex network tool.

Optionally, the network index is an evaluation index for evaluating importance of each node in the network. And the importance ranking of each node in the network can be obtained according to the size of the network index. Alternatively, the network index may be degree or node centrality.

And 307, sequencing the network indexes of the nodes according to the year for each node, and calculating the network index change trend of the nodes according to the sequencing result by using a linear regression technology.

And 308, sequencing the nodes according to the network index variation trend from high to low to obtain an academic hotspot trend ranking corresponding to the target vocabulary.

The network indexes of all nodes in the knowledge graph corresponding to all years are calculated by using a complex network tool, then the network indexes of the nodes are sequenced according to the years for each node, a linear regression technology is used, the network index change trend of the nodes is calculated according to the sequencing result, finally all the nodes are sequenced from high to low according to the network index change trend, academic hotspot trend ranking corresponding to target words is obtained, the network index change trend of all the nodes can be determined according to the relationship among all the nodes in the established knowledge graph, and the academic hotspot trend corresponding to the target words is predicted according to the network index change trend of all the nodes.

Example four

Fig. 4 is a schematic structural diagram of an academic hotspot trend prediction device constructed based on a dynamic knowledge graph according to a fourth embodiment of the present invention. The embodiment can be applied to the case of predicting academic hotspot trends. The apparatus can be implemented in software and/or hardware, and the apparatus can be configured in a computer device. As shown in fig. 4, the apparatus may include: a paper text acquisition module 401, a knowledge graph establishment module 402, and a trend ranking determination module 403.

The system comprises a thesis text acquisition module 401, a high-frequency vocabulary dictionary module and a target vocabulary acquisition module, wherein the thesis text acquisition module 401 is used for acquiring a thesis text corresponding to a target vocabulary and determining the high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text; a knowledge map establishing module 402, configured to establish a knowledge map corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary; and a trend ranking determining module 403, configured to determine network indexes of each node in the knowledge graph, and obtain an academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

Optionally, on the basis of the foregoing technical solution, the knowledge-graph establishing module 402 may include: a text classification unit for classifying the thesis text by year; the set determining unit is used for determining a target phrase set corresponding to the thesis texts of each year according to the word relation in each sentence in the thesis texts of each year; the phrase filtering unit is used for filtering each target phrase in the target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary; and the map establishing unit is used for establishing a knowledge map corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year.

Optionally, on the basis of the foregoing technical solution, the set determining unit may include: a year acquiring subunit configured to acquire one year among the years in sequence as a current processing year; the set determining subunit is used for extracting subjects, predicates and objects in the sentences according to word relations in the sentences aiming at each sentence in the thesis text of the current processing year, combining the subjects, the predicates and the objects according to the order of the subjects and the predicates to form a target phrase, and adding the target phrase into a target phrase set corresponding to the thesis text of the current processing year; and an operation returning subunit for returning to execute the operation of acquiring one year in each year in turn as the current processing year until the processing of all the years is completed.

Optionally, on the basis of the foregoing technical solution, the phrase filtering unit may include: the phrase judgment subunit is used for judging whether the subject or the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary; a phrase retaining subunit, configured to, if it is determined that the subject or the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary, retain the current processing target phrase in the target phrase set; and the phrase deleting subunit is used for deleting the current processing target phrase from the target phrase set if the subject and the object in the current processing target phrase are determined not to belong to the high-frequency vocabulary dictionary.

Optionally, on the basis of the above technical solution, the map establishing unit may include: a year acquiring subunit configured to acquire one year among the years in sequence as a current processing year; the phrase processing subunit is used for establishing a connection line between the subject and the object by taking the subject and the object in the target phrase as nodes for each target phrase in the target phrase set corresponding to the filtered thesis text of the current processing year; the connecting line converging subunit is used for converging all the established connecting lines to obtain a knowledge graph corresponding to the current processing year; and an operation returning subunit for returning to execute the operation of acquiring one year in each year in turn as the current processing year until the processing of all the years is completed.

Optionally, on the basis of the foregoing technical solution, the trend rank determining module 403 may include: an index calculation unit for calculating a network index of each node in the knowledge graph corresponding to each year using a complex network tool; the change trend calculation unit is used for sequencing the network indexes of the nodes according to the year for each node and calculating the network index change trend of the nodes according to the sequencing result by using a linear regression technology; and the node sorting unit is used for sorting all the nodes from high to low according to the network index variation trend to obtain an academic hotspot trend ranking corresponding to the target vocabulary.

Optionally, on the basis of the foregoing technical solution, the thesis text acquiring module 401 may include: the vocabulary acquisition unit is used for acquiring high-frequency vocabularies corresponding to the target vocabularies according to the thesis text; and the dictionary generating unit is used for determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary.

The academic hotspot trend prediction device constructed based on the dynamic knowledge graph can execute the academic hotspot trend prediction method constructed based on the dynamic knowledge graph provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects for executing the academic hotspot trend prediction method constructed based on the dynamic knowledge graph.

EXAMPLE five

Fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. FIG. 5 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 5 is only an example and should not bring any limitations to the functionality or scope of use of embodiments of the present invention.

As shown in FIG. 5, computer device 12 is embodied in the form of a general purpose computer device. The components of computer device 12 may include, but are not limited to: one or more processors 16, a memory 28, and a bus 18 that connects the various system components (including the memory 28 and the processors 16). The processor 16 includes, but is not limited to, an AI processor.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be appreciated that although not shown in FIG. 5, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processor 16 of the computer device 12 executes programs stored in the memory 28 to perform various functional applications and data processing, such as implementing the academic hotspot trend prediction method based on dynamic knowledge graph construction provided by the embodiment of the invention. The method specifically comprises the following steps: acquiring a thesis text corresponding to a target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text; establishing a knowledge map corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

EXAMPLE six

The sixth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the academic hotspot trend prediction method constructed based on the dynamic knowledge graph according to the sixth embodiment of the present invention. The method specifically comprises the following steps: acquiring a thesis text corresponding to a target vocabulary, and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text; establishing a knowledge map corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary; and determining network indexes of all nodes in the knowledge graph, and obtaining academic hotspot trend ranking corresponding to the target vocabulary according to the network indexes.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, Ruby, Go, and conventional procedural programming languages, such as the "C" programming language or similar programming languages, and computer languages for AI algorithms. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An academic hotspot trend prediction method constructed based on a dynamic knowledge graph is characterized by comprising the following steps:

establishing a knowledge graph corresponding to the target vocabulary according to the thesis text and the high-frequency vocabulary dictionary;

2. The method of claim 1, wherein building a knowledge graph corresponding to the target vocabulary from the paper text and the high frequency vocabulary dictionary comprises:

classifying said thesis text by year;

determining a target phrase set corresponding to the thesis text of each year according to the word relation in each sentence in the thesis text of each year;

filtering each target phrase in a target phrase set corresponding to the thesis text of each year according to the high-frequency vocabulary dictionary;

and establishing a knowledge graph corresponding to each year according to the filtered target phrase set corresponding to the thesis text of each year.

3. The method of claim 2, wherein determining the set of target phrases corresponding to the treatise texts of each year according to word relationships in each sentence in the treatise texts of each year comprises:

sequentially acquiring a year from each year as a current processing year;

for each sentence in the paper text of the current processing year, extracting a subject, a predicate and an object in the sentence according to the word relation in the sentence, and combining the subject, the predicate and the object according to the sequence of the subject and the predicate to form a target phrase which is added into a target phrase set corresponding to the paper text of the current processing year;

and returning to execute the operation of sequentially acquiring one year from each year as the current processing year until the processing of all the years is finished.

4. The method of claim 3, wherein filtering each target phrase in the set of target phrases corresponding to each year of paper text according to the high frequency vocabulary dictionary comprises:

judging whether the subject or the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary or not;

if the subject or the object in the current processing target phrase is determined to belong to the high-frequency vocabulary dictionary, keeping the current processing target phrase in the target phrase set;

deleting the current processing target phrase from the target phrase set if it is determined that neither the subject nor the object in the current processing target phrase belongs to the high-frequency vocabulary dictionary.

5. The method of claim 3, wherein establishing a knowledge graph corresponding to each year based on the filtered set of target phrases corresponding to the paper text of each year comprises:

sequentially acquiring a year from each year as a current processing year;

aiming at each target phrase in a target phrase set corresponding to the filtered thesis text of the current processing year, taking a subject and an object in the target phrase as nodes, and establishing a connection line between the subject and the object;

converging all established connecting lines to obtain a knowledge graph corresponding to the current processing year;

6. The method of claim 2, wherein determining a network indicator for each node in the knowledge-graph and deriving an academic hotspot trend ranking corresponding to the target vocabulary according to the network indicator comprises:

calculating network indexes of each node in the knowledge graph corresponding to each year by using a complex network tool;

for each node, the network indexes of the nodes are sorted according to the year, and the network index change trend of the nodes is calculated according to the sorting result by using a linear regression technology;

and sequencing all the nodes from high to low according to the network index variation trend to obtain an academic hotspot trend ranking corresponding to the target vocabulary.

7. The method of claim 1, wherein determining a high frequency vocabulary dictionary corresponding to the target vocabulary from the paper text comprises:

acquiring a high-frequency vocabulary corresponding to the target vocabulary according to the thesis text;

and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the high-frequency vocabulary.

8. An academic hotspot trend prediction device constructed based on a dynamic knowledge graph is characterized by comprising:

a thesis text acquisition module used for acquiring a thesis text corresponding to a target vocabulary and determining a high-frequency vocabulary dictionary corresponding to the target vocabulary according to the thesis text;

9. A computer device, the device comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the dynamic knowledge-graph-based academic hotspot trend prediction method of any one of claims 1-7.

10. A computer storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the academic hotspot trend prediction method based on dynamic knowledge graph construction according to any one of claims 1 to 7.