CN110717043A - Academic team construction method based on network representation learning training - Google Patents

Academic team construction method based on network representation learning training Download PDF

Info

Publication number
CN110717043A
CN110717043A CN201910930765.8A CN201910930765A CN110717043A CN 110717043 A CN110717043 A CN 110717043A CN 201910930765 A CN201910930765 A CN 201910930765A CN 110717043 A CN110717043 A CN 110717043A
Authority
CN
China
Prior art keywords
node
academic
network
data
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910930765.8A
Other languages
Chinese (zh)
Inventor
李微
陈瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Three Helix Big Data Technology (kunshan) Co Ltd
Original Assignee
Three Helix Big Data Technology (kunshan) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Three Helix Big Data Technology (kunshan) Co Ltd filed Critical Three Helix Big Data Technology (kunshan) Co Ltd
Priority to CN201910930765.8A priority Critical patent/CN110717043A/en
Publication of CN110717043A publication Critical patent/CN110717043A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses an academic team construction method based on network representation learning training, which comprises the following steps: reading students and scientific research data in the database; step two: training by using an author theme model to obtain author theme probability distribution; step three: constructing an initial academic network; step four: training based on a network representation learning method to obtain a learner vector; step five: clustering the learner vectors based on a machine learning clustering method; step six: clusters meeting the preset threshold are output as an academic team. The method has the advantages that the team building efficiency is high, the similarity of the theme of the built team is high, and communities with different granularity can be divided by changing the number of clusters according to needs.

Description

Academic team construction method based on network representation learning training
[ technical field ] A method for producing a semiconductor device
The invention belongs to the technical field of social network analysis, and particularly relates to an academic team construction method based on network representation learning training.
[ background of the invention ]
With the development of scientific research, the wide cooperation among scientific research students forms a complex academic network, the scale of academic teams is enlarged, the relations among team members are complex, the constitution conditions of the academic teams are deeply known and mined, the method is beneficial to helping enterprises to quickly know the group information of college students in the obstetrical and research cooperation, and can also help scientific research management departments to identify scientific talents and scientific research teams, and the development of disciplines is promoted.
The division tasks of academic teams can be completed by utilizing a community discovery technology, most of the existing methods are based on network topological structure information and mainly comprise methods based on clustering, methods based on modularization, spectral clustering, random block division models and the like; patent application No. 201810851399.2 in the prior art also discloses a team construction method based on an academic network, which can divide communities, but in the academic network, students corresponding to nodes contain a large amount of text information, such as research directions and paper data of the students, and the division method based on a network topology structure ignores the text information, so that topic cohesion of the communities of the students is difficult to ensure, the existing community discovery method cannot control the divided community scale, and the division method based on modularity optimization also easily generates communities which are not effectively divided and have very large scales.
Therefore, there is a need to provide a new academic team construction method based on network representation learning training to solve the above technical problems.
[ summary of the invention ]
The invention mainly aims to provide an academic team construction method based on network representation learning training, which has high team construction efficiency and high constructed team theme similarity and can divide communities with different granularities by changing the number of clusters according to requirements.
The invention realizes the purpose through the following technical scheme: an academic team construction method based on network representation learning training comprises the following steps,
the method comprises the following steps: reading students and scientific research data in the database;
step two: training by using an author theme model to obtain author theme probability distribution;
step three: constructing an initial academic network;
step four: training based on a network representation learning method to obtain a learner vector;
step five: clustering the learner vectors based on a machine learning clustering method;
step six: clusters meeting the preset threshold are output as an academic team.
Compared with the prior art, the academic team construction method based on the network representation learning training has the beneficial effects that: in the community discovery process, not only is the physical topological structure information of an academic network considered, but also author theme probability distribution is obtained through author theme model training, and text data contained in a learner is blended into the author theme probability distribution, so that an academic team with higher theme cohesion is obtained; in addition, when the learner vectors are clustered based on a machine learning clustering method, the clustering number can be changed, so that the number scale of academic teams can be flexibly controlled.
[ description of the drawings ]
FIG. 1 is a schematic flow chart of an embodiment of the present invention;
FIG. 2 is a schematic diagram of an algorithmic process according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an AT model according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an author topic and topic probability distribution generated by an AT model according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating an author topic probability distribution in an embodiment of the present invention;
fig. 6 is a schematic diagram of an academic network according to an embodiment of the present invention.
[ detailed description ] embodiments
Example (b):
referring to fig. 1, the present embodiment is a method for constructing an academic team based on network representation learning training, which includes the following steps:
the method comprises the following steps: reading students and scientific research data in the database;
step two: training by using an author topic model (author-topic, hereinafter referred to as AT model) to obtain the topic probability distribution of the learner;
step three: constructing an initial academic network;
step four: training based on a network representation learning method to obtain a learner vector;
step five: clustering the learner vectors based on a machine learning clustering method;
step six: clusters meeting the preset threshold are output as an academic team.
Specifically, the specific method of the above steps is as follows.
The method comprises the following steps: and reading the students and scientific research data in the database.
Reading relevant data from a scholars database, wherein the relevant data comprises:
student information including ID, name, school, college;
paper data, including ID, title, author, abstract, issuing authority;
project data including ID, title, participant;
patent data includes ID, title, inventor, and application organization.
Wherein the content of the first and second substances,
the student information corresponds to nodes in an academic network;
authors of the paper, participants of the project, inventors of the patent are used to extract collaboration data, i.e. edges in the academic network;
the abstract of the paper is used for AT model training to represent the research subject of a scholars, and text information can be integrated in the process of training the scholars to vectors.
The ID in the related data is a primary key of the database and functions as a unique identifier of the data item.
The learner database may be an existing database such as a chinese knowledge network, a wan database, a national intellectual property office, a wipu network, a Baidu academic, or a system preset database.
In particular, the method comprises the following steps of,
1) examples of learner information queries are as follows:
74347, ' name ' Du XX ', ' school ' Chinese agriculture university ', ' ins ': institute of information and Electrical engineering ' }
2) For each scholarly to query his scientific research data, including thesis data, project data, and patent data, examples are as follows:
Figure RE-GDA0002293168210000031
Figure RE-GDA0002293168210000041
3) and saving the scientific research data as a document for later operation. Specifically, the document may be stored as txt, for example, a current directory is [ 74347.txt ], which is a paper data representing a scholars with ID 74347, and is referred to as "document" hereinafter, and the patent and project data are similar to the above steps.
Examples of the effects of the respective data:
1) the student information corresponds to nodes in an academic network;
Figure RE-GDA0002293168210000042
(2) authors of the paper, participants of the project, inventors of the patent are used to extract the collaborative data, i.e. edges in the academic network.
Figure RE-GDA0002293168210000043
(3) The abstract of the paper is used for AT model training to characterize the research topic of a scholarly and is represented by probability distribution.
{ '74347-Du XX' [ (5,0.8992357556293058), (11,0.099451539260517502)]}
Wherein the content of the first and second substances,
'74347' is the scholar ID;
'Du XX' is name;
(5,0.8992357556293058), indicating that the probability distribution on topic 5 is 0.8992;
(11,0.099451539260517502) indicates that the probability distribution on the 11 th topic is 0.0995.
There is no (or minimal) probability distribution on other topics, neglecting that the sum of the probability distributions of the learner on all topics is 1. This probability distribution can be used to calculate topic similarity between scholars later.
Step two: the topic probability distribution of the scholars is obtained by using the author topic model training.
Calculating the Topic probability distribution of the scholars by using an AT Model (Author-Topic Model) and the abstract of the thesis in all the thesis data in the step one, wherein the structure schematic diagram of the probability Model of the AT Model is shown in FIG. 3, x represents the Author, z represents the Topic, theta represents the Topic probability distribution of the Author generated by the Dirichlet distribution alpha, z represents the Topic, phi represents the word probability distribution of the Topic generated by the Dirichlet distribution beta, A represents the total number of authors, T represents the total number of topics, w isdIs a set of words, adIs a collection of authors. The solving method of the AT model is the prior art in the field, and therefore, the embodiment is not described in detail, and the AT model function is directly called during programming calculation, and the AT model operation procedure is as follows:
model=gensim.models.atmodel.AuthorTopicModel(corpus, num_topics=theme_num,author2doc=author2doc,id2word=dictionary)
the existence form of the paper data in the calculation process is as follows:
C={(w1,a1),(w2,a3),……,(wM,aM)},
wherein M is the total number of documents.
Before the AT model is used for calculation, the paper data document saved in the first step needs to be processed, and the processing comprises the following steps:
1) creating author to document mapping tables, e.g.
{ 'Zhangtrio': [1], 'Lisiu': [2, 3, 4], 'Wangwu': [5], };
2) establishing a word to ID mapping table, e.g.
{0: 'computer', 1: 'data mining';.. };
3) the document bag-of-words model is transcoded in the form of [ [ (0,1),. ], (6,1) ], [ 9, 2 ] ].
The training process is schematically illustrated in fig. 4-5.
Step three: an initial academic network is constructed.
Constructing data of points and edges required by an academic network, wherein the process is as follows:
1) establishing nodes in the academic network, wherein the data of each node comprises the student ID, the name, the school, the college and the author theme probability obtained in the step two;
for example:
Figure RE-GDA0002293168210000061
2) establishing edges in an academic network: extracting cooperation data according to authors of the thesis, authors of the project and authors of the patent to obtain edges in the academic network;
for example:
Figure RE-GDA0002293168210000071
3) and constructing an initial academic network according to the data of the points and the edges. The model is schematically shown in fig. 6, and is essentially an undirected weighted graph G ═ V, E, W, where V denotes a node set, i.e., all trainee node sets (refer to node data above), E denotes an edge set, i.e., all trainee relationship sets (refer to source, target in edge above), and W denotes a weight set of edges, i.e., weight in edge.
Step four: and training to obtain a learner vector based on a network representation learning method.
By improving the calculation process of the transition probability in the node2vec random walk process, the text information of the node can be integrated into the feature sequence extraction process to obtain an improved algorithm node2vec, and the specific implementation process of the algorithm is described below.
(1) And D, calculating the topic similarity between the nodes in the academic network obtained in the step three.
Calculating the topic similarity between the node i and the node j by using the cosine similarity is as follows,
Figure RE-GDA0002293168210000072
in graph G ═ (V, E, W), Pi=(pi1,pi2,......,piT) Is a topic probability distribution, P, of node ij=(pj1,pj2,......,pjT) Is the topic probability distribution of node j.
(2) And generating a neighbor sequence of the nodes by adopting a random walk mode of topic similarity optimization.
Firstly, a current node i is given, the number of the next node is j, random walk with fixed length L is simulated, and then a node vi-1To the next node viThe random walk transition probability of (2) is:
Figure RE-GDA0002293168210000081
wherein, piijIs the transition probability from node i to node j, Z is the sum of the transition probabilities of all nodes, πijThe calculation formula of (a) is as follows:
πij=αpq(i,j)·wij·sim(Pi,Pj),
where p and q are parameters for setting two random walks, wijIs the weight of the edge formed by node i and node j, αpqThe expression of (i, j) is as follows:
Figure RE-GDA0002293168210000082
wherein d isijRepresenting the shortest path between node i and node j.
Thus finishingAs a result of the calculation of the probability of random walk transitions, the set of probability values is passed to an Alias Method (sampler) for selection of neighboring nodes, the probability of each node selected being equal to P (c)i=x|ci-1V). And (4) setting the wandering length L to carry out wandering to obtain a plurality of wandering paths, wherein the paths are random wandering sequences.
(3) And training the random walk sequences by using a random gradient descent method to finally obtain a learner vector. The random gradient descent method is a conventional method in the art, and the present embodiment will not be described in detail.
The algorithm pseudo-code is shown below.
Figure RE-GDA0002293168210000083
Figure RE-GDA0002293168210000091
Step five: and clustering the learner vectors based on a machine learning clustering method.
(1) And calculating the node centrality.
The centrality measurement of the node is calculated by adopting a PageRank algorithm, and the larger the PageRank value is, the higher the centrality of the node is:
wherein β is a jump parameter, which is usually 0.8 or 0.9, e is a unit vector of n-dimension, and n is the number of all nodes. Beta MP represents the jump to the next node with probability beta in the random walk process, and (1-beta) e/n represents the random jump with probability (1-beta). The experimental result proves that the PageRank values of all the nodes are converged to be stable finally through continuous iteration. The formula of the centrality calculation is also prior art, and the present embodiment will not be described in detail.
(2) And calculating the node dispersion.
The discrete measurement of the nodes adopts a minimum distance deltai(i=1,2,…,n) measurement, namely calculating the distance between the node and all nodes with higher centrality, and taking the minimum value as deltaiMeasuring the discreteness of the nodes:
Figure RE-GDA0002293168210000101
and if the node centrality is the same, sequencing according to the node ID. In addition, since the node having the highest centrality is inevitably the cluster center, the minimum distance is defined as max (δ)i)。
(3) The node F-static index CV (i) is calculated.
Figure RE-GDA0002293168210000102
And taking the first K nodes with the maximum front CV values as clustering centers.
Note that both the center degree and the dispersion degree are calculated in (V, E, W) of the graph G.
(4) And (6) clustering.
For each node, calculating the distance d from the node to each cluster centericThen, the node is assigned to dicAnd updating the mean value of the clustering centers by the smallest clustering centers, and repeating the operation until the mean value of the clustering centers is stable. Finally, there will be some nodes around each cluster center, usually called "clusters", which are the academic team we want finally.
The algorithm pseudo code is as follows.
Figure RE-GDA0002293168210000103
Figure RE-GDA0002293168210000111
Step six: clusters meeting the preset threshold are output as an academic team.
The scale of the academic team is generally more than 3 people, namely, the number threshold value is set to be 3, and the cluster with the node number gauge model being more than or equal to the threshold value 3 in the step five is output as the academic team.
Thus, the construction of an academic team is completed.
For example:
TABLE 1 results of division of the Dongda college of computers
Figure RE-GDA0002293168210000112
In order to verify the advantages of the technical scheme in the community topic similarity construction of the academic team, several existing methods are provided below as experimental designs for result analysis
By adding topic similarity and optimizing the process of selecting clustering centers, the optimization algorithms node2vec and K-means are obtained, the community discovery algorithm based on network representation learning provided herein is denoted as NK, and the original algorithm is denoted as NK. In this embodiment, NK and differences of the conventional community discovery algorithms LPA, CNM, and Louvain in the three aspects of algorithm community quality, community topic, and community distribution are used.
Data preparation
The academic network data sets disclosed so far are CA-HepPh and DBLP data sets and the like, but since they do not contain paper text data and cannot be used, experiments are conducted on the self-built academic network data sets herein. In order to obtain a relatively comprehensive experiment result, the experiment adopts four data sets in different ranges, including a college data set, a school data set, a subject data set and a national data set, and specific information is shown in table 2. These quantities are derived from real society, belong to real network datasets, and the community partitioning results are unknown.
TABLE 2 academic network data set
Figure RE-GDA0002293168210000121
Evaluation method
The community theme is mainly used for measuring the cohesion of the research directions of scholars in the community, the data set does not mark the research directions of the scholars, so that the F value cannot be calculated, and the manual judgment mode is interfered by personal cognitive deviation, so that the cosine similarity formula is used for measuring the theme similarity between the scholars, and the community theme similarity can be obtained by comprehensively and averagely weighting the similarity between all the scholars.
Table 3 is the community topic similarity index for each type of algorithm. In college and subject networks, the community topic similarity is high, and in school and national networks, the community topic similarity is low, mainly because the community topic similarity is eliminated in the network, the research fields among the groups of students in the network are closer. In each network, it is obvious that NK has better community topic similarity than other methods.
TABLE 3 Community topic similarity contrast
Figure RE-GDA0002293168210000131
What has been described above are merely some embodiments of the present invention. It will be apparent to those skilled in the art that various changes and modifications can be made without departing from the inventive concept thereof, and these changes and modifications can be made without departing from the spirit and scope of the invention.

Claims (8)

1. An academic team construction method based on network representation learning training is characterized in that: which comprises the following steps of,
the method comprises the following steps: reading students and scientific research data in the database;
step two: training by using an author theme model to obtain author theme probability distribution;
step three: constructing an initial academic network;
step four: training based on a network representation learning method to obtain a learner vector;
step five: clustering the learner vectors based on a machine learning clustering method;
step six: clusters meeting the preset threshold are output as an academic team.
2. The academic team construction method based on network representation learning training of claim 1, wherein: the first step comprises reading relevant data from a scholars database, wherein the relevant data comprises:
student information including ID, name, school, college;
paper data, including ID, title, author, abstract, issuing authority;
project data including ID, title, participant;
patent data, including ID, title, inventor, application organization;
the ID is a primary key in the database.
3. The academic team construction method based on network representation learning training of claim 2, wherein: the second step comprises the following steps
2-1) reading a summary document in the paper data;
2-2) establishing an author and document mapping table;
2-3) carrying out coding conversion on the document bag-of-words model;
2-4) training an author theme model to obtain an author theme probability distribution t.
4. The academic team construction method based on network representation learning training of claim 3, wherein: the third step comprises the following steps:
3-1) establishing nodes V in the academic network, wherein the data of each node comprises the student ID, the name, the school, the college and the author theme probability obtained in the step two;
3-2) establishing an edge E in the academic network: extracting cooperation data according to authors of the thesis, authors of the project and authors of the patent to obtain edges in the academic network;
3-3) constructing an initial academic network G ═ V, E, W according to the data of the points and the edges, wherein W represents the weight of the edges, and the weight is the sum of the cooperation times of the papers, the items and the patents of the students.
5. The academic team construction method based on network representation learning training of claim 4, wherein: the fourth step comprises the following steps:
4-1) calculating the topic similarity among the nodes in the academic network obtained in the step three;
4-2) generating neighbor sequences of the nodes by adopting a random walk mode of topic similarity optimization to obtain a plurality of random walk sequences;
4-3) training the random walk sequence by using a random gradient descent method to obtain a learner vector.
6. The academic team construction method based on network representation learning training of claim 5, wherein: topic similarity sim (P) between node i and node ji,Pj) The calculation formula of (a) is as follows:
Figure RE-FDA0002293168200000021
wherein the content of the first and second substances,
Pi=(pi1,pi2,......,piT) Is the subject probability distribution for node i,
Pj(pj1,pj2,......,pjT) Is the topic probability distribution of node j.
7. The academic team construction method based on network representation learning training of claim 5, wherein: the obtaining of the random walk sequence in the step 4-2 comprises:
4-2-1) compute node vi-1To the next node viRandom walk transition probability P (v)i=j|vi-1I), a set of random walk transfer probability values, P (v) is obtainedi=j|vi-1I) the calculation formula is as follows:
Figure RE-FDA0002293168200000022
wherein, piijIs the transition probability from node i to node j, Z is the sum of the transition probabilities of all nodes, πijThe calculation formula of (a) is as follows:
πij=αpq(i,j)·wij·sim(Pi,Pj),
where p and q are parameters for setting two random walks, wijIs the weight of the edge formed by node i and node j, αpqThe expression of (i, j) is as follows:
Figure RE-FDA0002293168200000031
wherein d isijRepresenting the shortest path between node i and node j;
4-2-2) selecting a neighbor node in the sampler by using the group of random walk transfer probability values, and walking by setting the walking length L to obtain a plurality of walking paths as a random walking sequence.
8. The academic team construction method based on network representation learning training of claim 5, wherein: the fifth step comprises the following steps:
5-1) calculating node centrality
Figure RE-FDA0002293168200000032
Figure RE-FDA0002293168200000033
Wherein, beta is a jump parameter, e is a unit vector of n dimension, n is the number of all nodes, beta MP represents jump to the next node with probability beta in the random walk process, and (1-beta) e/n represents one random jump with probability (1-beta);
5-2) calculating the node dispersion: the discrete measurement of the nodes adopts a minimum distance deltai(i-1, 2, …, n) and taking the minimum value as the weightIs deltaiMeasuring the discreteness of the nodes:
Figure RE-FDA0002293168200000034
wherein, deltaiIs the minimum value in the shortest path set between any two nodes, i ═ 1, 2, …, n;
5-3) calculating the node F-static statistic index CV (i):
Figure RE-FDA0002293168200000035
taking the first K nodes with the maximum front CV values as clustering centers;
(4) clustering: calculating the distance d from the node to each cluster centericThen, the node is assigned to dicUpdating the mean value of the clustering centers by the smallest clustering centers, and repeating the operation until the mean value of the clustering centers is stable; finally, around each cluster center, there are clustered nodes, called "clusters", which are the final output academic teams.
CN201910930765.8A 2019-09-29 2019-09-29 Academic team construction method based on network representation learning training Pending CN110717043A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910930765.8A CN110717043A (en) 2019-09-29 2019-09-29 Academic team construction method based on network representation learning training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910930765.8A CN110717043A (en) 2019-09-29 2019-09-29 Academic team construction method based on network representation learning training

Publications (1)

Publication Number Publication Date
CN110717043A true CN110717043A (en) 2020-01-21

Family

ID=69212052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910930765.8A Pending CN110717043A (en) 2019-09-29 2019-09-29 Academic team construction method based on network representation learning training

Country Status (1)

Country Link
CN (1) CN110717043A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595713A (en) * 2018-05-14 2018-09-28 中国科学院计算机网络信息中心 The method and apparatus for determining object set
CN115630141A (en) * 2022-11-11 2023-01-20 杭州电子科技大学 Scientific and technological expert retrieval method based on community query and high-dimensional vector retrieval

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793501A (en) * 2014-01-20 2014-05-14 惠州学院 Theme community discovery method based on social network
CN106778894A (en) * 2016-12-29 2017-05-31 大连理工大学 A kind of method of author's cooperative relationship prediction in academic Heterogeneous Information network
CN109902203A (en) * 2019-01-25 2019-06-18 北京邮电大学 The network representation learning method and device of random walk based on side
WO2019153551A1 (en) * 2018-02-12 2019-08-15 平安科技(深圳)有限公司 Article classification method and apparatus, computer device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793501A (en) * 2014-01-20 2014-05-14 惠州学院 Theme community discovery method based on social network
CN106778894A (en) * 2016-12-29 2017-05-31 大连理工大学 A kind of method of author's cooperative relationship prediction in academic Heterogeneous Information network
WO2019153551A1 (en) * 2018-02-12 2019-08-15 平安科技(深圳)有限公司 Article classification method and apparatus, computer device and storage medium
CN109902203A (en) * 2019-01-25 2019-06-18 北京邮电大学 The network representation learning method and device of random walk based on side

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595713A (en) * 2018-05-14 2018-09-28 中国科学院计算机网络信息中心 The method and apparatus for determining object set
CN108595713B (en) * 2018-05-14 2020-09-29 中国科学院计算机网络信息中心 Method and device for determining object set
CN115630141A (en) * 2022-11-11 2023-01-20 杭州电子科技大学 Scientific and technological expert retrieval method based on community query and high-dimensional vector retrieval
CN115630141B (en) * 2022-11-11 2023-04-25 杭州电子科技大学 Scientific and technological expert retrieval method based on community query and high-dimensional vector retrieval

Similar Documents

Publication Publication Date Title
Zhou et al. Discovering temporal communities from social network documents
CN106991127B (en) Knowledge subject short text hierarchical classification method based on topological feature expansion
Gui et al. A community discovery algorithm based on boundary nodes and label propagation
Yang et al. Identifying influential spreaders in complex networks based on network embedding and node local centrality
CN107391542A (en) A kind of open source software community expert recommendation method based on document knowledge collection of illustrative plates
CN110674318A (en) Data recommendation method based on citation network community discovery
WO2021128158A1 (en) Method for disambiguating between authors with same name on basis of network representation and semantic representation
Xu et al. Finding overlapping community from social networks based on community forest model
CN110717043A (en) Academic team construction method based on network representation learning training
Sun et al. Overlapping community detection based on information dynamics
Shekhawat et al. A classification technique using associative classification
Cheng et al. Dynamic embedding on textual networks via a gaussian process
CN111339258B (en) University computer basic exercise recommendation method based on knowledge graph
Shao et al. Research on a new automatic generation algorithm of concept map based on text clustering and association rules mining
CN105205075B (en) From the name entity sets extended method of extension and recommended method is inquired based on collaboration
Wang et al. An improved clustering method for detection system of public security events based on genetic algorithm and semisupervised learning
Ma et al. Composing knowledge graph embeddings via word embeddings
Agarwal et al. WGSDMM+ GA: A genetic algorithm-based service clustering methodology assimilating dirichlet multinomial mixture model with word embedding
Han et al. A semantic community detection algorithm based on quantizing progress
Liu et al. An improved k-means clustering algorithm based on semantic model
Wu Data association rules mining method based on improved apriori algorithm
Qiao et al. Improving stochastic block models by incorporating power-law degree characteristic
Chen et al. Community detection based on deepwalk model in large-scale networks
Kubota et al. Assignment strategies for ground truths in the crowdsourcing of labeling tasks
Liu et al. Overlapping community detection method based on network representation learning and density peaks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200121