CN106227835A - Team's research direction method for digging based on two subnetwork figure hierarchical clusterings - Google Patents
Team's research direction method for digging based on two subnetwork figure hierarchical clusterings Download PDFInfo
- Publication number
- CN106227835A CN106227835A CN201610595145.XA CN201610595145A CN106227835A CN 106227835 A CN106227835 A CN 106227835A CN 201610595145 A CN201610595145 A CN 201610595145A CN 106227835 A CN106227835 A CN 106227835A
- Authority
- CN
- China
- Prior art keywords
- author
- group
- key word
- team
- research
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of team's research direction method for digging based on two subnetwork figure hierarchical clusterings, comprise the following steps: step 1: set up author investigation interest representation mode based on author's key word two subnetwork;Step 2: author investigation interest representation mode is carried out figure cluster: author little for the degree of concern difference to each key word is attributed to same author group;Obtain author's cluster set;Step 3: general levels clusters, obtain the research interest of each author group: author's cluster set will only comprise the group of an author, it is merged in other author group that research interest is similar, make the author's number comprised in each author group more than 2, calculate and update the research interest of each author group, i.e. team's research direction.The present invention can excavate the academic research direction of team effectively, provides advantage for analyzing and evaluate the development of team's research direction.
Description
Technical field
The present invention relates to a kind of team's research direction method for digging based on two subnetwork figure hierarchical clusterings.
Background technology
In the trend of globalization day by day significantly today, team unity is a phenomenon the most universal.Along with multi-disciplinary
Mixing together, the interpenetrating of science technology and society, have higher requirement to research and development management and scientific research organization.Science and technology wound
New team is helped each other by resource-sharing, cooperative cooperating and the division of labor so that S&T innovation efficiency is greatly improved, as science and technology research and development
One effectively organizes form, and scientific and technical innovation team progressively becomes the important model of scientific research and innovation activity[1].For ensureing
Persistently the carrying out and turn out individual and the team with scientific and technical innovation spirit, National Natural Science base of science frontier research activities
Gold committee started from 2000 tentative to set up " innovation colony science fund ", in order to help domestic outstanding innovative research team
The basic scientific research that a certain Important Academic research direction is carried out and applied basic research[2]。
The research direction of team is to evaluate the important symbol of whole team development." National Natural Science Foundation of China creates
Recent studies on colony science fund trial method "[3]Explicitly pointing out, creation & research team must be the entirety that Long-term Collaboration is formed, and has
The research direction of Relatively centralized, and can persistently be active in the forward position of its research field.Scientific paper is led generally around science
A certain problem in territory study after science record or Scientific summarization, if the opinion that relevant academic problem is studied
Literary composition, their similarity is the highest, but the paper that different problems are answered, their difference is the biggest.Root in the past
According to this characteristic between paper, it is possible to analyze the dependency of the research direction of an Academic Teams.And this mode is not
There are the relation between member, academic relationship network in consideration team[4]There is complex structure.Academic direction to team
For, on the one hand need to consider the research interest place of member in team;On the other hand the respectively side of research is remained a need for considering in team
Relation between to.
Therefore, it is necessary to the team's research direction designing a kind of combination Team Member research interest and relational network feature is dug
Pick method.
List of references
[1] Wang Xinxin. the structure of scientific and technical innovation team and Study on Development Tactics [J]. science and technology and economy, 2014,27 (3):
66-69.
[2] Feng Changgen. National Natural Science Foundation of China's creation & research team [J]. science and technology Leader, 2010,28 (7):
125.
[3] " National Natural Science Foundation of China's creation & research team science fund trial method ", 2001,2.
[4]Fang Huang,Jing Liu,Xinmin Liu,et al.Academic Relation
Classification Rules Extraction with Correlation Feature Weight Selection[C]
.the 3rd Global Congress on Intelligent Systems(GCIS2012),Nov.6-8,2012:160-
165.
[5]Tian Y,Hankins R A,Patel J M.Efficient aggregation for graph
summarization[C].AcmSigmod International Conference on Management of
Data.2008:567-580.
[6] Chen Kehan, Han Panpan, Wu Jian. isomery social networks proposed algorithm [J] based on user clustering. computer
Report, 2013,38 (2): 349-359.
Summary of the invention
Solved by the invention technical problem is that, for the deficiencies in the prior art, it is provided that a kind of based on two subnetwork figure layers
Team's research direction method for digging of secondary cluster, research based on figure clustering method, the research direction of team is excavated, for
The development analyzing and evaluating team's research direction provides advantage.
Technical scheme provided by the present invention is:
A kind of team's research direction method for digging based on two subnetwork figure hierarchical clusterings, comprises the following steps:
Step 1: set up author investigation interest representation mode based on author's key word two subnetwork;
Step 2: author investigation interest representation mode is carried out figure cluster:
Author little for degree of concern difference to each key word is attributed to same author group;Obtain author group collection
Close;
Step 3: general levels clusters, and obtains the research interest of each author group:
Author's cluster set will only comprise the group of an author, be merged into other author group that research interest is similar
In so that the author's number comprised in each author group is more than 2, calculates and update the research interest of each author group, i.e.
Team's research direction.
Described step 1 particularly as follows:
The scientific paper collection of author from team, extraction author information and key word information, obtain preprocessed data,
Wherein, author's collection is designated as VA={ A1,A2,…,AN, keyword set is designated as VK=K={k1,k2,…,kM, by author AiScience
In collection of thesis, lists of keywords and keyword set K compare, therefore, for each author An, this author A of obtainingnResearch
Interest representation mode is An={ (k1,wn1),(k2,wn2),…,(kM,wnM)};
Research interest representation mode based on author, constructs the author interests matrix m of N × M, wherein, collects for author
In each author, define this author AnResearch interest vector be vn=(wn1,wn2,…,wnM);
Author investigation interest representation mode is expressed as G=G (V, E);
The set that wherein V is formed by author node and key word node, i.e. V={VAUVK, wherein VAGather for author
VA={ A1,A2,…,An,…,AN, VKFor keyword set VK=K={k1,k2,…,kj,…,kM, N and M is respectively in team
Author sum and team in all authors scientific paper concentrate key word sum;E is author node and key word node
Between the set that constituted of company limit, i.e. E={e (An,kj)|An∈VA,kj∈K,wnj>0};If author is AnScientific paper in
Lists of keywords comprises certain key word k in keyword setj, then weight wnj> 0, at author AnWith key word kjBetween exist
Even limit e (An,kj), otherwise wnj=0, at author AnWith key word kjBetween there is not even limit.
Described step 2, carries out figure cluster to author investigation interest representation mode G=G (V, E);Comprise the following steps:
2.1) author's cluster set Groups={G is initialized0, G0It is one and comprises the contributors group of all authors in team
Group;
2.2) forDefinition author group GiTo key word kjConcern collection be:
Wherein, A is author group GiIn author.
2.3) author group G is calculated by formula (2)iTo each key word kj(kj∈ K) concern situation focusij:
Wherein,Represent author group GiMiddle concern key word kjAuthor's quantity, | Gi| represent contributors group
Group GiIn comprise author sum;If attention rate focusij>=α, Ze Cheng author group Gi" pay close attention to by force " in key word kj, otherwise
Claim author group Gi" weak concern " is in key word kj;Wherein α > 0, for paying close attention to intensity threshold;
In an author group, the concern situation of key word is more concentrated by authors, and the degree of polymerization of this group is more
High.Ambiguity in definition degree describes the difference degree paid close attention between each author of author's group internal for key word.
2.4) each author group G is calculated by formula (3)iAt each key word kj(kj∈ K) on fuzziness
fuzzyij:
In formula (3), as author group Gi" pay close attention to by force " in key word kjTime, fuzzyijEqual to author group GiIn do not have
Pay close attention to key word kjAuthor's number;Author group Gi" weak concern " is in key word kjTime, fuzzyijEqual to author group GiMiddle concern
Key word kjAuthor's number.
2.5) according to fuzzyijCalculate each author group GiFuzziness fuzzy to keyword set Ki:
Wherein | K | is the key word sum in keyword set K, i.e. M;
2.6) overall fuzziness Fuzzy of this Groups is calculated:
2.7) fuzzy is foundijMaximum, by the key word k of its correspondencejAs locking word kj′;
Find fuzzyiMaximum, by the author group G of its correspondenceiAs group G to be dividedi′;
To treat that splitting group group is split into two new author group Gi1And Gi2, update author's cluster set Groups;
Gi1={ An∈Gi′||wnj′>0}
Gi2=Gi-Gi1;
2.8) repeated execution of steps 2.2)~2.7), until the author group number in author's cluster set Groups is k;
This cluster result is designated as Groups={G1,G2,…,Gk, k is the classification number in cluster result, meets:
(1)And
(2)Gj∈ Groups, and i ≠ j,
2.9) relatively each stage etch 2.6) in overall fuzziness Fuzzy of cluster result Groups that obtains, will
Groups corresponding to Fuzzy minima, as final cluster result, is designated as summaryGroups;
It is as follows that the algorithm of described step 2 correspondence performs process:
This algorithm originates in a group comprising all authors, then during each iteration, to original group
Group divides, until having obtained k group.This algorithm is not to be randomly chosen a group to divide, but based on right
The concern situation of key word selects group to be divided.The definition of contact fuzziness, closes certain for an author group
" weak concern " relation of keyword, it is intended that isolate the concern collection to this key word from which;And relative to " paying close attention to by force "
Situation, we then wish to isolate those non-interesting collection, and both operations Dou Huishi group has higher pass to this key word
Note degree.Therefore, we select group to be divided, and it is carried out splitting operation.Each iterative computation when, to often
The cluster result in individual stage preserves.Finally, an optimum cluster result is selected.
The pseudo-code of algorithm is described as follows:
The most openness, through clustering based on author's key word two subnetwork figure due to author's key word two subnetwork data
After result, have indivedual author to be individually classified as a class, these discrete author node may just this Academic Teams expand new
Research direction.It is therefore desirable to discrete author node is processed, the convenient handle to whole team's academic research direction
Hold.
Described step 3 specifically includes following steps:
3.1) the contributors group component in the cluster result summaryGroups that will obtain in step 2 is discrete contributors group
Group and discrete author group;Discrete author group refers to only comprise the author group of an author;Discrete author group is made
For initial cluster;
3.2) each discrete author group G is calculatediClass research interest vector GMI in keyword set Ki
(Group Major Interests) is as the center of each initial cluster;
GMIi=(GWi1,GWi2,…,GWij,…,GWiM) (6)
Wherein, GWij(j=1,2 ..., M) represent GiTo key word kjConcern situation, quantitative description is:
3.3) each author A in discrete author group is traveled throughn, calculate its center with each initial cluster European away from
From;Computational methods are:
If author is AnResearch interest vector be vn=(wn1,wn2,…,wnj,…,wnM)
3.4) A is comparednWith the Euclidean distance at the center of each initial cluster, select the discrete that Euclidean distance minima is corresponding
Author group, by AnDistribute to this discrete author group, will only comprise author AnDiscrete author group and this discrete make
Person group merges, and forms a new author group;
3.5) iteration carries out above-mentioned steps 3.1)~3.4), until the author group produced no longer changes;
3.6) calculate and update the class research interest vector of each author group.
The pseudo-code of the algorithm of described step 3 correspondence is described as follows:
Beneficial effect:
The present invention is according to Team Member and the scientific paper information thereof excavating certain team: document name, participate in list of authors,
Lists of keywords etc., has carried out pretreatment, and has utilized Authors of Science Articles and paper in Academic Teams crucial the data set obtained
Word information characterizes and quantifies the research interest of author;Author and key in the topological structure unique for two subnetworks and team
Between word two are intrinsic, construct author investigation interest representation mode based on author's key word two subnetwork.Then making
Carry out figure cluster on person's key word two subnetwork, excavated the body feature of team, this network is had individual Preliminary study.Finally
In the body feature of author's key word two subnetwork, carry out the overall situation of the cluster result Chu Liao Team Member of network general levels
Academic research direction, lays a good foundation for carrying out the analysis in team's science direction from now on.
The present invention combines figure digest algorithm, hierarchical clustering algorithm and k-means algorithm, it is proposed that based on author's key word two
Subnetwork figure clustering algorithm and network general levels clustering algorithm, poor to the concern of different key words according to each author in team
DRS degree, clusters the author in team as k author group;In an author group, author's concern feelings to key word
Condition Relatively centralized.The present invention can excavate the academic research direction of team effectively, for analyzing and evaluating team's research direction
Development provides advantage.
Accompanying drawing explanation
Fig. 1 is team's research direction method for digging flow process;
Fig. 2 is author's key word two subnetwork;
The result of author's key word two subnetwork figure cluster when Fig. 3 is k=5;
Author's key word two subnetwork figure cluster result when Fig. 4 is k=7;
Fig. 5 is 4 the author groups produced after network general levels clusters;
Fig. 6 is 5 the author groups produced after network general levels clusters.
Detailed description of the invention
Below in conjunction with the drawings and specific embodiments, the present invention is described in more detail.
The invention provides a kind of team's research direction method for digging based on two subnetwork figure hierarchical clusterings, including following
Step:
Step 1: set up author investigation interest representation mode G=G (V, E) based on author's key word two subnetwork;
The set that wherein V is formed by author node and key word node, i.e. V={VAUVK, wherein VAGather for author
VA={ A1,A2,…,An,…,AN, VKFor keyword set VK=K={k1,k2,…,kj,…,kM, N and M is respectively in team
Author sum and team in all authors scientific paper concentrate key word sum;E is author node and key word node
Between the set that constituted of company limit, i.e. E={e (An,kj)|An∈VA,kj∈K,wnj>0};If author is AnScientific paper in
Lists of keywords comprises certain key word k in keyword setj, then weight wnj> 0, at author AnWith key word kjBetween exist
Even limit e (An,kj), otherwise wnj=0, at author AnWith key word kjBetween there is not even limit.
Step 2: author investigation interest representation mode is carried out figure cluster: by little to the degree of concern difference of each key word
Author be attributed to same author group;Obtain author's cluster set;
2.1) author's cluster set Groups={G is initialized0, G0It is one and comprises the contributors group of all authors in team
Group;
2.2) forDefinition author group GiTo key word kjConcern collection be:
Wherein, A is author group GiIn author.
2.3) author group G is calculated by formula (2)iTo each key word kj(kj∈ K) concern situation focusij:
Wherein,Represent author group GiMiddle concern key word kjAuthor's quantity, | Gi| represent contributors group
Group GiIn comprise author sum;If attention rate focusij>=α, Ze Cheng author group Gi" pay close attention to by force " in key word kj, otherwise
Claim author group Gi" weak concern " is in key word kj;Wherein α > 0, for paying close attention to intensity threshold;
2.4) each author group G is calculated by formula (3)iAt each key word kj(kj∈ K) on fuzziness
fuzzyij:
2.5) according to fuzzyijCalculate each author group GiFuzziness fuzzy to keyword set Ki:
Wherein | K | is the key word sum in keyword set K, i.e. M;
2.6) overall fuzziness Fuzzy of this Groups is calculated:
2.7) fuzzy is foundijMaximum, by the key word k of its correspondencejAs locking word kj′;
Find fuzzyiMaximum, by the author group G of its correspondenceiAs group G to be dividedi′;
To treat that splitting group group is split into two new author group Gi1And Gi2, update author's cluster set Groups;
Gi1={ An∈Gi′|wnj′>0}
Gi2=Gi-Gi1;
2.8) repeated execution of steps 2.2)~2.7), until the author group number in author's cluster set Groups is k;
2.9) relatively each stage etch 2.6) in overall fuzziness Fuzzy of cluster result Groups that obtains, will
Groups corresponding to Fuzzy minima, as final cluster result, is designated as summaryGroups.
Step 3: general levels clusters, and obtains the research interest of each author group:
Author's cluster set will only comprise the group of an author, be merged into other author group that research interest is similar
In so that the author's number comprised in each author group is more than 2, calculates and update the research interest of each author group, i.e.
Team's research direction;
3.1) the contributors group component in the cluster result summaryGroups that will obtain in step 2 is discrete contributors group
Group and discrete author group;Discrete author group refers to only comprise the author group of an author;Discrete author group is made
For initial cluster;
3.2) each discrete author group G is calculatediClass research interest vector GMI in keyword set KiAs
The center of each initial cluster;
GMIi=(GWi1,GWi2,…,GWij,…,GWiM) (6)
Wherein, GWij(j=1,2 ..., M) represent GiTo key word kjConcern situation, quantitative description is:
3.3) each author A in discrete author group is traveled throughn, calculate its center with each initial cluster European away from
From;Computational methods are:
If author is AnResearch interest vector be vn=(wn1,wn2,…,wnj,…,wnM)
3.4) A is comparednWith the Euclidean distance at the center of each initial cluster, select the discrete that Euclidean distance minima is corresponding
Author group, by AnDistribute to this discrete author group, will only comprise author AnDiscrete author group and this discrete make
Person group merges, and forms a new author group;
3.5) iteration carries out above-mentioned steps 3.1)~3.4), until the author group produced no longer changes;
3.6) calculate and update the class research interest vector of each author group, i.e. team's research direction vector.
Main flow of the present invention is as shown in Figure 1:
Fig. 1 is team's research direction method for digging flow process, from team's scientific paper data, builds based on author crucial
The author investigation interest representation mode of word two subnetwork;Then author investigation interest representation mode is carried out figure cluster, rolled into a ball
The body feature of team's research interest;Finally the body feature excavated is analyzed, carries out the cluster of network general levels,
Research direction to this team's overall situation.
Experimental analysis
The source of 1 data
These part data come from certain computer science and technology research team as object of study, by the paper of this team
Data set has carried out experimental verification analysis, and the most in visual form and the form of form is opened up by experimental result
Show.The author investigation interest representation mode of this part, author's cohort studies interest representation mode all by means of Gephi software to be carried out
Signal is shown.
2 author's key word two subnetworks
Form and scientific paper data set by analyzing the team of this team, obtain in the research interest model of this team
Comprising 23 Team Members, 547 paper key words, initial author's key word two subnetwork of foundation is as shown in Figure 2.At Fig. 2
Illustrate only the name of author in this team, the research interest worlds of its correspondence are dispersed in around corresponding author, due to joint
Point is numerous, does not the most demonstrate keyword attribute.
3 team's research interest body feature
Based on author's key word two subnetwork figure cluster result during 1.k=5
When Fig. 3 describes k=5, the result of author's key word two subnetwork figure cluster: (made containing after cluster by 552 nodes
Person group and key word), 714 limits are constituted.It can be seen that each group nodes is not of uniform size, wherein in group 1, number is relatively
Many, group 4 contains only a Team Member.To above-mentioned information, it is possible to use form is shown.Such as table 1, list
5 author groups concern situation to Partial key word set.
Table 1 team research interest body feature partial information example shows (during k=5)
Based on author's key word two subnetwork figure cluster result during 2.k=7
When that Fig. 4 shows is k=7, author's key word two subnetwork figure cluster result: 554 nodes, 721 limits.And figure
3 compare, and owing to the number of cluster increases, occur in that more groupuscule, such as group 4, group 5 and group 7 in Fig. 4.Similarly,
Table 2 lists the part research field that 7 author groups pay close attention to:
Table 2 team research interest body feature partial information example shows (during k=7)
Original team based on two subnetworks author investigation interest representation mode, by using based on author's key word two
After subnetwork figure clustering algorithm excavates, the information of archetype can be simplified, focal point is placed on the main body of this model
In structure, contribute to the assurance to team's main direction of studying.
The interpretation of result of 4 network general levels clusters
1. process team's research interest body feature during k=5
First, the method using network general levels cluster, process team's research interest body feature during k=5,
Arriving: 551 nodes, 693 limits, as it is shown in figure 5, now, cluster number is 4 to obtained team's overall situation research interest.Can
To find out, comparing with Fig. 3, the research interest of Team Member becomes apparent from, more concentrates.
Table 3 lists now 4 author groups concern situation to Partial key word set.Compare with table 1, group 3 and group
Group 4 there occurs change to the concern situation of Partial key word (such as: lens distortions, Hanzi component).
Table 3 team overall situation research direction certain embodiments
2. process team's research interest body feature during k=7
Again, interest body feature is studied by team during k=7 and use the cluster of network general levels, due in Fig. 4
In team's body feature information, discrete author node has two, and therefore after global clustering, author's group number is 5.Global clustering
Result as shown in Figure 6, wherein comprises 552 nodes, 679 limits.
Table 4 lists now 5 author groups and the part research field of concern thereof, compares with table 2, due to discrete work
The classification of person's node, the research interest worlds causing network general levels to cluster the concern of Hou Ge group have occurred that change.
Table 4 team overall situation research direction certain embodiments
The present invention has initially set up author investigation interest representation mode based on two subnetworks.Emerging then in conjunction with author investigation
Interest represents the feature of model, introduces figure clustering algorithm based on author's key word two subnetwork, excavates the body feature of network,
And then obtain the main direction of studying of team.Finally by the basic thought of k-means algorithm, on the basis of network principal feature
On, carry out the excavation of network general levels, obtain the overall situation research interest worlds of team.The present invention can excavate team effectively
Academic research direction, provide advantage for analyzing and evaluate the development of team research direction.
Claims (4)
1. team's research direction method for digging based on two subnetwork figure hierarchical clusterings, it is characterised in that include following step
Rapid:
Step 1: set up author investigation interest representation mode based on author's key word two subnetwork;
Step 2: author investigation interest representation mode is carried out figure cluster:
Author little for degree of concern difference to each key word is attributed to same author group;Obtain author's cluster set;
Step 3: general levels clusters, and obtains the research interest of each author group:
Author's cluster set will only comprise the group of an author, be merged in other author group that research interest is similar,
Make the author's number comprised in each author group more than 2, calculate and update the research interest of each author group, i.e. team
Research direction.
Team's research direction method for digging based on two subnetwork figure hierarchical clusterings the most according to claim 1, its feature
Being, in described step 1, author investigation interest representation mode is expressed as G=G (V, E);
The set that wherein V is formed by author node and key word node, i.e. V={VAUVK, wherein VAV is gathered for authorA=
{A1,A2,…,An,…,AN, VKFor keyword set VK=K={k1,k2,…,kj,…,kM, N and M is respectively in team
The key word sum that in author's sum and team, the scientific paper of all authors is concentrated;E be author node and key word node it
Between the set that constituted of company limit, i.e. E={e (An,kj)|An∈VA,kj∈K,wnj>0};If author is AnScientific paper in close
Keyword list comprises certain key word k in keyword setj, then weight wnj> 0, at author AnWith key word kjBetween exist even
Limit e (An,kj), otherwise wnj=0, at author AnWith key word kjBetween there is not even limit.
Team's research direction method for digging based on two subnetwork figure hierarchical clusterings the most according to claim 1, its feature
Being, described step 2 comprises the following steps:
2.1) author's cluster set Groups={G is initialized0, G0It it is an author group comprising all authors in team;
2.2) forDefinition author group GiTo key word kjConcern collection be:
Wherein, A is author group GiIn author;
2.3) author group G is calculated by formula (2)iTo each key word kj(kj∈ K) concern situation focusij:
Wherein,Represent author group GiMiddle concern key word kjAuthor's quantity, | Gi| represent author group GiIn
The author's sum comprised;If attention rate focusij>=α, Ze Cheng author group Gi" pay close attention to by force " in key word kj, otherwise it is referred to as
Person group Gi" weak concern " is in key word kj;Wherein α > 0, for paying close attention to intensity threshold;
2.4) each author group G is calculated by formula (3)iAt each key word kj(kj∈ K) on fuzziness fuzzyij:
2.5) according to fuzzyijCalculate each author group GiFuzziness fuzzy to keyword set Ki:
Wherein | K | is the key word sum in keyword set K, i.e. M;
2.6) overall fuzziness Fuzzy of this Groups is calculated:
2.7) fuzzyi is foundjMaximum, by the key word k of its correspondencejAs locking word kj′;
Find fuzzyiMaximum, by the author group G of its correspondenceiAs group G to be dividedi′;
To treat that splitting group group is split into two new author group Gi1And Gi2, update author's cluster set Groups;
Gi1={ An∈Gi′|wnj′>0}
Gi2=Gi-Gi1;
2.8) repeated execution of steps 2.2)~2.7), until the author group number in author's cluster set Groups is k;
2.9) relatively each stage etch 2.6) in overall fuzziness Fuzzy of cluster result Groups that obtains, by Fuzzy
The Groups of little value correspondence, as final cluster result, is designated as summaryGroups.
Team's research direction method for digging based on two subnetwork figure hierarchical clusterings the most according to claim 3, its feature
Being, described step 3 comprises the following steps:
3.1) the contributors group component in the cluster result summaryGroups that will obtain in step 2 be discrete author group and
Discrete author group;Discrete author group refers to only comprise the author group of an author;Using discrete author group as just
Beginning bunch;
3.2) each discrete author group G is calculatediClass research interest vector GMI in keyword set KiAs each
The center of initial cluster;
GMIi=(GWi1,GWi2,…,GWij,…,GWiM) (6)
Wherein, GWij(j=1,2 ..., M) represent GiTo key word kjConcern situation, quantitative description is:
3.3) each author A in discrete author group is traveled throughn, calculate the Euclidean distance of itself and the center of each initial cluster;Calculate
Method is:
If author is AnResearch interest vector be vn=(wn1,wn2,…,wnj,…,wnM)
3.4) A is comparednWith the Euclidean distance at the center of each initial cluster, select the discrete author that Euclidean distance minima is corresponding
Group, by AnDistribute to this discrete author group, will only comprise author AnDiscrete author group and this discrete contributors group
Combination also, forms a new author group;
3.5) iteration carries out above-mentioned steps 3.1)~3.4), until the author group produced no longer changes;
3.6) calculate and update the class research interest vector of each author group.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610595145.XA CN106227835B (en) | 2016-07-25 | 2016-07-25 | Team's research direction method for digging based on two subnetwork figure hierarchical clusterings |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610595145.XA CN106227835B (en) | 2016-07-25 | 2016-07-25 | Team's research direction method for digging based on two subnetwork figure hierarchical clusterings |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106227835A true CN106227835A (en) | 2016-12-14 |
CN106227835B CN106227835B (en) | 2018-01-19 |
Family
ID=57533613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610595145.XA Expired - Fee Related CN106227835B (en) | 2016-07-25 | 2016-07-25 | Team's research direction method for digging based on two subnetwork figure hierarchical clusterings |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106227835B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107256231A (en) * | 2017-05-04 | 2017-10-17 | 腾讯科技(深圳)有限公司 | A kind of Team Member's identification equipment, method and system |
CN108491409A (en) * | 2018-01-29 | 2018-09-04 | 浙江工业大学 | A kind of city medical system clustering method based on hospital's related network structure feature |
CN109376236A (en) * | 2018-07-27 | 2019-02-22 | 中山大学 | A kind of academic paper author's weight analysis method based on clustering |
WO2019079971A1 (en) * | 2017-10-24 | 2019-05-02 | 深圳市云中飞网络科技有限公司 | Method for group communication, and apparatus, computer storage medium, and computer device |
CN109741791A (en) * | 2018-12-29 | 2019-05-10 | 人和未来生物科技(长沙)有限公司 | A kind of author's subject bearing data method for digging and system towards PubMed paper library |
CN109829634A (en) * | 2019-01-18 | 2019-05-31 | 北京工业大学 | A kind of adaptive patent Research Team, colleges and universities recognition methods |
CN110941662A (en) * | 2019-06-24 | 2020-03-31 | 上海市研发公共服务平台管理中心 | Graphical method, system, storage medium and terminal for scientific research cooperative relationship |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254028A (en) * | 2011-07-22 | 2011-11-23 | 青岛理工大学 | Personalized commodity recommending method and system which integrate attributes and structural similarity |
CN102609546A (en) * | 2011-12-08 | 2012-07-25 | 清华大学 | Method and system for excavating information of academic journal paper authors |
CN103020302A (en) * | 2012-12-31 | 2013-04-03 | 中国科学院自动化研究所 | Academic core author excavation and related information extraction method and system based on complex network |
CN103559262A (en) * | 2013-11-04 | 2014-02-05 | 北京邮电大学 | Community-based author and academic paper recommending system and recommending method |
-
2016
- 2016-07-25 CN CN201610595145.XA patent/CN106227835B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254028A (en) * | 2011-07-22 | 2011-11-23 | 青岛理工大学 | Personalized commodity recommending method and system which integrate attributes and structural similarity |
CN102609546A (en) * | 2011-12-08 | 2012-07-25 | 清华大学 | Method and system for excavating information of academic journal paper authors |
CN103020302A (en) * | 2012-12-31 | 2013-04-03 | 中国科学院自动化研究所 | Academic core author excavation and related information extraction method and system based on complex network |
CN103559262A (en) * | 2013-11-04 | 2014-02-05 | 北京邮电大学 | Community-based author and academic paper recommending system and recommending method |
Non-Patent Citations (1)
Title |
---|
刘非凡 等: "基于2-模网络和G-N 社群聚类算法的潜在合作者研究", 《情报理论与实践》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107256231A (en) * | 2017-05-04 | 2017-10-17 | 腾讯科技(深圳)有限公司 | A kind of Team Member's identification equipment, method and system |
CN107256231B (en) * | 2017-05-04 | 2022-04-22 | 腾讯科技(深圳)有限公司 | Team member identification device, method and system |
WO2019079971A1 (en) * | 2017-10-24 | 2019-05-02 | 深圳市云中飞网络科技有限公司 | Method for group communication, and apparatus, computer storage medium, and computer device |
CN108491409A (en) * | 2018-01-29 | 2018-09-04 | 浙江工业大学 | A kind of city medical system clustering method based on hospital's related network structure feature |
CN108491409B (en) * | 2018-01-29 | 2022-06-17 | 浙江工业大学 | Urban medical system clustering method based on hospital associated network structural features |
CN109376236A (en) * | 2018-07-27 | 2019-02-22 | 中山大学 | A kind of academic paper author's weight analysis method based on clustering |
CN109376236B (en) * | 2018-07-27 | 2021-10-26 | 中山大学 | Academic paper author weight analysis method based on cluster analysis |
CN109741791A (en) * | 2018-12-29 | 2019-05-10 | 人和未来生物科技(长沙)有限公司 | A kind of author's subject bearing data method for digging and system towards PubMed paper library |
CN109829634A (en) * | 2019-01-18 | 2019-05-31 | 北京工业大学 | A kind of adaptive patent Research Team, colleges and universities recognition methods |
CN109829634B (en) * | 2019-01-18 | 2021-02-26 | 北京工业大学 | Self-adaptive college patent and scientific research team identification method |
CN110941662A (en) * | 2019-06-24 | 2020-03-31 | 上海市研发公共服务平台管理中心 | Graphical method, system, storage medium and terminal for scientific research cooperative relationship |
Also Published As
Publication number | Publication date |
---|---|
CN106227835B (en) | 2018-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106227835A (en) | Team's research direction method for digging based on two subnetwork figure hierarchical clusterings | |
Qu et al. | Efficient topological OLAP on information networks | |
Narvekar et al. | An optimized algorithm for association rule mining using FP tree | |
Salam et al. | Mining top− k frequent patterns without minimum support threshold | |
Srinivas et al. | Clustering and classification of software component for efficient component retrieval and building component reuse libraries | |
Archambeau et al. | Latent IBP compound Dirichlet allocation | |
Malo et al. | Automated query learning with Wikipedia and genetic programming | |
Zhang et al. | Multi-label truth inference for crowdsourcing using mixture models | |
Loh et al. | Faster hoeffding racing: Bernstein races via jackknife estimates | |
Radhakrishna et al. | GANDIVA: Temporal pattern tree for similarity profiled association mining | |
Kumar et al. | Fake news detection of Indian and United States election data using machine learning algorithm | |
Bei et al. | Summarizing scale-free networks based on virtual and real links | |
Ge et al. | LPX: Overlapping community detection based on X‐means and label propagation algorithm in attributed networks | |
Cai et al. | HMSG: Heterogeneous graph neural network based on metapath subgraph learning | |
Flamino et al. | Robust and scalable entity alignment in big data | |
Yu et al. | Overlapping community detection based on random walk and seeds extension | |
Lewis | How transdisciplinary is design? An analysis using citation networks | |
Le Bras et al. | Mining classification rules without support: an anti-monotone property of Jaccard measure | |
Olawumi et al. | Scientometric review and analysis: A case example of smart buildings and smart cities | |
Liakos et al. | Uncovering local hierarchical overlapping communities at scale | |
Bouhatem et al. | Density-based Approach with Dual Optimization for Tracking Community Structure of Increasing Social Networks | |
Khanam et al. | Application of network analysis for finding relatedness among legal documents by using case citation data | |
Xenopoulos et al. | Gale: Globally assessing local explanations | |
Banerjee et al. | Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context | |
Wang et al. | An effective semi-supervised clustering framework integrating pairwise constraints and attribute preferences |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180119 Termination date: 20210725 |
|
CF01 | Termination of patent right due to non-payment of annual fee |