CN108510205B - Author skill evaluation method based on hypergraph - Google Patents
Author skill evaluation method based on hypergraph Download PDFInfo
- Publication number
- CN108510205B CN108510205B CN201810316651.XA CN201810316651A CN108510205B CN 108510205 B CN108510205 B CN 108510205B CN 201810316651 A CN201810316651 A CN 201810316651A CN 108510205 B CN108510205 B CN 108510205B
- Authority
- CN
- China
- Prior art keywords
- author
- skill
- field
- distance
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000011156 evaluation Methods 0.000 title abstract description 14
- 238000011160 research Methods 0.000 claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 17
- 238000010606 normalization Methods 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 4
- 239000002994 raw material Substances 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06398—Performance of employee with respect to a job function
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of student skill evaluation, and relates to a hypergraph-based student skill evaluation method, which can evaluate the level of a certain skill of a student in a certain field in a fine granularity manner and can reflect the change rule of the student skill along with time. The method considers factors such as the number of the thesis, the quality of the thesis, the difference of different fields, time change and the like. The use of the hypergraph concept allows the method to fuse scholars, domains and skills, thereby allowing the method to provide a fine-grained assessment scheme. When the distance of a student, a field and a skill is calculated, expansion is carried out on the basis of traditional evaluation parameters such as paper quote amount, H-index and the like, reliability is guaranteed, meanwhile, operation efficiency is improved through normalization, and errors are reduced. Finally, the time factor is added, so that the method can analyze the field of scholars, and the change of skills along with time provides more raw materials for research.
Description
Technical Field
The invention belongs to the technical field of author skill evaluation, and relates to an author skill evaluation method based on a hypergraph.
Background
With the continuous development of science and technology, more and more authors engaged in scientific research work, and the research on scientific researchers is promoted by the increase of the number of scientific researchers. The system and the method have the advantages that the system and the method can evaluate the level of a scientific research worker, have the specialties, have the rules and the like published in a thesis, have the promotion effects on the establishment of a scientific research team, project investment, the comprehensive evaluation of academic levels of authors, the comparison of different authors, the study of academic cooperation behavior mechanisms, the discovery of potential rules of scientific research and the like, and are beneficial to the progress and development of academic circles and even human society.
Currently, H-index is mostly used for evaluating the author level, and indexes such as quoted number of papers and publication number of papers are adopted. The above indexes are generally used for overall evaluation of an author, and have the problems that the proficiency of a certain skill of the author cannot be known, the academic level change of the author in a certain time period cannot be known, and the like, so that the research range and depth are limited to a certain extent.
A hypergraph is a generalized graph in which a hyperedge may contain multiple vertices. The characteristics of the super-edge enable the hypergraph to be capable of fusing the multiple attributes of the author, and the hypergraph is quite suitable for the three-element processing of the author, the field and the skill required in the skill research of the author.
Disclosure of Invention
The invention mainly aims at the defects of the existing research, provides an author skill evaluation method, and provides an author evaluation algorithm based on a hypergraph by analyzing the contribution of an author in published papers and combining time factors. The algorithm carries out fine-grained evaluation on the author, considers the number and quality of the treatises and also considers the differences of different fields, the proficiency of the author in a specific skill in a certain field can be obtained through the algorithm, and meanwhile, the time factor is added, so that the change of the skill of the author along with the time can be obtained.
The technical scheme of the invention is as follows:
an author skill assessment method based on hypergraphs comprises the following steps:
step 1): combining the skill of the author and the author in the paper and the field of the paper into a super limit, counting all papers participated by the author, merging the skill types, and obtaining the statistical data of the author, the skill and the field of the papers; the hypergraph has good compatibility, and can integrate three factors of an author, a field and skill; counting the thesis information published by the author, wherein the author, the skill and the field are used as three vertexes of the super edge;
the super edge connects a certain skill of an author in a certain field, the proficiency of the skill of the author in the certain field is reflected by calculating the weight of the super edge, and the network scale of the author can be effectively reduced by using the super graph;
the skills are complicated, so that the related calculation of the following steps can be influenced, and the skills in the data set are merged to obtain a uniform standard data set;
step 2): combining the three vertexes of the hyperedge pairwise, and calculating the distance of each vertex in the hyperedge; the distance between the attributes is calculated by the following formula:
distance of author j from field f:
where n is the number of authors in the field, nfIs the total number of papers in the field, ciIs the quoted number, h, of paper ijIs the H-index of the author;
the distance between the author j and the field f is normalized, so that subsequent data processing is facilitated; the normalized formula is as follows:
wherein avg (dis (field)) refers to the distance between all authors and the field, and the calculation results are summed and averaged;
distance of author j from skill s:
wherein n is the number of characters used by the author in the skill, ciIs the number of times of citation, h, of paper ijIs the H-index, n, of the authorisIs the number of participants in the skill in the paper;
the distance of author j from skill s uses the following normalization formula:
wherein, avg (dis) refers to the distance between all authors and skills, and the average value is calculated after the results are summed;
distance of area f from skill s:
wherein n isfIs the total number of papers in the field, nsIs the total number of papers containing the skill, nfsIs the number of domains that contain the skill;
the distance of the domain f from the skill s is normalized using the following formula:
wherein, avg (field, skip) refers to the average value after summing the distance calculation results of all fields and skills;
step 3): calculating the weight of the excess edge by using an excess edge weight calculation method, wherein the weight is the proficiency of a writer in a specific skill in a certain field; calculating the weight of the super edge by using the deformation of the Gaussian kernel function according to the hypergraph theory, and linking the three distances in the step 2) to obtain a specific skill level parameter of the author in a certain field;
the excess edge weight is calculated using the following formula:
wherein d (x, y) is the distance between two authors, areas and skills, σ is the average for the distance;is the level value of the skill of i author in the field f s;
step 4): the process is changed along with time, and is applied to each year, so that the change rule of the specific skill of an author in a certain field along with the change of the time is obtained; various time points exist in the research life of an author, such as changing a research institution and changing the research direction, and if the change condition of each skill of the author at different time is known, the change rule of the skill of the author along with the time can be researched, so that the potential rule of scientific research is discovered; in order to realize the goal, the data set is divided into a subdata set every year according to the increase of time, and the data in the year and before the year are stored; and repeating the step 2) and the step 3) for each data subset, and extracting the skill change of each author in each year from the result, namely obtaining the change condition of the skill of the author along with the time.
The invention has the beneficial effects that: skill assessment of authors is a hypergraph-based method that takes into account the number of papers and the quality of the papers, differences in different fields, temporal variations, etc. The use of hypergraph concepts allows the method to fuse authors, fields and skills, thus allowing the method to provide a fine-grained assessment solution. When the distance of an author, a field and skill is calculated, expansion is carried out on the basis of traditional evaluation parameters such as paper quoted amount, H-index and the like, reliability is guaranteed, meanwhile, operation efficiency is improved by normalization, and errors are reduced.
The invention adds time factor to analyze the author field, and the skill changes with time, to provide more raw material for follow-up research.
Drawings
FIG. 1 is a flow chart of the data preprocessing performed on Ploss datasets according to experimental requirements in accordance with the present invention.
Fig. 2 is a final result author skill radar chart a of the present invention.
FIG. 3 is a final result author skill radar chart b of the present invention
Fig. 4 is a schematic diagram a of the annual maximum skill level of the author as it changes incrementally over time.
Fig. 5 is a schematic diagram b of the author's annual maximum skill level increasing with time.
FIG. 6 is an exemplary graph of author skill over time.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in further detail below.
The embodiment of the invention provides an author skill evaluation method based on a hypergraph, which comprises the following steps:
step 1: selecting a Plosone data set as an experimental data set of the method, and preprocessing the Plosone data set, wherein the processing process is shown in fig. 1.
In order to capture the contribution of the author in the paper, i.e. the skill the author uses in this paper, the present invention uses the Plosone dataset. Data set raw data are as follows:
TABLE 1 Ploss dataset
As can be seen from table 1, the number of different skills is very large, which may be due to the lack of standard naming rules for the skills, resulting in similar skills being used with different expressions. The original skill naming if the data set is used directly can lead to inaccurate and redundant results.
Therefore, the invention makes statistics on skills, finds that 10342 skills appear less than 10 times and 21 skills appear in a jump increase mode, therefore, the skills appear less than 10 times are discarded, and then classifies the skill names based on the 21 jump increase skills to finally obtain 16 skill classes, wherein each skill class is represented by the skill name with the largest occurrence frequency.
Because the number of authors is large, there is a great probability that different authors have the same name, which may interfere with the experimental results. In order to relieve the influence of the same-name problem on the experimental result, the method disclosed by the invention is used for carrying out same-name distinguishing on the author by combining the actual condition that the data set contains the mechanism to which the author belongs and using the cooperation condition of the author and the research mechanism to which the author belongs as reference according to the existing same-name distinguishing algorithm. The homonymous distinguishing rule used by the invention is as follows: if two authors of the same name have collaborated with the same author, it is reasonable to think that the two authors of the same name have the same probability. If two authors of the same name belong to the same research institution, the two authors of the same name are likely to be the same person. Because the same-name distinguishing is a research problem at present, no better solution exists, and the invention does not discuss the same-name distinguishing problem any more.
Step 2: combining the three vertexes of the hyper-edge two by two, and then calculating the distance between the three vertexes.
And (3) naming the standardized skills obtained in the step (1) and collecting the data subjected to homonymy distinguishing, calculating the data according to a distance calculation formula, and then normalizing the result to obtain the distances between authors and fields, between authors and skills and between skills and fields.
And step 3: and calculating the weight of the excess edge through a weight calculation method of the excess edge, wherein the weight is the specific skill proficiency of an author in a certain field.
According to the hypergraph theory, the deformation of the Gaussian kernel function is used for calculating the hyperedge weight, and the calculation formula is as follows:
wherein d (j, s) represents the distance between the author and the skill, d (j, f) represents the distance between the author and the domain, d (f, s) represents the distance between the domain and the skill, and σ representsjsMean, σ, representing the distance of all authors from the skilljfMean value, σ, representing the distance of all authors from the fieldfsRepresents the average of all domain-to-skill distances.
To demonstrate the author skill distribution in a concrete and concise manner, the present invention uses radar maps to represent the skill distribution of the author. Fig. 2 and 3 give radar map examples of the author skill distributions, one field for each circle in the figures.
And 4, step 4: the change rule of the specific skill of the author in a certain field along with the change of the time can be obtained by applying the process along with the change of the time to each year.
According to the invention, the data of Ploss is divided into one data set every year according to the increment of time, and 12 sub-data sets from 2006 to 2017 are divided.
And (3) applying the steps 2 and 3 to the data subset of each year, and extracting the skill change condition corresponding to each author in different years from the result, so that the change of different skills of the author in different fields along with time can be obtained.
The invention integrates the author skill and year into a line graph in order to show the change of the author skill along with the time. Figure 6 gives the author a line graph of skill over time. Since the number of skills is too large, each learner has a plurality of skill levels which change with time, and it is difficult to find rules, the extraction of the highest skill level of the author is combined with the time variation, and the scatter diagrams shown in fig. 4 and 5 are obtained.
Claims (1)
1. An author skill assessment method based on hypergraph is characterized by comprising the following steps:
step 1): combining the skills of the author and the author in the paper and the field of the paper into a super limit, counting all papers participated by the author, merging the skill types, and obtaining the statistical data of the author, the skill and the field of the papers; counting the thesis information published by the author, wherein the author, the skill and the field are used as three vertexes of the super edge;
step 2): combining the three vertexes of the hyperedge pairwise, and calculating the distance of each vertex in the hyperedge; the distance between the attributes is calculated by the following formula:
distance of author j from field f:
where n is the number of authors in the field, nfIs the total number of papers in the field, ciIs the quoted number, h, of paper ijIs the H-index of the author;
the distance between the author j and the field f is normalized, so that subsequent data processing is facilitated; the normalized formula is as follows:
wherein avg (dis (field)) refers to the distance between all authors and the field, and the calculation results are summed and averaged;
distance of author j from skill s:
wherein n is the number of characters used by the author in the skill, ciIs the number of times of citation, h, of paper ijIs the H-index, n, of the authorisIs the number of participants in the skill in the paper;
the distance of author j from skill s uses the following normalization formula:
wherein, avg (dis) refers to the distance between all authors and skills, and the average value is calculated after the results are summed;
distance of area f from skill s:
wherein n isfIs the total number of papers in the field, nsIs the total number of papers containing the skill, nfsIs the number of domains that contain the skill;
the distance of the domain f from the skill s is normalized using the following formula:
wherein, avg (field, skip) refers to the average value after summing the distance calculation results of all fields and skills;
step 3): calculating the weight of the excess edge by using an excess edge weight calculation method, wherein the weight is the proficiency of a writer in a specific skill in a certain field; calculating the weight of the super edge by using the deformation of the Gaussian kernel function according to the hypergraph theory, and linking the three distances in the step 2) to obtain the skill level parameter of the author in a certain field;
the excess edge weight is calculated using the following formula:
wherein d (x, y) is the distance between two authors, areas and skills, σ is the average for the distance;is the level value of the skill of i author in the field f s;
step 4): the process is changed along with time, and is applied to each year, so that the change rule of the specific skill of an author in a certain field along with the change of the time is obtained; various time points exist in the research life of an author, such as changing research institutions and changing research directions, if the change conditions of various skills of the author at different time are known, the change rule of the skill of the author along with time is researched, and therefore the potential rule of scientific research is found; in order to realize the goal, the data set is divided into a subdata set every year according to the increase of time, and the data in the year and before the year are stored; and repeating the step 2) and the step 3) for each data subset, and extracting the skill change of each author in each year from the result, namely obtaining the change condition of the skill of the author along with the time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810316651.XA CN108510205B (en) | 2018-04-08 | 2018-04-08 | Author skill evaluation method based on hypergraph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810316651.XA CN108510205B (en) | 2018-04-08 | 2018-04-08 | Author skill evaluation method based on hypergraph |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108510205A CN108510205A (en) | 2018-09-07 |
CN108510205B true CN108510205B (en) | 2021-07-16 |
Family
ID=63381338
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810316651.XA Expired - Fee Related CN108510205B (en) | 2018-04-08 | 2018-04-08 | Author skill evaluation method based on hypergraph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108510205B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112862015A (en) * | 2021-04-01 | 2021-05-28 | 北京理工大学 | Paper classification method and system based on hypergraph neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609546A (en) * | 2011-12-08 | 2012-07-25 | 清华大学 | Method and system for excavating information of academic journal paper authors |
CN104090936A (en) * | 2014-06-27 | 2014-10-08 | 华南理工大学 | News recommendation method based on hypergraph sequencing |
CN105956197A (en) * | 2016-06-15 | 2016-09-21 | 杭州量知数据科技有限公司 | Social media graph representation model-based social risk event extraction method |
CN106778011A (en) * | 2016-12-29 | 2017-05-31 | 大连理工大学 | A kind of scholar's influence power appraisal procedure based on academic heterogeneous network |
CN107273207A (en) * | 2017-05-25 | 2017-10-20 | 天津大学 | A kind of related data storage method based on hypergraph partitioning algorithm |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9996567B2 (en) * | 2014-05-30 | 2018-06-12 | Georgetown University | Process and framework for facilitating data sharing using a distributed hypergraph |
-
2018
- 2018-04-08 CN CN201810316651.XA patent/CN108510205B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609546A (en) * | 2011-12-08 | 2012-07-25 | 清华大学 | Method and system for excavating information of academic journal paper authors |
CN104090936A (en) * | 2014-06-27 | 2014-10-08 | 华南理工大学 | News recommendation method based on hypergraph sequencing |
CN105956197A (en) * | 2016-06-15 | 2016-09-21 | 杭州量知数据科技有限公司 | Social media graph representation model-based social risk event extraction method |
CN106778011A (en) * | 2016-12-29 | 2017-05-31 | 大连理工大学 | A kind of scholar's influence power appraisal procedure based on academic heterogeneous network |
CN107273207A (en) * | 2017-05-25 | 2017-10-20 | 天津大学 | A kind of related data storage method based on hypergraph partitioning algorithm |
Non-Patent Citations (1)
Title |
---|
Music Recommendation by Unified Hypergraph:Combining Social Media Information and Music Content;Jiajun Bu et al.;《Proceedings of the 18th ACM international conference on Multimedia》;20101029;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108510205A (en) | 2018-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiang et al. | A topic modeling based bibliometric exploration of hydropower research | |
Caralis et al. | Profitability of wind energy investments in China using a Monte Carlo approach for the treatment of uncertainties | |
Smith et al. | Predicting firm-level bankruptcy in the Spanish economy using extreme gradient boosting | |
Cong et al. | Performance evaluation of public-private partnership projects from the perspective of Efficiency, Economic, Effectiveness, and Equity: A study of residential renovation projects in China | |
Fauzia et al. | Mapping the potential of zakat collection digitally in Indonesia | |
Hu et al. | The dynamic evolution of global energy security and geopolitical games: 1995~ 2019 | |
Rudkin et al. | On the topology of cryptocurrency markets | |
CN108510205B (en) | Author skill evaluation method based on hypergraph | |
Wang et al. | Forecasting VaR and ES by using deep quantile regression, GANs-based scenario generation, and heterogeneous market hypothesis | |
Jiuwen et al. | Impact of urban form on housing affordability stress in Chinese cities: Does public service efficiency matter? | |
Liu et al. | Assessing the credit risk of corporate bonds based on factor Analysis and logistic regress analysis techniques: evidence from new energy enterprises in China | |
CN108197729A (en) | Value investment analysis method, equipment and storage medium based on machine learning | |
Yao et al. | Evaluating and Analyzing Urban Renewal and Transformation Potential Based on AET Models: A Case Study of Shenzhen City | |
Lin | [Retracted] Big Data Technology in the Macrodecision‐Making Model of Regional Industrial Economic Information Applied Research | |
Koukal et al. | Offshore wind energy in emerging countries: a decision support system for the assessment of projects | |
CN111242520B (en) | Feature synthesis model generation method and device and electronic equipment | |
Gu et al. | Financial Decision Management of Enterprise Cloud Accounting Based on Big Data Technology | |
CN113344247A (en) | Deep learning-based power facility site selection prediction method and system | |
Muhammad et al. | Financial feasibility analysis of Gumanti micro hydro power plant project | |
Vnukova et al. | Identifying changes in insurance companies’ competitiveness on the travel services market | |
Huang et al. | Dynamic Analysis of Regional Integration Development: Comprehensive Evaluation, Evolutionary Trend, and Driving Factors | |
HONG et al. | Financial Decentralization, SOEs and Industrial Upgrading: An Empirical Explanation for Regional Differences of Financial Decentralization | |
Huang et al. | Graph neural network-based identification of ditch matching patterns across multi-scale geospatial data | |
Pan et al. | How does digital transformation affect systemic financial risks of commercial banks? An investigation based on fuzzy-set qualitative comparative analysis | |
Şahinarslan et al. | Machine learning algorithms to forecast population: Turkey example |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210716 |