CN117314266A - Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism - Google Patents
Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism Download PDFInfo
- Publication number
- CN117314266A CN117314266A CN202311623569.9A CN202311623569A CN117314266A CN 117314266 A CN117314266 A CN 117314266A CN 202311623569 A CN202311623569 A CN 202311623569A CN 117314266 A CN117314266 A CN 117314266A
- Authority
- CN
- China
- Prior art keywords
- talent
- data
- hypergraph
- scientific
- talents
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 62
- 230000007246 mechanism Effects 0.000 title claims abstract description 40
- 238000005516 engineering process Methods 0.000 claims abstract description 59
- 238000013461 design Methods 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 48
- 238000000034 method Methods 0.000 claims description 39
- 239000011159 matrix material Substances 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 21
- 238000011160 research Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 11
- 238000010276 construction Methods 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000009826 distribution Methods 0.000 claims description 7
- 241000251468 Actinopterygii Species 0.000 claims description 6
- 210000000988 bone and bone Anatomy 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000003556 assay Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000013500 data storage Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000003491 array Methods 0.000 claims 1
- 230000000295 complement effect Effects 0.000 abstract description 2
- 230000007547 defect Effects 0.000 abstract description 2
- 238000013210 evaluation model Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06398—Performance of employee with respect to a job function
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
- G06F18/15—Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Strategic Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Entrepreneurship & Innovation (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Probability & Statistics with Applications (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of scientific and technological talent evaluation, and discloses a novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism, which comprises the following steps: s1, constructing a science and technology talent knowledge hypergraph; s2, introducing an attention mechanism to design a scientific and technological talent classification evaluation network model; s3: and constructing a scientific and technological talent intelligent recommendation model. The novel intelligent evaluation method for the talents based on the hypergraph attention mechanism can automatically complement the defects of the talents information data, realize the dynamic evaluation and accurate recommendation of the omnibearing talents from the angles of multidimensional degree, multi-space, multi-angle, exhibition and comparison, further improve the scientificity, the speciality and the objectivity of the talents evaluation and provide powerful support for the intelligent management and the application of the talents data of human units.
Description
Technical Field
The invention relates to the technical field of scientific and technological talent evaluation, in particular to a novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanisms.
Background
The evaluation of the talents is an important content of basic system and deepening innovation of the talents, and is important to cultivating high-level talents, producing high-quality scientific research achievements and creating good innovative environment. However, constructing a scientific and effective talent assessment system is highly desirable to solve three major problems:
(1) talent data fragmentation, lack of dynamic diversified technology talent database
The current technology talent information data fragmentation lacks a diversified and dynamic database; leading to lack of scientificity, objectivity, authenticity and systemicity in talent assessment. Therefore, a full-body, multi-space-time and multi-dimensional dynamic database with large category span and complicated talent hierarchical structure is established by multi-source data analysis and talent evaluation core indexes.
(2) Incomplete evaluation data and difficult quantification of evaluation system indexes
The traditional talent evaluation has the problems of single existence form, undirected serious and the like, and the multi-objective performance, the dynamic performance, the traceability, the working suitability and the exposable performance in the talent evaluation cannot be reflected, so that the scientific talent evaluation result is inconsistent with the actual existence of an evaluation object.
(3) The evaluation result is single, and accurate recommendation of scientific talents is difficult to realize
The results of the current scientific and technological talent evaluation are only subjected to simple data statistics, the evaluation data are single in comparison, and the evaluation results mainly show basic indexes such as evaluation scores, grades, parameter evaluation rates and the like; the lack of multi-dimensional interactive analysis of the evaluation results cannot provide personalized accurate recommendations to the human entity.
Disclosure of Invention
The invention aims to provide a novel intelligent scientific and technological talent assessment method based on a hypergraph attention mechanism, which solves the problems set forth in the background technology.
In order to achieve the above purpose, the invention provides a novel intelligent evaluation method for talents of science and technology based on hypergraph attention mechanism, which comprises the following steps:
s1, constructing a science and technology talent knowledge hypergraph: extracting data from multi-source talent big data to form a talent database, preprocessing the data of the talent database, and designing a structure and a model of a knowledge graph after preprocessing;
s2, introducing an attention mechanism to design a scientific and technological talent classification evaluation network model: s1, establishing a science and technology talent knowledge hypergraph, introducing an attention mechanism to learn and identify, and establishing a science and technology talent classification evaluation network model;
s3: constructing a scientific and technological talent intelligent recommendation model: and on the basis of the S2 technology talent classification evaluation network model, constructing a technology talent intelligent recommendation model by adopting a collaborative filtering recommendation algorithm based on a knowledge graph.
Preferably, step S1 comprises the steps of:
s101, acquiring large data of scientific and technological talents;
s102, preprocessing large data of scientific and technological talents;
s103, determining a hypergraph basic structure based on the entity, the relation and the attribute in the scientific and technological talent data.
Preferably, step S101 includes the steps of:
s1011, determining a data input source: the data input sources comprise talent networks, academic institutions, research centers and professional social networks, and talent information is obtained from innovation values, capabilities and contribution aspects of scientific and technological talents;
s1012, determining an acquisition technology: for streaming data, adopting a kafka technology to acquire data, adopting a parallel crawler technology to acquire network unstructured data, and adopting an sqoop technology to extract traditional database structured data;
s1013, determining a big data storage technique: and storing big data by adopting an HDFS distributed file system and an Hbase column database.
Preferably, step S102 includes the steps of:
s1021, data cleaning, filling or deleting data missing values, and ensuring the integrity of data; for a numerical feature, filling the missing value by using a mean value, a median value or a mode value, for time series data or ordered data, filling the missing value by using a previous observation value or a subsequent observation value, if correlation exists between the data, estimating the missing value by using an interpolation method, deleting the whole row of data containing the missing value under the condition that the missing value of the data is less or is not important for an analysis task, deleting the whole feature if the data of a certain feature is the missing value, and deleting the continuous data segment if a plurality of continuous data points are missing;
s1022, converting the multi-mode data, creating new features or converting existing features; creating new features according to domain knowledge, capturing potential modes of data, selecting the most relevant features, scaling feature values into similar ranges, and finally converting classification variables into numbers or single-hot codes.
Preferably, step S103 includes the steps of:
s1031, analyzing multiple attributes of the scientific and technological talents in scientific researches, wherein the multiple attributes comprise academic levels of the scientific and technological talents, reputation and awareness of the scientific and technological talents in academia, working experience and professional histories in scientific research projects, application and technical innovation of research papers and patents published in academic journals and conferences, influence degree of the research achievements on society and industry, and collaborative capability of the scientific and technological talents in multidisciplinary and cross-domain teams;
s1032, according to the extracted talent multiple attribute information, extracting and integrating entities, relations and attributes by using NER, NLTK, stanford NLP and GATE technologies, identifying the entities in a database, establishing the relations among the entities and extracting important attribute information, and constructing technological talent information;
s1033, abstracting the multidimensional attribute of the scientific and technological talent information into nodes in the knowledge graph by combining the knowledge graph construction principle, connecting the nodes according to the evaluation mechanisms of different periods, associating the evaluation results of different periods as supersides, and determiningThe artificial science and technology talent hypergraph isBy node set->And (2) side set->And hyperedge set->Composition, thus hypergraph->Expressed as:
;
wherein each edgeWeighting each edge according to evaluation mechanism attribute bias of different periods>Each overrun->Biasing the assignment of weights to each superside according to the evaluation results of different periods>,/>And->Hypergraph +.f. for representing importance of connection relationship in whole hypergraph, after weight is introduced>Expressed as:
;
and->Weight diagonal matrix representing edges and superedges respectively,
;
;
wherein the method comprises the steps ofRepresenting the number of elements in the set;
the structure of hypergraph uses an associated matrixDescription of:
。
preferably, step S2 includes:
s201, introducing a graph embedding technology, using a graph convolution neural network and one-hot coding, converting each node and associated information thereof in a hypergraph into a vector form, and expressing the obtained vector in a matrix form as network input;
s202, an input vector matrix is embedded into a trainable embedding layer, each element in input data is mapped into a high-dimensional vector space, and CTransR is used for carrying out embedded learning on nodes and supersides, so that a scoring function is defined:
;
wherein the method comprises the steps ofAnd->For embedded entities learned under corresponding hyperedges, < +.>To be in specific entity pair->The following is about a specific relationship>Is embedded with a vector of triplet head entity,>to be in specific entity pair->The following is about a specific relationship>Is embedded with a vector by the triplet tail entity,>for entity pair->The learned superside embeds the relation vector, +.>And->Respectively representing a first norm and a second norm,relation vector for constraint clustering>Is +.>Similarity between->For adjusting the influence of the constraint on the scoring function;
s203, performing convolution operation on the output of the embedded layer by using the 1D convolution check, and further extracting characteristic information;
s204, performing attention mechanism operation on the output of the step S203, calculating the characteristic weight by using cosine similarity, and in the first stepIn the layer, the embedding vector of each superside obtained according to the embedding layer +.>Calculate +.>Weighted embedding vector of individual nodes:
;
representing the +.f in the attention mechanism diagram neural network>The hyperedge weight of the layer,/>Indicate->Layer->Embedding vectors by the superedges of the individual nodes;
according to the firstWeighted embedding of individual nodesVector-in calculation first-order neighbor node set +.>Middle->Cosine similarity of individual nodes:
;
is->Layer->Weighting the embedded vectors by the individual nodes;
further obtaining the attention coefficient of the first-order neighbor node:
;
is->Layer->Personal node and->Cosine similarity among neighboring nodes;
obtainingNode in layer->Is embedded in:
;
wherein the method comprises the steps ofFor sigmoid function, +.>Is->Node in layer->Is embedded in the memory;
s205, performing dimension reduction operation on the attention layer output by using maximum pooling, and extracting representative feature vectors;
s206, connecting the output of the pooling layer into the full-connection layer, and mapping the extracted feature vector to a target space by using a LeakyReLU activation function to obtain the output of the classifier;
s207, measuring the error between the prediction result and the sample label by using the cross entropy loss function, optimizing and training the model, and training the sampleThe true probability distribution is +.>Its predictive probability distribution is +.>The loss function is:
;
wherein the method comprises the steps ofIs the sample type;
s208, updating weight parameters of the network by using an optimizer, minimizing a loss function to achieve the aim of optimization, and continuously training the network by using a training set to achieve the best effect;
s209, determining a root cause hyper-parameter affecting model classification by using a fish bone analysis method: aiming at the situation that the classification accuracy of the person in the model is insufficient, judging the output attribute by using a fishbone analysis method, determining the problem of setting parameters, obtaining a super-parameter setting decision, and improving the overall classification performance of the model; setting the skeleton class attribute threshold to be 0.6, setting the branch class attribute threshold to be 0.8, if the data exceeds the threshold, taking the data into the root cause,
;
wherein the method comprises the steps ofRepresenting fish bone assay, ->Representing root cause attribute, ++>Representing the nature of the branch and,representing attribute values; if the value in the root cause attribute exceeds the threshold value 0.6, then its attribute is expressed as +.>The method comprises the steps of carrying out a first treatment on the surface of the If the value in the root cause attribute is less than 0.6, then the attribute is expressed as +.>The root hyper-parameter influence of the classification capability of the model is determined through the judgment, the root hyper-parameter influence is used as an adjustment target of the network model, and the aim of optimal classification evaluation is achieved by changing relevant parameters in the network through the target.
Preferably, step S3 includes:
s301: obtaining feature vectors of different technical talents based on talent features in the technical talent knowledge hypergraphConstructing a science and technology talent matrixMeanwhile, a post demand matrix is built>,/>The construction mode of (2) is as follows:
;
wherein,
;
the construction mode of (2) is as follows:
;
wherein,
;
s302, acquiring a science and technology talent matrixAnd human unit demand matrix->Thereafter, utilizeJaccardSimilarity coefficient calculates the similarity between the technology talent matrix and the demand matrix:
;
s303, selecting the first K scientific talents with the highest similarity with the post requirements to form a nearest neighbor set U through a K nearest neighbor algorithm;
s304, recommending the first N technological talents from the nearest neighbor set U to the target post by adopting a TOP-N method, and realizing intelligent recommendation of the technological talents.
Therefore, the novel intelligent talent assessment method based on hypergraph attention mechanism has the following beneficial effects:
(1) According to the invention, through multi-source data analysis and talent evaluation core indexes, a full-body, multi-space-time and multi-dimensional dynamic database with large category span and complicated talent hierarchical structure is established;
(2) The invention adopts hypergraph technology to extract hidden multi-element and multi-dimensional relationship of the technology talents, selects key indexes by using an attention mechanism, constructs an intelligent technology talent classification evaluation model, has more comprehensive evaluation data, and quantifies evaluation system indexes;
(3) Based on the intelligent evaluation based on hypergraph attention mechanism, the invention uses Jaccard similarity coefficient to measure the similarity between talents and posts, and obtains scientific and technological talent information which is most matched with the demands of institutions and human units by means of collaborative filtering recommendation algorithm, thereby achieving the aim of accurate recommendation.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a flow chart of a method for intelligent evaluation of talents based on hypergraph attention mechanisms.
Detailed Description
The technical scheme of the invention is further described below through the attached drawings and the embodiments.
Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "disposed," "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Examples
As shown in fig. 1, the novel intelligent talent assessment method based on hypergraph attention mechanism comprises the following steps:
s1: and constructing a science and technology talent knowledge hypergraph. Firstly, aiming at the problems of scattered data, fragility, lack of integrity and systematicness of the talents information of the current technology, adopting the techniques of crawlers, kafka and sqoop to extract the data of multi-source talents generated by platforms such as talent networks, academic institutions, research centers and the like, and forming a talent database; secondly, aiming at a talent database, cleaning and preprocessing data by adopting a Hive technology; and then designing a structure and a model of the knowledge graph based on the preprocessed data.
The step S1 comprises the following steps:
s101: technology talent big data acquisition
S1011: determining a data input source: the data input sources comprise talent networks, academic institutions, research centers and professional social networks, and talent information is obtained from the innovative value, capability, contribution and the like of scientific and technological talents.
S1012: determining an acquisition technology: for streaming data, the kafka technology is adopted for data acquisition; for network unstructured data, a large-scale parallel crawler technology is adopted for collection; for traditional database structured data, the sqoop technique is adopted for extraction.
S1013: determining big data storage technology: and storing big data by adopting an HDFS distributed file system and an Hbase column database.
S102: technology talent big data preprocessing
S1021: data cleansing, padding or deleting missing values to ensure data integrity. For numerical features, the missing values are filled with mean, median, or mode; for time series data or ordered data, filling in missing values using either the previous observations or the next observations; if there is a correlation between the data, interpolation methods (such as linear interpolation, polynomial interpolation, or spline interpolation) are used to estimate the missing values. Deleting the whole row of data containing the missing value under the condition that the data missing value is less or the analysis task is not important; if most of data of a certain feature are missing values, deleting the whole feature; if consecutive data points are missing, these consecutive data segments are deleted.
S1022: the multi-modal data is transformed to create new features or to transform existing features to improve the performance of the analysis. Creating new features according to domain knowledge to capture potential patterns of data; secondly, selecting the most relevant features to reduce the dimension and improve the model efficiency; scaling the feature values to a similar range to prevent some features from affecting the model too much; finally, the classified variables are converted into numbers or single-heat codes so as to be convenient for model processing.
S103: determining hypergraph basic structure based on entity, relationship and attribute in scientific and technological talent data
S1031: multiple attributes of scientific talents in scientific research (personal characteristics, academic background, working experience, professional fields and the like) are analyzed. The system comprises the academic level of the talents of science and technology, reputation and awareness in academia, working experience and professional history in scientific research projects, application and technical innovation of research papers and patents published in academia journals and conferences, influence degree of research achievements on society and industry, and cooperation capability of the talents of science and technology in multi-disciplines and cross-field teams.
S1032: according to the extracted talent multiple attribute information, the extraction and integration of entities, relations and attributes are carried out by using NER, NLTK, stanford NLP, GATE and other technologies, the entities (such as names, mechanism names, academic fields and the like) in a database are accurately identified, the relations (such as cooperation relations, teacher relations and the like) among the entities are established, and important attribute information (such as research achievements, academic backgrounds, project experiences and the like) is extracted, so that high-quality and highly-structured scientific talent information is constructed.
S1033: and (3) abstracting the multidimensional attribute of the scientific and technological talent information into nodes in the knowledge graph by combining a knowledge graph construction principle, connecting the nodes according to evaluation mechanisms in different periods, and associating evaluation results in different periods as superedges. Defining a science and technology talent hypergraph asBy node set->And (2) side set->And hyperedge set->Composition, thus hypergraph->Can be expressed as:
(1)
wherein each edgeWeighting each edge according to evaluation mechanism attribute bias of different periods>Each overrun->Biasing the assignment of weights to each superside according to the evaluation results of different periods>。/>And->For indicating the importance of the connection relationship throughout the hypergraph. Hypergraph after weight introduction ++>Can be expressed as:
(2)
and->Weight diagonal matrix respectively representing edges and superedges, namely:
(3)
(4)
wherein the method comprises the steps ofRepresenting the number of elements in the collection.
The structure of hypergraph can use an incidence matrixDescription of:
(5)
s2: the attention-introducing mechanism designs a scientific talent classification evaluation network model. The scientific and technological talents have rich and diversified characteristics and capabilities, the subjectivity of traditional scientific and technological talents is relatively strong, once indexes are not easy to adjust and index weight distribution is determined according to multiple experiences, talent evaluation results obtained by using the index system are often inconsistent with the real conditions of evaluation objects. In order to comprehensively and accurately evaluate the technology talents, based on the technology talent knowledge hypergraph constructed by the S1, a attention mechanism is introduced to automatically learn and identify the relative importance of different features in talent evaluation, key features are given higher weight, and finally a dynamic multi-target technology talent classification evaluation model is established.
The step S2 comprises the following steps:
s201: the graph embedding technology is introduced, a graph convolutional neural network (Graph Convolutional Network, GCN) and one-hot coding are used, each node and associated information in the hypergraph are converted into vector forms, and the obtained vectors are expressed in a matrix form and are used as network inputs.
S202: embedding an input vector matrix into a trainable embedding layer, mapping each element in input data into a high-dimensional vector space, and enhancing the expression capability; and (3) performing embedded learning on the nodes and the supersides by using CTransR, and defining a scoring function to obtain reasonable embedded vectors:
(6)
wherein the method comprises the steps ofAnd->For embedded entities learned under corresponding supersides, assume the superside relationship represented by the entity clusters in the same group ++>Has similar characteristics, and the relationship expressed in different groups is +.>There may be a large difference; thus, for each group of entity clusters +.>Learning out-of-limit embedding alone>。/>To be in specific entity pair->The following is about a specific relationship>Is embedded with a vector of triplet head entity,>to be in specific entity pair->The following is about a specific relationship>Is embedded with a vector by the triplet tail entity,>for entity pair->The learned superedges embed the relationship vectors. />And->Representing a first norm and a second norm, respectively. />Relation vector for constraint clustering>Is +.>The similarity between the clusters can ensure that the same relationship expressed by different clusters still has a certain degree of similarity. />For adjusting the influence of the constraints on the scoring function.
S203: performing convolution operation on the output of the embedded layer by using a 1D convolution check, and further extracting characteristic information;
s204: performing attention mechanism operation on the S203 output, and calculating the characteristic weight by using cosine similarity to improve the evaluation accuracy; in the first placeIn the layer, the embedding vector of each superside obtained according to the embedding layer +.>Calculate +.>Weighted embedding vector of individual nodes:
(7)
representing the +.f in the attention mechanism diagram neural network>The hyperedge weight of the layer,/>Indicate->Layer->Embedding vectors by the superedges of the individual nodes;
according to the firstWeighted embedding vector calculation of individual nodes first order neighbor node set +.>Middle->Cosine similarity of individual nodes:
(8)
is->Layer->Weighting the embedded vectors by the individual nodes;
further obtaining the attention coefficient of the first-order neighbor node:
(9)
is->Layer->Personal node and->Cosine similarity among neighboring nodes;
thereby can be obtainedNode in layer->Is embedded in, namely:
(10)
wherein the method comprises the steps ofFor sigmoid function, +.>Is->Node in layer->Is embedded in the memory; .
S205: performing dimension reduction operation on the attention layer output by using maximum pooling, and extracting the most representative feature vector;
s206: connecting the output of the pooling layer into the full connection layer, and mapping the extracted feature vector to a target space by using a LeakyReLU activation function to obtain the output of the classifier;
s207: error between the predicted result and the sample label is measured using cross-entropy (cross-entropy) loss function for model optimization and training. Training sampleThe true probability distribution is +.>The predictive probability distribution is thatThe loss function is:
(11)
wherein the method comprises the steps ofIs the sample type;
s208: and updating the weight parameters of the network by using an optimizer, minimizing a loss function to achieve the aim of optimization, and continuously training the network by using a training set to achieve the best effect.
S209: the root cause hyper-parameters affecting the classification of the model were determined using fishbone analysis. Aiming at the situation that the classification accuracy of the person in the model is insufficient, the fish bone analysis method is used for judging the output attribute so as to determine the set parameter problem, and further the super-parameter set decision is obtained, so that the overall classification performance of the model is improved. Setting the skeleton class attribute threshold value as 0.6 and the branch class attribute threshold value as 0.8; if the data exceeds this threshold, it is included as a root cause, namely:
(12)
wherein the method comprises the steps ofRepresenting fish bone assay, ->Representing root cause attribute, ++>Representing the nature of the branch and,representing attribute values. If the value in the root cause attribute exceeds the threshold value 0.6, then its attribute is expressed as +.>Namely determining the root attribute; if smaller than 0.6, its attribute is expressed as +.>I.e., non-root attributes. The root hyper-parameter influence of the model classification capability can be definitely determined through the judgment, and the root hyper-parameter influence is taken as a network modelThe main adjustment objective of the model is to change the relevant parameters in the network through the objective so as to achieve the aim of optimal classification evaluation.
S3: and constructing a scientific and technological talent intelligent recommendation model. Different personnel units have different demands on the scientific and technological talents, however, the traditional scientific and technological talent evaluation standard is single, so that the personnel unit demands are not matched with the scientific and technological talents, and talent waste is caused. In order to fully develop the talent value of science and technology, a knowledge graph-based collaborative filtering recommendation algorithm is adopted to construct an intelligent talent recommendation model on the basis of an evaluation model. Firstly, talent characteristics are extracted through a technological talent knowledge hypergraph, then post demand characteristics are extracted, similarity between different technical talents and different post demands is calculated, and finally, technological talents matched with demands are recommended to a human unit.
The step S3 comprises the following steps:
s301: obtaining feature vectors of different technical talents based on talent features in the technical talent knowledge hypergraph, and constructing a technical talent matrixMeanwhile, a post demand matrix is built>,/>The construction mode of (2) is as follows:
(13)
wherein,
(14)
the construction mode of (2) is as follows:
(15)
wherein,
(16)
s302: acquiring a science and technology talent matrixAnd human unit demand matrix->Thereafter, utilizeJaccardSimilarity coefficient calculates the similarity between the technology talent matrix and the demand matrix:
(17)
s303: and selecting the first K scientific talents with the highest similarity with the post requirements to form a nearest neighbor set U through a K nearest neighbor algorithm.
S304: and recommending the first N technological talents from the nearest neighbor set U to the target post by adopting a TOP-N method, so as to realize intelligent recommendation of the technological talents.
Therefore, the novel intelligent evaluation method for the talents based on the hypergraph attention mechanism can automatically complement the defects of the talents information data, realize the dynamic evaluation and accurate recommendation of the omnibearing talents from the angles of multidimensional, multi-space-time, multi-angle, demonstration and comparison, further improve the scientificity, the speciality and the objectivity of the talents evaluation and provide powerful support for the intelligent management and the application of the talents data of human units.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.
Claims (7)
1. A novel intelligent evaluation method for talents of science and technology based on hypergraph attention mechanism is characterized in that: the method comprises the following steps:
s1, constructing a science and technology talent knowledge hypergraph: extracting data from multi-source talent big data to form a talent database, preprocessing the data of the talent database, and designing a structure and a model of a knowledge graph after preprocessing;
s2, introducing an attention mechanism to design a scientific and technological talent classification evaluation network model: s1, establishing a science and technology talent knowledge hypergraph, introducing an attention mechanism to learn and identify, and establishing a science and technology talent classification evaluation network model;
s3: constructing a scientific and technological talent intelligent recommendation model: and on the basis of the S2 technology talent classification evaluation network model, constructing a technology talent intelligent recommendation model by adopting a collaborative filtering recommendation algorithm based on a knowledge graph.
2. The novel intelligent talent assessment method based on hypergraph attention mechanism, which is characterized by comprising the following steps of: step S1 comprises the steps of:
s101, acquiring large data of scientific and technological talents;
s102, preprocessing large data of scientific and technological talents;
s103, determining a hypergraph basic structure based on the entity, the relation and the attribute in the scientific and technological talent data.
3. The novel intelligent talent assessment method based on hypergraph attention mechanism according to claim 2, which is characterized by comprising the following steps: step S101 includes the steps of:
s1011, determining a data input source: the data input sources comprise talent networks, academic institutions, research centers and professional social networks, and talent information is obtained from innovation values, capabilities and contribution aspects of scientific and technological talents;
s1012, determining an acquisition technology: for streaming data, adopting a kafka technology to acquire data, adopting a parallel crawler technology to acquire network unstructured data, and adopting an sqoop technology to extract traditional database structured data;
s1013, determining a big data storage technique: and storing big data by adopting an HDFS distributed file system and an Hbase column database.
4. The novel intelligent talent assessment method based on hypergraph attention mechanism according to claim 3, wherein the novel intelligent talent assessment method is characterized by comprising the following steps: step S102 includes the steps of:
s1021, data cleaning, filling or deleting data missing values, and ensuring the integrity of data; for a numerical feature, filling the missing value by using a mean value, a median value or a mode value, for time series data or ordered data, filling the missing value by using a previous observation value or a subsequent observation value, if correlation exists between the data, estimating the missing value by using an interpolation method, deleting the whole row of data containing the missing value under the condition that the missing value of the data is less or is not important for an analysis task, deleting the whole feature if the data of a certain feature is the missing value, and deleting the continuous data segment if a plurality of continuous data points are missing;
s1022, converting the multi-mode data, creating new features or converting existing features; creating new features according to domain knowledge, capturing potential modes of data, selecting the most relevant features, scaling feature values into similar ranges, and finally converting classification variables into numbers or single-hot codes.
5. The novel intelligent talent assessment method based on hypergraph attention mechanism, which is characterized in that: step S103 includes the steps of:
s1031, analyzing multiple attributes of the scientific and technological talents in scientific researches, wherein the multiple attributes comprise academic levels of the scientific and technological talents, reputation and awareness of the scientific and technological talents in academia, working experience and professional histories in scientific research projects, application and technical innovation of research papers and patents published in academic journals and conferences, influence degree of the research achievements on society and industry, and collaborative capability of the scientific and technological talents in multidisciplinary and cross-domain teams;
s1032, according to the extracted talent multiple attribute information, extracting and integrating entities, relations and attributes by using NER, NLTK, stanford NLP and GATE technologies, identifying the entities in a database, establishing the relations among the entities and extracting important attribute information, and constructing technological talent information;
s1033, abstract the multidimensional attribute of the technology talent information into nodes in the knowledge graph according to the knowledge graph construction principle, connecting the nodes according to the evaluation mechanisms of different periods, associating the evaluation results of different periods as superedges, and defining the technology talent supergraph asBy node set->And (2) side set->And hyperedge set->Composition, thus hypergraph->Expressed as:
;
wherein each edgeWeighting each edge according to evaluation mechanism attribute bias of different periods>Each overrun->According to nothingContemporaneous evaluation bias assigns a weight to each superside>,/>And->Hypergraph +.f. for representing importance of connection relationship in whole hypergraph, after weight is introduced>Expressed as:
;
and->Weight diagonal matrix representing edges and superedges respectively,
;
;
wherein the method comprises the steps ofRepresenting the number of elements in the set;
the structure of hypergraph uses an associated matrixDescription of:
。
6. the novel intelligent talent assessment method based on hypergraph attention mechanism, which is characterized in that: the step S2 comprises the following steps:
s201, introducing a graph embedding technology, using a graph convolution neural network and one-hot coding, converting each node and associated information thereof in a hypergraph into a vector form, and expressing the obtained vector in a matrix form as network input;
s202, an input vector matrix is embedded into a trainable embedding layer, each element in input data is mapped into a high-dimensional vector space, and CTransR is used for carrying out embedded learning on nodes and supersides, so that a scoring function is defined:
;
wherein the method comprises the steps ofAnd->For embedded entities learned under corresponding hyperedges, < +.>To be in specific entity pair->The following is for a specific relationshipIs embedded with a vector of triplet head entity,>to be in specific entity pair->The following is about a specific relationship>Is embedded with a vector by the triplet tail entity,>for entity pair->The learned superside embeds the relation vector, +.>And->Respectively representing a first norm and a second norm,relation vector for constraint clustering>Is +.>Similarity between->For adjusting the influence of the constraint on the scoring function;
s203, performing convolution operation on the output of the embedded layer by using the 1D convolution check, and further extracting characteristic information;
s204, performing attention mechanism operation on the output of the step S203, calculating the characteristic weight by using cosine similarity, and in the first stepIn the layer, the embedding vector of each superside obtained according to the embedding layer +.>Calculate +.>Weighted embedding vector of individual nodes:
;
representing the +.f in the attention mechanism diagram neural network>The hyperedge weight of the layer,/>Indicate->Layer->Embedding vectors by the superedges of the individual nodes;
according to the firstWeighted embedding vector calculation of individual nodes first order neighbor node set +.>Middle->Cosine similarity of individual nodes:
;
is->Layer->Weighting the embedded vectors by the individual nodes;
further obtaining the attention coefficient of the first-order neighbor node:
;
is->Layer->Personal node and->Cosine similarity among neighboring nodes;
obtainingNode in layer->Is embedded in:
;
wherein the method comprises the steps ofFor sigmoid function, +.>Is->Node in layer->Is embedded in the memory;
s205, performing dimension reduction operation on the attention layer output by using maximum pooling, and extracting representative feature vectors;
s206, connecting the output of the pooling layer into the full-connection layer, and mapping the extracted feature vector to a target space by using a LeakyReLU activation function to obtain the output of the classifier;
s207, measuring the error between the prediction result and the sample label by using the cross entropy loss function, optimizing and training the model, and training the sampleThe true probability distribution is +.>Its predictive probability distribution is +.>The loss function is:
;
wherein the method comprises the steps ofIs the sample type;
s208, updating weight parameters of the network by using an optimizer, minimizing a loss function to achieve the aim of optimization, and continuously training the network by using a training set to achieve the best effect;
s209, determining a root cause hyper-parameter affecting model classification by using a fish bone analysis method: aiming at the situation that the classification accuracy of the person in the model is insufficient, judging the output attribute by using a fishbone analysis method, determining the problem of setting parameters, obtaining a super-parameter setting decision, and improving the overall classification performance of the model; setting the skeleton class attribute threshold to be 0.6, setting the branch class attribute threshold to be 0.8, if the data exceeds the threshold, taking the data into the root cause,
;
wherein the method comprises the steps ofRepresenting fish bone assay, ->Representing root cause attribute, ++>Representing the nature of the branch and,representing attribute values; if the value in the root cause attribute exceeds the threshold value 0.6, then its attribute is expressed as +.>The method comprises the steps of carrying out a first treatment on the surface of the If the value in the root cause attribute is less than 0.6, then the attribute is expressed as +.>The root hyper-parameter influence of the classification capability of the model is determined through the judgment, the root hyper-parameter influence is used as an adjustment target of the network model, and the aim of optimal classification evaluation is achieved by changing relevant parameters in the network through the target.
7. The novel intelligent talent assessment method based on hypergraph attention mechanism, which is characterized by comprising the following steps of: the step S3 comprises the following steps:
s301: obtaining feature vectors of different technical talents based on talent features in the technical talent knowledge hypergraph, and constructing a technical talent matrixMeanwhile, a post demand matrix is built>,/>The construction mode of (2) is as follows:
;
wherein,
;
the construction mode of (2) is as follows:
;
wherein,
;
s302, acquiring a science and technology talent matrixAnd human unit demand matrix->Thereafter, utilizeJaccardSimilarity coefficient calculation technology talent matrix and demand momentSimilarity between arrays:
;
s303, selecting the first K scientific talents with the highest similarity with the post requirements to form a nearest neighbor set U through a K nearest neighbor algorithm;
s304, recommending the first N technological talents from the nearest neighbor set U to the target post by adopting a TOP-N method, and realizing intelligent recommendation of the technological talents.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311623569.9A CN117314266B (en) | 2023-11-30 | 2023-11-30 | Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311623569.9A CN117314266B (en) | 2023-11-30 | 2023-11-30 | Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117314266A true CN117314266A (en) | 2023-12-29 |
CN117314266B CN117314266B (en) | 2024-02-06 |
Family
ID=89260819
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311623569.9A Active CN117314266B (en) | 2023-11-30 | 2023-11-30 | Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117314266B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117938669A (en) * | 2024-03-25 | 2024-04-26 | 贵州大学 | Network function chain self-adaptive arrangement method for 6G general intelligent service |
CN118210916A (en) * | 2024-05-20 | 2024-06-18 | 华东交通大学 | Scientific literature recommendation method based on hypergraph attention and enhanced contrast learning |
CN118485346A (en) * | 2024-05-30 | 2024-08-13 | 湖南工商大学 | Talent evaluation method based on big data technology and related device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3640864A1 (en) * | 2018-10-18 | 2020-04-22 | Fujitsu Limited | A computer-implemented method and apparatus for inferring a property of a biomedical entity |
CN112905891A (en) * | 2021-03-05 | 2021-06-04 | 中国科学院计算机网络信息中心 | Scientific research knowledge map talent recommendation method and device based on graph neural network |
CN114817568A (en) * | 2022-04-29 | 2022-07-29 | 武汉科技大学 | Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network |
CN115269816A (en) * | 2022-09-01 | 2022-11-01 | 迪吉凡特(宁波)数字技术有限公司 | Core personnel mining method and device based on information processing method and storage medium |
CN116340646A (en) * | 2023-01-18 | 2023-06-27 | 云南师范大学 | Recommendation method for optimizing multi-element user representation based on hypergraph motif |
CN116702900A (en) * | 2023-06-21 | 2023-09-05 | 电子科技大学 | Knowledge hypergraph completion method based on graph structure transformation |
CN117056392A (en) * | 2022-05-07 | 2023-11-14 | 六棱镜(杭州)科技有限公司 | Big data retrieval service system and method based on dynamic hypergraph technology |
-
2023
- 2023-11-30 CN CN202311623569.9A patent/CN117314266B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3640864A1 (en) * | 2018-10-18 | 2020-04-22 | Fujitsu Limited | A computer-implemented method and apparatus for inferring a property of a biomedical entity |
CN112905891A (en) * | 2021-03-05 | 2021-06-04 | 中国科学院计算机网络信息中心 | Scientific research knowledge map talent recommendation method and device based on graph neural network |
CN114817568A (en) * | 2022-04-29 | 2022-07-29 | 武汉科技大学 | Knowledge hypergraph link prediction method combining attention mechanism and convolutional neural network |
CN117056392A (en) * | 2022-05-07 | 2023-11-14 | 六棱镜(杭州)科技有限公司 | Big data retrieval service system and method based on dynamic hypergraph technology |
CN115269816A (en) * | 2022-09-01 | 2022-11-01 | 迪吉凡特(宁波)数字技术有限公司 | Core personnel mining method and device based on information processing method and storage medium |
CN116340646A (en) * | 2023-01-18 | 2023-06-27 | 云南师范大学 | Recommendation method for optimizing multi-element user representation based on hypergraph motif |
CN116702900A (en) * | 2023-06-21 | 2023-09-05 | 电子科技大学 | Knowledge hypergraph completion method based on graph structure transformation |
Non-Patent Citations (3)
Title |
---|
徐冰冰;岑科廷;黄俊杰;沈华伟;程学旗;: "图卷积神经网络综述", 计算机学报, no. 05, pages 755 - 780 * |
林晶晶 等: "超图神经网络综述", 《计算机研究与发展》, pages 1 - 26 * |
沈振国: "基于简历文本数据的人才知识图谱构建", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 3, pages 138 - 2993 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117938669A (en) * | 2024-03-25 | 2024-04-26 | 贵州大学 | Network function chain self-adaptive arrangement method for 6G general intelligent service |
CN118210916A (en) * | 2024-05-20 | 2024-06-18 | 华东交通大学 | Scientific literature recommendation method based on hypergraph attention and enhanced contrast learning |
CN118485346A (en) * | 2024-05-30 | 2024-08-13 | 湖南工商大学 | Talent evaluation method based on big data technology and related device |
Also Published As
Publication number | Publication date |
---|---|
CN117314266B (en) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117314266B (en) | Novel intelligent scientific and technological talent evaluation method based on hypergraph attention mechanism | |
CN109492157B (en) | News recommendation method and theme characterization method based on RNN and attention mechanism | |
CN108009674A (en) | Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks | |
CN115934990B (en) | Remote sensing image recommendation method based on content understanding | |
Piao et al. | Housing price prediction based on CNN | |
Dariane et al. | Forecasting streamflow by combination of a genetic input selection algorithm and wavelet transforms using ANFIS models | |
Wei et al. | Forecasting the daily natural gas consumption with an accurate white-box model | |
CN108960488B (en) | Saturated load spatial distribution accurate prediction method based on deep learning and multi-source information fusion | |
Huang et al. | Research on urban modern architectural art based on artificial intelligence and GIS image recognition system | |
CN114741519A (en) | Paper correlation analysis method based on graph convolution neural network and knowledge base | |
Hatim et al. | Addressing challenges and demands of intelligent seasonal rainfall forecasting using artificial intelligence approach | |
CN107748940A (en) | A kind of energy conservation potential Quantitative prediction methods | |
CN111062511B (en) | Aquaculture disease prediction method and system based on decision tree and neural network | |
CN114757433B (en) | Method for rapidly identifying relative risk of drinking water source antibiotic resistance | |
CN115099450A (en) | Family carbon emission monitoring and accounting platform based on fusion model | |
CN115859450A (en) | Building modeling data processing method and system based on BIM technology | |
CN116720743A (en) | Carbon emission measuring and calculating method based on data clustering and machine learning | |
CN115018357A (en) | Farmer portrait construction method and system for production performance improvement | |
Sun | Real estate evaluation model based on genetic algorithm optimized neural network | |
Foroughi et al. | Capturing experts’ knowledge in heritage planning enhanced by AI: A case study of windcatchers in Yazd, Iran | |
CN117077005B (en) | Optimization method and system for urban micro-update potential | |
CN113129188A (en) | Provincial education teaching evaluation system based on artificial intelligence big data | |
Nan et al. | Heuristic bivariate forecasting model of multi-attribute fuzzy time series based on fuzzy clustering | |
CN115660221B (en) | Oil and gas reservoir economic recoverable reserve assessment method and system based on hybrid neural network | |
CN116662860A (en) | User portrait and classification method based on energy big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |