CN103631859A - Intelligent review expert recommending method for science and technology projects - Google Patents

Intelligent review expert recommending method for science and technology projects Download PDF

Info

Publication number
CN103631859A
CN103631859A CN201310509358.2A CN201310509358A CN103631859A CN 103631859 A CN103631859 A CN 103631859A CN 201310509358 A CN201310509358 A CN 201310509358A CN 103631859 A CN103631859 A CN 103631859A
Authority
CN
China
Prior art keywords
word
science
feature
node
expert
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310509358.2A
Other languages
Chinese (zh)
Other versions
CN103631859B (en
Inventor
徐小良
吴仁克
林建海
陈秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201310509358.2A priority Critical patent/CN103631859B/en
Publication of CN103631859A publication Critical patent/CN103631859A/en
Application granted granted Critical
Publication of CN103631859B publication Critical patent/CN103631859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/328Management therefor

Abstract

The invention provides an intelligent review expert recommending method for science and technology projects. The method includes the following steps that (1) the science and technology projects to be reviewed and expert information main texts are segmented into substring sequences, ICTCLAS segmentation of Chinese academy of sciences is carried out on the substring sequences, and stop word filtering is carried out on a segmentation result to obtain a term set; (2) a term network of project information is built, feature words are extracted on the basis of statistical characteristics and aggregation characteristics, and if expert information is relatively concise, the term set obtained in the step (1) directly serves as the feature words; (3) a knowledge representation model is built on the basis of fields and weights of the feature words, and a relative information index is built; (4) experts are recommended in groups to carry out feature merging operations between the fields and between the projects on the knowledge representation model; (5) similarity of the experts and the science and technology projects or groups to be viewed is calculated on the basis of semantics, threshold truncation is set, and a final recommended expert list is generated. By means of the method, the problems that recommending workload is large and review decisions lack scientificity are greatly alleviated.

Description

A kind of evaluation expert's intelligent recommendation method towards science and technology item
Technical field
The invention belongs to expert's recommended technology field, relate in particular to a kind of science and technology item evaluation expert intelligent recommendation method of service Network Based, it is a kind of intelligent method of auxiliary science and technology item Authorize to Invest.
Background technology
Along with science and technology item management system is universal rapidly in each functional department of China, the evaluation of science and technology item develops into current network schemer from concentrated conference model in the past, has broken the restriction of expert region in evaluation.Whether evaluation expert, according to the subsidy standard of domain knowledge and subsidy mechanism, appraises through discussion project application book, subsidize mechanism and subsidize according to expert's the situation of appraising through discussion decision.
Expert towards science and technology item recommends mostly only with project manager's subjective consciousness, to recommend expert to evaluate pending trial project at present, a pending trial project often needs a plurality of experts to evaluate, certainly will there is the problems such as efficiency is not high, workload large, shortage is scientific in artificial recommendation expert, the expert who selects is not most suitable.Therefore, to the research of science and technology item evaluation expert intelligent recommendation, be very crucial, can effectively alleviate expert and the problem such as not mate with the commented contents of a project, greatly promote the community service ability of science and technology item evaluation.
Intelligent recommendation technology now, as collaborative filtering recommending, content-based recommendation etc., mostly be applied in video display recommended website, commercial product recommending website, rarely have research and application in science and technology item evaluation expert information bank, restriction due to specific area, for science and technology item intelligent recommendation expert's technology and general recommended technology or distinguishing: first, the recommendation of science and technology item management system relates to all trades and professions, and domain knowledge is very complicated; Secondly, science and technology item evaluation expert's recommendation relates to the sustentation fund of science and technology item, and the requirement of objectivity, fairness and the accuracy that expert is recommended is very high.At present in this respect, China also lacks systematized method guidance and proven technique support.And information text has features such as " semi-structured ", the content of expert info and pending trial science and technology item information can be mated, and the present invention makes full use of architectural feature and phrase semantic information computational item and expert's information similarity.If similarity is higher, represent that expert is familiar with this project, produce and recommend expert's list to evaluate project.The present invention provides a kind of decision support system (DSS) (Decision Support System for science and technology item recommendation evaluation expert simultaneously, DSS), evaluation expert is assigned to the project that domain knowledge matches and carries out science evaluation, make auxiliary expert (decision-making user) realize the decision-making of science, aid decision making user improves level of decision-making and quality, makes evaluation have more science and objectivity.
Summary of the invention
The present invention is directed to the deficiencies in the prior art, a kind of evaluation expert's intelligent recommendation method towards science and technology item is provided.
The present invention comprises the steps: towards evaluation expert's recommendation process of science and technology item
Step 1. dictionary of stopping using using the general term in science and technology item and expert info and habitual word as specialty; Using punctuation mark, non-Chinese character as cutting signature library.
Step 2. pair science and technology item information, expert info carry out participle: according to cutting mark in science and technology item information, the information such as project name, main research, technical indicator are cut into substring sequence; According to cutting mark in evaluation expert's information, the information such as the project that extraction expert info, prize-winning situation, invention situation, the situation that publishes thesis, problem were born and performance, research direction are cut into substring sequence, and a sub-string sequence is a field information; Utilize the ICTCLAS of Chinese Academy of Sciences antithetical phrase string sequence to carry out participle.
Step 3. science and technology item feature word extraction: utilize the inactive dictionary of general inactive dictionary and specialty to carry out stop words filtration to participle, general inactive dictionary adopts the inactive vocabulary of Harbin Institute of Technology, using the word segmentation result of removing stop words as a set of words.
The structure of the inactive dictionary of specialty is constantly perfect process of a self study, the word frequency of constantly adding up word in information participle process, and the probability that word occurs at text is greater than certain threshold values, brings it into inactive dictionary.
Science and technology item quantity of information is larger, set of words is carried out to semantic similarity between word and calculate, and according to the cooccurrence relation of the semantic relation of word and word, builds term network, the word aggregation characteristic value in computational grid; Then in conjunction with the statistical characteristics of word, the crucial degree that calculates word extracts science and technology item feature word; The feature word of science and technology item is exactly statistical nature information and the semantic feature information of extracting comprehensive text, extracts more exactly feature word.
Described semantic similarity computation process is as follows:
In knowing net semantic dictionary, if for two word W 1and W 2, W 1there is n concept: S11, S12 ..., S1n, W 2there is m concept: S21, S22 ..., S2n.Word W 1and W 2similarity SimSEM (W1, W2) equal the maximal value of the similarity of each concept:
SimSEM ( W 1 , W 2 ) = max i = 1 , . . . n . j = 1 . . . m Sim ( S 1 i , S 2 i )
Notional word and function word have different descriptive languages, need to calculate the adopted similarity between former of the former or relation of its corresponding syntax justice.Notional word concept comprises that the first basic meaning is former, other basic meanings are former, the adopted former description of relation, relational symbol are described, and similarity is designated as respectively Sim1 (p 1, p 2), Sim2 (p 1, p 2), Sim3 (p 1, p 2), Sim4 (p 1, p 2).The similarity calculating of two feature structures finally reverts to basic meaning similarity former or concrete word and calculates.
Sim 4 ( S 1 , S 2 ) = Σ i = 1 4 β i Sim i ( S 1 , S 2 )
β i(1≤i≤4) are adjustable parameters, and have: β 1+ β 2+ β 3+ β 4=1, β 1>=β 2>=β 3>=β 4.
If CW={C1, C2 ..., the set of words of Cm} for obtaining after processing, the semantic similarity adjacency matrix S that it is corresponding mbe defined as:
Wherein, Sim (C 1, C 2) be word C 1with word C 2semantic similarity, Sim (C i, C i) be 1, Sim (C i, C j)=Sim (C j, C i).
Set of words CW={C1, C2 ..., Cm} calculates the value of similarity between p* (1+p)/2 word through semantic similarity.
The cooccurrence relation computation process of described word is as follows:
Word co-occurrence patterns are one of important models of the natural language processing research field based on statistical method.According to word co-occurrence patterns, if two frequent co-occurrences of word in the same window unit of document (as a word, a paragragh etc.), these two words are to be mutually related in meaning, they express the semantic information of the text to a certain extent.Utilize moving window (moving window length is 3) to carry out word co-occurrence degree to the word in sequence of terms and calculate, moving window as shown in Figure 1:
First, sequence of terms is carried out to word extraction, remove space, null and merge identical word, obtains set of words CW={C1, C2 ..., Cm}, wherein m≤n.
The word co-occurrence degree Matrix C m that set of words CW is corresponding is defined as:
Figure BDA0000401292370000033
When Cm is initial, Coo (Ci, Cj) is 01(1≤i, j≤m).
By moving window, sequence of terms is carried out to word co-occurrence degree and calculate, the word in moving window is T i-1t it i+1(1<i<n):
1) if i=n-1 turns 4); If T i-1be space or null, moving window slides to next word, i++; Otherwise, turn 2).
2) if T ifor Chinese, Coo(T i-1, T i) ++, turn 3); If T ifor null, turn 3); Otherwise turn 1).
3) if T ichinese, Coo (T i-1, T i+1) ++, i++, turns 1); Otherwise, turn 1).
4) if T n-2be Chinese, turn 5); Otherwise, turn 7)
5) if T n-1chinese, Coo (T n-2, T n-1) ++, turn 6); If T n-1be space, turn 6); Otherwise finish.
6) if T nchinese, Coo (T n-2, T n) ++, finish; Otherwise finish.
7) if T n-1chinese, and T nalso be Chinese, Coo (T n-1, T n) ++, finish; Otherwise finish.
Through the calculating of step above, obtain word co-occurrence degree Matrix C m, and each element of Cm is normalized, namely each element is divided by the maximal value of all elements in matrix, i.e. max{Coo (C i, C j) | 1≤i, j≤m}.
Described term network is as follows:
When building cum rights term network, first to obtain the weight matrix of term network, definition weight matrix Wm is:
Figure BDA0000401292370000041
Wherein, α is that 0.3, β is 0.7, and the semantic relation between strengthening word, weakens the cooccurrence relation between word.
W mas the adjacency matrix corresponding to term network of input, its corresponding network chart is defined as: G={V, E}; Wherein scheming G is undirected weighted graph, the vertex set in V presentation graphs G, and E represents the limit collection in G, v irepresent i summit (word) in V.
The computation process of described word aggregation characteristic value is as follows:
Key character degree of the having distribution of term network, average shortest path, concentration class and convergence factor.The degree of node embodies the associated situation of this node and other node.The node that the concentration class of node and convergence factor are embodied in this node subrange interconnects density.The degree of node and convergence factor embody the importance of this node in subrange.The present invention, by the aggregation characteristic value that measures and weights, convergence factor and node betweenness are carried out computing node that adds of node, can allow important word give higher weights, and guaranteeing again also has higher scoring with the related word of many important words.
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe measures and weights that adds be defined as:
WD i = &Sigma; j = 1 n w ij / n
Wherein, w ijfor node v iwith v jbetween weights on limit, total number that n is node.
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe non-measures and weights D that adds ifor D i=| { (v i, v j): (v i, v j) ∈ E, v i, v j∈ V}|; Node v iconcentration class K ifor the actual limit number existing between neighbor node: T i=| { (v j, v k): (v i, v k) ∈ E, (v j, v k) ∈ E, v i, v j∈ V}|, node v jconvergence factor C ibe defined as:
C i = T i D i 2 = 2 T i / D i ( D i - 1 )
In semantic similarity network chart, node betweenness Betweenness is between node x and w and shortest path passes through node v ipossibility probability.Pair Analysis between two nonneighbor nodes depends on the node on the shortest path that connects point-to-point transmission, the potential role who controls interactive information stream between node, the B of playing the part of of these nodes iembody node v idegree of connecting under local environment, node betweenness Betweenness is defined as:
B i = &Sigma; v i , w , x &Element; G d ( w , x ; v i ) d ( w , x )
D (w, x) represents shortest path number between any two node w and x, d (w, x; v i) represent any two node w and x and pass through v ishortest path number.
By node v iaverage weighted degree, convergence factor and betweenness Betweenness be weighted the aggregation characteristic value of comprehensive measurement node, node v iaggregation characteristic value Z ibe defined as:
Z i = a &times; WD i + b &times; C i / &Sigma; j = 1 n C j + c &times; B i
Wherein, a+b+c=1.
The computation process of the statistical characteristics of described word is as follows:
Adopt nonlinear function to be normalized word frequency.Word W iword frequency weight TFi in text is defined as:
TFi = f ( Wi ) &Sigma; j = 1 n f ( p j )
Wherein, TFi represents word W iword frequency weight, p jrepresent certain word in text, f is word frequency statistics function.
In Chinese text, energy nameplate characteristic is generally notional word, as noun, verb, adjective etc.And the function words such as interjection, preposition, conjunction are substantially nonsensical to determining text categories, can extract and bring very large interference feature word.Word W ipart of speech weight posi in text is defined as:
Figure BDA0000401292370000062
Word is more long more can reflect concrete information, otherwise the represented meaning of shorter word is conventionally more abstract.Especially mostly the feature word in document is the academic combination of some specialties vocabulary, and length is longer, and its implication is clearer and more definite, more can reflect text subject.The weight that increases long word, is conducive to vocabulary to be cut apart, thereby reflects more accurately the significance level of word in document.
Word W ithe long weight leni of word in text is defined as:
Figure BDA0000401292370000063
For each word in sequence of terms, its statistical characteristics is
stats i=A*TF i+B*pos i+C*len i
Wherein, A+B+C=1.
Described word W ithe computation process of key degree is as follows:
Corresponding to each node in weighting term network, its crucial degree value Imp ibe defined as:
Imp i=β*stats i+(1-β)*Z i
Wherein, 0 < β < 1.
By calculating, the value of crucial degree sequence from big to small will be obtained, set a threshold gamma (0 < γ < 1), the value of q before taking out, these words are using the feature word as science and technology item, these words fully reflect theme, and are important words.
Step 4. evaluation expert feature word extracts: evaluation expert's quantity of information is few compared with science and technology item information, the Feature Words of science and technology item builds network the extractive technique based on statistical nature and semantic feature, the feature word that is not suitable for evaluation expert's information extracts, directly according to general inactive dictionary and the inactive dictionary of specialty, carry out stop words filtration, extract each expert's Feature Words set, general inactive dictionary is to be also to adopt the inactive vocabulary of Harbin Institute of Technology, and the inactive dictionary of specialty needs personnel constantly to safeguard.
Step 5. builds science and technology item, evaluation expert's minute field Knowledge Representation Model: by vector space model and matter-element Knowledge Set model are expanded, according to the different field information in science and technology item, set up text representation model PRO=(id, F, WF, T, V), wherein id is illustrated in the identification field in project library; F represents field classification set in science and technology item; WF is the weight of field; T is feature word; V represents that the corresponding word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe corresponding frequency of keyword.The representation of knowledge of science and technology item information is as follows:
Figure BDA0000401292370000071
In like manner, according to the different field information in expert, set up Knowledge Representation Model TM=(id, F, WF, T, V).Wherein, id is illustrated in the identification field in experts database; F represents field classification set in evaluation expert; WF is the weight set of field; T is feature word; V represents that the corresponding feature word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe frequency of occurrences of feature word in corresponding field.The representation of knowledge of evaluation expert's information is:
Figure BDA0000401292370000072
Step 5. evaluation expert information index storehouse builds: after evaluating Expert Knowledge Expression model construction and completing, information index is put in storage: the content item information that first reads an evaluation expert from experts database; Based on word segmentation result, set up phrase semantic network and extract the Feature Words that evaluation expert comprises; According to Knowledge Representation Model and utilize Apache Lucene to set up index to it; The index establishing is added in corresponding index database by affiliated classification, until all evaluation expert's index warehouse-in.
Step 6: according to the number of project, the way of recommendation is divided into single pending trial project recommendation expert and grouping (a plurality of) pending trial project recommendation expert.Grouping recommends expert to represent that to the pending trial project knowledge of step 5 model does the feature union operation between corresponding interfield and project, and single pending trial expert recommends only to do corresponding interfield feature union operation.Meanwhile, the evaluation expert's of step 5 Knowledge Representation Model is carried out to the merging of interfield feature.According to Knowledge Representation Model and utilize the characteristic information after Apache Lucene is combined to set up index.Wherein, science and technology item index construct carries out when carrying out project recommendation.
In science and technology item declaration management system, pending trial project needs grouping to recommend often, above-mentioned feature union operation not only guarantee not can removal process 5 in Knowledge Representation Model different field weight is set similarity is calculated and produced the contribution difference of recommending.
It is as follows that described pending trial project, evaluation expert's feature merga pass logic xor operation carries out process:
(1) pending trial project, an evaluation expert's interfield feature merges
Suppose field feature set of words W' 1and W' 2merge, define W' 1and W' 2merge rule
Figure BDA0000401292370000081
for:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , f ( word 1 i ) + f ( word 2 i ) 2 } | word 1 i = word 2 j }
Wherein, word 1i, word 2jfor Feature Words.
Add field weight to improve and expand above-mentioned definition, the interfield feature of evaluation expert, science and technology item is merged, merging rule is:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , w 1 * f ( word 1 i ) + w 2 * f ( word 2 i ) w 1 2 + w 2 2 } | word 1 i = word 2 j }
(2) between the project of grouping pending trial project, feature merges
This merging process operation is only for the proper vector of pending trial science and technology item, and not for evaluation expert's proper vector, expert's proper vector only need to be done interfield feature union operation.If V is (d 1) and V (d 2) be respectively the vector models of two science and technology items after interfield feature merges, to any t 1j∈ V (d 1), t 2j∈ V (d 2), if there is t 1jwith t 2jidentical merging.
Figure BDA0000401292370000084
be defined as:
V ( d 1 ) &CirclePlus; V ( d 2 ) = { < t k , w k ( p ) = w i ( d 1 ) + w j ( d 2 ) 2 > }
Wherein, k=1 ..., n, t kfor feature entry item, w k(p) be t kweight.
The basic process that the Knowledge Representation Model of science and technology item group produces is as follows:
A). merge science and technology item interfield feature, obtain the vector model V (d) of each project;
B). all science and technology item vector model set are adopted to consolidation strategy
Figure BDA0000401292370000091
by above-mentioned method, science and technology item is set up to the Knowledge Representation Model of the vector space that is based on.
V(p)={<t 1,w 1(p)>,<t 2,w 2(p)>,...,<t n,w n(p)>}
Wherein, k=1 ..., n, t kfor project team's Feature Words entry item, w k(p) be t kweight.
Step 7. merges through the interfield feature of the evaluation expert of step 6 and the Knowledge Representation Model of science and technology item, if suppose, evaluation expert's information vector is expressed as P={s 1, f (s 1), s 2, f (s 2) ..., s n, f (s n), science and technology item information (group) vector representation is Q={t 1, f (t 1), t 2, f (t 2) ..., t n, f (t n), the semantic similarity based on maximum matching algorithm calculating pending trial science and technology item (group) vector with evaluation expert.
Step 8. arranges similarity and blocks, and according to the size generation of similarity, recommends index, produces final recommendation evaluation expert list.
Beneficial effect of the present invention is as follows:
Can be more easily, intelligently, recommend out scientific and technological project appraisal expert accurately; Can greatly alleviate the allocating task of science and technology item declaration management system scientific worker to evaluation expert, reduce the cost of management; Can guarantee that evaluation expert and pending trial science and technology item have higher field matching degree, guarantee that evaluation expert accomplishes objectivity, fairness and science to the evaluation of project, automatic, efficient, just decision support is provided, avoids science and technology item examining to occur that human feelings network of personal connections, " Matthew effect " etc. examine improper problem.
Accompanying drawing explanation
Fig. 1 carries out word co-occurrence degree to calculate moving window in the present invention.
Fig. 2 is the maximum matching algorithm principle schematic based on bigraph (bipartite graph) in the present invention.
Fig. 3 is towards evaluation expert's intelligent recommendation method flow diagram of science and technology item in the present invention.
Fig. 4 is the extraction process flow diagram of the Feature Words of science and technology item and evaluation expert's information in the present invention.
Fig. 5 is that in the present invention, evaluation expert's knowledge index storehouse builds process flow diagram.
Embodiment
Below in conjunction with accompanying drawing, the invention will be further described, should be emphasized that following explanation is only exemplary, rather than in order to limit the scope of the invention and to apply.Below the specific embodiment of the present invention is described in further detail, the embodiment based in invention, those of ordinary skills, at the every other embodiment that does not have to obtain under creative work prerequisite, belong to protection scope of the present invention.
As shown in Figure 3, the main thought of recommend method of the present invention is: (1) is for the expert info in science and technology item declaration management system and pending trial science and technology item information, main text dividing become to substring sequence and carry out the ICTCLAS of Chinese Academy of Sciences participle, word segmentation result being carried out to stop words filtration and obtain set of words; (2) science and technology item information comprises the information such as main research, technical indicator, quantity of information is larger, invention builds term network according to the cooccurrence relation of the semantic relation of word and word, and calculate the node rendezvous eigenwert of term network, with statistical characteristics weighted calculation word key degree, extract the Feature Words of each science and technology item; (3) expert info is simplified than science and technology item information, and quantity of information is less, and the set of words directly each expert info being obtained is after filtration as Feature Words; (4) according to the importance difference of science and technology item, expert's field information, field weight is set, the Feature Words obtaining according to (2) and (3), builds respectively the Knowledge Representation Model for project and expert, builds expert's index database; (5) grouping recommends expert model pending trial project knowledge to represent that model does the feature union operation between interfield and project, and single pending trial project expert recommends only to do interfield feature union operation.Expert Knowledge Expression model is done to interfield feature merges simultaneously.(6) consider the feature that word has Semantic fuzzy matching, calculate the similarity of expert info and pending trial science and technology item information, by setting threshold values, block the final expert of recommendation of generation list.
Step 1. dictionary of stopping using using the general term in science and technology item and expert info and habitual word as specialty; Using punctuation mark, non-Chinese character as cutting signature library.
Step 2. pair science and technology item information, expert info carry out participle: according to cutting mark in science and technology item information, the information such as project name, main research, technical indicator are cut into substring sequence; According to cutting mark in evaluation expert's information, the information such as the project that extraction expert info, prize-winning situation, invention situation, the situation that publishes thesis, problem were born and performance, research direction are cut into substring sequence, and a sub-string sequence is a field information; Utilize the ICTCLAS of Chinese Academy of Sciences antithetical phrase string sequence to carry out participle.
Step 3. science and technology item feature word extracts: utilize general inactive dictionary and the specialty dictionary of stopping using to carry out stop words filtration to participle, general inactive dictionary adopts Harbin Institute of Technology's vocabulary of stopping using, the word segmentation result of removal stop words as a set of words, referring to Fig. 4.
The structure of the inactive dictionary of specialty is constantly perfect process of a self study, the word frequency of constantly adding up word in information participle process, and the probability that word occurs at text is greater than certain threshold values, brings it into inactive dictionary.
Science and technology item quantity of information is larger, set of words is carried out to semantic similarity between word and calculate, and according to the cooccurrence relation of the semantic relation of word and word, builds term network, the word aggregation characteristic value in computational grid; Then in conjunction with the statistical characteristics of word, the crucial degree that calculates word extracts science and technology item feature word; The feature word of science and technology item is exactly statistical nature information and the semantic feature information of extracting comprehensive text, extracts more exactly feature word.
Described semantic similarity computation process is as follows:
In knowing net semantic dictionary, if for two word W 1and W 2, W 1there is n concept: S11, S12 ..., S1n, W 2there is m concept: S21, S22 ..., S2n.Word W 1and W 2similarity SimSEM (W1, W2) equal the maximal value of the similarity of each concept:
SimSEM ( W 1 , W 2 ) = max i = 1 , . . . n . j = 1 . . . m Sim ( S 1 i , S 2 i )
Notional word and function word have different descriptive languages, need to calculate the adopted similarity between former of the former or relation of its corresponding syntax justice.Notional word concept comprises that the first basic meaning is former, other basic meanings are former, the adopted former description of relation, relational symbol are described, and similarity is designated as respectively Sim1 (p 1, p 2), Sim2 (p 1, p 2), Sim3 (p 1, p 2), Sim4 (p 1, p 2).The similarity calculating of two feature structures finally reverts to basic meaning similarity former or concrete word and calculates.
Sim 4 ( S 1 , S 2 ) = &Sigma; i = 1 4 &beta; i Sim i ( S 1 , S 2 )
β i(1≤i≤4) are adjustable parameters, and have: β 1+ β 2+ β 3+ β 4=1, β 1>=β 2>=β 3>=β 4.
If CW={C1, C2 ..., the set of words of Cm} for obtaining after processing, the semantic similarity adjacency matrix S that it is corresponding mbe defined as:
Figure BDA0000401292370000113
Wherein, Sim (C 1, C 2) be word C 1with word C 2semantic similarity, Sim (C i, C i) be 1, Sim (C i, C j)=Sim (C j, C i).
Set of words CW={C1, C2 ..., Cm} calculates the value of similarity between p* (1+p)/2 word through semantic similarity.
The cooccurrence relation computation process of described word is as follows:
Word co-occurrence patterns are one of important models of the natural language processing research field based on statistical method.According to word co-occurrence patterns, if two frequent co-occurrences of word in the same window unit of document (as a word, a paragragh etc.), these two words are to be mutually related in meaning, they express the semantic information of the text to a certain extent.Utilize moving window (moving window length is 3) to carry out word co-occurrence degree to the word in sequence of terms and calculate, moving window as shown in Figure 1:
First, sequence of terms is carried out to word extraction, remove space, null and merge identical word, obtains set of words CW={C1, C2 ..., Cm}, wherein m≤n.
The word co-occurrence degree Matrix C m that set of words CW is corresponding is defined as:
Figure BDA0000401292370000121
When Cm is initial, Coo (Ci, Cj) is 01(1≤i, j≤m).
By moving window, sequence of terms is carried out to word co-occurrence degree and calculate, the word in moving window is T i-1t it i+1(1<i<n):
1) if i=n-1 turns 4); If T i-1be space or null, moving window slides to next word, i++; Otherwise, turn 2).
2) if T ifor Chinese, Coo(T i-1, T i) ++, turn 3); If T ifor null, turn 3); Otherwise turn 1).
3) if T ichinese, Coo (T i-1, T i+1) ++, i++, turns 1); Otherwise, turn 1).
4) if T n-2be Chinese, turn 5); Otherwise, turn 7)
5) if T n-1chinese, Coo (T n-2, T n-1) ++, turn 6); If T n-1be space, turn 6); Otherwise finish.
6) if T nchinese, Coo (T n-2, T n) ++, finish; Otherwise finish.
7) if T n-1chinese, and T nalso be Chinese, Coo (T n-1, T n) ++, finish; Otherwise finish.
Through the calculating of step above, obtain word co-occurrence degree Matrix C m, and each element of Cm is normalized, namely each element is divided by the maximal value of all elements in matrix, i.e. max{Coo (C i, C j) | 1≤i, j≤m}.
Described term network is as follows:
When building cum rights term network, first to obtain the weight matrix of term network, definition weight matrix Wm is:
Wherein, α is that 0.3, β is 0.7, and the semantic relation between strengthening word, weakens the cooccurrence relation between word.
W mas the adjacency matrix corresponding to term network of input, its corresponding network chart is defined as: G={V, E}; Wherein scheming G is undirected weighted graph, the vertex set in V presentation graphs G, and E represents the limit collection in G, v irepresent i summit (word) in V.
The computation process of described word aggregation characteristic value is as follows:
Key character degree of the having distribution of term network, average shortest path, concentration class and convergence factor.The degree of node embodies the associated situation of this node and other node.The node that the concentration class of node and convergence factor are embodied in this node subrange interconnects density.The degree of node and convergence factor embody the importance of this node in subrange.The present invention, by the aggregation characteristic value that measures and weights, convergence factor and node betweenness are carried out computing node that adds of node, can allow important word give higher weights, and guaranteeing again also has higher scoring with the related word of many important words.
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe measures and weights that adds be defined as:
WD i = &Sigma; j = 1 n w ij / n
Wherein, w ijfor node v iwith v jbetween weights on limit, total number that n is node.
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe non-measures and weights D that adds ifor D i=| { (v i, v j): (v i, v j) ∈ E, v i, v j∈ V}|; Node v iconcentration class K ifor the actual limit number existing between neighbor node: T i=| { (v j, v k): (v i, v k) ∈ E, (v j, v k) ∈ E, v i, v j∈ V}|, node v jconvergence factor C ibe defined as:
C i = T i D i 2 = 2 T i / D i ( D i - 1 )
In semantic similarity network chart, node betweenness Betweenness is between node x and w and shortest path passes through node v ipossibility probability.Pair Analysis between two nonneighbor nodes depends on the node on the shortest path that connects point-to-point transmission, the potential role who controls interactive information stream between node, the B of playing the part of of these nodes iembody node v idegree of connecting under local environment, node betweenness Betweenness is defined as:
B i = &Sigma; v i , w , x &Element; G d ( w , x ; v i ) d ( w , x )
D (w, x) represents shortest path number between any two node w and x, d (w, x; v i) represent any two node w and x and pass through v ishortest path number.
By node v iaverage weighted degree, convergence factor and betweenness Betweenness be weighted the aggregation characteristic value of comprehensive measurement node, node v iaggregation characteristic value Z ibe defined as:
Z i = a &times; WD i + b &times; C i / &Sigma; j = 1 n C j + c &times; B i
Wherein, a+b+c=1.
The computation process of the statistical characteristics of described word is as follows:
Adopt nonlinear function to be normalized word frequency.Word W iword frequency weight TFi in text is defined as:
TFi = f ( Wi ) &Sigma; j = 1 n f ( p j )
Wherein, TFi represents word W iword frequency weight, p jrepresent certain word in text, f is word frequency statistics function.
In Chinese text, energy nameplate characteristic is generally notional word, as noun, verb, adjective etc.And the function words such as interjection, preposition, conjunction are substantially nonsensical to determining text categories, can extract and bring very large interference feature word.Word W ipart of speech weight posi in text is defined as:
Figure BDA0000401292370000144
Word is more long more can reflect concrete information, otherwise the represented meaning of shorter word is conventionally more abstract.Especially mostly the feature word in document is the academic combination of some specialties vocabulary, and length is longer, and its implication is clearer and more definite, more can reflect text subject.The weight that increases long word, is conducive to vocabulary to be cut apart, thereby reflects more accurately the significance level of word in document.
Word W ithe long weight leni of word in text is defined as:
Figure BDA0000401292370000151
For each word in sequence of terms, its statistical characteristics is
stats i=A*TF i+B*pos i+C*len i
Wherein, A+B+C=1.
Described word W ithe computation process of key degree is as follows:
Corresponding to each node in weighting term network, its crucial degree value Imp ibe defined as:
Imp i=β*stats i+(1-β)*Z i
Wherein, 0 < β < 1.
By calculating, the value of crucial degree sequence from big to small will be obtained, set a threshold gamma (0 < γ < 1), the value of q before taking out, these words are using the feature word as science and technology item, these words fully reflect theme, and are important words.
Step 4. evaluation expert feature word extracts: evaluation expert's quantity of information is few compared with science and technology item information, the Feature Words of science and technology item builds network the extractive technique based on statistical nature and semantic feature, the feature word that is not suitable for evaluation expert's information extracts, directly according to general inactive dictionary and the inactive dictionary of specialty, carry out stop words filtration, extract each expert's Feature Words set, general inactive dictionary is to be also to adopt the inactive vocabulary of Harbin Institute of Technology, and the inactive dictionary of specialty needs personnel constantly to safeguard.
Step 5. builds science and technology item, evaluation expert's minute field Knowledge Representation Model: by vector space model and matter-element Knowledge Set model are expanded, according to the different field information in science and technology item, set up text representation model PRO=(id, F, WF, T, V), wherein id is illustrated in the identification field in project library; F represents field classification set in science and technology item; WF is the weight of field; T is feature word; V represents that the corresponding word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe corresponding frequency of keyword.The representation of knowledge of science and technology item information is as follows:
Figure BDA0000401292370000152
In like manner, according to the different field information in expert, set up Knowledge Representation Model TM=(id, F, WF, T, V).Wherein, id is illustrated in the identification field in experts database; F represents field classification set in evaluation expert; WF is the weight set of field; T is feature word; V represents that the corresponding feature word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe frequency of occurrences of feature word in corresponding field.The representation of knowledge of evaluation expert's information is:
Figure BDA0000401292370000161
Step 5. evaluation expert information index storehouse builds: after evaluating Expert Knowledge Expression model construction and completing, information index is put in storage: the content item information that first reads an evaluation expert from experts database; Based on word segmentation result, set up phrase semantic network and extract the Feature Words that evaluation expert comprises; According to Knowledge Representation Model and utilize Apache Lucene to set up index to it; The index establishing is added in corresponding index database by affiliated classification, until all evaluation expert's index warehouse-in, referring to Fig. 5.
Step 6: according to the number of project, the way of recommendation is divided into single pending trial project recommendation expert and grouping (a plurality of) pending trial project recommendation expert.Grouping recommends expert to represent that to the pending trial project knowledge of step 5 model does the feature union operation between corresponding interfield and project, and single pending trial expert recommends only to do corresponding interfield feature union operation.Meanwhile, the evaluation expert's of step 5 Knowledge Representation Model is carried out to the merging of interfield feature.According to Knowledge Representation Model and utilize the characteristic information after Apache Lucene is combined to set up index.Wherein, science and technology item index construct carries out when carrying out project recommendation.
In science and technology item declaration management system, pending trial project needs grouping to recommend often, above-mentioned feature union operation not only guarantee not can removal process 5 in Knowledge Representation Model different field weight is set similarity is calculated and produced the contribution difference of recommending.
It is as follows that described pending trial project, evaluation expert's feature merga pass logic xor operation carries out process:
(1) pending trial project, an evaluation expert's interfield feature merges
Suppose field feature set of words W' 1and W' 2merge, define W' 1and W' 2merge rule
Figure BDA0000401292370000162
for:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , f ( word 1 i ) + f ( word 2 i ) 2 } | word 1 i = word 2 j }
Wherein, word 1i, word 2jfor Feature Words.
Add field weight to improve and expand above-mentioned definition, the interfield feature of evaluation expert, science and technology item is merged, merging rule is:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , w 1 * f ( word 1 i ) + w 2 * f ( word 2 i ) w 1 2 + w 2 2 } | word 1 i = word 2 j }
(2) between the project of grouping pending trial project, feature merges
This merging process operation is only for the proper vector of pending trial science and technology item, and not for evaluation expert's proper vector, expert's proper vector only need to be done interfield feature union operation.If V is (d 1) and V (d 2) be respectively the vector models of two science and technology items after interfield feature merges, to any t 1j∈ V (d 1), t 2j∈ V (d 2), if there is t 1jwith t 2jidentical merging.
Figure BDA0000401292370000172
be defined as:
V ( d 1 ) &CirclePlus; V ( d 2 ) = { < t k , w k ( p ) = w i ( d 1 ) + w j ( d 2 ) 2 > }
Wherein, k=1 ..., n, t kfor feature entry item, w k(p) be t kweight.
The knowledge model of science and technology item group represents that the basic process producing is as follows:
A). merge science and technology item interfield feature, obtain the vector model V (d) of each project;
B). all science and technology item vector model set are adopted to consolidation strategy
Figure BDA0000401292370000174
by above-mentioned method, science and technology item is set up to the Knowledge Representation Model of the vector space that is based on.
V(p)={<t 1,w 1(p)>,<t 2,w 2(p)>,...,<t n,w n(p)>}
Wherein, k=1 ..., n, t kfor project team's Feature Words entry item, w k(p) be t kweight.
Step 7. merges through the interfield feature of the evaluation expert of step 6 and the Knowledge Representation Model of science and technology item, if suppose, evaluation expert's information vector is expressed as P={s 1, f (s 1), s 2, f (s 2) ..., s n, f (s n), science and technology item information (group) vector representation is Q={t 1, f (t 1), t 2, f (t 2) ..., t n, f (t n), the semantic similarity based on maximum matching algorithm calculating pending trial science and technology item (group) vector with evaluation expert.
Described pending trial science and technology item (group) vector is as follows based on bigraph (bipartite graph) maximum matching algorithm computing semantic similarity computation process with evaluation expert's vector:
Based on maximum matching algorithm computing semantic similarity, the maximum matching algorithm similarity of the employing that obtains exactly two texts based on bigraph (bipartite graph).As shown in Figure 2, the similarity of the maximum matching algorithm calculated characteristics item based on bigraph (bipartite graph), its principle is exactly a summit using each Feature Words of science and technology item (group) vector as X portion, each Feature Words of evaluation expert's vector is as a summit of Y portion, be equivalent to the maximum weight matching of asking a complete bipartite graph, in accompanying drawing 2, thick line is exactly partly the semantic similarity of X portion feature word and certain Y portion Feature Words maximum.
So-called semantic similarity, the similarity based on knowing net is calculated and is obtained.The present invention is by knowing that net semantic dictionary and maximum matching algorithm calculate the semantic similarity between pending trial project (group) and evaluation expert, and computing formula is:
SimSEM ( P , Q ) = ( &Sigma; k = 1 p f ( s i ) * f ( t j ) * SimSEM ( s i , t j ) ) / min ( m , n )
Wherein, s i, t jfor semantic similarity maximal value SimSEM (s i, t j) corresponding two the word nodes in limit (thick line in Fig. 2), m, n is respectively the Feature Words number of science and technology item vector representation and the Feature Words number of evaluation expert's vector representation.P is the number on the limit (thick line in Fig. 2) of semantic similarity maximum.
Above-mentioned pending trial project (group) relates to the many factors such as language, phrase semantic, word structure with the semantic similarity of evaluation expert's information, it represents both matching degrees, similarity is large, illustrates that both matching degrees are high, and evaluation expert is applicable to evaluation this project (group).
Step 8. arranges similarity and blocks, and according to the size generation of similarity, recommends index, produces final recommendation evaluation expert list.
The above is only the preferred embodiment of the present invention; should be understood that; intelligent machine recommended technology for science and technology item evaluation expert field; do not departing under the prerequisite of the technology of the present invention principle; can also make some improvement and distortion, these improvement and distortion also should be considered as legal protection scope of the present invention.

Claims (3)

1. towards evaluation expert's intelligent recommendation method of science and technology item, it is characterized in that the method comprises the following steps:
Step 1. dictionary of stopping using using the general term in science and technology item and expert info and habitual word as specialty; Using punctuation mark, non-Chinese character as cutting signature library;
Step 2. pair science and technology item information, expert info carry out participle: according to cutting mark in science and technology item information, project name, main research, technical indicator are cut into substring sequence; According to cutting mark in evaluation expert's information, the project that extraction expert info, prize-winning situation, invention situation, the situation that publishes thesis, problem were born and performance, research direction are cut into substring sequence, and a sub-string sequence is a field information; Utilize the ICTCLAS of Chinese Academy of Sciences antithetical phrase string sequence to carry out participle;
Step 3. science and technology item feature word extraction: utilize the inactive dictionary of general inactive dictionary and specialty to carry out stop words filtration to participle, described general inactive dictionary adopts the inactive vocabulary of Harbin Institute of Technology, using the word segmentation result of removing stop words as a set of words;
The structure of the inactive dictionary of specialty is constantly perfect process of a self study, the word frequency of constantly adding up word in information participle process, and the probability that word occurs at text is greater than certain threshold values, brings it into inactive dictionary;
Science and technology item quantity of information is larger, set of words is carried out to semantic similarity between word and calculate, and according to the cooccurrence relation of the semantic relation of word and word, builds term network, the word aggregation characteristic value in computational grid; Then in conjunction with the statistical characteristics of word, the crucial degree that calculates word extracts science and technology item feature word; The feature word of science and technology item is exactly statistical nature information and the semantic feature information of extracting comprehensive text, extracts more exactly feature word;
Step 4. evaluation expert feature word extracts: according to general inactive dictionary and the inactive dictionary of specialty, carry out stop words filtration, extract each expert's Feature Words set;
Step 5. builds science and technology item, evaluation expert's minute field Knowledge Representation Model: by vector space model and matter-element Knowledge Set model are expanded, according to the different field information in science and technology item, set up text representation model PRO=(id, F, WF, T, V), wherein id is illustrated in the identification field in project library; F represents field classification set in science and technology item; WF is the weight of field; T is feature word; V represents that the corresponding word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe corresponding frequency of keyword; The representation of knowledge of science and technology item information is as follows:
Figure FDA0000401292360000021
In like manner, according to the different field information in expert, set up Knowledge Representation Model TM=(id, F, WF, T, V); Wherein, id is illustrated in the identification field in experts database; F represents field classification set in evaluation expert; WF is the weight set of field; T is feature word; V represents that the corresponding feature word of field and weight set thereof are V i={ v i1, f (v i1), v i2, f (v i2) ..., v in, f (v in), v ijrepresent j feature word in i field, f (v ij) expression v ijthe frequency of occurrences of feature word in corresponding field; The representation of knowledge of evaluation expert's information is:
Figure FDA0000401292360000022
Step 5. evaluation expert information index storehouse builds: after evaluating Expert Knowledge Expression model construction and completing, information index is put in storage: the content item information that first reads an evaluation expert from experts database; Based on word segmentation result, set up phrase semantic network and extract the Feature Words that evaluation expert comprises; According to Knowledge Representation Model and utilize Apache Lucene to set up index to it; The index establishing is added in corresponding index database by affiliated classification, until all evaluation expert's index warehouse-in;
Step 6: according to the number of project, the way of recommendation is divided into single pending trial project recommendation expert and grouping pending trial project recommendation expert; Grouping recommends expert to represent that to the pending trial project knowledge of step 5 model does the feature union operation between corresponding interfield and project, and single pending trial expert recommends only to do corresponding interfield feature union operation; Meanwhile, the evaluation expert's of step 5 Knowledge Representation Model is carried out to the merging of interfield feature; According to Knowledge Representation Model and utilize the characteristic information after Apache Lucene is combined to set up index; Wherein, science and technology item index construct carries out when carrying out project recommendation;
In science and technology item declaration management system, pending trial project needs grouping to recommend often, above-mentioned feature union operation not only guarantee not can removal process 5 in Knowledge Representation Model different field weight is set similarity is calculated and produced the contribution difference of recommending;
Step 7. merges through the interfield feature of the evaluation expert of step 6 and the Knowledge Representation Model of science and technology item, if suppose, evaluation expert's information vector is expressed as P={s 1, f (s 1), s 2, f (s 2) ..., s n, f (s n), science and technology item information vector is expressed as Q={t 1, f (t 1), t 2, f (t 2) ..., t n, f (t n), the semantic similarity based on maximum matching algorithm calculating pending trial science and technology item vector with evaluation expert;
Step 8. arranges similarity and blocks, and according to the size generation of similarity, recommends index, produces final recommendation evaluation expert list.
2. a kind of evaluation expert's intelligent recommendation method towards science and technology item according to claim 1, is characterized in that: the semantic similarity computation process described in step 3 is as follows:
In knowing net semantic dictionary, if for two word W 1and W 2, W 1there is n concept: S11, S12 ..., S1n, W 2there is m concept: S21, S22 ..., S2n; Word W 1and W 2similarity SimSEM (W1, W2) equal the maximal value of the similarity of each concept:
SimSEM ( W 1 , W 2 ) = max i = 1 , . . . n . j = 1 . . . m Sim ( S 1 i , S 2 i )
Notional word and function word have different descriptive languages, need to calculate the adopted similarity between former of the former or relation of its corresponding syntax justice; Notional word concept comprises that the first basic meaning is former, other basic meanings are former, the adopted former description of relation, relational symbol are described, and similarity is designated as respectively Sim1 (p 1, p 2), Sim2 (p 1, p 2), Sim3 (p 1, p 2), Sim4 (p 1, p 2); The similarity calculating of two feature structures finally reverts to basic meaning similarity former or concrete word and calculates;
Sim 4 ( S 1 , S 2 ) = &Sigma; i = 1 4 &beta; i Sim i ( S 1 , S 2 )
β i(1≤i≤4) are adjustable parameters, and have: β 1+ β 2+ β 3+ β 4=1, β 1>=β 2>=β 3>=β 4;
If CW={C1, C2 ..., the set of words of Cm} for obtaining after processing, the semantic similarity adjacency matrix S that it is corresponding mbe defined as:
Figure FDA0000401292360000033
Wherein, Sim (C 1, C 2) be word C 1with word C 2semantic similarity, Sim (C i, C i) be 1, Sim (C i, C j)=Sim (C j, C i);
Set of words CW={C1, C2 ..., Cm} calculates the value of similarity between p* (1+p)/2 word through semantic similarity;
The cooccurrence relation computation process of described word is as follows:
Word co-occurrence patterns are one of important models of the natural language processing research field based on statistical method; According to word co-occurrence patterns, if two frequent co-occurrences of word in the same window unit of document, these two words are to be mutually related in meaning, they express the semantic information of the text to a certain extent; Utilizing moving window to carry out word co-occurrence degree to the word in sequence of terms calculates:
First, sequence of terms is carried out to word extraction, remove space, null and merge identical word, obtains set of words CW={C1, C2 ..., Cm}, wherein m≤n;
The word co-occurrence degree Matrix C m that set of words CW is corresponding is defined as:
Figure FDA0000401292360000041
When Cm is initial, Coo (Ci, Cj) is 01(1≤i, j≤m);
By moving window, sequence of terms is carried out to word co-occurrence degree and calculate, the word in moving window is T i-1t it i+1(1<i<n):
1) if i=n-1 turns 4); If T i-1be space or null, moving window slides to next word, i++; Otherwise, turn 2);
2) if T ifor Chinese, Coo(T i-1, T i) ++, turn 3); If T ifor null, turn 3); Otherwise turn 1);
3) if T ichinese, Coo (T i-1, T i+1) ++, i++, turns 1); Otherwise, turn 1);
4) if T n-2be Chinese, turn 5); Otherwise, turn 7)
5) if T n-1chinese, Coo (T n-2, T n-1) ++, turn 6); If T n-1be space, turn 6); Otherwise finish;
6) if T nchinese, Coo (T n-2, T n) ++, finish; Otherwise finish;
7) if T n-1chinese, and T nalso be Chinese, Coo (T n-1, T n) ++, finish; Otherwise finish;
Through the calculating of step above, obtain word co-occurrence degree Matrix C m, and each element of Cm is normalized, namely each element is divided by the maximal value of all elements in matrix, i.e. max{Coo (C i, C j) | 1≤i, j≤m};
Described term network is as follows:
When building cum rights term network, first to obtain the weight matrix of term network, definition weight matrix Wm is:
Figure FDA0000401292360000051
Wherein, α is that 0.3, β is 0.7, and the semantic relation between strengthening word, weakens the cooccurrence relation between word;
W mas the adjacency matrix corresponding to term network of input, its corresponding network chart is defined as: G={V, E}; Wherein scheming G is undirected weighted graph, the vertex set in V presentation graphs G, and E represents the limit collection in G, v irepresent i summit (word) in V;
The computation process of described word aggregation characteristic value is as follows:
Key character degree of the having distribution of term network, average shortest path, concentration class and convergence factor; The degree of node embodies the associated situation of this node and other node; The node that the concentration class of node and convergence factor are embodied in this node subrange interconnects density; The degree of node and convergence factor embody the importance of this node in subrange; The aggregation characteristic value that measures and weights, convergence factor and node betweenness are carried out computing node that adds by node, can allow important word give higher weights, and guaranteeing again also has higher scoring with the related word of many important words;
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe measures and weights that adds be defined as:
WD i = &Sigma; j = 1 n w ij / n
Wherein, w ijfor node v iwith v jbetween weights on limit, total number that n is node;
In semantic similarity network chart, unordered couple (v i, v j) expression node v iwith v jbetween limit, node v ithe non-measures and weights D that adds ifor D i=| { (v i, v j): (v i, v j) ∈ E, v i, v j∈ V}|; Node v iconcentration class K ifor the actual limit number existing between neighbor node: T i=| { (v j, v k): (v i, v k) ∈ E, (v j, v k) ∈ E, v i, v j∈ V}|, node v jconvergence factor C ibe defined as:
C i = T i D i 2 = 2 T i / D i ( D i - 1 )
In semantic similarity network chart, node betweenness Betweenness is between node x and w and shortest path passes through node v ipossibility probability; Pair Analysis between two nonneighbor nodes depends on the node on the shortest path that connects point-to-point transmission, the potential role who controls interactive information stream between node, the B of playing the part of of these nodes iembody node v idegree of connecting under local environment, node betweenness Betweenness is defined as:
B i = &Sigma; v i , w , x &Element; G d ( w , x ; v i ) d ( w , x )
D (w, x) represents shortest path number between any two node w and x, d (w, x; v i) represent any two node w and x and pass through v ishortest path number;
By node v iaverage weighted degree, convergence factor and betweenness Betweenness be weighted the aggregation characteristic value of comprehensive measurement node, node v iaggregation characteristic value Z ibe defined as:
Z i = a &times; WD i + b &times; C i / &Sigma; j = 1 n C j + c &times; B i
Wherein, a+b+c=1;
The computation process of the statistical characteristics of described word is as follows:
Adopt nonlinear function to be normalized word frequency; Word W iword frequency weight TFi in text is defined as:
TFi = f ( Wi ) &Sigma; j = 1 n f ( p j )
Wherein, TFi represents word W iword frequency weight, p jrepresent certain word in text, f is word frequency statistics function;
Word W ipart of speech weight posi in text is defined as:
Figure FDA0000401292360000071
Word is more long more can reflect concrete information, otherwise the represented meaning of shorter word is conventionally more abstract; Especially mostly the feature word in document is the academic combination of some specialties vocabulary, and length is longer, and its implication is clearer and more definite, more can reflect text subject; The weight that increases long word, is conducive to vocabulary to be cut apart, thereby reflects more accurately the significance level of word in document;
Word W ithe long weight leni of word in text is defined as:
Figure FDA0000401292360000072
For each word in sequence of terms, its statistical characteristics is
stats i=A*TF i+B*pos i+C*len i
Wherein, A+B+C=1;
Described word W ithe computation process of key degree is as follows:
Corresponding to each node in weighting term network, its crucial degree value Imp ibe defined as:
Imp i=β*stats i+(1-β)*Z i
Wherein, 0 < β < 1;
By calculating, the value of crucial degree sequence from big to small will be obtained, set a threshold gamma (0 < γ < 1), the value of q before taking out, these words are using the feature word as science and technology item, these words fully reflect theme, and are important words.
3. a kind of evaluation expert's intelligent recommendation method towards science and technology item according to claim 1, is characterized in that: it is as follows that the feature merga pass logic xor operation described in step 6 carries out process:
(1) pending trial project, an evaluation expert's interfield feature merges
Suppose field feature set of words W' 1and W' 2merge, define W' 1and W' 2merge rule
Figure FDA0000401292360000073
for:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , f ( word 1 i ) + f ( word 2 i ) 2 } | word 1 i = word 2 j }
Wherein, word 1i, word 2jfor Feature Words;
Add field weight to improve and expand above-mentioned definition, the interfield feature of evaluation expert, science and technology item is merged, merging rule is:
W &prime; 1 &CirclePlus; W &prime; 2 = { &ForAll; i , j , { word 1 i , w 1 * f ( word 1 i ) + w 2 * f ( word 2 i ) w 1 2 + w 2 2 } | word 1 i = word 2 j }
(2) between the project of grouping pending trial project, feature merges
This merging process operation is only for the proper vector of pending trial science and technology item, and not for evaluation expert's proper vector, expert's proper vector only need to be done interfield feature union operation; If V is (d 1) and V (d 2) be respectively the vector models of two science and technology items after interfield feature merges, to any t 1j∈ V (d 1), t 2j∈ V (d 2), if there is t 1jwith t 2jidentical merging;
Figure FDA0000401292360000082
be defined as:
V ( d 1 ) &CirclePlus; V ( d 2 ) = { < t k , w k ( p ) = w i ( d 1 ) + w j ( d 2 ) 2 > }
Wherein, k=1 ..., n, t kfor feature entry item, w k(p) be t kweight;
The basic process that Knowledge Representation Model produces is as follows:
A). merge science and technology item interfield feature, obtain the vector model V (d) of each project;
B). all science and technology item vector model set are adopted to consolidation strategy
Figure FDA0000401292360000084
by above-mentioned method, science and technology item is set up to the Knowledge Representation Model of the vector space that is based on;
V(p)={<t 1,w 1(p)>,<t 2,w 2(p)>,...,<t n,w n(p)>}
Wherein, k=1 ..., n, t kfor project team's Feature Words entry item, w k(p) be t kweight.
CN201310509358.2A 2013-10-24 2013-10-24 Intelligent review expert recommending method for science and technology projects Active CN103631859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310509358.2A CN103631859B (en) 2013-10-24 2013-10-24 Intelligent review expert recommending method for science and technology projects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310509358.2A CN103631859B (en) 2013-10-24 2013-10-24 Intelligent review expert recommending method for science and technology projects

Publications (2)

Publication Number Publication Date
CN103631859A true CN103631859A (en) 2014-03-12
CN103631859B CN103631859B (en) 2017-01-11

Family

ID=50212901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310509358.2A Active CN103631859B (en) 2013-10-24 2013-10-24 Intelligent review expert recommending method for science and technology projects

Country Status (1)

Country Link
CN (1) CN103631859B (en)

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361102A (en) * 2014-11-24 2015-02-18 清华大学 Expert recommendation method and system based on group matching
CN105786960A (en) * 2015-01-14 2016-07-20 通用电气公司 Method, System, And User Interface For Expert Search Based On Case Resolution Logs
CN105912581A (en) * 2016-03-31 2016-08-31 比美特医护在线(北京)科技有限公司 Information processing method and device
CN103823896B (en) * 2014-03-13 2017-02-15 蚌埠医学院 Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm
CN107194672A (en) * 2016-11-09 2017-09-22 北京理工大学 It is a kind of to merge academic speciality and the evaluation distribution method of community network
CN107229738A (en) * 2017-06-18 2017-10-03 杭州电子科技大学 A kind of scientific paper search ordering method based on document scores model and the degree of correlation
CN107609006A (en) * 2017-07-24 2018-01-19 华中师范大学 A kind of chess game optimization method based on local chronicle research
CN107656920A (en) * 2017-09-14 2018-02-02 杭州电子科技大学 A kind of skilled personnel based on patent recommend method
CN107784087A (en) * 2017-10-09 2018-03-09 东软集团股份有限公司 A kind of hot word determines method, apparatus and equipment
CN107807978A (en) * 2017-10-26 2018-03-16 北京航空航天大学 A kind of code review person based on collaborative filtering recommends method
CN108229684A (en) * 2018-01-26 2018-06-29 中国科学技术信息研究所 Build the method, apparatus and terminal device of expertise vector model
CN108399491A (en) * 2018-02-02 2018-08-14 浙江工业大学 A kind of employee's diversity ranking method based on network
CN108427667A (en) * 2017-02-15 2018-08-21 北京国双科技有限公司 A kind of segmentation method and device of legal documents
CN108549730A (en) * 2018-06-01 2018-09-18 云南电网有限责任公司电力科学研究院 A kind of search method and device of expert info
CN108804633A (en) * 2018-06-01 2018-11-13 腾讯科技(深圳)有限公司 The content recommendation method of Behavior-based control Semantic knowledge network
CN108846056A (en) * 2018-06-01 2018-11-20 云南电网有限责任公司电力科学研究院 A kind of scientific and technological achievement evaluation expert recommended method and device
CN108873706A (en) * 2018-07-30 2018-11-23 中国石油化工股份有限公司 Evaluation of trap intelligent expert recommended method based on deep neural network
CN108920556A (en) * 2018-06-20 2018-11-30 华东师范大学 Recommendation expert method based on subject knowledge map
CN109308315A (en) * 2018-10-19 2019-02-05 南京理工大学 A kind of collaborative recommendation method based on specialist field similarity and incidence relation
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
CN109992642A (en) * 2019-03-29 2019-07-09 华南理工大学 A kind of automatic method of selecting of single task expert and system based on scientific and technological entry
CN110046225A (en) * 2019-04-16 2019-07-23 广东省科技基础条件平台中心 A kind of science and technology item material integrity evaluating decision model training method
CN110443574A (en) * 2019-07-25 2019-11-12 昆明理工大学 Entry convolutional neural networks evaluation expert's recommended method
CN110442618A (en) * 2019-07-25 2019-11-12 昆明理工大学 Merge convolutional neural networks evaluation expert's recommended method of expert info incidence relation
CN111143690A (en) * 2019-12-31 2020-05-12 中国电子科技集团公司信息科学研究院 Expert recommendation method and system based on associated expert database
CN111598526A (en) * 2020-04-21 2020-08-28 奇计(江苏)科技服务有限公司 Intelligent comparison and review method for describing scientific and technological innovation content
CN111666420A (en) * 2020-05-29 2020-09-15 华东师范大学 Method for intensively extracting experts based on subject knowledge graph
CN111782797A (en) * 2020-07-13 2020-10-16 贵州省科技信息中心 Automatic matching method for scientific and technological project review experts and storage medium
CN111951141A (en) * 2020-07-09 2020-11-17 广东港鑫科技有限公司 Double-random supervision method and system based on big data intelligent analysis and terminal equipment
CN112100370A (en) * 2020-08-10 2020-12-18 淮阴工学院 Picture examination expert combined recommendation method based on text convolution and similarity algorithm
CN112182327A (en) * 2019-07-05 2021-01-05 北京猎户星空科技有限公司 Data processing method, device, equipment and medium
CN112287679A (en) * 2020-10-16 2021-01-29 国网江西省电力有限公司电力科学研究院 Structured extraction method and system for text information in scientific and technological project review
CN112381381A (en) * 2020-11-12 2021-02-19 深圳供电局有限公司 Expert's device is recommended to intelligence
CN112417870A (en) * 2020-12-10 2021-02-26 北京中电普华信息技术有限公司 Expert information screening method and system
CN112948527A (en) * 2021-02-23 2021-06-11 云南大学 Improved TextRank keyword extraction method and device
CN113516094A (en) * 2021-07-28 2021-10-19 中国科学院计算技术研究所 System and method for matching document with review experts
CN113554210A (en) * 2021-05-17 2021-10-26 南京工程学院 Comment scoring and declaration prediction system and method for fund project declaration
CN113569575A (en) * 2021-08-10 2021-10-29 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping
CN113643008A (en) * 2021-10-15 2021-11-12 中国铁道科学研究院集团有限公司科学技术信息研究所 Acceptance expert matching method, device, equipment and readable storage medium
CN114186002A (en) * 2021-12-14 2022-03-15 智博天宫(苏州)人工智能产业研究院有限公司 Scientific and technological achievement data processing and analyzing method and system
CN115033772A (en) * 2022-06-20 2022-09-09 浙江大学 Creative excitation method and device based on semantic network
CN115577696A (en) * 2022-11-15 2023-01-06 四川省公路规划勘察设计研究院有限公司 Project similarity evaluation and analysis method based on WBS tree
CN117034273A (en) * 2023-08-28 2023-11-10 山东省计算中心(国家超级计算济南中心) Android malicious software detection method and system based on graph rolling network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075942A (en) * 2007-06-22 2007-11-21 清华大学 Method and system for processing social network expert information based on expert value progation algorithm
CN102495860A (en) * 2011-11-22 2012-06-13 北京大学 Expert recommendation method based on language model
CN102855241A (en) * 2011-06-28 2013-01-02 上海迈辉信息技术有限公司 Multi-index expert suggestion system and realization method thereof
CN102880657A (en) * 2012-08-31 2013-01-16 电子科技大学 Expert recommending method based on searcher

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075942A (en) * 2007-06-22 2007-11-21 清华大学 Method and system for processing social network expert information based on expert value progation algorithm
CN102855241A (en) * 2011-06-28 2013-01-02 上海迈辉信息技术有限公司 Multi-index expert suggestion system and realization method thereof
CN102495860A (en) * 2011-11-22 2012-06-13 北京大学 Expert recommendation method based on language model
CN102880657A (en) * 2012-08-31 2013-01-16 电子科技大学 Expert recommending method based on searcher

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡斌: "科技项目评审专家推荐系统的研究与实现", 《中国优秀硕士学位论文全文数据库(信息科技辑)》, no. 7, 15 July 2013 (2013-07-15) *

Cited By (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823896B (en) * 2014-03-13 2017-02-15 蚌埠医学院 Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm
CN104361102B (en) * 2014-11-24 2018-05-11 清华大学 A kind of expert recommendation method and system based on group matches
CN104361102A (en) * 2014-11-24 2015-02-18 清华大学 Expert recommendation method and system based on group matching
CN105786960A (en) * 2015-01-14 2016-07-20 通用电气公司 Method, System, And User Interface For Expert Search Based On Case Resolution Logs
CN105912581A (en) * 2016-03-31 2016-08-31 比美特医护在线(北京)科技有限公司 Information processing method and device
CN107194672A (en) * 2016-11-09 2017-09-22 北京理工大学 It is a kind of to merge academic speciality and the evaluation distribution method of community network
CN108427667A (en) * 2017-02-15 2018-08-21 北京国双科技有限公司 A kind of segmentation method and device of legal documents
CN108427667B (en) * 2017-02-15 2021-08-10 北京国双科技有限公司 Legal document segmentation method and device
CN107229738A (en) * 2017-06-18 2017-10-03 杭州电子科技大学 A kind of scientific paper search ordering method based on document scores model and the degree of correlation
CN107229738B (en) * 2017-06-18 2020-04-03 杭州电子科技大学 Academic paper search ordering method based on document scoring model and relevancy
CN107609006A (en) * 2017-07-24 2018-01-19 华中师范大学 A kind of chess game optimization method based on local chronicle research
CN107609006B (en) * 2017-07-24 2021-01-29 华中师范大学 Search optimization method based on local log research
CN107656920B (en) * 2017-09-14 2020-12-18 杭州电子科技大学 Scientific and technological talent recommendation method based on patents
CN107656920A (en) * 2017-09-14 2018-02-02 杭州电子科技大学 A kind of skilled personnel based on patent recommend method
CN107784087B (en) * 2017-10-09 2020-11-06 东软集团股份有限公司 Hot word determination method, device and equipment
CN107784087A (en) * 2017-10-09 2018-03-09 东软集团股份有限公司 A kind of hot word determines method, apparatus and equipment
CN107807978A (en) * 2017-10-26 2018-03-16 北京航空航天大学 A kind of code review person based on collaborative filtering recommends method
CN107807978B (en) * 2017-10-26 2021-07-06 北京航空航天大学 Code reviewer recommendation method based on collaborative filtering
CN108229684A (en) * 2018-01-26 2018-06-29 中国科学技术信息研究所 Build the method, apparatus and terminal device of expertise vector model
CN108229684B (en) * 2018-01-26 2022-04-15 中国科学技术信息研究所 Method and device for constructing expert knowledge vector model and terminal equipment
CN108399491A (en) * 2018-02-02 2018-08-14 浙江工业大学 A kind of employee's diversity ranking method based on network
CN108804633B (en) * 2018-06-01 2021-10-08 腾讯科技(深圳)有限公司 Content recommendation method based on behavior semantic knowledge network
CN108846056B (en) * 2018-06-01 2021-04-23 云南电网有限责任公司电力科学研究院 Scientific and technological achievement review expert recommendation method and device
CN108846056A (en) * 2018-06-01 2018-11-20 云南电网有限责任公司电力科学研究院 A kind of scientific and technological achievement evaluation expert recommended method and device
CN108804633A (en) * 2018-06-01 2018-11-13 腾讯科技(深圳)有限公司 The content recommendation method of Behavior-based control Semantic knowledge network
CN108549730A (en) * 2018-06-01 2018-09-18 云南电网有限责任公司电力科学研究院 A kind of search method and device of expert info
CN108920556A (en) * 2018-06-20 2018-11-30 华东师范大学 Recommendation expert method based on subject knowledge map
CN108920556B (en) * 2018-06-20 2021-11-19 华东师范大学 Expert recommending method based on discipline knowledge graph
CN108873706B (en) * 2018-07-30 2022-04-15 中国石油化工股份有限公司 Trap evaluation intelligent expert recommendation method based on deep neural network
CN108873706A (en) * 2018-07-30 2018-11-23 中国石油化工股份有限公司 Evaluation of trap intelligent expert recommended method based on deep neural network
CN109308315A (en) * 2018-10-19 2019-02-05 南京理工大学 A kind of collaborative recommendation method based on specialist field similarity and incidence relation
CN109308315B (en) * 2018-10-19 2022-09-16 南京理工大学 Collaborative recommendation method based on similarity and incidence relation of expert fields
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
CN109992642B (en) * 2019-03-29 2022-11-18 华南理工大学 Single task expert automatic selection method and system based on scientific and technological entries
CN109992642A (en) * 2019-03-29 2019-07-09 华南理工大学 A kind of automatic method of selecting of single task expert and system based on scientific and technological entry
CN110046225A (en) * 2019-04-16 2019-07-23 广东省科技基础条件平台中心 A kind of science and technology item material integrity evaluating decision model training method
CN112182327A (en) * 2019-07-05 2021-01-05 北京猎户星空科技有限公司 Data processing method, device, equipment and medium
CN110442618B (en) * 2019-07-25 2023-04-18 昆明理工大学 Convolutional neural network review expert recommendation method fusing expert information association relation
CN110442618A (en) * 2019-07-25 2019-11-12 昆明理工大学 Merge convolutional neural networks evaluation expert's recommended method of expert info incidence relation
CN110443574A (en) * 2019-07-25 2019-11-12 昆明理工大学 Entry convolutional neural networks evaluation expert's recommended method
CN111143690A (en) * 2019-12-31 2020-05-12 中国电子科技集团公司信息科学研究院 Expert recommendation method and system based on associated expert database
CN111598526A (en) * 2020-04-21 2020-08-28 奇计(江苏)科技服务有限公司 Intelligent comparison and review method for describing scientific and technological innovation content
CN111666420A (en) * 2020-05-29 2020-09-15 华东师范大学 Method for intensively extracting experts based on subject knowledge graph
CN111951141A (en) * 2020-07-09 2020-11-17 广东港鑫科技有限公司 Double-random supervision method and system based on big data intelligent analysis and terminal equipment
CN111782797A (en) * 2020-07-13 2020-10-16 贵州省科技信息中心 Automatic matching method for scientific and technological project review experts and storage medium
CN112100370B (en) * 2020-08-10 2023-07-25 淮阴工学院 Picture-trial expert combination recommendation method based on text volume and similarity algorithm
CN112100370A (en) * 2020-08-10 2020-12-18 淮阴工学院 Picture examination expert combined recommendation method based on text convolution and similarity algorithm
CN112287679A (en) * 2020-10-16 2021-01-29 国网江西省电力有限公司电力科学研究院 Structured extraction method and system for text information in scientific and technological project review
CN112381381A (en) * 2020-11-12 2021-02-19 深圳供电局有限公司 Expert's device is recommended to intelligence
CN112381381B (en) * 2020-11-12 2023-11-17 深圳供电局有限公司 Expert's device is recommended to intelligence
CN112417870A (en) * 2020-12-10 2021-02-26 北京中电普华信息技术有限公司 Expert information screening method and system
CN112948527A (en) * 2021-02-23 2021-06-11 云南大学 Improved TextRank keyword extraction method and device
CN112948527B (en) * 2021-02-23 2023-06-16 云南大学 Improved TextRank keyword extraction method and device
CN113554210A (en) * 2021-05-17 2021-10-26 南京工程学院 Comment scoring and declaration prediction system and method for fund project declaration
CN113516094A (en) * 2021-07-28 2021-10-19 中国科学院计算技术研究所 System and method for matching document with review experts
CN113516094B (en) * 2021-07-28 2024-03-08 中国科学院计算技术研究所 System and method for matching and evaluating expert for document
CN113569575A (en) * 2021-08-10 2021-10-29 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictograph-semantic dual-feature space mapping
CN113569575B (en) * 2021-08-10 2024-02-09 云南电网有限责任公司电力科学研究院 Evaluation expert recommendation method based on pictographic-semantic dual-feature space mapping
CN113643008A (en) * 2021-10-15 2021-11-12 中国铁道科学研究院集团有限公司科学技术信息研究所 Acceptance expert matching method, device, equipment and readable storage medium
CN114186002A (en) * 2021-12-14 2022-03-15 智博天宫(苏州)人工智能产业研究院有限公司 Scientific and technological achievement data processing and analyzing method and system
CN115033772A (en) * 2022-06-20 2022-09-09 浙江大学 Creative excitation method and device based on semantic network
CN115577696A (en) * 2022-11-15 2023-01-06 四川省公路规划勘察设计研究院有限公司 Project similarity evaluation and analysis method based on WBS tree
CN117034273A (en) * 2023-08-28 2023-11-10 山东省计算中心(国家超级计算济南中心) Android malicious software detection method and system based on graph rolling network

Also Published As

Publication number Publication date
CN103631859B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN103631859B (en) Intelligent review expert recommending method for science and technology projects
Saad et al. Twitter sentiment analysis based on ordinal regression
CN108182279B (en) Object classification method, device and computer equipment based on text feature
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN106599029A (en) Chinese short text clustering method
CN109582764A (en) Interaction attention sentiment analysis method based on interdependent syntax
CN108388651A (en) A kind of file classification method based on the kernel of graph and convolutional neural networks
CN108563703A (en) A kind of determination method of charge, device and computer equipment, storage medium
CN107832457A (en) Power transmission and transforming equipment defect dictionary method for building up and system based on TextRank algorithm
CN103942340A (en) Microblog user interest recognizing method based on text mining
CN106997341B (en) A kind of innovation scheme matching process, device, server and system
CN107122455A (en) A kind of network user&#39;s enhancing method for expressing based on microblogging
CN106250438A (en) Based on random walk model zero quotes article recommends method and system
CN103559199B (en) Method for abstracting web page information and device
EP3392783A1 (en) Similar word aggregation method and apparatus
CN106886576B (en) It is a kind of based on the short text keyword extracting method presorted and system
CN108038205A (en) For the viewpoint analysis prototype system of Chinese microblogging
CN103631858A (en) Science and technology project similarity calculation method
CN109918648B (en) Rumor depth detection method based on dynamic sliding window feature score
Gao et al. Text classification research based on improved Word2vec and CNN
CN105608075A (en) Related knowledge point acquisition method and system
CN107015965A (en) A kind of Chinese text sentiment analysis device and method
CN105869058A (en) Method for user portrait extraction based on multilayer latent variable model
CN108536781A (en) A kind of method for digging and system of social networks mood focus
Chamekh et al. Sentiment analysis based on deep learning in e-commerce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20140312

Assignee: Hangzhou eddy current technology Co., Ltd

Assignor: Hangzhou Electronic Science and Technology Univ

Contract record no.: X2020330000008

Denomination of invention: Intelligent review expert recommending method for science and technology projects

Granted publication date: 20170111

License type: Common License

Record date: 20200117

EE01 Entry into force of recordation of patent licensing contract