CN106897370A - A kind of figure based on Pearson came similarity and FP Growth examines expert recommendation method - Google Patents

A kind of figure based on Pearson came similarity and FP Growth examines expert recommendation method Download PDF

Info

Publication number
CN106897370A
CN106897370A CN201710034169.2A CN201710034169A CN106897370A CN 106897370 A CN106897370 A CN 106897370A CN 201710034169 A CN201710034169 A CN 201710034169A CN 106897370 A CN106897370 A CN 106897370A
Authority
CN
China
Prior art keywords
expert
project
item
collection
examines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710034169.2A
Other languages
Chinese (zh)
Other versions
CN106897370B (en
Inventor
冯万利
朱全银
于柿民
庄军
严云洋
李翔
周泓
瞿学新
唐海波
潘舒新
邵武杰
杨茂灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaiyin Institute of Technology
Original Assignee
Huaiyin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaiyin Institute of Technology filed Critical Huaiyin Institute of Technology
Priority to CN201710034169.2A priority Critical patent/CN106897370B/en
Publication of CN106897370A publication Critical patent/CN106897370A/en
Application granted granted Critical
Publication of CN106897370B publication Critical patent/CN106897370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Expert recommendation method is examined the invention discloses a kind of figure based on Pearson came similarity and FP Growth, the item attribute in integrated project record set is pre-processed first, using Pearson came similarity based method and extract and the examination expert of immediate ten projects of unexamined scale of the project, branch's project and figure according to unexamined integrated project examine expert's research direction, expert to extracting is combined, reuse FP Growth methods and examine history item the treatment of expert's collection, obtain Tu Shen expert groups and close frequent item set, using the support for combining the frequent item set every kind of expert's combination of sets of calculating, final support maximum is that compatible degree highest expert's combination of sets is the expert's collection for participating in unexamined project.The inventive method effectively recommends a kind of careful expert's combination of compatible degree highest figure so that expert's collaboration examines that efficiency is improved, and increased the use value that history item examination expert collects data.

Description

A kind of figure based on Pearson came similarity and FP-Growth examines expert recommendation method
Technical field
It is more particularly to a kind of to be based on Pearson came similarity the invention belongs to proposed algorithm and Association Rule Mining field Figure with FP-Growth examines expert recommendation method, is mainly used in the support that calculating project examines expert's combination, i.e. compatible degree, And then cause that expert's collaboration examines that efficiency is improved, and the use value that history item examination expert collects data is increased with this.
Background technology
Project examines that expert's proposed algorithm examines that expert has efficiently selected important to realizing project in project examination field Function and significance.Traditional project examines that expert group can not meet the need that project examines field by the mode of artificial selection Ask.In recent years for the demand of different commending systems, researcher proposes corresponding personalized recommendation scheme, is such as based on content Recommend, collaborative filtering, correlation rule, effectiveness is recommended, combined recommendation etc..
The existing Research foundation of Feng Wanli, Zhu Quanyin et al. includes:Wanli Feng.Research of theme statement extraction for chinese literature based on lexical chain.International Journal of Multimedia and Ubiquitous Engineering,Vol.11, No.6(2016),pp.379-388;Wanli Feng,Ying Li,Shangbing Gao,Yunyang Yan,Jianxun Xue.A novel flame edge detection algorithm via a novel active contour model.International Journal of Hybrid Information Technology,Vol.9,No.9 (2016),pp.275-282;Liu Jinling, Feng Wanli be based on Feature Dependence relation method for mode matching [J] microelectronics with Computer, 2011,28 (12):167-170;Liu Jinling, Feng Wanli, Zhang Yahong initialize Cu Lei centers and reconstruct scaling function Text cluster [J] computer applications research, 2011,28 (11):4115-4117;Liu Jinling, Feng Wanli, Zhang Yahong are based on Again Chinese short message Text Clustering Method [J] the computer engineering of scale and application, 2012,48 (21):146-150.;Zhu Quan Silver, Pan Lu, Liu Wenru wait .Web science and technology news classification extraction algorithm [J] Huaiyingong College journals, 2015,24 (5):18-24; Li Xiang, Zhu Quan silver joints cluster and shared collaborative filtering recommending [J] the computer science of rating matrix and exploration, 2014,8 (6):751-759;Quanyin Zhu,Sunqun Cao.A Novel Classifier-independent Feature Selection Algorithm for Imbalanced Datasets.2009,p:77-82;Quanyin Zhu,Yunyang Yan,Jin Ding,Jin Qian.The Case Study for Price Extracting of Mobile Phone Sell Online.2011,p:282-285;Quanyin Zhu,Suqun Cao,Pei Zhou,Yunyang Yan,Hong Zhou.Integrated Price Forecast based on Dichotomy Backfilling and Disturbance Factor Algorithm.International Review on Computers and Software,2011,Vol.6 (6):1089-1093;Zhu Quanyin, Feng Wanli et al. application, the open Patents with mandate:Feng Wanli, Shao Heshuai, Zhuan Jun A kind of intelligent refrigerated car state monitoring wireless NTU:CN203616634U[P].2014;Zhu Quanyin, Hu Rongjing, what A kind of price forecasting of commodity method Chinese patents based on linear interpolation and Adaptive windowing mouthful of the such as Su Qun, week training:ZL 2011 1 0423015.5,2015.07.01;Zhu Quanyin, Cao Suqun, Yan Yunyang, Hu Rong wait quietly, and one kind is repaiied based on two divided datas Mend the price forecasting of commodity method Chinese patents with disturbing factors:ZL 2011 1 0422274.6,2013.01.02;Li Xiang, Zhu Quanyin, Hu Ronglin, a kind of Cold Chain Logistics prestowage intelligent recommendation method China Patent Publication No. based on spectral clustering of all deep: CN105654267A,2016.06.08。
Pearson product-moment correlation coefficient:
Pearson product-moment correlation coefficient (Pearson product-moment correlation coefficient) is used In two correlations between variable X and Y are measured, its value is between -1 and 1.In natural science field, the coefficient is used extensively Degree of correlation between two variables are measured.
Association rule algorithm:
Recommendation based on correlation rule is more often seen in e-commerce system, and is also proved to effective, its reality Meaning be have purchased some articles user be more likely to buy other articles, the head of the commending system based on correlation rule It is to excavate correlation rule to want target, that is, those are while the article set bought by many users, the thing in these set Product can mutually be recommended.The general conversion ratio of commending system based on correlation rule is higher, because when user has bought After some projects in frequent set, the possibility for buying sundry item in the frequent set is higher.But excavate project The correlation rule amount of calculation of set is larger, while there is also the sparse sex chromosome mosaicism of user data, reduces the accuracy rate of recommendation.
FP-Growth algorithms:
FP-Growth algorithms are the association analysis algorithms that Han Jiawei et al. was proposed in 2000, and it takes following plan of dividing and ruling Slightly:The database compressing of frequent item set will be provided to a frequent pattern tree (fp tree) (FP-tree), but still retain item collection related information. FP-tree is a kind of special prefix trees, is made up of frequent item head table and item prefix trees.FP-Growth algorithms are based on above Structure accelerates whole mining process.FP-Growth algorithms are compared with the Apriori algorithm in the frequent item set algorithm of Mining Association Rules For, database is excavated using divide-and-conquer strategy, candidate is not produced, it deposits the weight of database using FP-Tree Information is wanted, only database twice need to be scanned, then crucial information is stored in internal memory in the form of FP-Tree, it is to avoid The great expense incurred that Multiple-Scan database brings.
The content of the invention
Goal of the invention:Traditional project examine expert group be artificial selection out, just there is a problem in that:Select Expert group do not examined the project of similar scale, the plenty of time can be wasted;Agree between the panel member for electing Degree is not high, causes project to examine less efficient.For the problem that conventional method is present, the present invention passes through comprehensive analysis history item Expert's collection and history integrated project record set are examined, expert is examined using a kind of figure based on Pearson came similarity and FP-Growth Recommendation method, is that unexamined project recommendation compatible degree highest examines expert group.
Technical scheme:The present invention proposes that a kind of figure based on Pearson came similarity and FP-Growth examines expert recommendation method, Comprise the following steps:
Step 1:Treat the item attribute in inspection item and integrated project record set and be normalized pretreatment, it is described to treat Inspection item and integrated project represented by integrated project type, branch's item types of integrated project type and item attribute, Specific method is:
Step 1.1:Define comprehensive item types, branch's item types and item attribute;
Step 1.2:The maximum and minimum value of each item data in record integrated project record set item attribute;
Step 1.3:Data to integrated project record set and pending item attribute are normalized, specifically Formula is:
Anorm=(A-Amin)/(Amax-Amin)
In formula, AmaxAnd AminThe respectively maximum and minimum value of each item data of item attribute, A is the number before normalization According to AnormIt is the data after normalization.
Step 2:The data set after normalization is processed by Pearson came similarity based method is drawn and unexamined scale of the project Immediate ten projects, and ten examination experts of project are extracted, branch's item types that the examination expert passes through research Recorded with inspection item and represented, specific method is:
Step 2.1:Definition figure examines expert data collection and inspection item record set, and the figure is examined expert data and compiled with expert Number and branch's item types of expert's research represent that the figure examines expert data collection bullets and figure examines expert number table Show;
Step 2.2:The expert in inspection item record set is integrated according to bullets, obtains examining different item Purpose project review expert collects;
Step 2.3:The similarity of unexamined project and projects in integrated project record set is calculated, specific formula is:
In formula, simiIt is unexamined project and i-th similarity of project, XjAnd YijRespectively unexamined project and i-th The item attribute data set element of individual project;WithRespectively unexamined project and i-th item attribute data of project Average;
Step 2.4:To similar to being ranked up, preceding ten corresponding bullets of project and corresponding examination expert are extracted Collection, obtains final product pre-selection figure and examines expert's collection.
Step 3:Branch's item types and figure according to unexamined integrated project examine expert's research direction, to what is extracted Expert is combined, and obtains all alternative combinations expert collection, and specific method is:
Step 3.1:Examining expert's concentration rejecting from pre-selection figure has the expert of examination task;
Step 3.2:The expert obtained from step 3.1 concentrates Selecting research branch's item types with unexamined project branch Mesh type identical figure examines expert, and expert is represented according to branch's item types;
Step 3.3:If expert's collection that step 3.2 is obtained has unexamined project branch pattern does not have expert, it is directed to The project branch pattern, examines expert data and concentrates to find examining branch's item types and special without task from all figures Family adds;
Step 3.4:The expert obtained from step 3.3 at least extracts one specially in collecting corresponding each branch's item types Family, obtains final product all alternative combinations expert collection.
Step 4:Expert's collection is processed to be examined to history item using FP-Growth methods, Tu Shen expert groups sum of fundamental frequencies is obtained numerous Item collection;
Step 5:Calculate every kind of alternative special by every kind of expert combination self adaptation compatible degree method using frequent item set is combined The support of family's combination of sets, final support maximum is that compatible degree highest expert's combination of sets is the special of the unexamined project of participation Family collects, and specific method is:
Step 5.1:So that a kind of alternative combinations expert collects as an example, expert collection has n expert, from alternative combinations expert collection 1 expert of middle extraction, hasExtraction mode is planted, is concentrated from alternative combinations expert and is extracted 2 experts, hadPlant extraction side Formula, by that analogy, is drawn into n for expert always, hasExtraction mode is planted, i.e., all of extraction result is combined into Subset Collect, Subset is comprising collective numberThe compatible degree SValue of initialization alternative combinations expert's collection is 0;
Step 5.2:Traversal Subset, if the expert's combination after a kind of extraction in Subset is numerous in Tu Shen expert groups sum of fundamental frequencies In item collection, then the compatible degree of alternative combinations expert collection should add the expert's combination correspondence frequent item set after the extraction in step 5.1 In frequency combined with the expert after the extraction in expert's number product, i.e.,:
SValue=SValue+f*k
In formula, SValue be alternative combinations expert collection compatible degree, f be extract after expert combination correspondence frequent item set in Frequency, k is the product of the expert's number in the expert's combination after extracting, and traversal terminates, that is, alternative combinations are special in obtaining step 5.1 The final compatible degree of family's collection;
Step 5.3:The compatible degree that all alternative combinations experts collect, final compatible degree are calculated by step 5.1,5.2 methods Highest alternative combinations expert collection is the expert's collection for participating in unexamined project.
The present invention uses above-mentioned technical proposal, has the advantages that:The inventive method utilizes integrated project record set Expert's collection is examined with history item, a kind of compatible degree highest figure is effectively recommended and is examined expert's combination, improve the effect of examination Rate, specifically:The present invention carries out data mining using specialist examination history of project record, find syntagmatic between expert with Compatible degree, obtains the history item similar to unexamined project and examines that expert collects using Pearson came similarity algorithm, extracts this special Family concentrates the expert without examination task, and branch's project according to unexamined integrated project and specialist examination direction to treatment Expert afterwards is combined so that the expert that every kind of combination is included is and examined and expert as unexamined item class.Additionally, The present invention creatively proposes a kind of expert's combination compatible degree algorithm is used to calculate the compatible degree of every kind of expert's combination, compatible degree Highest expert group is the expert group of consequently recommended unexamined project, improves the efficiency of examination.
Brief description of the drawings
Fig. 1 examines expert recommendation method overall flow figure for figure;
Fig. 2 is project and examines the pretreatment of expert's related data and association rules method flow chart;
Fig. 3 is project related data normalized and similarity calculating method flow chart;
Fig. 4 is that expert combines method flow diagram;
Fig. 5 is the method flow for choosing compatible degree highest expert group sum in all alternative expert's combinations;
Fig. 6 is that every kind of expert combines self adaptation compatible degree method flow.
Specific embodiment
With reference to specific embodiment, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention Rather than limitation the scope of the present invention, after the present invention has been read, those skilled in the art are to various equivalences of the invention The modification of form falls within the application appended claims limited range.
Step 1:Treat the item attribute in inspection item and integrated project record set and be normalized pretreatment, it is described to treat Inspection item and integrated project represented by integrated project type, branch's item types of integrated project type and item attribute, It is specific as shown in Figure 2:
Step 1.1:Define G1,G2,G3,G4,G5Urban water supply draining, building dress in respectively comprehensive item types Decorations, residential architecture, building construction prospecting and individual event design Engineering, define B1,B2,B3,B4,B5,B6,B7Respectively branch's project Geotechnical engineering investigation, building, HVAC, electric, structure, plumbing and road class in type, and meet relation:
G1={ B1,B2,B3,B4,B5,B6,B7},G2={ B1,B2,B3,B4,B5,B6},G3={ B1,B2,B3,B4,B5,B6},G4 ={ B1,B2,B4,B5,B6},G5={ B1,B2,B4,B5,B6}
Step 1.2:Definition ProjectInfo is all integrated project data sets, ProjectInfo={ pr1,pr2,..., prA, pri={ idi,GB,Ari,Fli,Hii,Aci,Coi,AmiIt is single integrated project data set, wherein, A=Card (ProjectInfo), function Card () is used for set of computations number of elements, variable i ∈ [1, A], variable B ∈ [1,5], idiFor Bullets, GB,Ari,Fli,Hii,Aci,Coi,AmiRepresent that bullets is id respectivelyiProject comprehensive item types, Floor space, number of floor levels, building height, accounts receivable, formulation content and consumable quantity;
Step 1.3:Definition HP is pending project, and comprehensive item types are HPType, project data collection HPInfo= { HPType, HAr, HFl, HHi, HAc, HCo, HAm }, wherein, HAr, HFl, HHi, HAc, HCo, HAm is respectively accounting for for HP projects Ground area, number of floor levels, building height, accounts receivable, formulation content and consumable quantity;
Step 1.4:Define Armin,Flmin,Himin,Acmin,Comin,AmminProjectInfo respectively in step 1.2 The minimum value of middle Ar, Fl, Hi, Ac, Co, Am, Armax,Flmax,Himax,AcMax,Comax,AmmaxRespectively in step 1.2 The maximum of Ar in ProjectInfo, Fl, Hi, Ac, Co, Am, defines cyclic variable P, in traversal step 1.2 It is 1 that ProjectInfo, P assign initial value;
Step 1.5:As cyclic variable P≤A, then step 1.6 is gone to;Otherwise perform step 1.8;
Step 1.6:ArP=(ArP-Armin)/(Armax-Armin), FlP=(FlP-Flmin)/(Flmax-Flmin), HiP= (HiP-Himin)/(Himax-Himin), AcP=(AcP-Acmin)/(Acmax-Acmin), CoP=(CoP-Comin)/
(Comax-Comin), AmP=(AmP-Ammin)/(Ammax-Ammin);That is returning to the data in integrated project record set One change is processed;
Step 1.7:P=P+1 is made, step 1.5 is gone to;
Step 1.8:HAr=(HAr-Armin)/(Armax-Armin), HFl=(HFl-Flmin)/(Flmax-Flmin), HHi= (HHi-Himin)/(Himax-Himin), HAc=(HAc-Acmin)/(Acmax-Acmin), HCo=(HCo-Comin)/(Comax- Comin), HAm=(HAm-Ammin)/(Ammax-Ammin);Treat the normalized of the data of inspection item.
Step 2:The data set after normalization is processed by Pearson came similarity based method is drawn and unexamined scale of the project Immediate ten projects, and ten examination experts of project are extracted, branch's item types that the examination expert passes through research Recorded with inspection item and represented, it is specific as shown in Figure 3:
Step 2.1:Define ExpertInfo={ expertInfo1,expertInfo2,...,expertInfoEFor institute There is figure to examine expert data collection, expertInfoF={ MaF,BgIt is the careful expert data collection of single figure, ExpertAll={ Ma1, Ma2,...,MaEIt is the careful expert's numbering collection of all figures, wherein, E=Card (ExpertInfo), MaFFor figure examines expert's numbering, become Amount F ∈ [1, E], g ∈ [1,7], BgFor numbering is MaFFigure examines branch's item types of expert's research;
Step 2.2:CenSorOpinions is defined for figure examines expert inspection item record set, CenSorOpinions= {{id1,MaC1},{id1,MaC2},...,{idA,MaD1},{idA,MaD2, wherein, C1, C2, D1, D2 ∈ [1, E], N=Card (CenSorOpinions);
Step 2.3:To id in the CenSorOpinions data sets in step 2.2iMa numbers in identical data subset Changed according to item procession, obtain project review expert collection:
ExpertJoin={ expertJoin1,expertJoin2,...,expertJoinA, wherein, expertJoinb ={ { MaH,...,MaIFor numbering be idbPrbProject examines expert's collection, variable H, I ∈ [1, E], b ∈ [1, A];
Step 2.4:Define cyclic variable R, for traversal step 1.2 in all integrated project data sets ProjectInfo, X={ HAr, HFl, HHi, HAcc, HCo, HAm },simRFor in the ProjectInfo in step 1.2 Integrated project prRWith the similarity of pending project HP, Sim is similarity collection, wherein, R ∈ [1, A], it is 1, id that R assigns initial valueR It is single integrated project prRBullets, Sim assign initial value be
Step 2.5:As cyclic variable R≤A, then step 2.6 is performed;Otherwise go to step 2.9;
Step 2.6:Y={ ArR,FlR,HiR,AcR,CoR,AmR, wherein,
Step 2.7:Wherein, Xr1, Yr1X is represented respectively, in Y The r1 data item,X is represented respectively, the average value of element in Y,X is represented respectively, the average value of element in Y, Sim=Sim ∪ { idR,simR};
Step 2.8:R=R+1 is made, step 2.5 is gone to;
Step 2.9:Obtain Sim={ { id1,sim1},{id2,sim2},...,{idA,simAAfter be ranked up, obtain Orderly similarity collection Simi={ { idj1,aj1},{idj2,aj2},...,{idjA,ajA, wherein, aj1≥aj2≥...≥ajA, {idjt,ajt∈ Sim, jt, j1, j2, jA ∈ [1, A], SimProject={ { idj1,aj1},{idj2,aj2},...,{idj10, aj10}};
Step 2.10:Forecast is defined for pre-selection figure examines expert's collection, and assigns initial value and beCyclic variable V is defined, is used for It is 1 that SimProject in traversal step 2.9, V assign initial value, and defined variable T is that the figure of pre-selection examines expert's numbering;
Step 2.11:When cyclic variable V≤10, then step 2.12 is performed;Otherwise go to step 2.14;
Step 2.12:Project is made to examine expert's collection expertJoinjVFor bullets is idjVExamination expert collection,Pre-selection figure examines expert's collection
Step 2.13:V=V+1 is made, step 2.11 is gone to;
Step 2.14:Obtain pre-selection figure and examine expert collection Forecast={ Mam1,Mam2,...,Mamn, wherein, MamiFor pre- I-th data item that choosing figure is examined in expert's collection Forecast, Mami∈ ExpertAll, mi ∈ [1, E].
Step 3:Branch's item types and figure according to unexamined integrated project examine expert's research direction, to what is extracted Expert is combined, and obtains all alternative combinations expert collection, specific as shown in Figure 4:
Step 3.1:It is that the figure for having examination task examines expert's collection, Work={ Ma to define Worku1,Mau2,...,Maun, in advance Choosing figure examines expert collection Forecast=Forecast-Work, wherein, MauiI-th data item in for Work, Maui∈ ExpertAll, ui ∈ [1, E];
Step 3.2:Define comprehensive item types GN2={ E1,E2,...,EZ, GN2=E1, E2 ..., and EZ } it is to wait to participate in comprehensive item types GN2The figure of examination examines expert's collection, wherein, EJ is branch's item types E to be participated inJThe figure of examination examines expert's collection, and EJ assigns initial value and isGN2=HPtype, i.e., comprehensive project Type GN2It is the comprehensive item types HPtype, Z=Card (G of the pending project HP in step 2.1N2), Z ∈ [5,7], J ∈[1,Z];
Step 3.3:Cyclic variable Num1 is defined, Num2 is respectively intended in the GN2 and step 3.1 in traversal step 3.2 Forecast, and it is 1, Num3=Card (Forecast), E all to assign initial valueNum1Expert is examined for the GN2 figures in step 3.2 to concentrate The Num1 branch's item types, MaNum2For the Num2 figure examines expert's numbering in the Forecast in step 3.1;
Step 3.4:As cyclic variable Num1≤Z, then step 3.5 is performed;Otherwise go to step 3.17;
Step 3.5:As cyclic variable Num2≤Num3, then step 3.6 is performed;Otherwise go to step 3.10;
Step 3.6:Make BNum4It is numbering MaNum2Branch's item types of expert's research, { MaNum2:BNum4}∈ ExpertInfo, wherein, Num4 ∈ [1,7];
Step 3.7:Work as BNum4==ENum1When, i.e. numbering MaNum2Branch's item types of expert's research examine special with GN2 figures Family concentrates the Num1 branch's item types, then perform step 3.8;Otherwise go to step 3.9;
Step 3.8:GN2 figures in step 3.2 examine the Num1 data items ENum1=ENum1 ∪ that expert concentrates MaNum2
Step 3.9:Num2=Num2+1 is made, step 3.5 is gone to;
Step 3.10:WhenWhen, then perform step 3.11;Otherwise go to step 3.16;
Step 3.11:Cyclic variable c is defined, for the ExpertInfo in traversal step 1.3, in ExpertInfo the C data item expertInfoc={ Mac, ty }, wherein, ty is numbering MacBranch's item types of specialist examination, c assigns initial value It is 1;
Step 3.12:As cyclic variable c≤E, then step 3.13 is performed;Otherwise perform step 3.16;
Step 3.13:WhenAnd ty==ENum1When, then perform step 3.14;Otherwise perform step 3.15;
Step 3.14:ENum1=ENum1 ∪ Mac
Step 3.15:C=c+1 is made, step 3.12 is gone to;
Step 3.16:Num1=Num1+1 is made, step 3.4 is gone to;
Step 3.17:Obtain GN2={ E1, E2 ..., EZ }, EJ={ MaJ1,MaJ2,...,MaJnu, Nu=Card (EJ), J ∈ [1, Z]
Step 3.18:Define the figure that ExportCom is all alternative examination HP and examine expert's combination of sets, define Com for wherein A kind of figure of alternative examination HP examines expert's combination of sets;
Step 3.19:Define ComN3={ Q1,Q2,...,QN5, ExportCom={ Com1,Com2,...,ComN6, SN3 It is ComN3Support, SC={ S1,S2,...,SN6It is support collection, wherein, QN7Represent ComN3In the N7 figure examine special Family's numbering, QN7It is any one element in EN7, EN7 is the N7 data item in the GN2 in step 3.17,1≤N7≤Z, It is 1 that N5=Z, 1≤N3≤N6, N3 assign initial value, definition End is that the figure of the HP projects in final review step 2.1 examines expert's collection, and End assigns initial value and is
Step 4:Expert's collection is processed to be examined to history item using FP-Growth methods, Tu Shen expert groups sum of fundamental frequencies is obtained numerous Item collection, specifically:The project review expert in step 2.5 is collected using association rules method FP-Growth ExpertJoin treatment, obtains all Tu Shen expert groups and closes frequent item set Relationt, Relationt={ { relationt1: fr1},{relationt2:fr2},...,{relationtM:frM, wherein, relationtX1={ r1,r2,...,rj, rj∈ ExpertAll, 1≤j≤E, variable M=Card (Relationt), X1 ∈ [1, M], H1 ∈ [1, E], frx1Represent relationtX1Frequency.
The method flow step 51 of compatible degree highest expert combination arrives step 5.8 in all alternative expert's combinations of step 5, Specific such as Fig. 5 shows:
Step 5.1:N3 in step 3.19 is used for all alternative combinations expert collection ExportCom in traversal step 3.19, N6 in step 3.19 is the subset number of ExportCom;
Step 5.2:As N3≤N6,
Step 5.3:By the Com in step 3.19N3It is assigned to step X1That is in step 5.4.1 to step 5.4.14 ExpertHandle, Relationt are assigned to the Rel in step 5.4;
Step 5.4:Perform step X1, i.e. step 5.4.1 to step 5.4.14;
Step 5.5:By step X1, i.e. step 5.4.1 is assigned to S to step 5.4.14 implementing results SValueN3, SN3It is step The N3 element in SC in rapid 3.19;
Step 5.6:N3=N3+1;
Step 5.7:Make SN4It is value maximum in SC, ComN4Support be SN4, wherein, N4 ∈ [1, N6];
Step 5.8:Finally examined that the figure of HP projects examines expert collection End={ K1,K2,...,KZ, i.e. End=ComN4, Work=Work ∪ End, wherein,1≤q≤Z;
Step 5.4:Close frequent item set and the every kind of alternative expert group of self adaptation compatible degree method calculating is combined by every kind of expert The support of intersection, final support maximum is that compatible degree highest expert's combination of sets is the expert for participating in unexamined project Collection, it is specific as Fig. 6 shows:
Step 5.4.1:Definition figure examines expert combination of sets ExpertHandle={ Ma1,Ma2,...,MaNu, SValue is Frequent item set Rel={ { rel close in the support of ExpertHandle, all Tu Shen expert groups1:f1},{rel2:f2},..., {relM1:fM1, wherein, Nu=Card (ExpertHandle), M1=Card (Rel), it is 0 that SValue assigns initial value;
Step 5.4.2:Define Subset={ Sub1,Sub2,...,SubNu, Sub1={ Su11,Su12,...,Su1n1, Su1n1={ dkh, Sub2={ Su21,Su22,...,Su2n2, Su2n2={ dki,dkj, SubNu={ SuNu1, SuNu1= {dk1,dk2,...,dkNu, wherein, dkh,dki,dkj,dk1,dk2,...,dkNu∈ ExpertHandle, I.e. Subset is all of combined result after the expert and combination extracted from ExpertHandle, Sub1Be from 1 n1=Nu combined result collection of expert's composition of arbitrary extracting, Sub in ExpertHandle2It is from ExpertHandle 2 expert's compositions of arbitrary extractingIndividual combined result collection, SubNuIt is the Nu expert group of extraction from ExpertHandle Into only one combined result collection;
Step 5.4.3:Cyclic variable index1 is defined, for traveling through Subset, wherein, it is 1 that index1 assigns initial value;
Step 5.4.4:As cyclic variable index1≤Nu, then step 5.4.5 is performed;Otherwise perform step 5.4.14;
Step 5.4.5:Cyclic variable index2 is defined, for traveling through Subindex1, wherein, Suindex1index2Be from Subindex1I-th ndex2 set of middle taking-up, it is 1 that index2 assigns initial value;
Step 5.4.6:Work as cyclic variableWhen, then perform step 5.4.7;Otherwise perform step 5.4.13;
Step 5.4.7:Cyclic variable index3 is defined, for traveling through Rel, { rel is definedindex3:findex3It is Rel the Index3 set, wherein, it is 1 that variable i ndex3 assigns initial value;
Step 5.4.8:As cyclic variable index3≤M1, then step 5.4.9 is performed;Otherwise perform step 5.4.12;
Step 5.4.9:Work as Suindex1index2=relindex3When, then perform step 5.4.10;Otherwise perform step 5.4.11;
Step 5.4.10:SValue=SValue+findex3* the value of index1, i.e. SValue is updated to the value of SValue and adds On expert group's sum of fundamental frequencies number for specifying the product of expert's quantity is combined with the expert;
Step 5.4.11:Index3=index3+1, goes to step 5.4.8;
Step 5.4.12:Index2=index2+1, goes to step 5.4.6;
Step 5.4.13:Index1=index1+1, goes to step 5.4.4;
Step 5.4.14:Obtain SValue.
Wherein, Pearson came similarity based method is to carry out data analysis, FP- by the pretreated data set of item attribute Growth methods examine history item the treatment of expert's collection, obtain Tu Shen expert groups and close frequent item set, expert's combination compatible degree side Method calculates the support that every kind of expert combines, i.e. expert's combination compatible degree according to frequent item set.
Expert records to be examined to 65536 history items by PF-Growth methods and is associated rule digging, obtain figure Examine expert group and close frequent item set;Data compression and pretreatment are carried out to 20061 integrated project records, using Pearson came similarity Method is simultaneously extracted and the examination expert of immediate ten projects of unexamined scale of the project so that the expert for extracting is careful Looked into and expert as unexamined item class;The inventive method is similar compared with the artificial expert's combined result recommended in actual applications Degree reaches 82.13%, adopts rate and reaches 97.25%.

Claims (5)

1. a kind of figure based on Pearson came similarity and FP-Growth examines expert recommendation method, it is characterised in that including following step Suddenly:
Step 1:Treat the item attribute in inspection item and integrated project record set and be normalized pretreatment, it is described unexamined Project and integrated project are represented by integrated project type, branch's item types of integrated project type and item attribute;
Step 2:Is drawn in the data set treatment after normalization by Pearson came similarity based method most connect with unexamined scale of the project Ten near projects, and ten examination experts of project are extracted, the examination expert passes through the branch's item types studied and examines Item record is looked into represent;
Step 3:Branch's item types and figure according to unexamined integrated project examine expert's research direction, to the expert for extracting It is combined, obtains all alternative combinations expert collection;
Step 4:Expert's collection is processed to be examined to history item using FP-Growth methods, Tu Shen expert groups is obtained and is closed frequent item set;
Step 5:The every kind of alternative expert group of self adaptation compatible degree method calculating is combined by every kind of expert using frequent item set is combined The support of intersection, final support maximum is that compatible degree highest expert's combination of sets is the expert for participating in unexamined project Collection.
2. the figure based on Pearson came similarity and FP-Growth according to claim 1 examines expert recommendation method, its feature It is that the specific method of the step 1 is:
Step 1.1:Define comprehensive item types, branch's item types and item attribute;
Step 1.2:The maximum and minimum value of each item data in record integrated project record set item attribute;
Step 1.3:Data to integrated project record set and pending item attribute are normalized, specific formula For:
Anorm=(A-Amin)/(Amax-Amin)
In formula, AmaxAnd AminThe respectively maximum and minimum value of each item data of item attribute, A is the data before normalization, Anorm It is the data after normalization.
3. the figure based on Pearson came similarity and FP-Growth according to claim 1 examines expert recommendation method, its feature It is that the specific method of the step 2 is:
Step 2.1:Definition figure examines expert data collection and inspection item record set, the figure examine expert data with expert number with Branch's item types of expert's research represent that the figure examines expert data collection bullets and figure is examined expert's numbering and represented;
Step 2.2:The expert in inspection item record set is integrated according to bullets, obtains examining disparity items Project review expert collects;
Step 2.3:The similarity of unexamined project and projects in integrated project record set is calculated, specific formula is:
sim i = Σ j = 1 N ( X j - X ‾ ) ( Y i j - Y ‾ ) ( Σ j = 1 N ( X j - X ‾ ) 2 ) ( Σ j = 1 N ( Y i j - Y ‾ ) 2 )
In formula, simiIt is unexamined project and i-th similarity of project, XjAnd YijRespectively unexamined project and i-th project Item attribute data set element;WithRespectively unexamined project and i-th average of the item attribute data of project;
Step 2.4:Preceding ten corresponding bullets of project and corresponding examination expert collection are extracted to being ranked up to similar, Obtain final product pre-selection figure and examine expert's collection.
4. the figure based on Pearson came similarity and FP-Growth according to claim 1 examines expert recommendation method, its feature It is that the specific method of the step 3 is:
Step 3.1:Examining expert's concentration rejecting from pre-selection figure has the expert of examination task;
Step 3.2:The expert obtained from step 3.1 concentrates Selecting research branch's item types and unexamined project branch item class Type identical figure examines expert, and expert is represented according to branch's item types;
Step 3.3:If expert's collection that step 3.2 is obtained has unexamined project branch pattern does not have expert, for this Mesh branch pattern, examines expert data and concentrates to find examining branch's item types and expert without task adds from all figures Enter;
Step 3.4:The expert obtained from step 3.3 at least extracts an expert in collecting corresponding each branch's item types, i.e., Obtain all alternative combinations expert collection.
5. the figure based on Pearson came similarity and FP-Growth according to claim 1 examines expert recommendation method, its feature It is that the specific method of the step 5 is:
Step 5.1:So that a kind of alternative combinations expert collects as an example, expert collection has n expert, is concentrated from alternative combinations expert and taken out 1 expert is taken, is hadExtraction mode is planted, is concentrated from alternative combinations expert and is extracted 2 experts, hadExtraction mode is planted, with This analogizes, and n is drawn into always for expert, hasExtraction mode is planted, i.e., all of extraction result is combined into Subset collection, Subset is comprising collective numberThe compatible degree SValue of initialization alternative combinations expert's collection is 0;
Step 5.2:Traversal Subset, if the expert's combination after a kind of extraction in Subset closes frequent item set in Tu Shen expert groups In, then the compatible degree of alternative combinations expert collection should be plus in the expert's combination correspondence frequent item set after the extraction in step 5.1 Frequency combined with the expert after the extraction in expert's number product, i.e.,:
SValue=SValue+f*k
In formula, SValue is the compatible degree of alternative combinations expert collection, and f is the frequency in the expert's combination correspondence frequent item set after extracting Number, k is the product of the expert's number in the expert's combination after extracting, and traversal terminates, that is, obtain alternative combinations expert collection in step 5.1 Final compatible degree;
Step 5.3:The compatible degree that all alternative combinations experts collect, final compatible degree highest are calculated by step 5.1,5.2 methods Alternative combinations expert collection be participate in unexamined project expert collection.
CN201710034169.2A 2017-01-18 2017-01-18 Picture examination expert recommendation method based on Pearson similarity and FP-Growth Active CN106897370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710034169.2A CN106897370B (en) 2017-01-18 2017-01-18 Picture examination expert recommendation method based on Pearson similarity and FP-Growth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710034169.2A CN106897370B (en) 2017-01-18 2017-01-18 Picture examination expert recommendation method based on Pearson similarity and FP-Growth

Publications (2)

Publication Number Publication Date
CN106897370A true CN106897370A (en) 2017-06-27
CN106897370B CN106897370B (en) 2020-08-11

Family

ID=59197902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710034169.2A Active CN106897370B (en) 2017-01-18 2017-01-18 Picture examination expert recommendation method based on Pearson similarity and FP-Growth

Country Status (1)

Country Link
CN (1) CN106897370B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807978A (en) * 2017-10-26 2018-03-16 北京航空航天大学 A kind of code review person based on collaborative filtering recommends method
CN108984863A (en) * 2018-06-27 2018-12-11 淮阴工学院 A kind of layout design efficiency evaluation method based on direction distance with super-efficiency model
CN109062961A (en) * 2018-06-27 2018-12-21 淮阴工学院 A kind of expert's combination recommended method of knowledge based map
CN110162638A (en) * 2019-04-12 2019-08-23 淮阴工学院 A kind of expert's combination proposed algorithm based on figure vector
CN110442038A (en) * 2019-07-25 2019-11-12 南京邮电大学 Method is determined based on the thermal power unit operation optimization target values of FP-Growth algorithm
CN112100394A (en) * 2020-08-10 2020-12-18 淮阴工学院 Knowledge graph construction method for recommending medical experts
CN112100370A (en) * 2020-08-10 2020-12-18 淮阴工学院 Picture examination expert combined recommendation method based on text convolution and similarity algorithm
CN112199939A (en) * 2020-11-12 2021-01-08 深圳供电局有限公司 Intelligent recommendation method for evaluation experts and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110184926A1 (en) * 2010-01-26 2011-07-28 National Taiwan University Of Science & Technology Expert list recommendation methods and systems
CN106066873A (en) * 2016-05-30 2016-11-02 哈尔滨工程大学 A kind of travel information based on body recommends method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110184926A1 (en) * 2010-01-26 2011-07-28 National Taiwan University Of Science & Technology Expert list recommendation methods and systems
CN106066873A (en) * 2016-05-30 2016-11-02 哈尔滨工程大学 A kind of travel information based on body recommends method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIANG LIU ET AL.: "A robust model for paper reviewer assignment", 《RECSYS"14 PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMEND SYSTEMS》 *
余峰: "项目评审专家推荐方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
傅妍芳 等: "专家分配问题的KMP优化求解方法研究", 《西安工业大学学报》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807978A (en) * 2017-10-26 2018-03-16 北京航空航天大学 A kind of code review person based on collaborative filtering recommends method
CN108984863A (en) * 2018-06-27 2018-12-11 淮阴工学院 A kind of layout design efficiency evaluation method based on direction distance with super-efficiency model
CN109062961A (en) * 2018-06-27 2018-12-21 淮阴工学院 A kind of expert's combination recommended method of knowledge based map
CN108984863B (en) * 2018-06-27 2023-07-25 淮阴工学院 Drawing design efficiency evaluation method based on direction distance and super efficiency model
CN110162638B (en) * 2019-04-12 2023-06-20 淮阴工学院 Expert combination recommendation method based on graph vectors
CN110162638A (en) * 2019-04-12 2019-08-23 淮阴工学院 A kind of expert's combination proposed algorithm based on figure vector
CN110442038B (en) * 2019-07-25 2022-05-17 南京邮电大学 Thermal power generating unit operation optimization target value determination method based on FP-Growth algorithm
CN110442038A (en) * 2019-07-25 2019-11-12 南京邮电大学 Method is determined based on the thermal power unit operation optimization target values of FP-Growth algorithm
CN112100394A (en) * 2020-08-10 2020-12-18 淮阴工学院 Knowledge graph construction method for recommending medical experts
CN112100370A (en) * 2020-08-10 2020-12-18 淮阴工学院 Picture examination expert combined recommendation method based on text convolution and similarity algorithm
CN112100394B (en) * 2020-08-10 2023-07-21 淮阴工学院 Knowledge graph construction method for recommending medical expert
CN112100370B (en) * 2020-08-10 2023-07-25 淮阴工学院 Picture-trial expert combination recommendation method based on text volume and similarity algorithm
CN112199939A (en) * 2020-11-12 2021-01-08 深圳供电局有限公司 Intelligent recommendation method for evaluation experts and storage medium
CN112199939B (en) * 2020-11-12 2024-02-20 深圳供电局有限公司 Intelligent recommendation method and storage medium for review experts

Also Published As

Publication number Publication date
CN106897370B (en) 2020-08-11

Similar Documents

Publication Publication Date Title
CN106897370A (en) A kind of figure based on Pearson came similarity and FP Growth examines expert recommendation method
CN103778227B (en) The method screening useful image from retrieval image
CN104199832B (en) Banking network based on comentropy transaction community discovery method extremely
CN102033883B (en) A kind of method, Apparatus and system improving data transmission speed of website
CN103164540B (en) A kind of patent hotspot finds and trend analysis
CN102750286B (en) A kind of Novel decision tree classifier method processing missing data
CN103235812B (en) Method and system for identifying multiple query intents
CN109062961A (en) A kind of expert's combination recommended method of knowledge based map
CN106407349A (en) Product recommendation method and device
CN106156090A (en) A kind of designing for manufacturing knowledge personalized push method of knowledge based collection of illustrative plates (Man-tree)
CN104463601A (en) Method for detecting users who score maliciously in online social media system
CN106886872A (en) Method is recommended in a kind of logistics based on cluster and cosine similarity
CN110162638A (en) A kind of expert's combination proposed algorithm based on figure vector
CN105023178A (en) Main body-based electronic commercere commendation method
CN114399251A (en) Cold-chain logistics recommendation method and device based on semantic network and cluster preference
Molnar et al. China’s outward direct investment and its impact on the domestic economy
CN101697174A (en) Automatic simplifying and evaluating method of part model facing to steady-state thermal analysis
CN102722578A (en) Unsupervised cluster characteristic selection method based on Laplace regularization
CN108536825A (en) A method of whether identification source of houses data repeat
Kara et al. An integrated methodology to estimate the external environmental costs of products
CN100545840C (en) A kind of punching part sample researching method
CN101859328B (en) Exploitation method of remote sensing image association rule based on artificial immune network
CN116578569B (en) Satellite space-time track data association analysis method
CN110489665B (en) Microblog personalized recommendation method based on scene modeling and convolutional neural network
Prasad et al. Frequent pattern mining and current state of the art

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20170627

Assignee: Suqian Jiutian Information Technology Co.,Ltd.

Assignor: HUAIYIN INSTITUTE OF TECHNOLOGY

Contract record no.: X2021980010528

Denomination of invention: A drawing review expert recommendation method based on Pearson similarity and FP growth

Granted publication date: 20200811

License type: Common License

Record date: 20211011