CN108595595A

CN108595595A - A kind of user knowledge requirement acquisition method calculated based on interactive differential evolution

Info

Publication number: CN108595595A
Application number: CN201810355129.2A
Authority: CN
Inventors: 郝佳; 杨念; 王国新; 阎艳; 徐灵艳; 丁少兵; 王宏伟; 王婧
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-04-19
Filing date: 2018-04-19
Publication date: 2018-09-28
Anticipated expiration: 2038-04-19
Also published as: CN108595595B

Abstract

The present invention discloses a kind of user knowledge requirement acquisition method calculated based on interactive differential evolution,Convert knowledge document to vector,Build text vector knowledge base,Existing document is divided into N classes using K means methods and generates one at random from every class,It generates N number of initial population document and is pushed to user,Then the calculating of differential evolution operator is carried out to initial population document,The corresponding fitness function of document is determined by the click behavior of user wherein in selection operation,The filial generation of generation is pushed to user again,Fitness function is determined according to the behavior of user and carries out the calculating of evolutionary operator again,The text data that user clicks when reaching user satisfaction,It can be used to the knowledge requirement model of training user,It is fitted the knowledge requirement of user,The present invention can be in the case where lacking user knowledge document scores data set,Solve the problems, such as user's cold start-up and project cold start-up simultaneously.

Description

A kind of user knowledge requirement acquisition method calculated based on interactive differential evolution

Technical field

The invention belongs to knowledge services fields, and in particular to a kind of user knowledge need calculated based on interactive differential evolution Seek acquisition methods.

Background technology

In the case where most enterprises possess huge knowledge quantity and good knowledge resource, how supplying system is utilized By knowledge active push to designer to improve the emphasis that enterprises production efficiency is current knowledge-intensive enterprise's concern.Knowing Know in supplying system, the task and behavioural information according to user push product design personnel.So how will be suitable The problem of knowledge document is pushed to new user and how new knowledge document (i.e. new projects) is pushed to the people of needs, that is, use Family/project (knowledge document) cold start-up problem is the critical issue of knowledge supplying system development.Existing cold start-up solution Known users score data is needed mostly, and can only solve the problems, such as one of user, project cold start-up.For product design field Knowledge supplying system cold start-up for, due to the missing of user-knowledge document-score data, existing cold start-up solution party Case is simultaneously not suitable for.

Invention content

In view of this, the present invention provides a kind of user knowledge demand acquisition sides calculated based on interactive differential evolution Method, can be under conditions of lacking user-knowledge document-score data, while solving user's cold start-up and project cold start-up is asked Topic.

Realize that technical scheme is as follows：

A kind of user knowledge requirement acquisition method calculated based on interactive differential evolution, is included the following steps：

Step 1: converting knowledge document to vector, structure vector data library；

Step 2: gathering the vector in vector data library for N classes, a vector is randomly selected from every class, obtains N number of kind Group's vector, is denoted as { x_i(g) }, wherein x_i(g) indicate in population i-th of g generations to

Amount, i=1,2 ..., N；G=0,1,2 ...；

Step 3: the corresponding knowledge document of population vector is pushed to user, setting user satisfaction index is clicking rate, If user is not less than 80% to the clicking rate of the knowledge document pushed, step 6 is executed, if clicking rate is less than 80%, executes step Rapid four-step 5；

Step 4: obtaining new N number of vector { u into row variation, intersection respectively to N number of population vector in g generations successively_i(g+ 1)}；

Using greedy algorithm from obtained { u_iAnd { x (g+1) }_i(g) } selection obtains next-generation population at individual in；

Wherein, f (u_i(g+1)) individual u is indicated_i(g+1) fitness value；

Step 5: calculating N number of g+1 successively for the vector in population vector and vector data library using cosine similarity algorithm Similarity, choose the knowledge document corresponding to most similar N number of vector, interaction times cumulative 1 execute step 3；

Step 6: being exported current N number of population vector as user knowledge demand model.

Further, the knowledge document includes pdf documents, word document and txt documents.

Further, in step 1, knowledge document is converted by vector using doc2vec (text vector conversion) technology.

Further, in step 2, vector is gathered for N classes using K-means methods.

Advantageous effect：

The present invention is for the first time combined evolution algorithm to obtain user knowledge demand with interactive mode；According to the click of user Behavior determines fitness function and carries out the calculating of evolutionary operator, the textual data that user clicks when reaching user satisfaction again According to, you can it is used for the knowledge requirement model of training user, is fitted the knowledge requirement of user.The present invention is realized in missing user- Under conditions of knowledge document-score data, while solving the problems, such as user's cold start-up and project cold start-up.

Description of the drawings

Fig. 1 is the method for the present invention flow chart.

Specific implementation mode

The present invention will now be described in detail with reference to the accompanying drawings and examples.

As shown in Figure 1, the present invention provides a kind of user knowledge demand acquisition sides calculated based on interactive differential evolution Method includes the following steps：

Step 1: being pre-processed to knowledge document using natural language processing technique, the knowledge document includes pdf texts Shelves, word document and txt documents, the pretreatment include participle and removal stop words；Then doc2vec (text vectors are utilized Conversion) technology converts pretreated knowledge document to vector.Knowledge document vector database is built, which is The basic data of bottom layer treatment, knowledge document are the data being presented to the user.

Step 2: determine user single browsing items quantity N, setting interaction times initial value actNum=1.Wherein browse Number of entries N refers to knowledge document number shown in the interface being presented to the user.Interaction times are defined as：Algorithm is often generated and is once pushed away Recommend list, knowledge document be just pushed to user, user carries out a series of click to the knowledge document of recommendation, using difference into Change the knowledge document that algorithm process user clicked and generates recommendation list again.Such algorithm push-user clicks Process become primary interaction.Vector in vector data library is gathered using K-means methods for N classes, is taken out at random from every class A vector is taken, N number of population vector is obtained, each vector is an individual, participates in subsequent evolution algorithm and calculates, is denoted as { x_i (g) }, wherein x_i(g) i-th of vectorial, i=1,2 in g generations in population is indicated ..., N；G=0,1,2 ...；

Step 3: the corresponding knowledge document of population vector is pushed to user, setting user satisfaction index is clicking rate CTR (Click-Through-Rate) executes step if user is not less than 80% to the clicking rate of the knowledge document pushed Six, if clicking rate is less than 80%, execute step 4-step 5；

Step 4: carrying out differential evolution operator calculating.Its specific steps are：

Mutation operation is carried out respectively to N number of population vector in g generations successively and obtains intermediate { v_i(g+1)}；

v_i(g+1)=x_r1(g)+F·x_r2(g)-x_r3(g)),i≠r1≠r2≠r3

Wherein, x_r1(g) it is to wait for variation vector, r1=1,2 ..., N, x in g generations_r2(g) and x_r3(g) it is g vectorial for population In two different vectors, r2=1,2 ..., N, r3=1,2 ..., N, r1 ≠ r2 ≠ r3, F are zoom factor, are a constants；

To the institute directed quantity { x in population_iAnd the obtained intermediate { v of variation (g) }_i(g+1) } crossover operation between progress individual Obtain new N number of vector { u_i(g+1)}；Wherein, u_i(g+1) each element u in_j,(g+1) it obtains according to the following formula, j=1,2 ..., D, wherein D are vector element sum：

Wherein, CR is crossover probability, is setting value；v_j,i(g+1) it is intermediate v_i(g+1) j-th of element, x_j,(g) it is x_i(g) j-th of element；If random number rand (0,1) is less than or equal to crossover probability CR, then the of i-th of g+1 generations vector J element takes j-th of element of intermediate；Conversely, then taking j-th of element of i-th of vector in g generations；

Wherein, f (u_i(g+1)) individual u is indicated_i(g+1) fitness value；

F (i) gets up the iterative process of evolution algorithm and user behavior informational linkage, user to the recommendation list of presentation into Row is clicked, and clicking document by user corresponds to fitness value 1, and the corresponding fitness value of the text being not clicked on is 0.

Step 6: being exported current N number of population vector as user knowledge demand model, and export current interaction time Number.

One-step 6 of above-mentioned steps is the detailed process that user knowledge demand obtains, and can know which user knows It is interested to know document.

In the present invention, population quantity N, zoom factor F and crossover probability CR can be tested according to real data collection, most The parameter combination (population quantity N, zoom factor F and crossover probability CR) for reaching user satisfaction in small interaction times is best Optimisation strategy value.

, can be by its knowledge requirement of this method quick obtaining when new user enters system, its knowledge requirement mould of training Suitable knowledge is pushed to new user by type in subsequent recommendation process, to solve the problems, such as new user's cold start-up.Work as new projects When into system, new projects' (i.e. new knowledge document) can be judged for which using trained user knowledge demand model Needed for a little users, and new knowledge is pushed to suitable user, to solve the problems, such as new projects' cold start-up.

In conclusion the above is merely preferred embodiments of the present invention, being not intended to limit the scope of the present invention. All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in the present invention's Within protection domain.

Claims

1. a kind of user knowledge requirement acquisition method calculated based on interactive differential evolution, which is characterized in that including following step Suddenly：

Step 2: gathering the vector in vector data library for N classes, a vector is randomly selected from every class, obtain N number of population to Amount, is denoted as { x_i(g) }, wherein x_i(g) i-th of vectorial, i=1,2 in g generations in population is indicated ..., N；G=0,1,2 ...；

Step 3: the corresponding knowledge document of population vector is pushed to user, setting user satisfaction index is clicking rate, if with Family is not less than 80% to the clicking rate of the knowledge document pushed, executes step 6, if clicking rate is less than 80%, executes step Four-step 5；

Step 4: obtaining new N number of vector { u into row variation, intersection respectively to N number of population vector in g generations successively_i(g+1)}；

Wherein, f (u_i(g+1)) individual u is indicated_i(g+1) fitness value；

Step 5: calculating phases of N number of g+1 for population vector and the vector in vector data library successively using cosine similarity algorithm Like degree, the knowledge document corresponding to most similar N number of vector is chosen, interaction times cumulative 1 execute step 3；

2. a kind of user knowledge requirement acquisition method calculated based on interactive differential evolution as described in claim 1, special Sign is that the knowledge document includes pdf documents, word document and txt documents.

3. a kind of user knowledge requirement acquisition method calculated based on interactive differential evolution as described in claim 1, special Sign is, in step 1, knowledge document is converted to vector using doc2vec technologies.

4. a kind of user knowledge requirement acquisition method calculated based on interactive differential evolution as described in claim 1, special Sign is, in step 2, is gathered vector for N classes using K-means methods.