CN111581368A

CN111581368A - Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network

Info

Publication number: CN111581368A
Application number: CN201910121716.XA
Authority: CN
Inventors: 曹聪; 张路; 刘燕兵; 曹亚男; 谭建龙; 郭莉
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2019-02-19
Filing date: 2019-02-19
Publication date: 2020-08-25

Abstract

The invention discloses an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network. The method comprises the following steps: 1) acquiring an expert information data set by using the selected expert personal information; 2) processing each piece of expert information in the expert information data set into a sentence constructed by a word sequence; 3) performing text representation on the expert information processed in the step 2) by using word vectors; 4) training a convolutional neural network by using a word vector corresponding to the expert information; 5) generating a word vector of the to-be-constructed image expert according to the text information of the to-be-constructed image expert, and classifying the word vector of the to-be-constructed image expert by utilizing the trained convolutional neural network to generate the user image of the to-be-constructed image expert. The invention has high accuracy in drawing the user portrait.

Description

Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network

Technical Field

The invention belongs to the field of text information processing and recognition, and particularly relates to an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network.

Background

User portraits (personas) are derived from objective requirements of development and operation of enterprises, and are tagged user models abstracted according to personal information of users. In the actual product development and operation process, the product development and operation process is often not plain due to a plurality of reasons such as unclear target user positioning, unclear user requirements and the like. To solve this problem, user profiling techniques have evolved. User profiling techniques can abstract user information, delineate user attributes, extract highly refined features from user information, and "tag" users based on these features to achieve tagging of users. The user portrait has many advantages in the aspects of positioning target users, reducing the divergence of developers, improving the product development efficiency and the like. Therefore, user portrayal techniques have attracted extensive attention in the industry and academia.

And the user image recommended by the intelligent expert is drawn, and the target population is the expert as the name suggests. With the increasing concern of the development of the scientific research field in China, various scientific research projects are countless, and experts in related fields applying for the scientific research projects are countless. Therefore, how to automatically measure the scientific research capability of experts and realize intelligent recommendation of experts to complete projects becomes an urgent problem to be solved. In the process of intelligently recommending experts, academic image classification aiming at numerous and complicated experts is an indispensable step.

The existing user portrait technical schemes are mainly divided into two main categories, namely qualitative description and classification-based methods. The former often obtains qualitative characteristics of the user through questionnaires, telephone interviews, and the like. For example, the technician first identifies a target group of people for whom a user profile is to be created and lists relevant questions, such as the user's age, gender, hobbies, etc., that can delineate the user profile of the target group, and then collects and records the responses of the target group to the questions by way of questionnaires, interview conversations, etc. And finally, on the basis of the answer of the target crowd, eliminating invalid redundant information, extracting highly refined features, realizing the labeling of the user, and establishing the user portrait of the target crowd.

The latter is usually to apply the classic classification algorithm in machine learning, such as k nearest neighbor, naive bayes, support vector machine, etc. to the user portrait technical solution. First, a large amount of data about text information, image information, and the like of the user is collected. The data itself often contains a large amount of abundant semantic information, so that the user portrait technical problem is naturally transformed into a text analysis problem or an image analysis problem. Then, a classification algorithm in machine learning is realized, such as k nearest neighbor, naive Bayes, support vector machine and the like. They have achieved satisfactory results in many areas of text mining, image analysis, and the like. Through the implementation of the algorithm or the implementation of the multiple algorithms in combination, a model is established and trained on the collected user data, finally, the trained model is used for extracting features from the user data, samples are classified, and then 'labels' and 'label' users are generated, and then the user portrait is sketched.

In the big data era, the data scale is continuously enlarged, and the data structure is increasingly complex. In the existing user portrait technical scheme, although qualitative description can draw the user portrait of the target crowd, the user portrait technical scheme has the defects of low efficiency, low user portrait accuracy, lack of persuasion and the like. These disadvantages have led to qualitative profiling methods that are increasingly unable to meet practical requirements. Although classification-based methods such as k nearest neighbor, naive Bayes, support vector machines and the like have a solid theoretical basis and a certain persuasion, the classification-based methods still have the defects of high cost, low accuracy and the like due to the need of manual feature extraction. In addition, due to the limitation of a plurality of external factors such as privacy, safety and the like, the data of experts in colleges and universities and scientific research institutions in China is lacked, and the construction of expert academic pictures is hindered.

Disclosure of Invention

The application provides an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network. The method is based on the basic idea that expert information (text form) is obtained through an open data acquisition technology, a large-scale expert data set is automatically constructed, a convolutional neural network technology which obtains excellent performances in many fields such as texts and images in recent years is utilized, highly refined features are automatically extracted from the expert information in a text classification mode and are marked for users, sample categories are output in a text classification mode, user labels are generated, and user portraits of experts are sketched.

The application proposal takes the relevant information (text form) of an expert as input, and outputs the category of the expert, namely 'label' through a series of data processing operations to outline the user portrait of the expert, thereby overcoming the defects of low efficiency, low user portrait accuracy and the like and being superior to the prior method in processing results. The algorithm flow chart of the present invention is shown in fig. 1.

A convolution neural network-based user image method facing intelligent expert recommendation comprises the following steps:

1) acquiring an expert information data set by using the selected expert personal information;

2) processing each piece of expert information in the expert information data set into a sentence constructed by a word sequence;

3) performing text representation on the expert information processed in the step 2) by using word vectors;

4) training a convolutional neural network by using a word vector corresponding to the expert information;

5) generating a word vector of the to-be-constructed image expert according to the text information of the to-be-constructed image expert, and classifying the word vector of the to-be-constructed image expert by utilizing the trained convolutional neural network to generate the user image of the to-be-constructed image expert.

Further, the convolutional neural network comprises a convolutional layer, a pooling layer and a full-connection layer which are connected in sequence; the convolution layer is used for performing convolution operation on input word vectors and extracting features for classification from the word vectors; the pooling layer is used for sampling the characteristics obtained by the convolutional layer; the full connection layer is used for acquiring the category of the corresponding expert according to the sampling value, namely constructing a user portrait of the expert; cross entropy is used as a loss function of the convolutional neural network.

Further, the pooling layer adopts a plurality of different window sizes to pool the features obtained by the convolution layer respectively, and the pooling results are spliced.

Further, the pooling layer is sampled at each window by a maximum pooling method.

Further, the operation formula of the convolution layer is c_i＝f(w_in·x_i:i+h-1+b_in) (ii) a Wherein the function f represents an activation function, c_i∈R^n-h+1Represents the convolution result, w_in∈R^hkConvolution kernels, x, being convolutional layers_i:i+h-1∈R^kLine i to line i + h-1, b representing word vectors_in∈R^n-h+1As a bias parameter, R^n-h+1Representing n-h +1 dimensional real number space, R^hkRepresenting real space in hk dimension, R^kAnd k is a real number space, n is a text length of the processed expert information, h is a convolution kernel size, and k is a word vector dimension.

Further, the operation formula of the full connection layer is that y is w_fX; where x is the output of the pooling layer, w_fThe weight of the edge between the pooling layer and the output layer is used, and y is the output result of the full-connection layer; in the training process of the convolutional neural network, the fully-connected layer adopts a dropout strategy, a plurality of nodes are randomly selected according to probability p in each iteration and do not participate in actual operation, a softmax function is used for calculation after the fully-connected layer output y is obtained, the maximum value of the softmax function value is selected as the category of a corresponding expert, and a user portrait of the expert is constructed and completed.

Further, in step 2), removing punctuation marks and invisible characters, Chinese word segmentation, stop word removal and low-frequency word removal are sequentially carried out on each piece of expert information, and the text information of the expert is processed into a sentence constructed by a word sequence.

Further, if the text information of the expert is larger than the set maximum text length, the text information of the expert is cut off to enable the text length to be equal to the set maximum text length; if the text information of the expert is smaller than the set maximum text length, filling the text information of the expert to enable the text length of the expert to be equal to the set maximum text length.

Further, the convolutional neural network is trained by adopting an Adam gradient descent method.

Further, the selected personal information of the experts is used for obtaining and incrementally updating the encyclopedia information of the experts of all universities and scientific research institutions in the encyclopedia to generate an expert information data set.

The user image method based on the convolutional neural network and recommended by the intelligent expert and based on the convolutional neural network needs to obtain expert data (in a text form) to extract highly refined features from the expert data, so that the expert is marked according to the extracted features, the tagging of the expert is realized, and the academic image of the expert is sketched. In addition, no open source data set about experts in colleges and universities and scientific research institutes in China is available at present. Therefore, to solve this problem, the present application proposes to autonomously construct large-scale datasets.

The creation of a portrait of an expert user is disturbed by the often large amount of "noisy" data in the real dataset. Therefore, the proposal of the application needs to preprocess the original data, remove the 'noise' data in the original data set, and make the data more easily extracted with refined and non-redundant features.

Because the expert information is stored in textual form, it needs to be converted to digital form in order to be used as input for training of the convolutional neural network. Therefore, after completing the preprocessing operation on the expert raw data set, the expert information will be text-represented by using the word vector to improve the effect. After the word vector corresponding to the expert information is obtained, the method realizes the construction of a convolutional neural network, trains a model on a data set and sketches the user portrait of the expert; after the model training is completed, the model of the application is scored on the test set to check the effect of the model.

Compared with the prior user portrait technical scheme, the application proposal has the following technical advantages:

1. according to the user portrait established based on the convolutional neural network method, parameter sharing is achieved through convolutional operation, pooling operation is achieved to reduce the number of parameters, model overfitting is avoided, the defects of low efficiency, low accuracy of the user portrait and the like are overcome, and the user portrait has the advantages that manual feature extraction is not needed; the requirement on data is loose, only a text form is needed, and the universality is high;

2. the data preprocessing operation, the method for constructing the word vector, the realized convolutional neural network structure and the like adopted by the proposal are simple, easy to realize and easy to use;

3. the proposal of the application overcomes the defects of low efficiency, low accuracy of user portrait, lack of persuasion and the like in the prior technical scheme of user portrait, delineates the user portrait in a quantitative representation mode, and has high accuracy and solid theoretical foundation.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of data preprocessing;

FIG. 3 is a schematic diagram of a word vector space;

FIG. 4 is a diagram of a convolutional neural network architecture;

FIG. 5 is a schematic diagram of a convolution operation;

FIG. 6 is a schematic diagram of nonlinear classification;

FIG. 7 is a schematic view of pooling;

FIG. 8 is a schematic view of a fully connected layer;

FIG. 9 is a schematic drawing of dropout;

FIG. 10 is a schematic diagram of gradient descent.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and examples.

As shown in fig. 1, the expert classification prefiltering algorithm mainly includes five key processes: data acquisition, data preprocessing, word vector space construction, model training and prediction. In the following, a specific embodiment of this algorithm will be described by way of elaborating the above five key processes, respectively.

The first process is as follows: data acquisition

As mentioned above, the present application proposes to delineate a user portrait for an expert. Therefore, information about the expert needs to be gathered to extract the feature "tag" the expert, delineating the user representation. It should be noted that the data of each expert should be finally represented by a text message, such as a text introduction about the expert, and in addition, the present application has no additional requirement on the information of the expert and has a certain versatility.

However, particularly, in the user image method proposed in the present application, which is based on the convolutional neural network and faces to the intelligent expert recommendation, due to the limitations of many external conditions such as security and privacy, no expert data set is published at home and abroad, and therefore, the user image research of the expert is hindered to some extent. Therefore, the proposal of the application uses partial existing personal information of experts as key words to carry out the query from hundred degrees, generates a data set according to the query result, and incrementally updates the hundred-degree entry information of the experts of various domestic colleges and universities and scientific research institutions in the hundred-degree encyclopedia. Finally, a large-scale data set of information for up to four million domestic experts is constructed. This data set is by far the first reliable data set, corpus, of expert information about the nation. The method can be used for the user portrait study of experts and other research fields, and has the characteristics of originality, large scale, high accuracy, automation, increment updating and the like.

And a second process: data pre-processing

In real data, there are often a lot of redundant information, default values and noise, and there may be abnormal points due to human errors. In addition, as for the data set adopted in the proposal of the application, due to the characteristics of the text information, the data set also has the defects of non-structure, no separators between words and the like which are not beneficial to extracting the characteristics. Therefore, data preprocessing is an essential loop in the expert classification prefiltering algorithm proposed in the present application.

Common data preprocessing operations include numerical normalization, data structuring, data de-redundancy, and the like. For the purpose of this application, data preprocessing operations such as stop word removal, punctuation and invisible character removal, Chinese word segmentation, etc. will be taken for the raw data set. Finally, the expert's text information is processed into a sentence constructed from a sequence of words, as shown in fig. 2.

The third process: constructing a word vector space

After data pre-processing, the original data set (textual information) needs to be represented in digital form as input to the convolutional neural network. There are many ways to represent text information into numbers, such as statistical word frequency, TF-IDF, word vector, etc. According to the data set, the text information of the expert is expressed by adopting word vectors according to the characteristics of large data set, rich semantic information and the like.

Word vectors, also known as word embedding, represent words in a corpus or vocabulary in the form of vectors, i.e. x_i∈R^k. Wherein x is_iWord vectors, R, representing the ith word in a corpus or vocabulary^kRepresenting a k-dimensional real space, as shown in fig. 3.

In this way, words in the original material library or vocabulary are mapped to points in vector space and can be used as input for training of the convolutional neural network model. In the actual development process, there are many technical models for obtaining word vectors, such as Skip-gram, CBOW, randomly generating word vectors and adjusting them continuously. According to the application, the word vector of the expert information is obtained by a method of randomly initializing the word vector and continuously adjusting according to the special characteristics (classifying experts) of the application field, and the word vector is used as input for the convolutional neural network.

In addition, it is worth noting that since text information is often not uniform in size, convolutional neural networks require that the input be of a fixed length (number of words). Therefore, the present application proposes to set the maximum length of the text of the expert information, that is, after the data preprocessing operation, the part of the text information of each expert, which is larger than the maximum length of the text, will be automatically truncated (dropped), and the part smaller than the maximum length of the text will be filled (filled with 0, or < UNK >, which means unknown), so as to finally realize that the length of the text information of each expert is the same. Finally, a useful byproduct of the word vector can be obtained at the end of model training.

The process four is as follows: training model

The convolutional neural network is one of the most representative network structures in deep learning, and overcomes the defects of various parameters and the like of the traditional neural network by methods of local connection, weight sharing, pooling and the like, so that excellent results are obtained in various fields of visual processing, natural language processing and the like.

Generally, a convolutional neural network includes network structures such as convolutional layers, activation function layers, pooling layers, full-link layers, and the like. The convolutional neural network structure adopted in the present application is shown in fig. 4. To facilitate understanding of the convolutional neural network structure used in the present application, the following describes the structure in detail.

The structure I is as follows: convolutional layer

After the word vectors are obtained as input, the convolution layer performs convolution operation on the word vectors by using convolution kernels (often a plurality of convolution kernels), so that features are extracted from the word vectors for classification, and an expert user portrait is constructed. The convolution layer has the following operation formula,

c_i＝f(w_in·x_i:i+h-1+b_in) (1)

wherein, c in the formula (1)_i∈R^n-h+1Represents the convolution result, w_in∈R^hkBeing a convolution kernel, x_i:i+h-1∈R^kLine i to line i + h-1 of the word vector, i.e. word i to word i + h-1 in the corpus or vocabulary, b_in∈R^n-h+1Is a bias parameter. Here, n denotes a text length of each bit of expert information after preprocessing, h denotes a convolution kernel size, and k is a word vector dimension. A schematic diagram of the convolution operation can be seen in fig. 5.

Further, the function f in the formula (1) represents an activation function. It will be readily apparent that formula w_in·x_i:i+h-1+b_inIn the case of a linear operation, the operation,can only be used in the linearly separable case. In practical applications, the classes are not linearly separable, and a nonlinear operation is needed to realize the classification, as shown in fig. 6. Therefore, to address this deficiency of neural networks, activation functions are generated as needed.

The activation function has many advantages such as nonlinearity, micromanipulation, monotonicity, etc., and is an essential loop in the neural network. Commonly used activation functions include Sigmoid functions, tanh functions, ReLU functions and other nonlinear functions. In the proposal of the application, the RELU function can better avoid adverse factors such as gradient disappearance and the like, and has better experimental effect. Therefore, the proposed application uses the RELU function as a non-linear function, which is expressed as follows

ReLU(x)＝max(0,x) (2)

The structure II is as follows: pooling layer

After the convolution layer completes convolution operation on the word vectors to extract the features for classification, the next step is to use the features to classify so as to realize classification pre-filtering of experts. However, the convolution still has too many features and related parameters, which results in too large a calculation amount and even an over-fitting phenomenon. Thus, the present application proposes to deploy the pooling layer after the convolutional layer to avoid the effects of the above-mentioned adverse factors.

What is referred to as pooling layer is simply understood as sampling the features obtained by the convolutional layer, i.e. we select one value at a time from a pooling window of a specified size, or all values in a region, as a sampling value in the whole window for further calculation.

The commonly used pooling methods have two schemes of maximum pooling and average pooling. The maximum pooling is to select the maximum value in a certain size of area as the sampling value of the whole window, and the experimental effect is generally better than the average pooling, as shown in fig. 7. Therefore, the present application proposes to use the maximum pooling as the pooling scheme, and the formula is as follows,

in addition, different from a common convolutional neural network architecture, in order to obtain the influence of the context in different window size ranges on the current word, the method adopts different window sizes to perform maximum pooling on the features obtained by the convolutional layers respectively, and finally realizes the improvement of the convolutional neural network in a mode of splicing pooling results. Too large a window will result in too large a calculation amount, and too small a window will not extract features effectively, so the window sizes proposed in the present application are 3, 4, and 5, respectively.

The structure is three: full connection layer

The fully-connected layer, as the name implies, is such that each node of the fully-connected layer is connected to each node of the previous layer, as shown in fig. 8. In the proposal of the application, the upper layer is the pooling layer, the full-connection layer is the output layer, and the formula is as follows

In the formula (4), the first and second groups,

for the output of the pooling layer, w_out∈R^lpAs a weight of the edge between the pooling layer and the output layer, b_out∈R^lFor bias between the pooling layer and the output layer, y ∈ R^lThe result of the output layer, i.e., the academic picture of the expert, is represented. Here, p is the dimension of the output of the pooling layer, and l is the number of categories of expert academic images.

In addition, the application proposes to adopt a dropout strategy in order to avoid the disadvantages that the weight parameters of the full connection layer are too many, calculation is difficult, and overfitting is easy to cause.

So called dropout, in the training process, every iteration randomly selects some nodes with probability p to not participate in the actual operation, as shown in fig. 9, and the second node of the input layer temporarily does not participate in the operation.

In this way, after the full connection layer output y is obtained, the corresponding category can be obtained by using the softmax function, namely, the user portrait of the completed expert is constructed. Wherein the softmax function is as follows,

in the formula, l represents the number of categories, y_iRepresenting the ith value of the output layer. It is easy to see that the result of the formula is a probability value. And calculating the value of the softmax function for all the values of the output layer, and selecting the maximum value as the category of the expert.

The structure is four: loss function and training method

After the model is determined, the next and final step is to determine the loss function and the training method.

The loss function is used to measure the predicted value of the model. It is a non-negative real-valued function, usually represented by the function L (y, f (x)). The smaller the loss function is, the better the robustness of the model is, i.e. the parameters are adjusted by the training method during the training process so that the value of the loss function is reduced. Commonly used loss functions are a mean absolute value loss function, a mean square error loss function, a cross entropy loss function, and the like. The experimental effect of the cross-entropy loss function in the convolutional neural network is better than that of other loss functions, and the cross-entropy loss function well reflects the difference between the expected output and the current actual output. Therefore, the present application proposes to use the commonly used cross entropy as the loss function, and the formula is as follows.

Here, N represents the number of samples. After the loss function is determined, the next step is to determine the training method. In the neural network, the adjustment optimization of the parameters is completed by gradient descent.

The gradient descent method is a first-order optimization algorithm, also commonly referred to as the steepest descent method. To find the local minimum value of a function by using the gradient descent method, iterative search must be performed to a distance point with a specified step length corresponding to the opposite direction of the gradient (or approximate gradient) on the function at the current point, as shown in the formula

Wherein the function f (x) is at point x₁Can be fine and defined, and gamma is the step size. It is easy to see that when gamma is>When 0 is a sufficiently small value, there is f (x)₁)≥f(x₂). The gradient descent diagram is shown in fig. 10.

However, since the model is too complex, the computation amount of calculating the gradient for all training samples is too large, and the academia and the industry often adopt an improved gradient descent method as a scheme for finding the optimal value or the local optimal value by the model. The commonly used modified gradient descent methods include a random gradient descent method, a batch gradient descent method, and an Adam gradient descent method. The Adam gradient descent method can calculate the adaptive learning rate of each parameter, so the proposal of the application adopts the Adam gradient descent method as a model optimization scheme.

Process five prediction

Finally, after the model training is finished, the method classifies the text information of the expert to be constructed the portrait on the data set by using the convolutional neural network model, and compares the text information with other user portrait technical schemes to check the user portrait effect of the expert.

The method and the device have no special requirements on the text information of the expert who treats the image to be constructed, and only need a segment of text description about the expert.

After inputting expert information, namely a text description about an expert, the application proposal carries out data preprocessing operation which is the same as the process 2 on the expert information, maps the data preprocessing operation to a word vector space (a byproduct of the application proposal) which is described in the process 3, and classifies the expert information by using a convolutional neural network model which is trained in the process 4, thereby constructing a user portrait of the expert.

In order to verify the performance of the convolutional neural network used in the present application in delineating the problem of portrait of an expert user, this section will delineate the effect of portrait of the user by comparing the convolutional neural network with other user portrait technical schemes on the same expert data set.

The hardware environment of the experiment in this section is 2.8GHz CPU, 506.3GB memory, 88 nuclear server, and the operating system is 64-bit Linux system.

Particularly, the data set of the experiment is limited by a plurality of external conditions such as safety, privacy and the like, and a data set about experts is not disclosed at home and abroad. Therefore, the data set of the experiment in this section is the encyclopedia information of experts in colleges and universities and scientific research institutes in China, which is crawled from encyclopedia. Finally, the data set for this experiment is shown in Table 1. The data set for this experiment will be in 9: the scale of 1 is divided into a training set and a test set.

TABLE 1 number of class samples

Categories	Sample number (digit)
		A1	1701
A2	1337
		A3	1940
A4	2061
		A5	1488
A6	4374
		A7	1055
A8	518
		A9	9490
A10	14428
		A11	791
A12	1933

Wherein the meanings of the category attributes in Table 1 are shown in Table 2

TABLE 2 family of taxonomy systems

The expert classification system of table 2 is self-constructed by the present application. The method comprehensively refers to the labels of related domestic industries, considers the academic titles with influence in China, constructs a hierarchical system according to the influence and has certain reliability. Each sample is the information of each expert, i.e. the text description of the expert as described in the previous paragraph.

In addition, in order to complete the stop word removal operation, a stop word table containing a large number of stop words is required for reference. The proposal of the application integrates several famous public decommissioning vocabularies on the network, such as a Baidu decommissioning vocabulary, a decommissioning vocabulary of Harbin university of industry, a decommissioning vocabulary of Sichuan university and the like.

Specifically, the model hyper-parameters proposed in the present application are shown in table 3 according to the data set characteristics proposed in the present application and the conventional setting scheme of the convolutional neural network hyper-parameters.

TABLE 3 model hyper-parameter table

In the experiment, word vectors are randomly generated according to uniformly distributed U (-1,1), various convolution kernels are used, a ReLU function is selected as an activation function, a cross entropy loss function is selected as a loss function of a model, an Adam gradient descent method is adopted as a training method of the model, and the initial learning rate is set to be 0.0001. The results of the experiment are shown in table 4.

TABLE 4 results of the experiment

The experimental analysis in this section is as follows:

as can be seen from Table 2, there were 12 expert groups in this section, and each sample belongs to one and only one group. Therefore, the classification is performed randomly according to a document about experts, and the correct result is about 1/12. As can be seen from Table 4, the accuracy of the convolutional neural network is much higher than that of the artificial random selection, and the final accuracy index is much higher than that of other user portrait technical schemes, which is satisfactory! For the above experimental results, the following specific analysis exists:

1) the application proposes to independently construct a large-scale data set of up to four million experts in colleges and universities and scientific research institutions in China. It is the first data set of expert information about the country, and has a certain originality. In addition, the data set has the characteristics of high accuracy, automatic acquisition, increment updating and the like;

2) the method and the device for representing the text information in the word vector mode are used for representing the text information, and well depict the similarity and the correlation between words, so that the implicit relation of the text information is well represented. In addition, the byproduct of the proposal, namely the word vector, can also be used in other related fields;

3) the convolutional neural network realizes parameter sharing through convolutional operation, reduces the number of parameters through pooling operation, avoids model overfitting, overcomes the defects of low efficiency, low accuracy of user portrait and the like, has the advantages of no need of manually extracting features and the like, and is obviously superior to other schemes in a final experimental result.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A convolution neural network-based user image method facing intelligent expert recommendation comprises the following steps:

2. The method of claim 1, in which the convolutional neural network comprises a convolutional layer, a pooling layer, and a fully-connected layer connected in sequence; the convolution layer is used for performing convolution operation on input word vectors and extracting features for classification from the word vectors; the pooling layer is used for sampling the characteristics obtained by the convolutional layer; the full connection layer is used for acquiring the category of the corresponding expert according to the sampling value, namely constructing a user portrait of the expert; cross entropy is used as a loss function of the convolutional neural network.

3. The method of claim 2, wherein the pooling layer pools the features obtained from the convolutional layer using a number of different window sizes, respectively, and concatenates the pooled results.

4. The method of claim 3, wherein the pooling layer is sampled using a maximum pooling method at each window.

5. The method of claim 2, wherein the convolutional layer has the formula c_i＝f(w_in·x_i:i+h-1+b_in) (ii) a Wherein the function f represents an activation function, c_i∈R^n-h+1Represents the convolution result, w_in∈R^hkConvolution kernels, x, being convolutional layers_i:i+h-1∈R^kLine i to line i + h-1, b representing word vectors_in∈R^n-h+1As a bias parameter, R^n-h+1Representing n-h +1 dimensional real number space, R^hkRepresenting real space in hk dimension, R^kAnd k is a real number space, n is a text length of the processed expert information, h is a convolution kernel size, and k is a word vector dimension.

6. The method of claim 2, wherein the operation formula of the fully-connected layer is y-w_fX; where x is the output of the pooling layer, w_fThe weight of the edge between the pooling layer and the output layer is used, and y is the output result of the full-connection layer; in the training process of the convolutional neural network, the fully-connected layer adopts a dropout strategy, a plurality of nodes are randomly selected according to probability p in each iteration without participating in actual operation, a softmax function is used for calculation after the fully-connected layer output y is obtained, the maximum value of the softmax function value is selected as the category of a corresponding expert, and the construction is finishedBecomes a user representation of the expert.

7. The method as claimed in claim 1, wherein in step 2), the processes of removing punctuation marks and invisible characters, Chinese word segmentation, stop word removal and low-frequency word removal are sequentially performed on each piece of expert information, and the text information of the expert is processed into a sentence constructed by a word sequence.

8. The method of claim 7, wherein if the text information of the expert is greater than the set maximum text length, the text information of the expert is truncated so that the text length thereof is equal to the set maximum text length; if the text information of the expert is smaller than the set maximum text length, filling the text information of the expert to enable the text length of the expert to be equal to the set maximum text length.

9. The method of claim 1, wherein the convolutional neural network is trained using an Adam gradient descent method.

10. The method of claim 1, wherein the selected expert personal information is used to obtain and incrementally update the encyclopedia information of the experts in the colleges and scientific research institutions in the encyclopedia to generate the expert information data set.