CN111581368A - Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network - Google Patents

Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network Download PDF

Info

Publication number
CN111581368A
CN111581368A CN201910121716.XA CN201910121716A CN111581368A CN 111581368 A CN111581368 A CN 111581368A CN 201910121716 A CN201910121716 A CN 201910121716A CN 111581368 A CN111581368 A CN 111581368A
Authority
CN
China
Prior art keywords
expert
information
layer
neural network
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910121716.XA
Other languages
Chinese (zh)
Inventor
曹聪
张路
刘燕兵
曹亚男
谭建龙
郭莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201910121716.XA priority Critical patent/CN111581368A/en
Publication of CN111581368A publication Critical patent/CN111581368A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network. The method comprises the following steps: 1) acquiring an expert information data set by using the selected expert personal information; 2) processing each piece of expert information in the expert information data set into a sentence constructed by a word sequence; 3) performing text representation on the expert information processed in the step 2) by using word vectors; 4) training a convolutional neural network by using a word vector corresponding to the expert information; 5) generating a word vector of the to-be-constructed image expert according to the text information of the to-be-constructed image expert, and classifying the word vector of the to-be-constructed image expert by utilizing the trained convolutional neural network to generate the user image of the to-be-constructed image expert. The invention has high accuracy in drawing the user portrait.

Description

Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
Technical Field
The invention belongs to the field of text information processing and recognition, and particularly relates to an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network.
Background
User portraits (personas) are derived from objective requirements of development and operation of enterprises, and are tagged user models abstracted according to personal information of users. In the actual product development and operation process, the product development and operation process is often not plain due to a plurality of reasons such as unclear target user positioning, unclear user requirements and the like. To solve this problem, user profiling techniques have evolved. User profiling techniques can abstract user information, delineate user attributes, extract highly refined features from user information, and "tag" users based on these features to achieve tagging of users. The user portrait has many advantages in the aspects of positioning target users, reducing the divergence of developers, improving the product development efficiency and the like. Therefore, user portrayal techniques have attracted extensive attention in the industry and academia.
And the user image recommended by the intelligent expert is drawn, and the target population is the expert as the name suggests. With the increasing concern of the development of the scientific research field in China, various scientific research projects are countless, and experts in related fields applying for the scientific research projects are countless. Therefore, how to automatically measure the scientific research capability of experts and realize intelligent recommendation of experts to complete projects becomes an urgent problem to be solved. In the process of intelligently recommending experts, academic image classification aiming at numerous and complicated experts is an indispensable step.
The existing user portrait technical schemes are mainly divided into two main categories, namely qualitative description and classification-based methods. The former often obtains qualitative characteristics of the user through questionnaires, telephone interviews, and the like. For example, the technician first identifies a target group of people for whom a user profile is to be created and lists relevant questions, such as the user's age, gender, hobbies, etc., that can delineate the user profile of the target group, and then collects and records the responses of the target group to the questions by way of questionnaires, interview conversations, etc. And finally, on the basis of the answer of the target crowd, eliminating invalid redundant information, extracting highly refined features, realizing the labeling of the user, and establishing the user portrait of the target crowd.
The latter is usually to apply the classic classification algorithm in machine learning, such as k nearest neighbor, naive bayes, support vector machine, etc. to the user portrait technical solution. First, a large amount of data about text information, image information, and the like of the user is collected. The data itself often contains a large amount of abundant semantic information, so that the user portrait technical problem is naturally transformed into a text analysis problem or an image analysis problem. Then, a classification algorithm in machine learning is realized, such as k nearest neighbor, naive Bayes, support vector machine and the like. They have achieved satisfactory results in many areas of text mining, image analysis, and the like. Through the implementation of the algorithm or the implementation of the multiple algorithms in combination, a model is established and trained on the collected user data, finally, the trained model is used for extracting features from the user data, samples are classified, and then 'labels' and 'label' users are generated, and then the user portrait is sketched.
In the big data era, the data scale is continuously enlarged, and the data structure is increasingly complex. In the existing user portrait technical scheme, although qualitative description can draw the user portrait of the target crowd, the user portrait technical scheme has the defects of low efficiency, low user portrait accuracy, lack of persuasion and the like. These disadvantages have led to qualitative profiling methods that are increasingly unable to meet practical requirements. Although classification-based methods such as k nearest neighbor, naive Bayes, support vector machines and the like have a solid theoretical basis and a certain persuasion, the classification-based methods still have the defects of high cost, low accuracy and the like due to the need of manual feature extraction. In addition, due to the limitation of a plurality of external factors such as privacy, safety and the like, the data of experts in colleges and universities and scientific research institutions in China is lacked, and the construction of expert academic pictures is hindered.
Disclosure of Invention
The application provides an intelligent expert recommendation-oriented user image drawing method based on a convolutional neural network. The method is based on the basic idea that expert information (text form) is obtained through an open data acquisition technology, a large-scale expert data set is automatically constructed, a convolutional neural network technology which obtains excellent performances in many fields such as texts and images in recent years is utilized, highly refined features are automatically extracted from the expert information in a text classification mode and are marked for users, sample categories are output in a text classification mode, user labels are generated, and user portraits of experts are sketched.
The application proposal takes the relevant information (text form) of an expert as input, and outputs the category of the expert, namely 'label' through a series of data processing operations to outline the user portrait of the expert, thereby overcoming the defects of low efficiency, low user portrait accuracy and the like and being superior to the prior method in processing results. The algorithm flow chart of the present invention is shown in fig. 1.
A convolution neural network-based user image method facing intelligent expert recommendation comprises the following steps:
1) acquiring an expert information data set by using the selected expert personal information;
2) processing each piece of expert information in the expert information data set into a sentence constructed by a word sequence;
3) performing text representation on the expert information processed in the step 2) by using word vectors;
4) training a convolutional neural network by using a word vector corresponding to the expert information;
5) generating a word vector of the to-be-constructed image expert according to the text information of the to-be-constructed image expert, and classifying the word vector of the to-be-constructed image expert by utilizing the trained convolutional neural network to generate the user image of the to-be-constructed image expert.
Further, the convolutional neural network comprises a convolutional layer, a pooling layer and a full-connection layer which are connected in sequence; the convolution layer is used for performing convolution operation on input word vectors and extracting features for classification from the word vectors; the pooling layer is used for sampling the characteristics obtained by the convolutional layer; the full connection layer is used for acquiring the category of the corresponding expert according to the sampling value, namely constructing a user portrait of the expert; cross entropy is used as a loss function of the convolutional neural network.
Further, the pooling layer adopts a plurality of different window sizes to pool the features obtained by the convolution layer respectively, and the pooling results are spliced.
Further, the pooling layer is sampled at each window by a maximum pooling method.
Further, the operation formula of the convolution layer is ci=f(win·xi:i+h-1+bin) (ii) a Wherein the function f represents an activation function, ci∈Rn-h+1Represents the convolution result, win∈RhkConvolution kernels, x, being convolutional layersi:i+h-1∈RkLine i to line i + h-1, b representing word vectorsin∈Rn-h+1As a bias parameter, Rn-h+1Representing n-h +1 dimensional real number space, RhkRepresenting real space in hk dimension, RkAnd k is a real number space, n is a text length of the processed expert information, h is a convolution kernel size, and k is a word vector dimension.
Further, the operation formula of the full connection layer is that y is wfX; where x is the output of the pooling layer, wfThe weight of the edge between the pooling layer and the output layer is used, and y is the output result of the full-connection layer; in the training process of the convolutional neural network, the fully-connected layer adopts a dropout strategy, a plurality of nodes are randomly selected according to probability p in each iteration and do not participate in actual operation, a softmax function is used for calculation after the fully-connected layer output y is obtained, the maximum value of the softmax function value is selected as the category of a corresponding expert, and a user portrait of the expert is constructed and completed.
Further, in step 2), removing punctuation marks and invisible characters, Chinese word segmentation, stop word removal and low-frequency word removal are sequentially carried out on each piece of expert information, and the text information of the expert is processed into a sentence constructed by a word sequence.
Further, if the text information of the expert is larger than the set maximum text length, the text information of the expert is cut off to enable the text length to be equal to the set maximum text length; if the text information of the expert is smaller than the set maximum text length, filling the text information of the expert to enable the text length of the expert to be equal to the set maximum text length.
Further, the convolutional neural network is trained by adopting an Adam gradient descent method.
Further, the selected personal information of the experts is used for obtaining and incrementally updating the encyclopedia information of the experts of all universities and scientific research institutions in the encyclopedia to generate an expert information data set.
The user image method based on the convolutional neural network and recommended by the intelligent expert and based on the convolutional neural network needs to obtain expert data (in a text form) to extract highly refined features from the expert data, so that the expert is marked according to the extracted features, the tagging of the expert is realized, and the academic image of the expert is sketched. In addition, no open source data set about experts in colleges and universities and scientific research institutes in China is available at present. Therefore, to solve this problem, the present application proposes to autonomously construct large-scale datasets.
The creation of a portrait of an expert user is disturbed by the often large amount of "noisy" data in the real dataset. Therefore, the proposal of the application needs to preprocess the original data, remove the 'noise' data in the original data set, and make the data more easily extracted with refined and non-redundant features.
Because the expert information is stored in textual form, it needs to be converted to digital form in order to be used as input for training of the convolutional neural network. Therefore, after completing the preprocessing operation on the expert raw data set, the expert information will be text-represented by using the word vector to improve the effect. After the word vector corresponding to the expert information is obtained, the method realizes the construction of a convolutional neural network, trains a model on a data set and sketches the user portrait of the expert; after the model training is completed, the model of the application is scored on the test set to check the effect of the model.
Compared with the prior user portrait technical scheme, the application proposal has the following technical advantages:
1. according to the user portrait established based on the convolutional neural network method, parameter sharing is achieved through convolutional operation, pooling operation is achieved to reduce the number of parameters, model overfitting is avoided, the defects of low efficiency, low accuracy of the user portrait and the like are overcome, and the user portrait has the advantages that manual feature extraction is not needed; the requirement on data is loose, only a text form is needed, and the universality is high;
2. the data preprocessing operation, the method for constructing the word vector, the realized convolutional neural network structure and the like adopted by the proposal are simple, easy to realize and easy to use;
3. the proposal of the application overcomes the defects of low efficiency, low accuracy of user portrait, lack of persuasion and the like in the prior technical scheme of user portrait, delineates the user portrait in a quantitative representation mode, and has high accuracy and solid theoretical foundation.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of data preprocessing;
FIG. 3 is a schematic diagram of a word vector space;
FIG. 4 is a diagram of a convolutional neural network architecture;
FIG. 5 is a schematic diagram of a convolution operation;
FIG. 6 is a schematic diagram of nonlinear classification;
FIG. 7 is a schematic view of pooling;
FIG. 8 is a schematic view of a fully connected layer;
FIG. 9 is a schematic drawing of dropout;
FIG. 10 is a schematic diagram of gradient descent.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and examples.
As shown in fig. 1, the expert classification prefiltering algorithm mainly includes five key processes: data acquisition, data preprocessing, word vector space construction, model training and prediction. In the following, a specific embodiment of this algorithm will be described by way of elaborating the above five key processes, respectively.
The first process is as follows: data acquisition
As mentioned above, the present application proposes to delineate a user portrait for an expert. Therefore, information about the expert needs to be gathered to extract the feature "tag" the expert, delineating the user representation. It should be noted that the data of each expert should be finally represented by a text message, such as a text introduction about the expert, and in addition, the present application has no additional requirement on the information of the expert and has a certain versatility.
However, particularly, in the user image method proposed in the present application, which is based on the convolutional neural network and faces to the intelligent expert recommendation, due to the limitations of many external conditions such as security and privacy, no expert data set is published at home and abroad, and therefore, the user image research of the expert is hindered to some extent. Therefore, the proposal of the application uses partial existing personal information of experts as key words to carry out the query from hundred degrees, generates a data set according to the query result, and incrementally updates the hundred-degree entry information of the experts of various domestic colleges and universities and scientific research institutions in the hundred-degree encyclopedia. Finally, a large-scale data set of information for up to four million domestic experts is constructed. This data set is by far the first reliable data set, corpus, of expert information about the nation. The method can be used for the user portrait study of experts and other research fields, and has the characteristics of originality, large scale, high accuracy, automation, increment updating and the like.
And a second process: data pre-processing
In real data, there are often a lot of redundant information, default values and noise, and there may be abnormal points due to human errors. In addition, as for the data set adopted in the proposal of the application, due to the characteristics of the text information, the data set also has the defects of non-structure, no separators between words and the like which are not beneficial to extracting the characteristics. Therefore, data preprocessing is an essential loop in the expert classification prefiltering algorithm proposed in the present application.
Common data preprocessing operations include numerical normalization, data structuring, data de-redundancy, and the like. For the purpose of this application, data preprocessing operations such as stop word removal, punctuation and invisible character removal, Chinese word segmentation, etc. will be taken for the raw data set. Finally, the expert's text information is processed into a sentence constructed from a sequence of words, as shown in fig. 2.
The third process: constructing a word vector space
After data pre-processing, the original data set (textual information) needs to be represented in digital form as input to the convolutional neural network. There are many ways to represent text information into numbers, such as statistical word frequency, TF-IDF, word vector, etc. According to the data set, the text information of the expert is expressed by adopting word vectors according to the characteristics of large data set, rich semantic information and the like.
Word vectors, also known as word embedding, represent words in a corpus or vocabulary in the form of vectors, i.e. xi∈Rk. Wherein x isiWord vectors, R, representing the ith word in a corpus or vocabularykRepresenting a k-dimensional real space, as shown in fig. 3.
In this way, words in the original material library or vocabulary are mapped to points in vector space and can be used as input for training of the convolutional neural network model. In the actual development process, there are many technical models for obtaining word vectors, such as Skip-gram, CBOW, randomly generating word vectors and adjusting them continuously. According to the application, the word vector of the expert information is obtained by a method of randomly initializing the word vector and continuously adjusting according to the special characteristics (classifying experts) of the application field, and the word vector is used as input for the convolutional neural network.
In addition, it is worth noting that since text information is often not uniform in size, convolutional neural networks require that the input be of a fixed length (number of words). Therefore, the present application proposes to set the maximum length of the text of the expert information, that is, after the data preprocessing operation, the part of the text information of each expert, which is larger than the maximum length of the text, will be automatically truncated (dropped), and the part smaller than the maximum length of the text will be filled (filled with 0, or < UNK >, which means unknown), so as to finally realize that the length of the text information of each expert is the same. Finally, a useful byproduct of the word vector can be obtained at the end of model training.
The process four is as follows: training model
The convolutional neural network is one of the most representative network structures in deep learning, and overcomes the defects of various parameters and the like of the traditional neural network by methods of local connection, weight sharing, pooling and the like, so that excellent results are obtained in various fields of visual processing, natural language processing and the like.
Generally, a convolutional neural network includes network structures such as convolutional layers, activation function layers, pooling layers, full-link layers, and the like. The convolutional neural network structure adopted in the present application is shown in fig. 4. To facilitate understanding of the convolutional neural network structure used in the present application, the following describes the structure in detail.
The structure I is as follows: convolutional layer
After the word vectors are obtained as input, the convolution layer performs convolution operation on the word vectors by using convolution kernels (often a plurality of convolution kernels), so that features are extracted from the word vectors for classification, and an expert user portrait is constructed. The convolution layer has the following operation formula,
ci=f(win·xi:i+h-1+bin) (1)
wherein, c in the formula (1)i∈Rn-h+1Represents the convolution result, win∈RhkBeing a convolution kernel, xi:i+h-1∈RkLine i to line i + h-1 of the word vector, i.e. word i to word i + h-1 in the corpus or vocabulary, bin∈Rn-h+1Is a bias parameter. Here, n denotes a text length of each bit of expert information after preprocessing, h denotes a convolution kernel size, and k is a word vector dimension. A schematic diagram of the convolution operation can be seen in fig. 5.
Further, the function f in the formula (1) represents an activation function. It will be readily apparent that formula win·xi:i+h-1+binIn the case of a linear operation, the operation,can only be used in the linearly separable case. In practical applications, the classes are not linearly separable, and a nonlinear operation is needed to realize the classification, as shown in fig. 6. Therefore, to address this deficiency of neural networks, activation functions are generated as needed.
The activation function has many advantages such as nonlinearity, micromanipulation, monotonicity, etc., and is an essential loop in the neural network. Commonly used activation functions include Sigmoid functions, tanh functions, ReLU functions and other nonlinear functions. In the proposal of the application, the RELU function can better avoid adverse factors such as gradient disappearance and the like, and has better experimental effect. Therefore, the proposed application uses the RELU function as a non-linear function, which is expressed as follows
ReLU(x)=max(0,x) (2)
The structure II is as follows: pooling layer
After the convolution layer completes convolution operation on the word vectors to extract the features for classification, the next step is to use the features to classify so as to realize classification pre-filtering of experts. However, the convolution still has too many features and related parameters, which results in too large a calculation amount and even an over-fitting phenomenon. Thus, the present application proposes to deploy the pooling layer after the convolutional layer to avoid the effects of the above-mentioned adverse factors.
What is referred to as pooling layer is simply understood as sampling the features obtained by the convolutional layer, i.e. we select one value at a time from a pooling window of a specified size, or all values in a region, as a sampling value in the whole window for further calculation.
The commonly used pooling methods have two schemes of maximum pooling and average pooling. The maximum pooling is to select the maximum value in a certain size of area as the sampling value of the whole window, and the experimental effect is generally better than the average pooling, as shown in fig. 7. Therefore, the present application proposes to use the maximum pooling as the pooling scheme, and the formula is as follows,
Figure BDA0001972125060000071
in addition, different from a common convolutional neural network architecture, in order to obtain the influence of the context in different window size ranges on the current word, the method adopts different window sizes to perform maximum pooling on the features obtained by the convolutional layers respectively, and finally realizes the improvement of the convolutional neural network in a mode of splicing pooling results. Too large a window will result in too large a calculation amount, and too small a window will not extract features effectively, so the window sizes proposed in the present application are 3, 4, and 5, respectively.
The structure is three: full connection layer
The fully-connected layer, as the name implies, is such that each node of the fully-connected layer is connected to each node of the previous layer, as shown in fig. 8. In the proposal of the application, the upper layer is the pooling layer, the full-connection layer is the output layer, and the formula is as follows
Figure BDA0001972125060000072
In the formula (4), the first and second groups,
Figure BDA0001972125060000073
for the output of the pooling layer, wout∈RlpAs a weight of the edge between the pooling layer and the output layer, bout∈RlFor bias between the pooling layer and the output layer, y ∈ RlThe result of the output layer, i.e., the academic picture of the expert, is represented. Here, p is the dimension of the output of the pooling layer, and l is the number of categories of expert academic images.
In addition, the application proposes to adopt a dropout strategy in order to avoid the disadvantages that the weight parameters of the full connection layer are too many, calculation is difficult, and overfitting is easy to cause.
So called dropout, in the training process, every iteration randomly selects some nodes with probability p to not participate in the actual operation, as shown in fig. 9, and the second node of the input layer temporarily does not participate in the operation.
In this way, after the full connection layer output y is obtained, the corresponding category can be obtained by using the softmax function, namely, the user portrait of the completed expert is constructed. Wherein the softmax function is as follows,
Figure BDA0001972125060000074
in the formula, l represents the number of categories, yiRepresenting the ith value of the output layer. It is easy to see that the result of the formula is a probability value. And calculating the value of the softmax function for all the values of the output layer, and selecting the maximum value as the category of the expert.
The structure is four: loss function and training method
After the model is determined, the next and final step is to determine the loss function and the training method.
The loss function is used to measure the predicted value of the model. It is a non-negative real-valued function, usually represented by the function L (y, f (x)). The smaller the loss function is, the better the robustness of the model is, i.e. the parameters are adjusted by the training method during the training process so that the value of the loss function is reduced. Commonly used loss functions are a mean absolute value loss function, a mean square error loss function, a cross entropy loss function, and the like. The experimental effect of the cross-entropy loss function in the convolutional neural network is better than that of other loss functions, and the cross-entropy loss function well reflects the difference between the expected output and the current actual output. Therefore, the present application proposes to use the commonly used cross entropy as the loss function, and the formula is as follows.
Figure BDA0001972125060000081
Here, N represents the number of samples. After the loss function is determined, the next step is to determine the training method. In the neural network, the adjustment optimization of the parameters is completed by gradient descent.
The gradient descent method is a first-order optimization algorithm, also commonly referred to as the steepest descent method. To find the local minimum value of a function by using the gradient descent method, iterative search must be performed to a distance point with a specified step length corresponding to the opposite direction of the gradient (or approximate gradient) on the function at the current point, as shown in the formula
Figure BDA0001972125060000082
Wherein the function f (x) is at point x1Can be fine and defined, and gamma is the step size. It is easy to see that when gamma is>When 0 is a sufficiently small value, there is f (x)1)≥f(x2). The gradient descent diagram is shown in fig. 10.
However, since the model is too complex, the computation amount of calculating the gradient for all training samples is too large, and the academia and the industry often adopt an improved gradient descent method as a scheme for finding the optimal value or the local optimal value by the model. The commonly used modified gradient descent methods include a random gradient descent method, a batch gradient descent method, and an Adam gradient descent method. The Adam gradient descent method can calculate the adaptive learning rate of each parameter, so the proposal of the application adopts the Adam gradient descent method as a model optimization scheme.
Process five prediction
Finally, after the model training is finished, the method classifies the text information of the expert to be constructed the portrait on the data set by using the convolutional neural network model, and compares the text information with other user portrait technical schemes to check the user portrait effect of the expert.
The method and the device have no special requirements on the text information of the expert who treats the image to be constructed, and only need a segment of text description about the expert.
After inputting expert information, namely a text description about an expert, the application proposal carries out data preprocessing operation which is the same as the process 2 on the expert information, maps the data preprocessing operation to a word vector space (a byproduct of the application proposal) which is described in the process 3, and classifies the expert information by using a convolutional neural network model which is trained in the process 4, thereby constructing a user portrait of the expert.
In order to verify the performance of the convolutional neural network used in the present application in delineating the problem of portrait of an expert user, this section will delineate the effect of portrait of the user by comparing the convolutional neural network with other user portrait technical schemes on the same expert data set.
The hardware environment of the experiment in this section is 2.8GHz CPU, 506.3GB memory, 88 nuclear server, and the operating system is 64-bit Linux system.
Particularly, the data set of the experiment is limited by a plurality of external conditions such as safety, privacy and the like, and a data set about experts is not disclosed at home and abroad. Therefore, the data set of the experiment in this section is the encyclopedia information of experts in colleges and universities and scientific research institutes in China, which is crawled from encyclopedia. Finally, the data set for this experiment is shown in Table 1. The data set for this experiment will be in 9: the scale of 1 is divided into a training set and a test set.
TABLE 1 number of class samples
Categories Sample number (digit)
A1 1701
A2 1337
A3 1940
A4 2061
A5 1488
A6 4374
A7 1055
A8 518
A9 9490
A10 14428
A11 791
A12 1933
Wherein the meanings of the category attributes in Table 1 are shown in Table 2
TABLE 2 family of taxonomy systems
Figure BDA0001972125060000091
Figure BDA0001972125060000101
The expert classification system of table 2 is self-constructed by the present application. The method comprehensively refers to the labels of related domestic industries, considers the academic titles with influence in China, constructs a hierarchical system according to the influence and has certain reliability. Each sample is the information of each expert, i.e. the text description of the expert as described in the previous paragraph.
In addition, in order to complete the stop word removal operation, a stop word table containing a large number of stop words is required for reference. The proposal of the application integrates several famous public decommissioning vocabularies on the network, such as a Baidu decommissioning vocabulary, a decommissioning vocabulary of Harbin university of industry, a decommissioning vocabulary of Sichuan university and the like.
Specifically, the model hyper-parameters proposed in the present application are shown in table 3 according to the data set characteristics proposed in the present application and the conventional setting scheme of the convolutional neural network hyper-parameters.
TABLE 3 model hyper-parameter table
Figure BDA0001972125060000102
In the experiment, word vectors are randomly generated according to uniformly distributed U (-1,1), various convolution kernels are used, a ReLU function is selected as an activation function, a cross entropy loss function is selected as a loss function of a model, an Adam gradient descent method is adopted as a training method of the model, and the initial learning rate is set to be 0.0001. The results of the experiment are shown in table 4.
TABLE 4 results of the experiment
Figure BDA0001972125060000103
Figure BDA0001972125060000111
The experimental analysis in this section is as follows:
as can be seen from Table 2, there were 12 expert groups in this section, and each sample belongs to one and only one group. Therefore, the classification is performed randomly according to a document about experts, and the correct result is about 1/12. As can be seen from Table 4, the accuracy of the convolutional neural network is much higher than that of the artificial random selection, and the final accuracy index is much higher than that of other user portrait technical schemes, which is satisfactory! For the above experimental results, the following specific analysis exists:
1) the application proposes to independently construct a large-scale data set of up to four million experts in colleges and universities and scientific research institutions in China. It is the first data set of expert information about the country, and has a certain originality. In addition, the data set has the characteristics of high accuracy, automatic acquisition, increment updating and the like;
2) the method and the device for representing the text information in the word vector mode are used for representing the text information, and well depict the similarity and the correlation between words, so that the implicit relation of the text information is well represented. In addition, the byproduct of the proposal, namely the word vector, can also be used in other related fields;
3) the convolutional neural network realizes parameter sharing through convolutional operation, reduces the number of parameters through pooling operation, avoids model overfitting, overcomes the defects of low efficiency, low accuracy of user portrait and the like, has the advantages of no need of manually extracting features and the like, and is obviously superior to other schemes in a final experimental result.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A convolution neural network-based user image method facing intelligent expert recommendation comprises the following steps:
1) acquiring an expert information data set by using the selected expert personal information;
2) processing each piece of expert information in the expert information data set into a sentence constructed by a word sequence;
3) performing text representation on the expert information processed in the step 2) by using word vectors;
4) training a convolutional neural network by using a word vector corresponding to the expert information;
5) generating a word vector of the to-be-constructed image expert according to the text information of the to-be-constructed image expert, and classifying the word vector of the to-be-constructed image expert by utilizing the trained convolutional neural network to generate the user image of the to-be-constructed image expert.
2. The method of claim 1, in which the convolutional neural network comprises a convolutional layer, a pooling layer, and a fully-connected layer connected in sequence; the convolution layer is used for performing convolution operation on input word vectors and extracting features for classification from the word vectors; the pooling layer is used for sampling the characteristics obtained by the convolutional layer; the full connection layer is used for acquiring the category of the corresponding expert according to the sampling value, namely constructing a user portrait of the expert; cross entropy is used as a loss function of the convolutional neural network.
3. The method of claim 2, wherein the pooling layer pools the features obtained from the convolutional layer using a number of different window sizes, respectively, and concatenates the pooled results.
4. The method of claim 3, wherein the pooling layer is sampled using a maximum pooling method at each window.
5. The method of claim 2, wherein the convolutional layer has the formula ci=f(win·xi:i+h-1+bin) (ii) a Wherein the function f represents an activation function, ci∈Rn-h+1Represents the convolution result, win∈RhkConvolution kernels, x, being convolutional layersi:i+h-1∈RkLine i to line i + h-1, b representing word vectorsin∈Rn-h+1As a bias parameter, Rn-h+1Representing n-h +1 dimensional real number space, RhkRepresenting real space in hk dimension, RkAnd k is a real number space, n is a text length of the processed expert information, h is a convolution kernel size, and k is a word vector dimension.
6. The method of claim 2, wherein the operation formula of the fully-connected layer is y-wfX; where x is the output of the pooling layer, wfThe weight of the edge between the pooling layer and the output layer is used, and y is the output result of the full-connection layer; in the training process of the convolutional neural network, the fully-connected layer adopts a dropout strategy, a plurality of nodes are randomly selected according to probability p in each iteration without participating in actual operation, a softmax function is used for calculation after the fully-connected layer output y is obtained, the maximum value of the softmax function value is selected as the category of a corresponding expert, and the construction is finishedBecomes a user representation of the expert.
7. The method as claimed in claim 1, wherein in step 2), the processes of removing punctuation marks and invisible characters, Chinese word segmentation, stop word removal and low-frequency word removal are sequentially performed on each piece of expert information, and the text information of the expert is processed into a sentence constructed by a word sequence.
8. The method of claim 7, wherein if the text information of the expert is greater than the set maximum text length, the text information of the expert is truncated so that the text length thereof is equal to the set maximum text length; if the text information of the expert is smaller than the set maximum text length, filling the text information of the expert to enable the text length of the expert to be equal to the set maximum text length.
9. The method of claim 1, wherein the convolutional neural network is trained using an Adam gradient descent method.
10. The method of claim 1, wherein the selected expert personal information is used to obtain and incrementally update the encyclopedia information of the experts in the colleges and scientific research institutions in the encyclopedia to generate the expert information data set.
CN201910121716.XA 2019-02-19 2019-02-19 Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network Pending CN111581368A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910121716.XA CN111581368A (en) 2019-02-19 2019-02-19 Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910121716.XA CN111581368A (en) 2019-02-19 2019-02-19 Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN111581368A true CN111581368A (en) 2020-08-25

Family

ID=72118727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910121716.XA Pending CN111581368A (en) 2019-02-19 2019-02-19 Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111581368A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967949A (en) * 2020-09-22 2020-11-20 武汉博晟安全技术股份有限公司 Leaky-Conv & Cross-based safety course recommendation engine sorting algorithm
CN112434965A (en) * 2020-12-04 2021-03-02 广东电力信息科技有限公司 Expert label generation method, device and terminal based on word frequency
CN113468203A (en) * 2021-04-29 2021-10-01 华东师范大学 Financial user image drawing method based on recurrent neural network and attention mechanism
CN113535820A (en) * 2021-07-20 2021-10-22 贵州电网有限责任公司 Electrical operating personnel attribute presumption method based on convolutional neural network
WO2022120975A1 (en) * 2020-12-10 2022-06-16 中国科学院深圳先进技术研究院 Document searching method and apparatus, and electronic device
CN115033699A (en) * 2022-07-07 2022-09-09 建信基金管理有限责任公司 Fund user classification method and device
CN115470414A (en) * 2022-11-03 2022-12-13 安徽商信政通信息技术股份有限公司 United celebrity recommendation method and recommendation system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017113232A1 (en) * 2015-12-30 2017-07-06 中国科学院深圳先进技术研究院 Product classification method and apparatus based on deep learning
CN108399230A (en) * 2018-02-13 2018-08-14 上海大学 A kind of Chinese financial and economic news file classification method based on convolutional neural networks
CN109102341A (en) * 2018-08-27 2018-12-28 寿带鸟信息科技(苏州)有限公司 A kind of old man's portrait method for the service of supporting parents

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017113232A1 (en) * 2015-12-30 2017-07-06 中国科学院深圳先进技术研究院 Product classification method and apparatus based on deep learning
CN108399230A (en) * 2018-02-13 2018-08-14 上海大学 A kind of Chinese financial and economic news file classification method based on convolutional neural networks
CN109102341A (en) * 2018-08-27 2018-12-28 寿带鸟信息科技(苏州)有限公司 A kind of old man's portrait method for the service of supporting parents

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
袁莎等: "开放互联网中的学者画像技术综述", vol. 55, no. 9, pages 1903 - 1919 *
陈敏: "认知计算导论", 30 April 2017, 华中科技大学出版社, pages: 299 - 303 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967949A (en) * 2020-09-22 2020-11-20 武汉博晟安全技术股份有限公司 Leaky-Conv & Cross-based safety course recommendation engine sorting algorithm
CN112434965A (en) * 2020-12-04 2021-03-02 广东电力信息科技有限公司 Expert label generation method, device and terminal based on word frequency
WO2022120975A1 (en) * 2020-12-10 2022-06-16 中国科学院深圳先进技术研究院 Document searching method and apparatus, and electronic device
CN113468203A (en) * 2021-04-29 2021-10-01 华东师范大学 Financial user image drawing method based on recurrent neural network and attention mechanism
CN113535820A (en) * 2021-07-20 2021-10-22 贵州电网有限责任公司 Electrical operating personnel attribute presumption method based on convolutional neural network
CN115033699A (en) * 2022-07-07 2022-09-09 建信基金管理有限责任公司 Fund user classification method and device
CN115470414A (en) * 2022-11-03 2022-12-13 安徽商信政通信息技术股份有限公司 United celebrity recommendation method and recommendation system

Similar Documents

Publication Publication Date Title
CN111291185B (en) Information extraction method, device, electronic equipment and storage medium
CN108897857B (en) Chinese text subject sentence generating method facing field
CN108875051B (en) Automatic knowledge graph construction method and system for massive unstructured texts
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
WO2020224097A1 (en) Intelligent semantic document recommendation method and device, and computer-readable storage medium
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN110210468B (en) Character recognition method based on convolutional neural network feature fusion migration
CN108959305A (en) A kind of event extraction method and system based on internet big data
CN116450796B (en) Intelligent question-answering model construction method and device
CN116303977B (en) Question-answering method and system based on feature classification
CN114238653A (en) Method for establishing, complementing and intelligently asking and answering knowledge graph of programming education
CN113434688A (en) Data processing method and device for public opinion classification model training
CN115329120A (en) Weak label Hash image retrieval framework with knowledge graph embedded attention mechanism
CN111428502A (en) Named entity labeling method for military corpus
CN115062123A (en) Knowledge base question-answer pair generation method of conversation generation system
CN118013038A (en) Text increment relation extraction method based on prototype clustering
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism
CN113837307A (en) Data similarity calculation method and device, readable medium and electronic equipment
CN117371481A (en) Neural network model retrieval method based on meta learning
CN115934944A (en) Entity relation extraction method based on Graph-MLP and adjacent contrast loss
CN111666375A (en) Matching method of text similarity, electronic equipment and computer readable medium
CN116595170A (en) Medical text classification method based on soft prompt

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200825

RJ01 Rejection of invention patent application after publication