CN112650861A

CN112650861A - Personality prediction method, system and device based on task layering

Info

Publication number: CN112650861A
Application number: CN202011598147.7A
Authority: CN
Inventors: 权小军; 余玉洁
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-04-13

Abstract

The invention discloses a personality prediction method, a system and a device based on task layering, wherein the method comprises the following steps: performing parallel coding on user posts based on a preset BERT pre-training language model to obtain a feature vector; fusing user posts through a graph convolution network to obtain graph convolution output vectors; predicting external characteristic information of a user; completing a prediction task in a layering manner to obtain personality information; and returning the external feature vectors and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result. The system comprises: the system comprises a parallel coding module, a graph convolution network fusion module, an external data migration pre-training module, a layered self-attention personality prediction module and a message returning module. The device comprises a memory and a processor for executing the personality prediction method based on task hierarchy. By using the invention, a more accurate personality prediction result is obtained. The invention can be widely applied to the field of text processing.

Description

Personality prediction method, system and device based on task layering

Technical Field

The invention relates to the field of text processing, in particular to a personality prediction method, system and device based on task layering.

Background

In deep learning research, the natural language processing direction is a subdivided field where information is encoded and mined for text content. The personality prediction problem is a cross-domain frontier problem that includes natural language processing technology and psychological knowledge, and attempts to predict the personality characteristics and personality labels of users by using texts (such as posts) published by the users on social media. In the prior art, in the process of performing early text fusion, posts of a user are spliced front and back to form a long text for feature extraction. But this introduces unnecessary or even erroneous timing information between posts, and the relevant information between posts is greatly weakened under the influence of long text processing.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide a personality prediction method, system and device based on task layering, which can independently and concurrently encode a plurality of texts, thereby improving the utilization rate of effective information and making the prediction result more accurate.

The first technical scheme adopted by the invention is as follows: a personality prediction method based on task layering comprises the following steps:

acquiring user posts and carrying out parallel coding on the user posts based on a preset BERT pre-training language model to obtain a feature vector;

fusing user posts through a graph convolution network, constructing a topological graph based on the characteristic vector and obtaining a graph convolution output vector;

predicting external feature information of a user according to the topological graph based on a pre-trained external feature model to obtain an external feature vector;

completing a personality prediction task of four dimensions in a layering mode according to the external feature vector and the graph convolution output vector to obtain personality information;

and returning the external characteristic information and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result.

Further, the step of obtaining the user posts and performing parallel coding on the user posts based on a preset BERT pre-training language model to obtain the feature vectors specifically includes:

setting the sample processing number, the maximum value of the number of posts belonging to the user and the vector of the number of effective posts by taking the user as a sample unit to obtain preset information;

constructing a two-dimensional vector corresponding to a user according to preset information;

and parallelly transmitting the two-dimensional vectors corresponding to the user into a preset BERT pre-training language model for coding to obtain the feature vectors of the posts.

Further, the step of fusing the user posts through a graph convolution network, constructing a topological graph based on the feature vectors, and obtaining a graph convolution output vector specifically includes:

and taking the user posts as graph nodes, constructing a corresponding topological graph for each user sample based on the feature vector to which the user posts belong, and obtaining a convolved output vector.

Further, let feature vector Embed_ijThe Pmax X d dimension node feature matrix is X, the adjacent matrix between the nodes is A, and the calculation method of the convolved output vector is as follows:

wherein

I is a unit matrix of the image data,

is that

Degree matrix of (H)⁽⁰⁾＝X，W^(l)Is a parameter matrix, the convolution layer number l is 2, the adjacent matrix A is a random initialization symmetric matrix,

the operation is on the adjacency matrixPerforming a Laplace normalization transform, said H^(l+1)And representing the output vector after the node feature convolution.

Further, the method for constructing the pre-trained external feature model specifically comprises the following steps:

acquiring an external data set containing user post data and tags of user age, gender and the like, and training an external feature model according to the user post data, the corresponding tags of the user age and the gender to obtain model parameters;

and fixing the model parameters to obtain a first layer of external feature model with the function of predicting the age and the gender of the user.

Further, the step of completing the personality prediction task of four dimensions in a layering manner according to the external feature vector and the graph convolution output vector to obtain personality information specifically comprises:

based on a second-layer encoder, performing fusion processing on the graph convolution output vector, the age vector and the gender vector to complete an attention direction prediction task and a judgment mode prediction task to obtain a second-layer feature vector;

outputting a vector, an age vector, a gender vector and a second-layer feature vector to a graph convolution based on a third-layer encoder, and completing a life style prediction task and a cognitive style prediction task to obtain a third-layer feature vector;

and obtaining a personality information prediction result of the user according to the second layer feature vector and the third layer feature vector.

Further, the second-layer feature vector comprises an attention direction feature vector and a judgment mode feature vector, and the third-layer feature vector comprises a life mode feature vector and a cognitive mode feature vector.

Further, the step of returning the external feature vector and the personality information to the graph convolution network and predicting again until reaching the preset returning times to obtain the personality prediction result specifically includes:

respectively calculating the return retention weights of the age vector, the gender vector, the attention direction characteristic vector, the judgment mode characteristic vector, the life mode characteristic vector and the cognition mode characteristic vector;

calculating according to the return reservation weight of the feature vector to obtain a reservation message vector;

the reserved message vector and the graph convolution vector are transmitted back to the graph convolution network and the graph convolution network is updated, and the updated graph convolution network is obtained;

and predicting the personality information again based on the updated graph convolution network until the preset number of return times is reached to obtain a personality prediction result.

The second technical scheme adopted by the invention is as follows: a personality prediction system based on task hierarchy, comprising:

the parallel coding module is used for acquiring user posts and carrying out parallel coding on the user posts based on a preset BERT pre-training language model to obtain feature vectors;

the graph convolution network fusion module is used for fusing the user posts through a graph convolution network, constructing a topological graph based on the characteristic vector and obtaining a graph convolution output vector;

the external data migration pre-training module is used for predicting external feature information of the user according to the topological graph based on a pre-trained external feature model to obtain an external feature vector;

the hierarchical self-attention personality prediction module is used for completing the personality prediction tasks of four dimensions in a hierarchical mode according to the external feature vectors and the graph convolution output vectors to obtain personality information;

and the message returning module is used for returning the external characteristic vector and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result.

The third technical scheme adopted by the invention is as follows: a personality prediction device based on task hierarchy, comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement a personality prediction method based on task hierarchy as described above.

The method, the system and the device have the advantages that: the invention can independently and parallelly encode a plurality of texts in the text encoding stage, can combine the feature vectors of different posts, excavate the correlation between posts about user character features and personality information, abstract deeper text semantics, improve the utilization rate of effective information, and divide the tasks with different prediction difficulties into different levels by dividing the prediction task of each dimension of the personality into levels, so that the task with better and simpler prediction result can complete training more quickly, and the result is transmitted to the next layer to provide additional reliable information for the task prediction of the next layer, thereby obtaining more accurate prediction result.

Drawings

FIG. 1 is a flowchart illustrating the steps of task hierarchy based personality prediction in accordance with the present invention;

FIG. 2 is a block diagram of a personality prediction system based on task hierarchy according to the present invention;

FIG. 3 is a schematic diagram of a model architecture of an embodiment of the present invention;

FIG. 4 is a flow chart of data processing according to an embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

Referring to fig. 1 and 3, the present invention provides a personality prediction method based on task hierarchy, which includes the following steps:

s1, obtaining user posts and carrying out parallel coding on the user posts based on a preset BERT pre-training language model to obtain feature vectors;

s2, fusing the user posts through a graph convolution network, constructing a topological graph based on the feature vectors and obtaining graph convolution output vectors;

s3, predicting external feature information of the user according to the topological graph based on the pre-trained external feature model to obtain an external feature vector;

s4, completing four-dimensional personality prediction tasks in a layering mode according to the external feature vectors and the graph convolution output vectors to obtain personality information;

and S5, returning the external characteristic information and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result.

As a preferred embodiment of the method, the step of obtaining the user posts and performing parallel coding on the user posts based on a preset BERT pre-training language model to obtain the feature vectors specifically includes:

Specifically, a dimension reduction method is adopted, the number of samples processed by the model each time is set to be batch _ size, the maximum value of the number of posts to which the user belongs is Pmax, a vector post _ mask is set, and the number of effective posts of a single sample is recorded. If the number of single sample posts does not reach Pmax, then use "[ pad]The "value is filled in. Let the jth post of the ith sample be post_ij。

Will be [ batch _ size, Pmax ] originally]Is reduced to [ batch _ size ] Pmax]The vectors are parallelly transmitted into a large-scale pre-training language model BERT for coding to obtain a d-dimensional feature vector (Embed) of each post_ij)：

Embed_ij＝BERT(post_ij)，d＝768

Each feature vector can reflect semantic information of a corresponding post in the calculation process of the high-dimensional space. Wherein post filled is referred to according to post _ mask_ijIs the eigenvector Embedded_ijThe weight is set to 0 in the subsequent operation. After the coding operation is finished, the size is [ batch _ size ] Pmax [ ]]The vector of (b) is expanded to [ batch _ size, Pmax [ ]]The two-dimensional vector of (2) thus satisfies the requirement of encoding single posts in parallel by taking the user as a unit.

Further, as a preferred embodiment of the method, the step of fusing the user posts through a graph convolution network, constructing a topological graph based on the feature vectors, and obtaining a graph convolution output vector specifically includes:

Further, as a preferred embodiment of the method, an eigenvector Embed is set_ijThe Pmax X d dimension node feature matrix is X, the adjacent matrix between the nodes is A, and the calculation method of the convolved output vector is as follows:

wherein

I is a unit matrix of the image data,

is that

the operation is to perform a laplace normalized transform on the adjacency matrix. After the convolution calculation is finished, the obtained characteristic information in the adjacent matrix A represents the correlation information between the nodes. Node characteristic matrix volumeThe integrated output vector H^(l+1)Then, each node fuses the feature vectors of the related nodes adjacent to itself. Specifically, the personality prediction task is to try to find the topological relation among nodes by taking a user as an object and each post as a node through graph convolution calculation so as to integrally depict the personality information of the user.

Further, as a preferred embodiment of the method, the method for constructing the pre-trained external feature model specifically includes the following steps:

Specifically, as the first layer of the hierarchical model, the model is migrated to the task of personality prediction, and by predicting age and gender first and then combining these characteristic information, a more accurate prediction about personality characteristics is obtained.

In addition, in the later personality prediction process, in order to ensure that the age and gender prediction capability is not influenced by the personality prediction data set, a resampling mechanism is introduced, sampling is carried out on the external data set again, training of age and gender prediction is carried out, the prediction capability of the model is ensured to be maintained at a high accuracy rate all the time, the use effect of the external features is ensured, and then training of personality prediction is carried out on the model continuously.

As a preferred embodiment of the method, the step of completing the personality prediction task of four dimensions hierarchically according to the external feature vector and the graph convolution output vector to obtain the personality information specifically includes:

Specifically, the second and third layers of the model are based on the four-dimensional personality prediction task of the MBTI index.

The two prediction tasks corresponding to the second layer are an IE task (attention direction) and a TF task (judgment mode), and by utilizing corresponding encoders IE _ Encoder and TF _ Encoder, the image convolution output vector and the age vector and the gender vector of the previous layer are taken as input, and the information of the lower layer is fused, so that the feature vectors of the two tasks of the current layer are obtained. The respective classifiers (IE _ Classifier, TF _ Classifier) are further trained to obtain predicted IE and TF prediction results.

The two predictive tasks corresponding to the third layer are the JP task (lifestyle) and the NS task (cognitive style). The principle of this layer is substantially the same as that of the second layer.

In the process of fusing vectors output by different characteristics and modules, the module adopts a self-attention mechanism to perform coding calculation. And performing parallel splicing on the graph convolution output vector and the low-level feature vectors (age, gender, IE, TF and the like) to obtain a vector X, and performing linear transformation to obtain a query vector Q, a key vector K and a value vector V.

Q＝W^QH₀

K＝W^KH₀

V＝W^VH₀

And obtaining a final code vector through scaling and nonlinear transformation:

wherein d is_kIs the eigenvector dimension used to perform the scaling of the dot product.

In the module, through a self-attention mechanism, the model dynamically learns the weights of different connections among vectors, and can more flexibly process the relation of feature vectors of different modules and different levels. And the design of the hierarchical model separates tasks with different difficulties into different levels for calculation, so that the data flow is more efficient, and the model has better interpretability.

Further as a preferred embodiment of the method, the second-layer feature vector includes an attention direction feature vector and a judgment mode feature vector, and the third-layer feature vector includes a life mode feature vector and a cognitive mode feature vector.

Further, as a preferred embodiment of the method, the step of returning the external feature vector and the personality information to the graph convolution network and predicting again until the preset number of return times is reached to obtain the personality prediction result specifically includes:

in particular, based on 6 feature vectors of different prediction tasks, respectively an age vector h_ageGender vector h_genderIE eigenvectors h_IETF feature vector h_TFJP feature vector h_JPAnd NS feature vector h_NS. The module designs a message return mechanism, and the feature vector obtained in one turn and the graph convolution vector h are multiplied_GCNAnd under the condition of mutual dominance, the interaction and the update of the data are carried out again.

calculating the return reserve weight of each feature vector:

where o belongs to { age, genre, IE, TF, JP, NS }.

the final reservation message vector is:

next, the graph is convolved with a vector h_GCNCombined message vector h^oAnd carrying out the next updating:

h′_GCN＝f_mp(h_GC_N：h^o)

wherein [:]indicating a splicing operation, f_mpThe calculation includes a fully connected network layer and a RELU activation function.

Updated vector h_GCNAnd then the feature vectors are transmitted into the hierarchical model module again for training to obtain a new round of feature vectors of each prediction task. The message returning operation can control error transmission, information sharing and fusion can be performed among different tasks, information originally on a high layer of the model can be utilized by a low layer, and different positive feedback effects are obtained under the control of a loss function. In the actual operation process of the model, a fixed number of pass-back times can be set.

The specific data processing and model training and prediction processes in this patent are described below:

a series of hyper-parameters, such as an age/gender task accuracy threshold P and a message return time N, are predefined. And D, defining the current processing data set as D, wherein D can be used for taking the original personality prediction data set and the external age and gender data set. The flow chart refers to fig. 4:

1) model initialization, data set D is external age and gender data set, initial message return times n is 0, identifier Flag is False

2) The data is sampled from D and processed to a length and format acceptable to the model.

3) And the graph convolution text fusion module obtains a text fusion vector by adopting a fusion method of parallel coding of a BERT pre-training language model and graph convolution.

4) And training self-attention encoders of the age task and the gender task according to the fusion vector, and comparing the data labels to obtain the accuracy rate p of the two tasks.

5) It is determined whether the identifier Flag is true. If true, the feature vectors are passed into the next layer model. Otherwise, judging whether the accuracy rate P reaches a preset threshold value P.

6) If P does not reach the preset threshold P, the gradient updates the parameters of the first tier age and gender encoder. If the threshold has been met or exceeded, indicating that the first layer has been trained sufficiently, the data set D is replaced with a personality predicted data set and the identifier is set to true.

7) When the identifier Flag is true, the feature vector is transmitted to a higher-level model, and IE/TF prediction (attention direction/judgment mode) and JP/NS prediction (life mode/cognitive mode) are performed once to obtain feature vectors of respective dimensions.

8) And judging whether the current return times N are less than N. And if the sum of the weights is less than N, adding 1 to N, carrying out message return calculation, carrying out weight analysis on the feature vector, and carrying out iteration by combining the graph convolution vector. If N is larger than or equal to N, the message is returned and the personality prediction parameters of the second layer and the third layer of the model are updated according to the current prediction result.

9) And judging whether to perform resampling operation. And if resampling is carried out, modifying the sampling data set into an external age and gender data set, and retraining the prediction capability of the age and gender of the model and consolidating the parameters.

10) The objective loss function used is focal loss. And continuously updating the model parameters according to the steps until the prediction result is converged.

As shown in fig. 2, a personality prediction system based on task hierarchy includes:

and the message returning module is used for returning the external characteristic information and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result.

The contents in the above method embodiments are all applicable to the present system embodiment, the functions specifically implemented by the present system embodiment are the same as those in the above method embodiment, and the beneficial effects achieved by the present system embodiment are also the same as those achieved by the above method embodiment.

A personality prediction device based on task hierarchy:

at least one processor;

at least one memory for storing at least one program;

The contents in the above method embodiments are all applicable to the present apparatus embodiment, the functions specifically implemented by the present apparatus embodiment are the same as those in the above method embodiments, and the advantageous effects achieved by the present apparatus embodiment are also the same as those achieved by the above method embodiments.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A personality prediction method based on task layering is characterized by comprising the following steps:

and returning the external feature vectors and the personality information to the graph convolution network and predicting again until the preset returning times are reached to obtain a personality prediction result.

2. The task hierarchy-based personality prediction method of claim 1, wherein the step of obtaining user posts and encoding the user posts in parallel based on a pre-defined BERT pre-training language model to obtain feature vectors comprises:

3. The task hierarchy-based personality prediction method of claim 2, wherein the step of fusing the user posts through a graph convolution network, constructing a topological graph based on the feature vectors and obtaining a graph convolution output vector comprises:

4. The personality prediction method based on task hierarchy of claim 3, wherein an eigenvector Embed is set_ijThe Pmax X d dimension node feature matrix is X, the adjacent matrix between the nodes is A, and the calculation method of the convolved output vector is as follows:

wherein

I is a unit matrix of the image data,

is that

the operation is to perform a Laplace normalization transform on the adjacency matrix, H^(l+1)And representing the output vector after the node feature convolution.

5. The personality prediction method based on task layering as claimed in claim 4, wherein the construction method of the pre-trained external feature model specifically comprises the following steps:

6. The personality prediction method based on task layering according to claim 5, wherein the step of completing the personality prediction task of four dimensions hierarchically according to the external feature vector and the graph convolution output vector to obtain personality information specifically comprises:

7. The personality prediction method based on task layering of claim 6, wherein the second-layer feature vectors comprise attention direction feature vectors and judgment mode feature vectors, and the third-layer feature vectors comprise life mode feature vectors and cognitive mode feature vectors.

8. The personality prediction method based on task hierarchy of claim 7, wherein the step of returning the extrinsic feature vectors and the personality information to the graph convolution network and re-predicting until reaching a preset number of return times to obtain the personality prediction result specifically comprises:

9. A personality prediction system based on task hierarchy, comprising:

10. A personality prediction device based on task hierarchy, comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement a task hierarchy-based personality prediction method as claimed in any one of claims 1-7.