CN111477329B

CN111477329B - Method for evaluating psychological state based on image-text combination

Info

Publication number: CN111477329B
Application number: CN202010241782.3A
Authority: CN
Inventors: 王冲冲; 杨菲; 贺同路; 李嘉懿; 郭学栋; 任永亮
Original assignee: Beijing Intelligent Workshop Technology Co ltd
Current assignee: Beijing Intelligent Workshop Technology Co ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2021-04-13
Anticipated expiration: 2040-03-31
Also published as: CN111477329A

Abstract

A method for evaluating a psychological state based on image-text combination is characterized in that the psychological state of a user is evaluated by analyzing historical psychological texts and image numbers of the user in unit time, the real psychological state of the user is obtained, and an evaluation result and corresponding countermeasure suggestions are fed back in time, so that the user can visually know the current psychological state of the user, and the user can conveniently self-adjust or seek medical advice in time to achieve the state of psychological health.

Description

Method for evaluating psychological state based on image-text combination

Technical Field

The invention relates to the technical field of computers, in particular to a method for evaluating a psychological state on line.

Background

With the development of socio-economic, the living standard of human beings is improved, the demand of people for health is continuously promoted, the concept of health spans the times of 'no disease', and the concept of health begins to enter the times of physical and mental health and high-quality life. Modern people have relatively peaceful life, but the mental and psychological pressure is large, and the psychological health problem increasingly becomes the main health topic of modern people's health, so how to rapidly, accurately and comprehensively evaluate the individual physical and mental health is urgently needed to be researched.

Today in the information age, the internet is increasingly becoming an indispensable part of people's lives. Various psychological test websites can be found on each large search website, but only a few of the websites are standardized tests compiled by professional organizations, the rest websites are mostly psychological tests with activities and interests of various science popularization, and the users often need to spend a long time to fill in a psychological measurement table, and then the psychological health state is evaluated according to the filling content of the psychological measurement table, so that the condition of randomly selecting or intentionally selecting wrong answers cannot be effectively processed.

Disclosure of Invention

In view of the above-mentioned drawbacks of the prior art, the present invention provides a method for evaluating a psychological state based on a combination of graphics and text, comprising the following steps:

s101, data acquisition is carried out from the following three aspects: acquiring individual professional psychological test data; acquiring individual historical psychological text and picture data; collecting the historical information of the new individual;

s102, constructing a psychological test database: constructing a psychological test database according to the collected individual professional psychological test data and the psychological text and picture data which are issued by the individual history;

s103, data analysis: the method comprises the steps of respectively performing text analysis and picture analysis;

s104, analyzing the acquired text data;

s105, calculating semantic similarity of the psychological text;

s106, psychological text classification: labeling and classifying psychological text data corresponding to the individuals, and storing results;

s107, constructing a text factor feature set according to the result of the psychological text classification and the result of the semantic similarity of the psychological text;

s108, image analysis, namely analyzing pictures historically issued by a user to obtain picture elements;

s109, image classification, namely classifying the pictures published by the user, training a picture classification model, and predicting the classes of the pictures published by the user in history;

s110, calculating an image weight factor, namely analyzing the pictures historically issued by the user according to the picture classification result obtained in the S109 and calculating the image weight factor according to the analysis result;

s111, constructing an image factor feature set, namely constructing the image factor feature set by using the image weight factors and the corresponding class pictures;

the S112 mental state analysis module is used for constructing a mental analysis model according to the text factor feature set obtained in the S107 and the image factor feature set obtained in the S111 to analyze the mental state of the user and predict the mental state of the user;

and S113, analyzing the result, namely analyzing the prediction result of the user psychological state, and giving a corresponding conclusion or a countermeasure suggestion according to the current psychological state.

Preferably, the acquisition of the professional psychological test information of the individual includes, but is not limited to, acquiring relevant psychological test data with high credibility from professional institutions.

Preferably, the individual historical psychological text and picture data includes but is not limited to data obtained from historical releases of the user.

Preferably, the history information of the new individual includes, but is not limited to, data obtained from psychologically related texts and pictures issued by the new individual in a unit time.

Preferably, S104 further includes: preprocessing the acquired data before text analysis; including but not limited to encoding data using the encoding specification unicode rule, removing non-psychologically related text, filtering special characters, and removing stop words.

Preferably, S105 further comprises: constructing a semantic vector model to convert a semantic vector of the psychological text; and calculating the semantic similarity between the psychological text and the questions in the corresponding psychological test table, and storing the result.

Preferably, the image classification in S109 further includes the steps of:

s301, data preprocessing: cleaning picture data in unit time of an individual, clearing abnormal pictures, formatting the pictures, unifying the sizes of the pictures and labeling the pictures; dividing the marked standard data;

s302, training a classification model: the pictures are converted into corresponding matrix representations before the classification model is trained. According to a preferred embodiment of the present invention;

and S303, model evaluation: testing the effect of the classification model by using the test set, and evaluating the image classification model; judging whether an expected evaluation standard is met; if so, executing S305; if not, returning to S302 to continue model optimization;

s304 meets expectations: when the expected evaluation standard is not met, optimally adjusting the model; returning to execute S302; when the model reaches the expected evaluation standard; thereby completing the training, evaluation and optimization of the image classification model, and the final result is the optimal result of the image classification model;

and S305, outputting a result: the final result in S304 is the optimal result of the image classification model, and is output as the final result.

Preferably, the image weight factor calculated in S110 is specifically:

s1101, counting the number of pictures of each category of the sample individuals according to the labeling result of S301;

s1102, calculating a ratio Tn of each category picture to the total number of sample individual pictures, where N is 1,2, …, and N is the number of categories;

s1103, recording the psychological test score as I, dividing the psychological test score into a plurality of grades according to the psychological test result, and then expressing the psychological test score by numbers;

the preset psychological formula is constructed as follows:

I＝T1*α+T2*β+…+Tn*γ

wherein alpha, beta, … and gamma are psychological weight values of the pictures, and I is a psychological test score;

s1104, respectively bringing the psychological test score I of each sample individual and the corresponding proportion Tn into a psychological preset formula to obtain a formula group;

and S1105, solving the formula group to obtain the picture psychological weight values of alpha, beta, … and gamma. α + β + γ is 1, that is, α, β, …, γ range between 0 and 1.

Preferably, the relevant psychological test data includes, but is not limited to, a professional psychological test chart of an individual and test results thereof, and corresponding test data source time, personal information, evaluation results, and countermeasure suggestion data.

Preferably, the data published by the user history includes, but is not limited to, social media history data published by the user.

According to the invention, the individual historical psychological texts are analyzed based on the neural network, the relevant characteristic factors are extracted according to the psychological test table, then the picture distribution is released in combination with the individual history, and finally the psychological state of the user is judged and evaluated through the neural network evaluation model, so that the evaluation result is more scientific, accurate and rapid, and the user does not need to fill the psychological measurement table on line.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart of a method for evaluating a psychological state based on a combination of graphics and text according to an embodiment of the present invention;

FIG. 2 is a flowchart of a text classification model according to an embodiment of the present invention;

FIG. 3 is a flowchart of an image classification model according to an embodiment of the present invention;

FIG. 4 is a flow chart of a neural network evaluation model according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a convolutional neural network mapping according to an embodiment of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

The invention provides a method for evaluating psychological states based on image-text combination, which comprises the following steps:

s101, data acquisition, including but not limited to data acquisition from at least the following three aspects:

acquiring individual professional psychological test data: the acquisition of individual professional psychological test information comprises but is not limited to acquiring relevant psychological test data with high reliability from professional institutions; according to a preferred embodiment of the present invention, the professional institutions include, but are not limited to, professional psychotherapeutic institutions, professional psychotherapeutic websites, professional psychoresearch institutions, and the like; according to a preferred embodiment of the present invention, the relevant psychological test data includes, but is not limited to, an individual professional psychological test chart and its test results, and corresponding data such as source time of the test data, personal information, assessment results, and countermeasure suggestions.

Acquiring individual historical psychological text and picture data: the individual historical psychological text and picture data comprises, but is not limited to, data obtained from historical postings of the user, according to a preferred embodiment of the present invention, the user historical postings data comprises, but is not limited to, social media historical data obtained from social media postings of the user, according to a preferred embodiment of the present invention, the social media historical data comprises, but is not limited to, historical data of social media such as WeChat friend circles, QQQQBs, microblogs, and the like, and the obtained data is data such as psychologically-related texts and pictures historically postings of the user.

Collecting the history information of the new individual: the history information of the new individual includes, but is not limited to, the data such as the psychologically-related text and pictures published by the new individual in a unit time, and according to a preferred embodiment of the present invention, the data such as the psychologically-related text and pictures published by the new individual in a unit time includes, but is not limited to, the history data published by the new individual user in social media such as a WeChat friend circle, a QQ, a microblog and the like. The new individual is a new user.

S102, constructing a psychological test database: and constructing a psychological test database according to the collected individual professional psychological test data and the psychological text and picture data which are released by the individual history.

S103, data analysis: the method comprises the following steps of text analysis and picture analysis.

S104, text analysis, according to a preferred embodiment of the invention, the collected data is preprocessed before the text analysis; according to a preferred embodiment of the present invention, the preprocessing of the data is data cleaning, including but not limited to encoding the data using a unified encoding rule of the encoding specification (e.g., UTF-8), removing non-psychologically related text, filtering special characters, removing stop words, etc.

S105, calculating semantic similarity of the psychological text: constructing a semantic vector model to convert a semantic vector of the psychological text; further calculating semantic similarity between the psychological text and the questions in the corresponding psychological test table, and storing the result; according to a preferred embodiment of the invention, a semantic vector model is constructed by using a deep learning or machine learning method; according to a preferred embodiment of the present invention, the semantic similarity between the psychological text and the corresponding topic in the psychological test chart is calculated by cosine distance or Jacard similarity.

S106, psychological text classification: labeling and classifying psychological text data corresponding to the individuals, storing results and preparing for next data processing; according to a preferred embodiment of the invention, a classification model is constructed in a supervised or unsupervised manner to perform psychological text classification.

S107, constructing a text factor characteristic set according to the result of the psychological text classification and the result of the semantic similarity of the psychological text.

S108, image analysis, namely analyzing pictures historically issued by a user to obtain picture elements; according to a preferred embodiment of the present invention, the picture elements include, but are not limited to, color (e.g., color, black and white, etc.), content (e.g., people, landscape, architecture, etc.), release time (e.g., the release time can be divided into morning, day, night);

s109, image classification, namely classifying the pictures published by the user, training a picture classification model, and predicting the classes of the pictures published by the user in history; according to a preferred embodiment of the present invention, pictures with consistent picture elements are classified into one category, for example, when the picture elements include three elements according to color, content and time, the pictures with the completely consistent three elements are classified into one category. According to a preferred embodiment of the present invention, if the first picture element comprises m1 sub-categories, the second picture element comprises m2 sub-categories, and the nth picture element comprises mn sub-categories, the picture may be divided into N categories, N-m 1 m2 m … mn.

and S113, analyzing the result, namely analyzing the prediction result of the user psychological state, and giving corresponding conclusion or strategy suggestion and the like according to the current psychological state.

The following detailed description of embodiments of the invention is provided in conjunction with the accompanying drawings of which figures 1-5 show:

S105, calculating semantic similarity of the psychological text: constructing a semantic vector model to convert a semantic vector of the psychological text; further calculating semantic similarity between the psychological text and the questions in the corresponding psychological test table, and storing the result; according to a preferred embodiment of the invention, a semantic vector model is constructed by using a deep learning or machine learning method; according to a preferred embodiment of the present invention, the semantic similarity between the psychological text and the corresponding topic in the psychological test chart is calculated by cosine distance.

According to an embodiment of the present invention, the cosine value of the included angle between two vectors in the vector space is used as a measure of the difference between two individuals. Cosine similarity emphasizes the difference in direction of the two vectors, rather than distance or length, as compared to using a distance metric alone. The cosine similarity between vectors X, Y in vector space can be expressed as:

according to a preferred embodiment of the invention, calculating semantic similarity between the psychological text and the subject in the corresponding psychological test chart through Jacard similarity; wherein the Jacard similarity coefficient is used to compare similarity and difference between finite sample sets. The larger the coefficient value, the higher the sample similarity.

For a given two n-dimensional vectors a, B, the vicard similarity coefficient can be expressed as:

according to an embodiment of the present invention, the calculation of the semantic similarity of the psychological text specifically includes the following steps:

1) converting the text data into corresponding semantic vectors;

according to a preferred embodiment of the present invention, the vectorization of text data is performed using a method of deep learning or machine learning:

according to a preferred embodiment of the invention, a semantic vector model is constructed using BERT (bi-directionally encoded representation of the encoder) to convert text into the form of a semantic vector representation; according to a preferred embodiment of the invention, text is converted into the form of a vector representation using the variant set 2Vec (sentence vector) of Word2 Vec; according to the preferred embodiment of the present invention, other methods of deep learning or machine learning may be used for vectorization of text data (e.g., TF-IDF (term frequency-inverse text frequency), LDA (topic model), etc.) to convert text into a form of unique vector representation;

according to a preferred embodiment of the invention, the vectorization of the psychological text is carried out in two parts: vectorizing individual historical psychological text and vectorizing corresponding psychological test questions.

2) Calculating semantic similarity of the psychological text:

after individual historical psychological text vectorization and corresponding psychological test topic vectorization are carried out, semantic similarity Si of each individual historical psychological text and corresponding psychological measurement table topic text in unit time is calculated, wherein i is 1,2, …, n is a positive integer, and n represents the number of corresponding psychological measurement tables; then, the obtained text similarity is averaged and recorded as

The number of corresponding psychometric questions is represented as a positive integer. The equalization processing is to perform equalization processing on the psychological text vector corresponding to each topic, for example, if a psychological test topic 1 corresponds to 3 psychological texts, the 3 psychological text vectors are equalized, that is, an average value is obtained.

According to a preferred embodiment of the invention, the method of unsupervised learning binary K-means (K-means) clustering is used for carrying out classification processing on the psychological texts;

s10601 initializes all data into a cluster;

s10602 divides the first cluster into two clusters (initially, only one cluster) using a K-means algorithm; the SSE (sum of squares of the total error) for the two clusters divided, i.e., the sum of the squares of the differences between the elements in each cluster and the cluster center, is then calculated as follows:

description of parameters: p represents the sum of the point position (x, y), Mi represents the position of the central point, and SSE represents the position of the central point of the current classification case to the point of the self classification cluster.

The using method comprises the following steps: in the iterative process of the clustering algorithm, the current classification effect is evaluated by calculating the SSE value under the condition of the currently obtained central point, if the SSE value is greatly reduced after certain iteration, the clustering process is basically finished, and more iterations are not needed.

S10603, judging whether k clusters exist at present, wherein k is the number of categories which are required to be divided by the preset psychological text; if the judgment result is yes, finishing the classification; if the judgment result is negative, executing S10604;

s10604, selecting the cluster with the maximum SSE (sum of squares of the total errors); execution returns to S10602.

The smaller the sum of the squares of the SSE errors is, the better the clustering effect is; if the contour coefficient is adopted to represent the clustering effect, the larger the contour coefficient is, the distance between clusters is represented, and the longer the distance is, the better the clustering effect is. According to an embodiment of the invention, the contour coefficient comprises the degree of aggregation and the degree of separation of the clusters, and is used for evaluating the clustering effect; the value S is between-1 and 1, and the larger the value is, the better the clustering effect is.

The contour coefficients are calculated as follows:

where a is the average distance of Xi from other samples in the same cluster, called the degree of agglomeration; b is the average distance of Xi from all samples in the nearest cluster, called the degree of separation. The definition of the nearest cluster is as follows:

where p is the sample in a certain cluster Ck; and after the average distance from the Xi to all samples of a certain cluster is used as a measure of the distance from the point to the cluster, selecting a cluster closest to the Xi as a closest cluster. And obtaining the average profile coefficient by averaging the profile coefficients of all the samples. The value range of the average contour coefficient is [ -1,1], and the closer the distance of the samples in the clusters is, the farther the distance of the samples between the clusters is, the larger the average contour coefficient is, and the better the clustering effect is.

According to a preferred embodiment of the present invention, the psychographic text is classified using a deep learning text classification model; the text data is vectorized and then the text vectors are classified.

According to a preferred embodiment of the present invention, the text classification method further comprises:

s201, data preprocessing: the purpose of text preprocessing is to extract main content from a text corpus in a normative manner and remove information irrelevant to the emotion classification of the text. Preprocessing Chinese text, which mainly comprises the steps of standard coding, filtering illegal characters, word segmentation processing, removing stop words and the like;

1) and (3) encoding specification: the Chinese text generally relates to the coding problem, common Chinese coding comprises GB2312, GBK, UTF-8 and the like, and in order to avoid messy codes of the text, the text is uniformly coded according to a preferred embodiment of the invention. According to a preferred embodiment of the invention, the text is UTF-8 encoded.

2) And filtering illegal characters, wherein other special characters such as: emoticons, non-chinese characters, special symbols, etc., and these unnecessary special characters are collectively referred to as illegal characters in the present invention. Because the occurrence of the illegal characters can influence the analysis of the Chinese text, in order to avoid influencing the accuracy of the subsequent model training, the illegal characters need to be filtered;

3) word segmentation processing: the word segmentation processing of Chinese text is an important step in text analysis, and the quality of the segmentation even directly affects the accuracy of the model (for example, "she does not look good" can be divided into "she", "not", "look good"), according to a preferred embodiment of the present invention, the word segmentation method includes, but is not limited to, word segmentation toolkits such as Jieba word segmentation, Glove word segmentation, NLTK (Natural language processing toolkit);

4) removing stop words: it is common to filter out certain Words or phrases, known as Stop Words, before processing chinese text data. Generally, the stop words are manually arranged and generated in a non-automatic mode according to a text analysis task and a data set, and finally the obtained stop words form a stop word list. The stop words refer to words (such as yes, no, and the like) which have very high occurrence frequency but have no specific influence on the substantial meaning of the text, and the stop words do not influence the accuracy of the model after being removed;

s202, text labeling and data segmentation: firstly, the data set needs to be labeled, taking a depression test as an example, the specific labeling mode is as follows:

labeling the psychological text with reference to the individual depression test psychological test results; the depression test generally includes three types of features according to professional psychological analysis: physiological, psychological, behavioral; the individual historical texts can be labeled according to the three types of characteristics to screen out psychological texts corresponding to the individuals; the specific notations are as follows:

and then dividing the labeled data set into a training set and a testing set according to a preset proportion. According to a preferred embodiment of the invention, the division ratio is 8: 2 or 7: 3; the training set text is used for training a psychological classification model; the test set text is used to evaluate the predictive power of the model.

S203, training a classification model: before training the classification model, we need to vectorize the text, and the computer cannot recognize Chinese, so we need to convert it. Such as: depression tests generally include three types of features: physiological, psychological and behavioral can be respectively converted into labels 1,2 and 3; the essence of model training is the operation of various numerical values or matrices; after the samples are converted into corresponding feature vectors, data are thrown into the model according to batches during training, and then the text classification model is trained according to the labeled labels;

and S204, evaluating a model: testing the effect of the classification model by using the test set, and evaluating the text classification model; judging whether an expected evaluation standard is met; if so, go to S205; if not, returning to S203 to continuously optimize the model; according to a preferred embodiment of the present invention, the method for evaluating the text model includes, but is not limited to, accuracy value, precision value, recall value, F1 value, and other evaluation methods.

According to one embodiment of the present invention, the accuracy value, precision value, recall value, F1 value are calculated according to the following manner:

	true positive	True is of negative class
			Is judged as positive	TP	FP
Is judged as negative	FN	TN

According to the data labeling result of S202, according to a preferred embodiment of the invention, 10000 labeled psychological texts are selected as a test set to evaluate the text classification model;

according to a preferred embodiment of the present invention, the number of the physiological, psychological and behavioral texts in 10000 selected psychological texts is 3333, 3333 and 3334 respectively;

according to a preferred embodiment of the invention, the number of the predicted physiological, psychological and behavioral texts is 3400, 3400, 3200; wherein the correct number of physical, psychological, behavioral texts 3300, 3300, 3300 are predicted; each category can take the category as a positive category and take the non-category as a negative category;

the accuracy is as follows:

the accuracy is as follows: accuracy ═ prediction correct number of pieces/prediction total (i.e., test set total)

Accuracy＝3300+3300+3300/10000＝99％

The precision ratio is as follows:

precision ratio of each category: precision ═ number of pieces that the class predicts correctly/number of all predicted classes

The recall ratio is as follows:

recall per category: recall ═ number of pieces that the class predicts correctly/number of trues in the class test set

F1 value:

the F1 value for each category equals 2 times the precision recall/(precision + recall)

Wherein, each parameter is defined as follows:

true Positive (TP): predicting positive class as a positive class number

True Negative (TN): predicting negative classes as negative class numbers

False Positive (FP): predicting negative classes as positive class numbers

False Negative (FN): predicting positive class as negative class number

S205 whether it is expected: when the expected evaluation standard is not met, optimally adjusting the model; returning to execute S203; according to a preferred embodiment of the present invention, the optimization adjustment includes, but is not limited to, adjusting the learning rate, randomly disconnecting a certain proportion of neurons, and adjusting the optimization function. According to a preferred embodiment of the invention, the initial value of the learning rate is 0.001. According to a preferred embodiment of the present invention, the tuning optimization function includes, but is not limited to, Adam optimization algorithm, SGD random gradient descent.

When the model reaches the expected evaluation standard; therefore, training, evaluation and optimization of the text classification model are completed, the finally optimized text classification model is the required text classification model, and the classification result is the current optimal classification result.

And S206, outputting a result: the classification result that meets the expected evaluation criteria in S205 is the currently best classification result, and the currently best text classification result is output.

According to a preferred embodiment of the invention, the texts of the same type of the individual are vectorized, and then the addition and averaging operations are carried out, so that three types of semantic expression vector sets of physiology, psychology and behavior can be obtained; taking a result sequence set of text semantic similarity calculation in the S106 as a label; and combining the two data sets according to individuals in sequence to construct a text factor characteristic set.

s109, image classification, namely classifying the pictures issued by the user;

according to a preferred embodiment of the invention, a picture classification model is trained, and the category of the historical published picture of the user is predicted;

according to a preferred embodiment of the present invention, pictures with consistent picture elements are classified into one category, for example, when the picture elements include three elements according to color, content and time, the pictures with the completely consistent three elements are classified into one category. According to a preferred embodiment of the present invention, if the first picture element comprises m1 sub-categories, the second picture element comprises m2 sub-categories, and the nth picture element comprises mn sub-categories, the picture may be divided into N categories, N-m 1 m2 m … mn.

According to a preferred embodiment of the present invention, colors can be classified into m1 levels according to the levels; content may be categorized into m2 categories by major category, such as house, river, people, etc.; the time can be divided into m3 time periods according to morning, noon and evening; the pictures can be finally classified into N categories according to color, content and time, wherein N is m1 m2 m 3.

According to a preferred embodiment of the invention, the image classification further comprises the steps of:

s301, data preprocessing: cleaning picture data in unit time of an individual, clearing abnormal pictures, formatting the pictures, unifying the sizes of the pictures and labeling the pictures;

dividing the marked standard data; according to a specific embodiment of the present invention, the division ratio of the training set and the test set in the data set is 7: 3 or 8: 2; according to a specific embodiment of the invention, the division ratio of the training set and the test set can be adjusted according to specific requirements; the training set is used for training the image classification model; and (3) test set: to evaluate the effectiveness of the image classification model.

S302, training a classification model: before training the classification model, we need to process the pictures firstly, and the computer cannot directly recognize the images, so we need to convert the images. According to an embodiment of the invention, the picture is converted into a corresponding matrix representation. According to a preferred embodiment of the present invention, the picture classification model may be composed of CNN (convolutional neural network), posing (pooling), FC (fully-connected network); and loading the picture data into a corresponding image classification network and training a picture classification model according to the labeled labels.

According to an embodiment of the invention, the convolution kernel of the convolutional neural network regularly sweeps the input features during operation, and matrix element multiplication summation and offset superposition are performed on the input features in the receptive field.

The convolutional neural network works as follows: the black dot frame in the left graph is a convolution kernel, the size of the convolution kernel is 5 x 5, each point in the convolution kernel has a corresponding weight coefficient, each convolution kernel, namely a receptive field, multiplies and sums matrix elements of input features (picture features) and superposes deviation values, and the final value is a mapping result, namely a feature graph shown in fig. 5; then, the convolution kernel starts to scan input features in sequence according to the set step length;

the summation part in the equation is equivalent to solving a cross-correlation (cross-correlation). b is the amount of deviation, z^lAnd z^l+1Represents the input and output of the l +1 th layer convolution, also called feature map. L is_l+1Is z_l+1The feature pattern length and width are assumed to be the same. z (i, j) corresponds to the pixel of the feature map, K is the channel number of the feature map, f, s₀And p is a convolutional layer parameter, corresponding to convolutional kernel size, convolutional step size (stride), and number of padding (padding) layers.

According to one embodiment of the invention, pooling may be in the form of:

step length s in the formula₀Pixel (i, j) has the same meaning as the convolution layer, and p is a pre-specified parameter. Pooling is averaged over a pooling area when p is 1, referred to as mean pooling (averaging); pooling takes a maximum within a region when p → ∞, and is called maximal pooling (max pooling).

According to one embodiment of the present invention, the fully-connected layer in the convolutional neural network is equivalent to the hidden layer in the conventional feedforward neural network; the fully-connected layer is located at the last part of the hidden layer of the convolutional neural network and only signals are transmitted to other fully-connected layers. According to one embodiment of the invention, the feature map loses spatial topology in the fully connected layer, is expanded into vectors and outputs classification labels through excitation functions.

According to one embodiment of the present invention, each neuron node in the neural network accepts an output value of a neuron in a previous layer as an input value of the neuron and transfers the input value to a next layer, and the neuron node in the input layer directly transfers an input attribute value to the next layer (hidden layer or output layer). In a multi-layer neural network, the functional relationship between the output of an upper node and the input of a lower node is an activation function. According to one embodiment of the present invention, the activation functions used include, but are not limited to, the following types:

sigmoid function analytic expression:

analytical formula of tanh function:

ReLU function analytic formula:

Relu＝max(0，x)

the Softmax function has the analytic formula:

and S303, model evaluation: testing the effect of the classification model by using the test set, and evaluating the image classification model; judging whether an expected evaluation standard is met; if so, executing S305; if not, returning to S302 to continue model optimization; according to a preferred embodiment of the present invention, the method for evaluating the picture model includes, but is not limited to, accuracy value, precision value, recall value, F1 value, and other evaluation methods.

the accuracy is as follows:

The precision ratio is as follows:

The recall ratio is as follows:

F1 value:

Wherein, each parameter is defined as follows:

true Positive (TP): predicting positive class as a positive class number

True Negative (TN): predicting negative classes as negative class numbers

False Positive (FP): predicting negative classes as positive class numbers

False Negative (FN): predicting positive class as negative class number

S304 meets expectations: when the expected evaluation standard is not met, optimally adjusting the model; returning to execute S302; according to a preferred embodiment of the present invention, the optimization adjustment includes, but is not limited to, adjusting the learning rate, randomly disconnecting a certain proportion of neurons, and adjusting the optimization function. According to a preferred embodiment of the present invention, the learning rate is 0.001. According to a preferred embodiment of the present invention, the tuning optimization function includes, but is not limited to, Adam optimization algorithm, SGD random gradient descent.

When the model reaches the expected evaluation standard; therefore, training, evaluation and optimization of the image classification model are completed, and the final result is the optimal result of the image classification model.

And S110, calculating an image weight factor, namely analyzing the pictures historically issued by the user according to the picture classification result obtained in the S109, and calculating the image weight factor according to the analysis result.

According to an embodiment of the present invention, the image weight factor is calculated as follows:

s1103, recording the psychological test scores as I, dividing the psychological test scores into a plurality of grades according to the psychological test results and expressing the psychological test scores by numbers in order to simplify the calculation process and improve the calculation speed;

the preset psychological formula is constructed as follows:

I＝T1*α+T2*β+…+Tn*γ

and S1105, solving the formula group to obtain the picture psychological weight values of alpha, beta, … and gamma. α + β + … + γ is 1, that is, α, β, …, γ range between 0 and 1.

And S111, constructing an image factor feature set, namely constructing the image factor feature set by utilizing the image weight factors and the corresponding class pictures. According to a preferred embodiment of the present invention, a convolutional neural network is used to obtain a feature vector of an image, and a method of multiplying a weighting factor by the image feature vector may be used in combination with the picture psychological weight value obtained in S1105, so that multiple pictures of each individual may form an image factor feature set of the individual.

S112, a psychological state analysis module for analyzing the psychological state of the user and predicting the psychological state of the user;

the analyzing the user psychological state specifically includes:

s401, constructing a factor data set: constructing a factor data set according to the text factor data set constructed in S107 and the image factor data set constructed in S111; according to a preferred embodiment of the present invention, the corresponding individual text factor vector set and image factor data set are connected as a feature set, that is, the text factor vector and image factor vector of each individual can be regarded as individual features of the individual, the corresponding individual semantic similarity sequence and image weight sequence are used as a factor label set, the result of professional psychological test is divided into 10 levels by taking 10 as an example according to the score, and the 10 levels can be used as classification labels 1,2, 3, …, 10;

according to a specific embodiment of the invention, dividing the constructed factor data set into a training set and a test set according to a preset proportion; the training set is used for training a psychological assessment model; and (3) test set: the quality degree of the model is evaluated;

s402, training an evaluation model: embedding factor labels in the training set into the feature factor set in an embedding manner to serve as a feature set and a classification label, and loading the feature set and the classification label into a psychological assessment model for training.

And S403, evaluating a model: evaluating the obtained evaluation model, and evaluating the psychological evaluation model; judging whether an expected evaluation standard is met; if so, go to S405; if not, executing S404; according to a preferred embodiment of the present invention, the method for evaluating the psychological evaluation model includes, but is not limited to, accuracy value, precision value, recall value, F1 value, and the like.

S404 is not as expected: when the expected evaluation standard is not met, optimally adjusting the model; returning to execute S402; according to a preferred embodiment of the present invention, the optimization adjustment includes, but is not limited to, adjusting the learning rate, randomly disconnecting a certain proportion of neurons, and adjusting the optimization function. According to a preferred embodiment of the present invention, the learning rate is 0.001. According to a preferred embodiment of the present invention, the tuning optimization function includes, but is not limited to, Adam optimization algorithm, SGD random gradient descent.

When the model reaches the expected evaluation standard; thereby completing the training and evaluation optimization of the evaluation model; the current optimal prediction result is the optimal evaluation model result.

And S405, outputting a result: the current optimal prediction result in S404 is the final output result.

According to another preferred embodiment of the present invention, the classification result of the individual historical psychological picture obtained in S305 is associated with the picture psychological weight values of α, β, …, γ obtained in S1105; calculating the proportion Tn of each category picture of the individual to the total number of the individual pictures, wherein N is 1,2, …, and N is the number of categories; then substituting the calculation result into a preset psychological formula to obtain a psychological test score I ═ T1 ═ alpha + T2 ^ beta + … + Tn ^ gamma; next, the classification result obtained in S405 may also be regarded as the sum and average of the score and the psychological test score obtained in the previous step, and the final result is the optimal result we want.

According to another preferred embodiment of the present invention, it is not necessary to construct the image factor feature set in S111 and integrate the image factor feature set with the text factor feature set in S401, and the above method of substituting into the psychology default formula is directly used to calculate the psychology test score, and then the value of the psychology test score is added to the result of S405 to average, and the final result is the desired optimal result.

By implementing the method and the device, the psychological state of the user is evaluated by analyzing the historical psychological texts and image numbers of the user in unit time without manually making questions or making face-to-face contact with the user, so that the mode does not cause stress on the user, the real psychological state of the user can be acquired to a great extent, and the evaluation result is quicker and more accurate; and the evaluation result and the corresponding strategy suggestion can be fed back in time, so that the user can visually know the current psychological state of the user, and the user can conveniently adjust or seek medical advice in time so as to achieve the state of psychological health.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for evaluating psychological states based on image-text combination comprises the following steps:

s101, data acquisition:

data acquisition was performed from three aspects: acquiring individual professional psychological test data; acquiring individual historical psychological text and picture data; collecting the historical information of the new individual;

s104, analyzing the acquired text data;

s105, calculating semantic similarity of the psychological text;

vectorizing texts of the same type of the individual, and then carrying out addition and averaging operation on the texts to obtain a semantic expression vector set of three types of physiology, psychology and behavior; taking a result sequence set of text semantic similarity calculation as a label; combining the two data sets according to individuals in sequence to construct a text factor characteristic set;

acquiring a feature vector of an image by using a convolutional neural network and multiplying an image weight factor by the image feature vector, wherein a plurality of pictures of each individual can form an image factor feature set of the individual;

the analyzing the user psychological state specifically includes:

s401, constructing a factor data set: constructing a factor data set according to the text factor data set constructed in S107 and the image factor data set constructed in S111; connecting the corresponding individual text factor vector set and the image factor data set as a feature set, regarding the text factor vector and the image factor vector of each individual as individual features of the individual, regarding the corresponding individual semantic similarity sequence and the corresponding image weight sequence as a factor label set, and dividing the results of professional psychological tests into grades according to scores to serve as classification labels;

dividing the constructed factor data set into a training set and a test set according to a preset proportion; the training set is used for training a psychological assessment model; and (3) test set: the quality degree of the model is evaluated;

s402, training an evaluation model: embedding factor labels in a training set into a feature factor set in an embedding manner to serve as a feature set and a classification label, and loading the feature set and the classification label into a psychological assessment model for training;

and S403, evaluating a model: evaluating the psychological evaluation model;

s404 is not as expected: when the expected evaluation standard is not met, optimally adjusting the model; returning to execute S402;

when the model reaches the expected evaluation standard; thereby completing the training and evaluation optimization of the evaluation model; the current optimal prediction result is the optimal evaluation model result;

s405, outputting a result;

2. A method for assessing mental state based on teletext according to claim 1, characterized in that:

wherein the acquisition of the professional psychological test data of the individual comprises but is not limited to acquiring relevant psychological test data with high credibility from professional institutions.

3. A method for assessing mental state based on teletext according to claim 2, characterized in that:

wherein the individual historical psychological text and picture data includes but is not limited to data obtained from historical releases of the user.

4. A method for assessing mental state based on teletext according to claim 3, characterized in that:

wherein the history information of the new individual includes, but is not limited to, the obtained from psychologically related text and picture data published by the new individual in a unit time.

5. The method of claim 4, wherein the mental state is assessed based on a combination of graphics and text:

s104 further includes: preprocessing the acquired data before text analysis; including but not limited to encoding data using the encoding specification unicode rule, removing non-psychologically related text, filtering special characters, and removing stop words.

6. The method of claim 5, wherein the mental state is assessed based on a combination of graphics and text:

s105 further includes: constructing a semantic vector model to convert a semantic vector of the psychological text; and calculating the semantic similarity between the psychological text and the questions in the corresponding psychological test table, and storing the result.

7. The method of claim 6, wherein the mental state is assessed based on a combination of graphics and text:

the image classification in S109 further includes the steps of:

s302, training a classification model: converting the pictures into corresponding matrix representations before training the classification model;

and S303, model evaluation: testing the effect of the classification model by using the test set, and evaluating the image classification model;

and S305 outputting the result.

8. The method of claim 7, wherein the mental state is assessed based on a combination of graphics and text:

in S110, the image weight factor is specifically calculated as:

the preset psychological formula is constructed as follows:

I＝T1*α+T2*β+…+Tn*γ

wherein alpha, beta, … and gamma are psychological weighted values of the picture, and I is a psychological test score;

s1105, solving a formula group to obtain the picture psychological weight values of alpha, beta, … and gamma; α + β + γ is 1, that is, α, β, …, γ range between 0 and 1.

9. The method of claim 8, wherein the mental state is assessed based on a combination of graphics and text:

the relevant psychological test data includes, but is not limited to, a psychological test chart of individual specialties and test results thereof, and corresponding test data source time, personal information, assessment results, and countermeasure suggestion data.

10. The method of claim 9, wherein the mental state is assessed based on a combination of graphics and text:

the data published by the user history includes, but is not limited to, social media history data published by the user.