CN114969554A

CN114969554A - User emotion adjusting method and device, electronic equipment and storage medium

Info

Publication number: CN114969554A
Application number: CN202210894354.XA
Authority: CN
Inventors: 李勇; 展丽霞; 肖强; 孔昭阳; 李伟生; 郑加强; 吴敏
Original assignee: Hangzhou Netease Cloud Music Technology Co Ltd
Current assignee: Hangzhou Netease Cloud Music Technology Co Ltd
Priority date: 2022-07-27
Filing date: 2022-07-27
Publication date: 2022-08-30
Anticipated expiration: 2042-07-27
Also published as: CN114969554B

Abstract

The application relates to the technical field of internet, in particular to a method and a device for adjusting user emotion, electronic equipment and a storage medium, wherein the method comprises the following steps: identifying a target user with a negative emotion based on behavior information of either user, the behavior information including one or both of operation information for a media asset, expression information for a text; then, determining a negative emotion grade of the target user based on one or two of the behavior characteristic and the first image characteristic of any target user, wherein the behavior characteristic comprises one or two of an operation characteristic aiming at a media resource with negative emotion and an expression characteristic aiming at a text with negative emotion; and finally, selecting the media resources with the positive emotion grades matched with the negative emotion grades of the target users from the media resource library, and recommending the selected media resources to the target users, so that the operation behaviors of the target users aiming at the media resources are correctly guided, and the negative emotions of the target users are relieved.

Description

User emotion adjusting method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for adjusting user emotion, an electronic device, and a storage medium.

Background

Along with the popularization of application software, more and more people play entertainment, social contact and the like through the application software on the terminal. The entertainment or social behavior on the terminal can influence the effect of emotional support obtained by people from the social environment; especially people with negative emotions (e.g. depression patients, etc.) may get rid of the negative emotions in real life through entertainment or social behaviors in application software.

For example, listening to music can activate and improve the emotional state of a person, and thus, for people with negative emotions, listening to some suitable music helps to gradually eliminate the negative emotions.

Therefore, it is very important to help passive users to adjust their emotions by guiding their activities such as entertainment and social interaction, and how to correctly guide their activities is a problem to be solved.

Disclosure of Invention

The embodiment of the disclosure provides a user emotion adjusting method and device, electronic equipment and a storage medium, which are used for correctly guiding the behavior of a passive user so as to realize emotion adjustment of an electrodeless user.

In a first aspect, an embodiment of the present disclosure provides a method for adjusting a user emotion, including:

identifying a target user with a negative emotion based on behavioral information of any user; the behavior information includes at least one of: operation information for the media resource, expression information for the text;

determining a negative emotion level of any target user based on the behavior characteristics and/or the first image characteristics of the target user; the behavioral characteristics include at least one of: operational characteristics for media resources with negative emotions, expression characteristics for text with negative emotions;

and selecting the media resources with the positive emotion ratings matched with the negative emotion ratings of the target users from a media resource library, and recommending the selected media resources to the target users.

In a second aspect, an embodiment of the present disclosure further provides a device for adjusting a user emotion, including:

a passive user identification module for identifying a target user having a passive emotion based on the behavior information of any user; the behavior information includes at least one of: operation information for the media resource, expression information for the text;

the user grade determining module is used for determining the negative emotion grade of any target user based on the behavior characteristics and/or the first image characteristics of the target user; the behavioral characteristics include at least one of: an operational characteristic for media assets having a negative emotion, an expression characteristic for text having a negative emotion;

and the first recommending module is used for selecting the media resources with the positive emotion grades matched with the negative emotion grades of the target user from the media resource library and recommending the selected media resources to the target user.

In a third aspect, the disclosed embodiments also provide an electronic device, which includes a processor and a memory, where the memory stores a computer program that is executable on the processor, and when the computer program is executed by the processor, the processor is caused to execute the steps of any one of the methods for adjusting user emotion according to the first aspect.

In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium, which stores a computer program, and when the computer program runs on an electronic device, the computer program is configured to enable the electronic device to execute the steps of any one of the methods for adjusting user emotion in the first aspect.

In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program, the computer program being stored in a computer readable storage medium; when the processor of the electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, so that the electronic device performs the steps of any of the above-mentioned user emotion adjusting methods.

The method for adjusting the emotion of the user provided by the embodiment of the disclosure at least has the following beneficial effects:

according to the scheme provided by the embodiment of the disclosure, firstly, a target user with a negative emotion is identified based on behavior information of any user, wherein the behavior information comprises one or two of operation information aiming at a media resource and expression information aiming at a text; then, determining a negative emotion grade of the target user based on one or two of the behavior characteristic and the first image characteristic of any target user, wherein the behavior characteristic comprises one or two of an operation characteristic aiming at a media resource with negative emotion and an expression characteristic aiming at a text with negative emotion; and finally, selecting the media resources with the positive emotion grades matched with the negative emotion grades of the target users from the media resource library, and recommending the media resources to the target users. In this way, the target user is correctly guided to operate on the media resource, and the negative emotion of the target user is relieved by identifying the target user with the negative emotion and determining the negative emotion grade of the target user so as to recommend the media resource with the positive emotion grade matched with the negative emotion grade of the target user to the target user.

Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the disclosure. The objectives and other advantages of the disclosure may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic view of an application scenario of a method for adjusting user emotion according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a method for adjusting user emotion according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a classification model of negative emotions of a user according to an embodiment of the present disclosure;

fig. 4 is a schematic flowchart illustrating a process of determining an emotional confidence of a sample song based on the pageRank algorithm according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of a construction process of a media resource emotion classification model according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a training process of a media resource emotion classification model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a text emotion classification model according to an embodiment of the present disclosure;

FIG. 8 is a schematic flow chart illustrating a recommendation of a media resource to a target user according to an embodiment of the present disclosure;

fig. 9 is a schematic diagram of a dynamic boot process for a target user according to an embodiment of the present disclosure;

FIG. 10 is a schematic diagram of another dynamic boot process for a target user according to an embodiment of the present disclosure;

fig. 11 is a schematic diagram of implementation logic of a method for adjusting user emotion according to an embodiment of the present disclosure;

fig. 12 is a schematic diagram of a user emotion regulating device provided in an embodiment of the present disclosure;

fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the present disclosure clearer, the present disclosure will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure. The data referred to in the present disclosure may be data authorized by a user or sufficiently authorized by each party, and the embodiments/examples of the present disclosure may be combined with each other.

The following explains a part of terms related to the embodiments of the present disclosure.

Confidence (Confidence): confidence, also called reliability, or confidence level, confidence coefficient, i.e. when a sample estimates an overall parameter, its conclusion is always uncertain due to the randomness of the sample. Therefore, a probabilistic statement method, i.e. interval estimation in mathematical statistics, is used, i.e. how large the corresponding probability of the estimated value and the overall parameter are within a certain allowable error range, and this corresponding probability is called confidence. The emotion confidence of the sample media resource in the embodiments of the present disclosure may be understood as a probability that the sample media resource has a corresponding negative emotion level or a corresponding positive emotion level.

Web page ranking (PageRank), also known as web page rank, is an algorithm that roughly analyzes the importance of web pages, taking the number and quality of hyperlinks between web pages as the main factors. Google search engine is used for analyzing the relevance and importance of web pages, and is often used as one of the effect factors for evaluating web page optimization in search engine optimization.

And (3) BPM: beats per minute, tempo is generally marked at the beginning of a musical composition by text or numbers, usually measured in beats per minute, indicating the number of times a given note, e.g., quarter note, occurs in a minute, with a higher value of BPM indicating a faster tempo.

UGC: and generating the content by the user, namely the original content of the user. Originally originated in the internet field, i.e. users showed or provided their original content to other users through the internet platform. For example: the related text in the embodiment of the disclosure generates content for the user, including comments, title information of the media resource collection, attribute tags, and the like.

Bandit: an algorithm for making real-time decisions in uncertain scenarios. Through several trials, the probability of interest in the new user's mind for each Topic is characterized. If a user is interested in a certain Topic, explicit feedback or implicit feedback is provided to indicate that a benefit is obtained, and if a Topic which is not interested in the user is given, the recommendation system receives negative feedback. Going through multiple cycles of "select-observe-update-select" is an increasing approximation to Topic, which is really of interest to the user.

BERT: the Bidirectional Encoder reproduction from transformations is a pre-trained language Representation model, the internal part of the model is divided into an Encoder and a decoder, and an Attention mechanism is mainly adopted to effectively utilize context information to encode text information. The model pre-trained by BERT can serve downstream tasks well with simple fine-tuning.

It is noted that the terms "first," "second," and the like in the description and in the claims of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein.

Furthermore, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The idea of designing the embodiments of the present disclosure is described below.

As described above, it is very important to help the passive user to adjust emotion by guiding the passive user's actions such as entertainment or social interaction, and how to correctly guide the passive user's actions such as entertainment or social interaction is a problem to be solved.

The activities of entertainment or social interaction of the passive user comprise operation activities aiming at media resources, wherein the media resources comprise music, videos, broadcasts and the like; by recommending media resources with positive emotions to a passive user, the passive user can be made to mitigate their negative emotions by means of the recommended media resources. Taking media resources as music as an example, when a user enjoys music, the user experiences the aesthetic feeling of music art by using auditory sense, and in the listening process, through sound perception, emotional feeling, image association and rational comprehension, the self emotion is improved, the user enjoys the music and relieves the mood;

in general, the negative emotions of a negative user can be divided into a plurality of levels, and the degree of negative is different for different levels. For example: taking a depressed mood as an example, the depression mood is generally divided into three grades, namely mild depression, moderate depression and severe depression. Thus, when recommending media assets to a negative user, media assets that match their negative mood rating may be selected.

In view of this, the present disclosure provides a method and an apparatus for adjusting a user emotion, an electronic device, and a storage medium, which may identify a user with a negative emotion, determine a negative emotion level of a target user, select a media resource with a positive emotion level matching with the negative emotion level, and recommend the media resource to the target user, so as to correctly guide the target user to change from the negative emotion to the positive emotion, thereby implementing emotion adjustment of the target user.

An application scenario of the embodiments of the present disclosure is exemplarily described below with reference to the drawings.

Reference is made to fig. 1, which is a schematic view of an application scenario of a method for adjusting a user emotion according to an embodiment of the present disclosure. The application scenario includes a plurality of terminal devices 100 and a server 200, and the plurality of terminal devices 100 and the server 200 may be connected through a wired or wireless communication network, respectively.

The terminal device 100 is an electronic device used by a user, and the electronic device includes, but is not limited to, an electronic device such as a desktop computer, a mobile phone, a computer, an intelligent appliance, an intelligent voice interaction device, and a vehicle-mounted terminal. Various applications, such as media resource applications (e.g., audio applications, video applications, etc.), information applications, shopping applications, social applications, etc., may be installed on the terminal device. The server 200 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like.

In an optional implementation manner, a media resource application is installed in the terminal device 100, and a user may use the terminal device 100 to log in the media resource application, where operation information for a media resource in the media resource application includes, but is not limited to, clicking the media resource, playing the media resource, collecting the media resource, downloading the media resource, and the like, and in addition, text expression may be performed, such as making a comment, and like a comment. The server 200 may be a background server of the media resource application, and may obtain various behavior information of the user in the media resource application, including but not limited to operation information for the media resource application, expression information for the text, and the like.

Server 200 may identify a target user with a negative emotion based on operation information of any user for the media asset and/or expression information for the text; then, determining a negative emotion grade of any target user based on the behavior characteristics and/or the first image characteristics of the target user, wherein the behavior characteristics comprise operation characteristics aiming at media resources with negative emotions and/or expression characteristics aiming at texts with negative emotions; and finally, selecting the media resources with the matched positive emotion grade and negative emotion grade from the media resource library, and recommending the media resources to the target user.

The above application scenarios are merely illustrative for facilitating an understanding of the spirit and principles of the present disclosure, and embodiments of the present disclosure are not limited in any way in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.

In the embodiment of the disclosure, the acquisition, transmission, use and the like of data all meet the requirements of relevant national laws and regulations.

The method for adjusting user emotion according to the embodiments of the present application is described below with reference to the accompanying drawings and the detailed description.

Referring to fig. 2, an embodiment of the present disclosure provides a user emotion adjusting method, which may be applied to a server, such as the server 200 shown in fig. 1, and may include the following steps S201 to S203:

step S201, identifying a target user with a negative emotion based on the behavior information of any user; the behavior information includes at least one of: operation information for media resources, expression information for text.

The media resources include but are not limited to songs, broadcasts, videos, and the like, and the operation information for the media resources includes but is not limited to: click information, play information, effective play (for example, the play time reaches the set time), complete play information, collection information, addition list information, download information, sharing information, purchase information, and the like. Various operation information aiming at the media resources can not only reflect the interest and preference of the user, but also express the emotion of the user, such as happy, angry, sadness, angry, dysphoria, depression, worry and the like.

Expression information for text includes, but is not limited to: publication text (e.g., comments, etc.) information, approval information for publication text, title information and/or attribute tag creation information for a collection of media assets, etc.; the method comprises the steps that a user can construct a media resource set according to the understanding and the preference of the user on media resources, and in addition, title information and/or attribute labels can be created for the media resource set, namely, one or two of the title information and the attribute labels are created; the title information and attribute tags may reflect the type of media asset, mood information, etc. Taking a song as an example, the media resource set may be a song list, for example, the title information of a song list is "you love this popular Chinese song list", and the attribute labels are "Chinese, popular, cured"; for another example, the title information of another song is "how wind can be avoided in the harbor, and the attribute label is" Chinese, night, and hurt ".

Various expression information for the text may also reflect the emotion expressed by the user. Accordingly, a target user having a negative emotion may be identified based on one or both of operation information for a media asset, expression information for text.

The media resource library comprises media resources with negative emotions, media resources with positive emotions and media resources without emotions, and various texts expressed by each user comprise texts with negative emotions, texts with positive emotions and texts without emotions. In the following embodiments of the present disclosure, the determination manner of the emotion information of the media resource and the determination manner of the emotion information of the text will be described in detail.

Optionally, when the target user with a negative emotion is identified in the step S201, the following steps a1-a2 may be performed:

a1, based on the behavior information of any user, determining whether the user meets at least one of the following conditions: the operation information for the media resource having the negative emotion satisfies a first preset condition, and the expression information for the text having the negative emotion satisfies a second preset condition.

A2, if yes, determining the user as a target user with negative emotion.

The first preset condition and the second preset condition may be set as needed, for example, the first preset condition may be: the number of operations for the media resource with negative emotion reaches a first set number, and the second preset condition may be: the number of expressions for the text with negative emotion reaches a second set number, and the first set number and the second set number may be set as needed, which is not limited.

The negative emotions of any target user have corresponding levels of negative emotions, different levels of negative emotions indicate different degrees of negative emotions, a higher level indicates a greater degree of negative emotions, and the specific number of levels can be set as desired. For example: taking the negative mood as an example, the depression can be classified into mild depression, moderate depression and severe depression.

Thus, after identifying the target users with negative emotions, the following step S202 may be continued to determine the negative emotion level of any of the target users.

Step S202, determining a passive emotion level of a target user based on the behavior characteristics and/or the first image characteristics of any target user; the behavioral characteristics include at least one of: operational features for media assets with negative emotions, expression features for text with negative emotions.

Wherein the first image characteristics of the target user include, but are not limited to: age, sex, region, education level, etc.

In step S202, the operational characteristics for media assets with negative emotions may include at least one of the following A1-A7:

a1: the target user manipulates the number and/or proportion of media assets having negative emotions between the set historical point in time and the current point in time.

Where a1 represents the real-time operational characteristics of the target user that characterize the target user's real-time interest changes in order to quickly capture the changes and react accordingly. The set historical time point may be set as needed, and optionally, the time period between the set historical time point and the current time point may be the last several hours.

Manipulating the number and/or proportion of media assets having a negative emotion can include effectively playing one or both of the number and proportion of media assets having a negative emotion; the effective playing proportion with negative emotion can be: a ratio of the number of media assets that are actively played that have a negative emotion to the total number of media assets that are actively played.

A2: the target users respectively perform the specified operation times for the media resources with the negative emotion in at least one historical time period.

A3: the target users each perform a frequency of specified operations for media assets having negative emotions during at least one historical time period.

A4: the target user performs a specified number of operations on the media assets having the corresponding negative sentiment rating, respectively, during at least one historical time period.

A5: the target users each perform a frequency of specified operations for media assets having a corresponding negative sentiment rating during at least one historical time period.

In the above-mentioned a2-a5, at least one historical time period may be selected according to needs, and optionally, the at least one historical time period may include the previous 1 day, the previous 7 days, the previous 28 days, and the like, which is not limited herein. The specified operations include one or more of: clicking, playing, effective playing, complete playing, collecting, downloading, sharing and purchasing.

The following description of the frequency is given by taking the frequency of the target user playing the media resource with the negative emotion in the previous 1 day as an example, where the frequency is a ratio of the number of times the target user plays the media resource with the negative emotion in the previous day to the total number of times, and other operation frequencies are similar and will not be described herein again.

A6: whether the target user has the following historical operating states: the number of times of executing the specified operation for the media resource with the negative emotion within the first set time length reaches the first set number of times.

The first set time period can be set according to needs, such as one day. The first set number of times may be set as needed, for example, 10 times, but is not limited thereto. Here, the number of times the specified operation is performed with respect to the media asset having the negative emotion may be: a number of times, such as a number of plays, that any of the above operations are performed with respect to a media asset having a negative emotion.

A7: the time length of the time point which reaches the historical operation state last time from the current time point.

The above-mentioned a2-a7 may reflect whether the target user has historically had a negative emotional tendency and severity, and the above-mentioned a2-a7 is explained in the following three aspects.

(1) The above-mentioned a2 and A3 represent the frequency expression and the frequency expression, respectively, of the target user for media assets having negative emotions.

Taking a media resource as a song and a negative mood as depression as an example, a media resource with a negative mood may be understood as a depressed song. The designated operation for the depression song comprises one or more of the above operations, and the like, different operations for the depression song represent different preference degrees for the depression song, such as clicking represents willingness to listen to the depression song, full playing represents comparatively liking the depression song, adding a list represents further liking, the playing can be repeated in the future, and purchasing represents willingness to pay for the depression song and deeper liking. Thus, the various operations described above for a depressed song may reflect the depressed expression of the target user. In order to express the depressed mood of the target user more accurately, a time factor, namely, the frequency and frequency of performing the above-mentioned various operations by the target user, needs to be considered. For example: the number of times the target user clicked/played effectively/played completely/added to a list/collected/downloaded/shared/purchased for a depressed song on the first 1/7/28 days, and the frequency of the target user clicked/played effectively/played completely/added to a list/collected/downloaded/shared/purchased for a depressed song on the first 1/7/28 days.

(2) The above-mentioned a4 and a5 represent the frequency expression and the frequency expression, respectively, of the ranking of media assets with negative emotions by the target user.

Also exemplified is a media asset as a song, a negative mood as depression, assuming depression includes the following three levels: for better granularity of describing the depression tendency of the target user, the frequency and frequency of the target user performing the above operations on songs of different depression grades can be recorded, and finally the frequency and frequency of the target user performing the operations on the songs of a certain depression grade in a certain time period can be obtained. For example: the number of times the target user clicked/played/effectively played/fully played/added to list/favorite/downloaded/shared/purchased for a mildly/moderately/severely depressed song on the first 1/7/28 days, and the frequency of how often the target user clicked/played/effectively played/fully played/added to list/favorite/downloaded/shared/purchased for a mildly/moderately/severely depressed song on the first 1/7/28 days.

(3) The above-mentioned a6-a7 shows the periodic operational behavior of the target user with respect to media assets having negative emotions.

Taking the media resource as a song and the negative mood as an example of depression, considering that depression is periodically and repeatedly characterized, i.e. after a certain depressed patient is treated, there is a possibility of attacks in the later period. To characterize such repetition, the behavior of the target user's operations on the depressed song may be recorded, including whether there has been frequent behavior on the depressed song (e.g., the number of times the depressed song was played within a first set length of time exceeds a first set number of times), the last frequent behavior on the depressed song is a current day, etc.

In step S202, the expression features for text with negative emotions may include at least one of the following B1-B6:

b1, the target user respectively executes the times of the specified expression behaviors aiming at the texts with negative emotions in at least one historical time period;

b2, respectively, the target user executes the frequency of the appointed expression behaviors aiming at the texts with negative emotions in at least one historical time period;

b3, respectively executing the times of the specified expression behaviors aiming at the texts with the corresponding negative emotion levels in at least one historical time period by the target user; wherein the negative emotion of the text corresponds to at least one negative emotion rating;

b4, respectively executing the frequency of the designated expression behaviors aiming at the texts with the corresponding negative emotion levels in at least one historical time period by the target user;

b5, whether the target user has the following historical behavior states: performing the specified expressive behavior for the text with the negative emotion within a second set time period for a second set number of times;

wherein the second set time period can be set according to needs, such as one day. The second set number of times may be set as needed, for example, 10 times, which is not limited. Here, the number of times the specified expressive behavior is performed for text with negative emotions may be: the number of times any of the above-described expression actions are performed for text having a negative emotion, for example, the number of times the text is published.

B6, the time length of the time point of last reaching the historical behavior state from the current time point.

In the above B1-B4, at least one historical time period may be selected according to needs, and optionally, the at least one historical time period may include the previous 1 day, the previous 7 days, the previous 28 days, and the like, which is not limited herein. The above-mentioned specified expression behavior comprises one or more of: publication text (e.g., comments, etc.), endorsement text, title information for creating a collection of media assets, and attribute tags for creating a collection of media assets.

Similar to A2-A7 above, B1-B6 may also reflect whether the target user has historically been negatively emotional toward and severity, and B1-B6 is explained below in three respects.

1) The above-mentioned B1 and B2 represent the frequency expression and the frequency expression of the text having negative emotion by the target user, respectively.

Taking the negative emotion of the text as an example of depression, the text with the negative emotion can be understood as a depressed text, and the designated expression behavior of the target user on the depressed text comprises one or more of the following: comment posting, comment praise, create a song list, record the frequency and frequency with which the target user performs these different expressive actions, such as: number of comments/praise/create song for the target user for the depressed text on the previous 1/7/28 days, frequency of comments/praise/create song for the target user for the depressed text on the previous 1/7/28 days.

2) The above-mentioned B3 and B4 represent the frequency expression and the frequency expression, respectively, of the ranking of the text having negative emotions by the target user.

Also take a media asset as a song and a negative mood as depression, assuming that depression includes the following three ranks: the frequency and the frequency of the various expression behaviors of the target user aiming at the texts with different depression grades can be recorded, and the frequency of the expression behaviors of the target user aiming at the texts with a certain depression grade in a certain time period can be finally obtained. For example: the number of times the user commented/commented on/created a song order for mild/moderate/severe depressed text on the first 1/7/28 days, and the frequency of commenting/commented on/created a song order for mild/moderate/severe depressed text on the first 1/7/28 days.

3) The above-mentioned B5 and B6 represent the periodic expressive behavior of the target user on media assets with negative emotions.

Specifically, the expression behavior of the target user on the depressed text is recorded, including whether the target user has frequently expressed behavior on the depressed text (for example, the number of times of publication for the depressed text exceeds a second set number of times within a second set time period), the number of days from the last frequently expressed behavior on the depressed text to the current day, and the like.

In step S202, the passive emotion level of the target user can be determined by analyzing one or both of the behavior feature and the first image feature of any target user. Wherein the behavior characteristics comprise part or all of A1-A7 and B1-B6.

Optionally, the behavioral characteristics of the target user and the first portrait characteristics may be analyzed simultaneously in order to more accurately determine the negative emotion level of the target user. The following embodiments of the present disclosure will further describe the specific implementation of step S202.

Step S203, selecting the media resources with the positive emotion level matched with the negative emotion level of the target user from the media resource library, and recommending the selected media resources to the target user.

The media resource library comprises media resources with various negative emotion grades and media resources with various positive emotion grades, and the media resources with the positive emotion grades matched with the negative emotion grades of the target user are selected, so that the media resources can be recommended to the target user in a targeted mode, the operation behaviors of the target user on the media resources are correctly guided, and the negative emotions of the target user are better relieved.

The following embodiment describes a specific implementation of the step S202.

In some embodiments, the determining the negative emotion level of the target user based on the behavior feature and/or the first image feature of any target user in step S202 may include the following steps:

inputting the behavior characteristics and/or the first image characteristics of the target user into a user negative emotion classification model to obtain a negative emotion grade of the target user; the user negative emotion classification model comprises a first-order feature crossing module, a second-order feature crossing module and classification modules connected with the first-order feature crossing module and the second-order feature crossing module respectively.

Specifically, the user negative emotion classification model further includes an embedding module, and the embedding module converts the input features (the behavior features and/or the first image features) into dense embedded vectors, specifically, the input features have respective subscripts, and for each input feature, the embedding module obtains a corresponding vector based on the subscript of each feature, and then multiplies the obtained vector by the value of the feature to obtain the embedded vector of the feature. Each feature corresponds to a vector, and the subscript of the feature corresponds to the vector and is stored in advance, so that the embedding module searches the corresponding vector according to the subscript of the feature, for example, the vector corresponding to the subscript of the feature is found by searching the table.

The first-order feature crossing module is used for carrying out first-order crossing on each feature vector output by the embedding module; the second-order feature crossing module is used for performing second-order crossing on each feature vector output by the embedding module, and the order of the second-order crossing is greater than that of the second-order crossing; the classification module executes classification operation based on the output feature vector of the first-order feature crossing module and the output feature vector of the second-order feature crossing module to obtain the passive emotion grade of the target user.

Illustratively, as shown in fig. 3, the user negative emotion classification model includes an embedding layer, an FM (Factorization Machines) layer, a full connection layer and a softmax layer, wherein the embedding layer is the embedding module, the FM layer may serve as the first-order feature crossing module, the full connection layer may serve as the second-order feature crossing module, and the softmax layer may serve as the classification module.

The input layer comprises various characteristics of a classification model of negative emotions of an input user, and the behavior characteristics further comprise one or two of operation characteristics aiming at media resources with negative emotions and expression characteristics aiming at texts with negative emotions, on the assumption that the behavior characteristics and the first image characteristics are simultaneously included. Operational characteristics for media assets with negative emotions include one or more of: the expression characteristics of the text with negative emotion comprise one or more of historical operation frequency characteristics (one or more of A2, A4, A6 and A7), historical operation frequency characteristics (one or two of A3 and A5), real-time operation characteristics (A1), and expression characteristics of the text with negative emotion comprise one or more of historical expression frequency characteristics (one or more of B1, B3, B5 and B6) and historical expression frequency characteristics (one or two of B2 and B4). And inputting each feature of the input layer into the embedding layer, and outputting an embedding vector of each feature.

Inputting the embedded vector of each feature into an FM layer, wherein the FM layer captures the importance of the feature through feature crossing and comprises two parts, one part is first-order crossing, and the following formula (1) shows that:

（1）

wherein, y _w Is a first-order cross feature after first-order cross of each feature, n represents the number of features output by the embedding layer, w _i Weight, x, representing the ith feature _i Is the value of the feature.

The other part is a second-order crossover, as shown in the following equation (2):

（2）

wherein, w _i,j Representing the weight at which the ith and jth features cross. In practical operation, to save computational complexity, the equation (2) may be simplified as shown in the following equation (3):

（3）

wherein, y _v Is a second-order cross feature after second-order cross of each feature, L represents the length of the vector of each feature after passing through the embedding layer, v _i,f The value in the f column of the vector representing the i-th feature after it passes through the embedding layer.

Inputting the embedded vector of each feature into a full-link layer, wherein the full-link layer can capture the high-order intersection of the features, the full-link layer is internally composed of three layers of full links, the dimensionality of the last layer is 1, and the output of the full-link layer is y _d And (4) showing.

The softmax layer outputs the FM layer y _w 、y _v And output y of the full connection layer _d The addition was carried out, and the result after the addition is shown in the following formula (4):

（4）

then, the probability of each classification (i.e., negative emotion level) is calculated by subjecting y to softmax operation, which is expressed by the following formula (5):

（5）

wherein, X _i A predicted value representing the ith classification in y; c is the number of classes, f (X) _i ) And representing the probability of the ith classification, and taking the classification with the highest probability as an output classification result.

The following embodiment illustrates the training process of the user negative emotion classification model.

First, a training sample of a user negative emotion classification model is introduced.

The user negative emotion classification model is obtained based on a first sample data set, and each sample data in the first sample data set comprises: the behavior characteristics and/or the first image characteristics of the sample user and the corresponding negative emotion level labels of the sample user.

The behavior feature and/or the first image feature of the sample user are/is the same as the behavior feature and/or the first image feature of the target user in the same manner, and are not described herein again.

Secondly, the acquisition mode of the negative emotion level label corresponding to the sample user is introduced.

In some embodiments, the process of obtaining the corresponding negative emotion rating labels for the sample user includes the following steps a1-a 2:

a1, determining, for each sample user of the plurality of sample users, a negative sentiment score for the sample user based on the sample user's specified operating characteristics for the media asset having the corresponding negative sentiment rating and/or the specified expression characteristics for the text having the corresponding negative sentiment rating within the set period of time.

The set time period may be set as required, and specifically may be a latest time period, such as a latest week, a latest 28 days, and the like, which is not limited herein. The specified operational characteristics include at least one of: a frequency with which a sample user performs a certain operational behavior with respect to a media asset having a certain passive mood rating, the operational behavior comprising at least one of: click, play effectively, play completely, collect, add to list, download, share, purchase. Specifying expression characteristics includes at least one of: a frequency with which a sample user performs a certain expressive behavior with respect to text having a certain negative emotion rating, a frequency with which a sample user performs a certain expressive behavior with respect to a media asset having a certain negative emotion rating, the expressive behavior comprising at least one of: publication text (e.g., comments, etc.), approved publication text, title information for creating a collection of media assets, and attribute tags for creating a collection of media assets.

In step a1, assuming that the specified operation characteristics include the frequency of a certain operation behavior performed by the sample user for a media resource with a certain negative emotion level and the specified expression characteristics include the frequency of a certain expression behavior performed by the sample user for a text with a certain negative emotion level, the negative emotion score D of the sample user can be specifically calculated by the following formula (6) when the specified operation characteristics and the specified expression characteristics are used to determine the negative emotion score D of the sample user:

（6）

wherein the negative sentiment score is composed of two parts, one part is an operation score of the sample user for the media resource with the corresponding negative sentiment level, and the other part is an expression score of the sample user for the text with the corresponding negative sentiment level. In the formula, V1 _i Weight representing the ith action performed by a sample user for a media asset having a corresponding negative emotional rating, m1 representing the number of categories of actions, W1 _j Weight representing jth negative emotion rating, m2 is the number of negative emotion ratings of the media asset, SCount _ij Indicating the frequency with which the sample user performed the ith action against the jth negative mood level media asset, here V1 _i And W1 _j It can be set according to the requirement, for example, when the ith operation behavior is clicking, V1 _i = 0.8; when the ith operation is full play, V1 _i = 1.0; when the ith action is adding to the list, V1 _i = 1.2; this is not limitative. Assuming that the plurality of negative emotion ratings includes a mild negative emotion, a moderate negative emotion, and a severe negative emotion, then for W1 _j When the jth negative emotion rating is a mild negative emotion, W1 _j = 1.0; when the jth negative emotion rating is a moderate negative emotionW1 _j = 1.2; w1 when the jth negative mood rating is a severe negative mood _j = 1.4; this is not limitative.

Similarly, user V2 _i Weight indicating that sample user performed ith expression behavior with respect to text having corresponding negative emotion level, m3 indicating the number of categories of expression behaviors, W2 _j Weight representing jth negative emotion rating, m4 is the number of negative emotion ratings for text, TCount _ij Indicating how often a sample user performed an ith action against a media asset having a jth negative emotion rating, here V2 _i And W2 _j Can be set as required, specifically, it is the same as the above-mentioned V1 _i And W1 _j The setting method is similar and will not be described herein.

a2, determining a negative emotion grade label corresponding to the negative emotion score of each sample user based on the preset corresponding relationship between the negative emotion score range and the negative emotion grade.

Wherein, the corresponding relation between the passive emotion score range and the passive emotion grade can be obtained by the following steps:

after obtaining the respective negative sentiment scores of the respective sample users, a manual interview is conducted for a portion of the sample users, the purpose of the manual interview is to determine the negative sentiment rating labels of the portion of the sample users, and then an average negative sentiment score of each negative sentiment rating is calculated based on the negative sentiment rating labels of the portion of the sample users. Such as: the negative emotion ratings include three of mild negative emotion, moderate negative emotion, and severe negative emotion, the total negative emotion scores of 100 sample users with moderate negative emotion are manually interviewed and 800, and the average negative emotion score of moderate negative emotion is 8. And sequentially calculating the average negative emotion score of each negative emotion grade, and dividing a negative emotion score range corresponding to each negative emotion grade based on the average negative emotion score, wherein the negative emotion scores [1 and 5] correspond to a mild negative emotion, the negative emotion scores (5 and 9) correspond to a moderate negative emotion, and the negative emotion scores (9 and 10) correspond to a severe negative emotion.

Further, for any of the remaining sample users, in addition to a portion of the sample users of the interview, a corresponding negative sentiment rating label may be determined based on the negative sentiment scores of the sample users. Thus, the sample user that is finally obtained includes the following two parts: a first portion is a sample user of the artificial interview, the negative emotion level labels of the sample user of this portion being determined based on the interview results; the negative sentiment rating labels of the second portion of sample users are determined based on the negative sentiment scores. The negative emotion rating labels of the first part of sample users are relatively accurate and can be endowed with higher weight in the subsequent model training, and the negative emotion rating labels of the second part of sample users are relatively low in accuracy and can be endowed with smaller weight in the subsequent model training, so that the prediction accuracy of the user negative emotion classification model obtained through training is ensured.

After obtaining the first sample data set, in order to train the user negative emotion classification model, the negative emotion rating labels of the sample users may be encoded, for example onehot, such as a certain negative emotion rating may be denoted as [0,1,0,0 ].

In each sample data, the behavior characteristics of the sample user comprise operation characteristics of media resources with negative emotions and expression characteristics of texts with negative emotions, wherein the operation characteristics comprise historical operation frequency characteristics, historical operation frequency characteristics and the like, and the expression characteristics comprise historical expression frequency characteristics, historical expression frequency characteristics and the like. In order to further facilitate model training and achieve higher calculation accuracy, the corresponding characteristics in the behavior characteristics may be subjected to bucket dividing processing for each sample user, for example, if the historical operation frequency characteristics include the number of media resources with negative emotion completely played by the sample user within the last seven days, the historical operation frequency characteristics of each sample user are divided into 5 buckets, which are [0,1, respectively]，[2，2]，[3，5]，[6，10]，[11，+

]Then, the historical operation frequency characteristic in each barrel is codedSuch as onehot processing, etc.

Next, performing multiple rounds of iterative training on the user negative emotion classification model based on the obtained first sample data set until a preset convergence condition is met. Specifically, in a round of iterative training process, the following operations are performed: inputting sample data into a user negative emotion classification model to obtain a prediction classification result corresponding to the sample data; obtaining a corresponding loss value according to the prediction classification result and the negative emotion grade label corresponding to the sample data; and carrying out parameter adjustment on the user negative emotion classification model according to the loss value. Wherein, cross entropy loss can be adopted when calculating the loss value, and the loss function formula is shown as the following formula (7):

（7）

wherein, y _i True tags (i.e., negative mood rating tags) representing the ith category, f (X) _i ) And K is the total number of categories, for example, if the negative emotion level label includes 4 kinds, K is 4.

Taking the structure of the user negative emotion classification model shown in fig. 3 as an example, in the process of training the user negative emotion classification model, in order to avoid overfitting inside the full connection layer, dropout processing may be performed. dropout is a regular term used in the training process, and can reduce the complexity of the model when the weight of the model is updated to ensure deterministic output. It should be noted that, in the model prediction process, dropout processing is not required to be executed.

After the training process of the user negative emotion classification model is introduced, a manner of determining emotion information of a media resource is described below.

In an embodiment of the disclosure, the mood information of the media resource includes a negative mood, a positive mood, and no mood, the negative mood corresponding to at least one level of negative mood, and the positive mood corresponding to at least one level of positive mood. For example: negative emotions correspond to mild negative emotions, moderate negative emotions, severe negative emotions; the positive emotions correspond to mild positive emotions, moderate positive emotions and severe positive emotions; this may be specifically set as needed, and is not limited thereto.

The media resource with negative emotion in the above embodiments of the present disclosure may be specifically determined through the following steps b1-b 2:

b1, inputting the attribute characteristics of any media resource into the media resource emotion classification model, and obtaining first emotion information of the media resource, wherein the first emotion information comprises one of the following: negative mood level, positive mood level, no mood.

b2, identifying the media resource with the first emotion information having the negative emotion rating as the media resource with the negative emotion.

The media resource emotion classification model may be a Neural network model, such as Deep Neural Networks (DNNs), and Neural network layers inside the DNNs may be classified into three types, i.e., an input layer, a hidden layer, and an output layer, where the first layer is an input layer, the last layer is an output layer, and the middle layers are hidden layers. The attribute characteristics of the media assets include at least one of: audio characteristics of audio contained by the media asset; semantic features of text corresponding to the audio.

The following is an exemplary description of the attribute characteristics of a media asset.

Taking the example that the media resource is a song, the audio contained in the media resource is the melody of the song, the text corresponding to the audio is the lyrics, and the melody and the lyrics of the song completely define a song and contain important information of the song, so that the emotion information of the song (i.e. the first emotion information) can be determined based on one or two of the semantic characteristics of the lyrics and the audio characteristics of the melody.

Audio features of first, melody

The audio features of the melody may include BPM, which may be understood as rhythm. Typically, a song has a 4-minute note as one beat, and assuming that one beat of the song is 1s, then 60 beats in one minute, then the BPM for the song is 60. Thus, the BPM represents the rhythm of a song, the BPM value is large and small, and the corresponding emotion can be expressed, for example, fast-paced music represents happy, excited and excited mood, and slow-paced music represents quiet, relaxed and sad mood, so that the BPM contains emotional information of the song.

The disclosed embodiment may identify the BPM of the song and output the BPM value based on an existing BPM identification tool, for example, the BPM identification tool is Librosa based on python, and the like, which is not limited herein.

Semantic features of lyrics

The extraction process of the semantic features of the lyrics can comprise two steps of lyrics text preprocessing and text vectorization.

The first step, preprocessing the lyric text. Besides text information, the lyric text also includes a large number of punctuations, line feed characters, unspecific characters (e.g. non-chinese-english-japanese-korean characters), etc., and the text information usually includes repeated sentences, etc., so that the lyric text needs to be cleaned, the punctuations, line feed characters, unspecific characters, etc. are removed, the cleaned lyric text is further subjected to word segmentation, and finally a data index format of a song and word token list (i.e. a word segmentation list) is generated, so that text vectorization is performed subsequently.

And secondly, vectorizing the text. And aiming at the lyric text containing the participles obtained in the first step, inputting the lyric text into a language representation model, outputting a text vector corresponding to the lyric text, and taking the text vector as the semantic feature of the lyric. The language representation model can be, for example, a pre-trained language model BERT, and a deep bidirectional Transformer component is adopted to construct the whole model, so that a deep bidirectional language representation capable of fusing left and right context information can be generated.

The BERT model takes each word token (participle) in the lyric text as input, outputs an embedding vector corresponding to each word token, namely an embedding vector, and takes the embedding vector of each word token as the semantic feature of the lyric. The embedding vector has a certain physical meaning, that is, the higher the similarity of the embedding vector of two words is, the more similar the meaning between the words is. The embedding vector can be a feature of a word. The dimension of the embedding vector may be set according to needs, for example, the size of the embedding vector of each word token is set to be 100 dimensions, which is not limited.

Through the first step and the second step, semantic features of the lyrics of the song can be extracted.

In practical application, when the emotion information of a song needs to be determined, after the semantic features of the lyrics corresponding to the song and the audio features of the melody are extracted based on the above mode, the semantic features of the lyrics corresponding to the song and the audio features of the melody are input into a media resource emotion classification model, and the first emotion information of the song, namely one of a passive emotion level, a positive emotion level and no emotion, is output.

It should be noted that, the foregoing embodiment is described by taking the example that the media resource is a song, and actually, the media resource may also be other resources including audio, such as video, broadcast, and the like, which is not limited thereto.

Next, a training process of the media resource emotion classification model in the above embodiment of the present disclosure is described.

Firstly, obtaining a training sample of an initial media resource emotion classification model, namely a second sample data set.

In an embodiment of the present disclosure, the second sample data set of the initial media resource emotion classification model includes: one part of sample data which is labeled and the other part of sample data which is not labeled, wherein the labeled sample data comprises: sample media resources, attribute characteristics of the sample media resources, and first emotion information tags corresponding to the sample media resources, wherein unlabeled sample data includes: sample media assets, attribute features of sample media assets. For the manner of obtaining the attribute characteristics of the sample media resources, reference is made to the above embodiments, which are not described herein again.

In the labeled sample data, the manner of obtaining the first emotion information label of the sample media resource is described below.

In some embodiments, the obtaining process of the first emotion information tag possessed by the sample media asset may include the following steps c1-c 2:

c1, determining the emotion confidence of each sample media resource contained in the sample media resource sets based on the attribute label and/or title information of each sample media resource set and the preset emotion classification label; wherein the sentiment classification tags include at least one negative sentiment tag and at least one positive sentiment tag, the sentiment confidence including a negative sentiment confidence or a positive sentiment confidence;

and c2, determining a first emotion information label corresponding to the sample media resource based on the emotion confidence of any sample media resource.

As can be seen from the foregoing, a user may create a media resource set, and may also create title information, attribute tags, and the like of the media resource set, specifically, the user may construct the media resource set according to his understanding and preference of the media resource, create personalized title information for the media resource set, and select 1 or more tags from a plurality of candidate tags as the attribute tags. A set of media assets represents a set of multiple media assets, a media asset in the set of media assets having a general probability of having a negative emotion if its title information or attribute tags contain a keyword for a negative emotion; conversely, if its title information or attribute tags contain keywords with positive emotions, then the media assets in the set of media assets are likely to be of positive emotions. Moreover, if a media asset is attributed to multiple sets of media assets, and the title information or attribute tags of the sets of media assets each contain a keyword with a negative emotion, the higher the confidence that the media asset has a negative emotion.

Therefore, a plurality of sample media resource sets can be selected from a large number of media resource sets, and then a plurality of sample media resources can be obtained, and based on attribute tags and/or title information of each of the plurality of sample media resource sets, emotional confidence of each of the plurality of sample media resources can be determined.

Specifically, based on the attribute tags and/or title information of the sample resource sets, and in combination with preset emotion classification tags, emotion confidence of the sample media resources is calculated by using a pageRank algorithm. The emotion classification tags include a negative emotion tag and a positive emotion tag, which may be specifically set according to attribute tags and/or title information of a large media resource set, for example, the negative emotion tag includes: depression, autism, etc.; positive emotion labels include: cure, relax, etc., without limitation.

The following describes in detail a specific embodiment of the above step c 1.

In the embodiment of the disclosure, a plurality of sample resource sets, respective title information and/or attribute tags, and a plurality of sample media resources included in the plurality of sample resource sets are used as input, and an emotion confidence of each sample media resource is calculated through a pageRank algorithm by combining preset emotion classification tags, wherein the emotion confidence specifically includes a negative emotion confidence or a positive emotion confidence.

In some embodiments, the determining, in the step c1, the confidence level of the emotion of each of the plurality of sample media resources included in the plurality of sample media resource sets based on the attribute tags and/or title information of each of the plurality of sample media resource sets and the preset emotion classification tag may include the following steps c11-c 14:

c11, determining an initial first weight of the emotion classification label corresponding to each of the plurality of media resource sets based on the attribute label and/or the title information of each of the plurality of sample media resource sets, and obtaining an initial first weight matrix.

The emotion classification labels comprise negative emotion labels and positive emotion labels, for each sample media resource set, if the attribute labels or the title information of the sample media resource set comprise the negative emotion labels or the positive emotion labels, the initial first weights of the negative emotion labels or the positive emotion labels are set to be set values, for example, 1, and an initial first weight matrix is constructed based on the initial first weights of the sample media resource sets.

c12, determining an initial second weight of the emotion classification label corresponding to each of the plurality of sample media resources contained in the plurality of sample media resource sets based on the initial first weight matrix, and obtaining an initial second weight matrix.

Specifically, for each sample media resource, one or more sample media resource sets to which the sample media resource belongs are determined, initial second weights of emotion classification labels corresponding to the sample media resource sets are summed, initial second weights of emotion classification labels corresponding to the sample media resources are obtained, and an initial first weight matrix is constructed based on the initial first weights of the sample media resources.

c13, iteratively executing the following steps until the second weight of the emotion classification label corresponding to each of the plurality of sample media resources is converged: updating the initial first weight matrix based on the initial second weight matrix to obtain an updated first weight matrix, and updating the initial second weight matrix based on the updated first weight matrix to obtain an updated second weight matrix.

And for each sample song list, summing the initial second weights of the emotion classification labels corresponding to the sample songs contained in the sample song list to obtain an updated first weight of the emotion classification label corresponding to the sample song list, and constructing an updated first weight matrix based on the updated first weights of the sample media resources.

c14, aiming at any sample media resource, normalizing the second weight of the emotion classification label corresponding to the sample media resource, and obtaining the emotion confidence corresponding to the sample media resource.

The pageRank algorithm is used for calculating the importance of the internet web pages, and the pageRank is a function defined on a web page set and gives a positive real number to each web page to represent the importance of the web page, and the higher the pageRank value is, the more important the web page is, and the web page may be ranked in the top in the ranking of the internet search. The pageRank algorithm defines the relationships of web pages in the internet into a data structure of a graph, wherein nodes in the graph represent web pages, edges in the graph represent references between web pages, and the weights of the edges are pageRank values, and the calculation of pageRank can be performed on a directed graph of the internet, and is usually an iterative process. An initial distribution is assumed, and the pageRank values of all web pages are iterated continuously until convergence.

In the embodiment of the present disclosure, taking a sample media resource as a sample song, taking a sample media resource set as a sample song list as an example, and comparing the relationship between the sample song list and the sample song with the relationship between the web pages, the difference is that a graph formed by the web page reference relationship is a homograph, and the relationship between the sample song list and the sample song is an heteromorphic graph, but a pageRank iterative algorithm may still be used. Further, the title information and/or attribute tag of the sample song list is also related to the sample song list, and thus, the title information and/or attribute tag of the sample song list can be considered to be also related to the sample song.

As shown in fig. 4, an implementation procedure for determining respective emotion confidences of a plurality of sample media assets based on the pageRank algorithm in the embodiment of the present disclosure includes the following steps S401 to S405:

step S401, obtaining title information, attribute labels, contained sample songs and emotion classification labels of each sample song list.

Step S402, initializing a first weight matrix of the sample song list-emotion classification label.

And step S403, initializing a second weight matrix of the sample song-emotion classification labels.

Similar to the initial distribution among the web pages in pageRank, the emotion classification tag corresponding to the sample Song i _k Is P _ik ，P _ik The formula (8) is shown as follows:

（8）

wherein m is the total number of the sample song menu, song _i Representing Song i, playlist _j Expressing the song list j, and expressing the number of the emotion classification labels by k, so that the sample song and the emotion classification labels form an initialized graph relation, and then the graph relation is similar to a pageRank algorithm, so that the weight relation between the sample song and the emotion classification labels tends to be converged and stable, and the sample song is obtainedA second weight matrix of emotion classification tags corresponding to the songs.

And step S404, updating a first weight matrix of the sample song list-emotion classification label.

The above steps S403-S404 are iteratively performed until convergence, i.e. the first weight matrix and the second weight matrix do not change.

And step S405, obtaining the emotion confidence of the sample song.

An implementation flow for determining the emotional confidence levels of a plurality of sample songs based on the pageRank algorithm is described below with reference to a specific example.

Illustratively, the mood classification labels include both depressed mood labels and cured mood labels, the depressed mood labels including: depression, autism, cure mood label includes: healing and relaxing. The title information, attribute labels, and initial first weights of emotion classification labels corresponding to the respective sample vocalists are shown in table 1 below.

TABLE 1

Based on the initial first weights of the emotion classification labels corresponding to the sample vocabularies in table 1 and the sample songs included in the sample vocabularies, an initial second weight of the emotion classification label corresponding to each sample song may be calculated, that is, one or more sample vocabularies to which each sample song belongs are determined, and the initial second weights of the emotion classification labels corresponding to the sample vocabularies to which each sample song belongs are summed to obtain an initial second weight of the emotion classification label corresponding to the sample song, as shown in table 2.

TABLE 2

Based on the content in table 2, an initial second weight matrix of the emotion classification label corresponding to each sample song is obtained, and further, the initial first weight matrix is updated based on the initial second weight matrix, so as to obtain an updated first weight matrix, specifically, for each sample song list, the first weights of the emotion classification labels corresponding to each sample song included in the sample song list are added, so as to obtain an updated first weight, as shown in table 3.

TABLE 3

Further, the initial second weight matrix is continuously updated based on the updated first weight matrix, that is, the updating step between the first weight matrix and the second weight matrix is iteratively executed until the first weight matrix and the second weight matrix are converged, so as to obtain the second weight of the emotion classification label corresponding to each sample song.

In order to facilitate the subsequent calculation of the emotion confidence of the sample songs, for each sample song, the second weight of the emotion classification label is normalized, so that the normalized second weight is located between [0-1], and specifically, the normalization process may be performed according to the following formula (9):

（9）

where norm value represents the normalized weight, tagValue represents the second weight before normalization, and max (value), min (value) represent the maximum weight and the minimum weight, respectively.

Finally, obtaining the normalized weight of the emotion classification label corresponding to each sample song, specifically including the normalized weights of the 4 emotion labels, taking the sum of the normalized weights of 2 depressed emotion labels (depressed, solitary) as a depression confidence level, and taking the sum of the normalized weights of 2 cured emotion labels (cured, relaxed) as a cure confidence level.

For example, the confidence levels of the sample song S1 corresponding to the above 4 emotion labels are: the normalized weight corresponding to the depression label is 0.7, the normalized weight corresponding to the lonely label is 0.9, and the normalized weights corresponding to the cure label and the relax label are both 0, then the confidence of the depression mood of sample song S1 is 1.6, and the confidence of the cure mood is 0. Optionally, in order to facilitate determining the first emotion information tag corresponding to the sample media resource based on the emotion confidence of the sample media resource subsequently, the emotion confidence of the sample media resource may be normalized, so that the emotion confidence of each sample media resource is between 0 and 1.

After determining the emotion confidence level of each sample media resource, determining a first emotion information tag corresponding to each sample media resource based on the emotion confidence level of each sample media resource, specifically including any one of the following operations d1-d 4:

d1, if the confidence of the negative emotion of the sample media resource reaches a first set value, determining a label of the negative emotion grade corresponding to the confidence of the negative emotion of the sample media resource based on the corresponding relationship between the preset negative emotion grade and the confidence of the negative emotion;

d2, if the confidence coefficient of the positive emotion of the sample media resource reaches a second set value, determining a label of the positive emotion level corresponding to the confidence coefficient of the positive emotion of the sample media resource based on the corresponding relationship between the preset positive emotion level and the confidence coefficient of the positive emotion;

d3, if the negative emotion confidence of the sample media resource does not reach the first set value, determining that the first emotion information label of the sample media resource is a no emotion label;

d4, if the confidence of positive emotion of the sample media resource does not reach the second set value, determining that the first emotion information label of the sample media resource is a no emotion label.

Specifically, a minimum confidence threshold of the passive emotion (namely, the first set value) and a minimum confidence threshold of the active emotion (namely, the second set value) are set, when the emotion confidence of the sample media resource is the passive emotion confidence and reaches the first set value, the passive emotion grade corresponding to the passive emotion confidence of the sample media resource is determined based on the preset corresponding relationship between the passive emotion grade and the passive emotion confidence, so as to obtain a passive emotion grade tag of the sample media resource, and if the passive emotion confidence of the sample media resource does not reach the first set value, the sample media resource is determined to correspond to the non-emotion tag; when the emotion confidence of the sample media resource is the positive emotion confidence, the process of determining the first emotion information tag is similar to the above process, and details are not repeated here.

Wherein, the corresponding relation between the negative emotion grade and the negative emotion confidence coefficient can be set according to needs. Illustratively, assuming that the plurality of negative emotion levels includes a mild negative emotion, a moderate negative emotion, and a severe negative emotion, the negative emotion confidence level [0.9-1) is set to correspond to the severe negative emotion, the negative emotion confidence level [0.65-0.9) to correspond to the moderate negative emotion, the negative emotion confidence level [0.5-0.65) to correspond to the mild negative emotion, and the negative emotion confidence level (0-0.5) to correspond to no emotion. The corresponding relationship between the positive emotion level and the positive emotion confidence may also be set as needed, which is specifically similar to the setting manner of the corresponding relationship between the negative emotion level and the negative emotion confidence, and is not described herein again.

Further, after the first emotion information labels of the sample media resources are obtained, in order to ensure the accuracy of the samples, partial sampling may be performed on the sample media resources to perform manual verification, so as to ensure the accuracy of the first emotion information labels.

And secondly, training an initial media resource emotion classification model based on the second sample data set obtained in the first step.

As can be seen from the foregoing, the second sample data set includes a part of sample data that is tagged and another part of sample data that is not tagged. The embodiment of the disclosure can adopt semi-supervised learning to train the initial media resource emotion classification model, and iteratively execute the following two steps until the predicted effect of the trained media resource emotion classification model is stable:

the first step is as follows: and training the initial media resource emotion classification model by using part of labeled sample data to obtain a trained first version media resource emotion classification model.

The second step is that: predicting all sample data by adopting a first version of media resource emotion classification model to obtain prediction results of all sample data, wherein the output of the media resource emotion classification model is prediction probabilities corresponding to a plurality of categories, the plurality of categories comprise a plurality of negative emotion grades, a plurality of positive emotion grades and no emotion, and one category with the highest prediction probability is used as the prediction result of the media resource emotion classification model. And then, sampling and labeling all sample data by adopting a manual checking method.

When sample sampling is performed on all sample data, sample sampling can be performed based on the prediction results of the sample data, and manual verification can be performed. Specifically, part of sample data is uniformly sampled, namely, part of sample data is respectively sampled according to a plurality of classified classes, and more input and prior knowledge can be provided for the model through manual verification, so that the sample quality is improved, and the model is rapidly converged to achieve the optimal solution.

In the process of executing the above two steps in an iteration manner, the emotion classification model of the media resource obtained in each iteration can be evaluated by adopting a set index. For example: the set index includes an AUC (Area Under ROC Curve) index, and ROC is collectively called a receiver operating characteristic Curve (receiver operating characteristic Curve). The AUC index is the area contained by an ROC curve and a two-position coordinate axis, the value range of the AUC value is 0-1, when the AUC value is less than 0.5, the prediction capability of the representative model is poorer than random guess, when the AUC value is equal to 0.5, the prediction capability of the representative model is equal to the random guess, when the AUC value is greater than 0.5, the representative model has certain prediction capability, and the larger the AUC value is, the stronger the prediction capability of the representative model is.

When the set index of the media resource emotion classification model obtained in the iteration is not obviously improved compared with the set index of the media resource emotion classification model obtained in the last iteration, the iteration step can be stopped, and the finally trained media resource emotion classification model is obtained.

Fig. 5 and 6 show logic diagrams of a training process of a media asset emotion classification model.

Illustratively, taking a sample media resource as a sample song as an example, as shown in fig. 5, between training media resource emotion classification models, song information of the sample song is extracted, including semantic features (e.g., text vectors) of lyrics, audio features (e.g., BPM) of melodies, and emotion confidence degrees identified based on the pageRank algorithm; for the extraction process of the song information, refer to the above embodiments of the present disclosure. Then, executing the construction process of the media resource emotion classification model: obtaining an initial training sample based on song information of a sample song, manually checking the training sample, and training the media resource emotion classification model based on the training sample.

As shown in FIG. 6, the training process of the media resource emotion classification model includes the following steps S601-S606:

step S601, obtaining the following song information of the sample song: semantic features of lyrics, audio features of melodies, and emotional confidence;

step S602, obtaining a part of sample data set of the labeled sample based on the song information of each sample song;

step S603, manually checking a part of the labeled sample data set;

step S604, training the media resource emotion classification model based on the initial training sample;

step S605, predicting a second sample data set based on the trained media resource emotion classification model, wherein the second sample data set comprises a part of sample data sets which are labeled and another part of sample data sets which are not labeled;

and step S606, sampling from the second sample data set based on the prediction result, performing manual verification, and returning to the step S604.

And (4) iteratively executing the steps S604 to S606 until the prediction result of the media resource emotion classification model is stable.

After the training process of the media resource emotion classification model is introduced, a manner of determining emotion information of a text is introduced below.

In the embodiment of the disclosure, considering that the text expressed by the user is emotional color, the emotion of the user in the current condition is expressed, and the text expressed by the user with negative emotion can be sad, negative and even dull, so that the determination of whether the user has negative emotion and the grade of the negative emotion is facilitated by identifying the emotion information of the text expressed by the user.

The emotional information of the text includes a negative emotion, a positive emotion, and no emotion, the negative emotion corresponding to the at least one negative emotion level, and the positive emotion corresponding to the at least one positive emotion level. For example: negative emotions correspond to mild negative emotions, moderate negative emotions, severe negative emotions; the positive emotions correspond to mild positive emotions, moderate positive emotions and severe positive emotions; this may be specifically set as needed, and is not limited thereto.

In some embodiments, text with negative emotions may be determined by the following steps e1-e 4:

e1, aiming at any text expressed by the user, executing text processing operation on the text through an embedding layer of the text emotion classification model, and obtaining the embedding characteristics corresponding to the text.

Wherein, any text expressed by the user comprises any one of the following texts: published text, title information of the created media asset collection, and/or attribute tags.

The embedded layer of the text emotion classification model comprises a tokenization module and a pre-trained language model. Firstly, a marking module is used for marking a text, and the text is converted into an input format required by a pre-training language model, such as token _ ids, annotation _ mask and token _ type _ ids, for example, the marking module can adopt a marking tool, tokenizer; and then inputting the converted marked text into a pre-training model, and outputting the embedded features corresponding to the text.

For example, the pre-training language model may be BERT, a bidirectional Transformer model is used in BERT, which can well capture context-related feature representations and process texts of ambiguous words, BERT is friendly to downstream tasks, and training of the downstream tasks can be completed by fine tuning with low cost. More specifically, the pre-training model may employ a BERT Chinese model (e.g., a BERT-base Chinese).

e2, executing corresponding operation on the embedded features through a full connection layer of the text emotion classification model, and obtaining target features.

The full-connection layer can further fine-tune the embedding characteristics output by the embedding layer to adapt to a classification task, the full-connection layer can be composed of 3 layers of full-connection, and the final output dimensionality of the full-connection layer is 4 if the classification result of the text emotion classification model comprises 4 categories.

e3, executing corresponding operation on the target characteristics through an output layer of the text emotion classification model, and obtaining second emotion information of the text, wherein the second emotion information comprises a negative emotion level, a positive emotion level or no emotion.

The output layer is used for performing softmax operation on the target features output by the full connection layer, and calculating to obtain the probability of each classification. The calculation formula of softmax operation can be referred to as formula (5) in the above embodiment, and details are not repeated here.

e4, determining the text with the second emotion information having the negative emotion level as the text with the negative emotion.

In the embodiment of the disclosure, the second emotion information of the text can be accurately predicted through the text emotion classification model, and the text with the second emotion information having a negative emotion grade is used as the text with the negative emotion.

Next, a training process of the text emotion classification model in the above embodiment of the present disclosure is described.

Step one, obtaining a training sample of an initial text emotion classification model, namely a third sample data set.

In an embodiment of the present disclosure, each sample data in the third sample data set of the initial text emotion classification model includes: and the sample text and a second emotion information label corresponding to the sample text.

The following describes an acquisition manner of the sample text and the second emotion information tag corresponding to the sample text.

In some embodiments, the obtaining process of the sample text and the second emotion information tag corresponding to the sample text may include the following steps f1-f 4:

f1, aiming at any emotion level label, acquiring a preset keyword set corresponding to the emotion level label.

Specifically, any emotion level tag may be a negative emotion level tag or a positive emotion level tag; aiming at any passive emotion level label, acquiring a preset keyword set corresponding to the passive emotion level label, wherein each keyword in the preset keyword set is used for describing the passive degree of the passive emotion level; and aiming at any positive emotion level label, acquiring a preset keyword set corresponding to the positive emotion level label, wherein each keyword in the preset keyword set is used for describing the positive degree of the positive emotion level. The negative emotion level is taken as an example, and a corresponding preset keyword set is exemplarily described below.

Assuming that the negative mood is depression, the mood rating labels are respectively mild depression, moderate depression and severe depression, and the preset keyword sets corresponding to the mood rating labels are respectively as follows.

Preset keyword set corresponding to mild depression

Mild depression is characterized by persistent depression of the mood, depression in mood, lack of interest in daily activities, and resulting in lassitude, mental retardation, and mental retardation. In cognition, attention cannot be focused, memory is reduced, thinking is slow, self-esteem and self-confidence are reduced, self-evaluation is reduced, and self-defects and errors are often exaggerated. In the aspect of behaviors, the actions are delayed and the patients do not have the essence to be collected, and the actions are passive, dependent, retrogressive and do not like to actively interact with people. Thus, keywords for mild depression may include sadness, depression, vacuity, barren mining, lassitude, guilt, dullness, autism, and the like.

Preset keyword set for moderate depression

Moderate depression is more painful than mild depression, both physical and mood, and symptoms are also prominent, as shown by: "desperate mood, negative pessimistic thinking, slow thinking and simple work are all difficult to deal with; the body has gastrointestinal discomfort, palpitation, insomnia, body pain and the like. The symptoms at this time are also not personally manageable, as the symptoms are prominent and severe, and uncontrolled, often accompanied by distressing insomnia and the like. Thus, keywords for moderate depression may include negative, pessimistic, insomnia, and the like.

(III) Preset keyword set corresponding to major Depression

Severe depression can cause despair, hallucination, hypofunction and serious behavior unfavorable for physical health, loss of interest and pleasure in daily activities, continuous fatigue is often felt, even despair is felt, life is suffering, and people can take the year. Thus, keywords for major depression may include despair, helplessness, life suffering, spending days as years, etc.

f2, preprocessing any text to obtain a target keyword set corresponding to the text.

In the embodiment of the disclosure, in consideration of a large data volume of a text, any text is preprocessed, and a target keyword set corresponding to the obtained text is obtained, so as to determine whether the text contains a keyword in a preset keyword set corresponding to any emotion level label, and if the text contains the keyword, the text can be used as a sample text.

Specifically, the preprocessing of the text is a process of extracting keywords in the text to represent the text. The preprocessing mainly comprises two stages of text word segmentation and word deactivation, and after the text word segmentation and word deactivation are carried out on the text, a target keyword set corresponding to the text is obtained. The word segmentation is performed because words are closer to human expression than words, and the word granularity is better than the word granularity. Stop words are because stop words have no real meaning in the expression of the entire text, for example: stop words include "and", "some", etc. The text segmentation can adopt the existing segmentation tools, such as: the open source word segmentation tool Jieba has high word segmentation speed, supports a user-defined dictionary, and can well perform word segmentation on Chinese; other word segmentation tools may also be used, without limitation.

f3, if any keyword in the target keyword set of the text belongs to a preset keyword set corresponding to any emotion level label, taking the text as a sample text.

And comparing any keyword in the target keyword set of any text with the preset keyword set of each emotion level label respectively to determine whether any keyword of the text belongs to the preset keyword set of any emotion level label, and if so, taking the text as a sample text.

f4, taking the emotion level label corresponding to the attributed preset keyword set as a second emotion information label corresponding to the sample text.

For example, any keyword of any sample text belongs to a preset keyword set of a certain negative emotion level label, and the negative emotion level label is used as a second emotion information label corresponding to the text.

Further, in order to ensure the accuracy of the second emotion information tag of the sample text, it is also possible to manually check the second emotion information tag of the sample text, considering that different negative emotion level tags (or different positive emotion level tags) may correspond to the same keyword. During manual verification, a preset keyword set corresponding to the emotion level label can be referred, some modifiers in the sample text, such as 'very', 'special', 'little' and the like, can be considered, the depression degree of the keywords 'very', 'depression' is more serious than that of the keywords 'little', 'depression', and in addition, the context expressed by the sample text can be considered.

Through the first step, the sample texts and the second emotion information label of each sample text are obtained, wherein the second emotion information label comprises a negative emotion grade label, a positive emotion grade label or a no emotion label. To facilitate subsequent training of the textual emotion classification model, the second emotion information label may be encoded, e.g. expressed in onehot, e.g. asRepresenting 0 as a vector

Representing 1 as a vector

And so on.

And step two, after the third sample data set is obtained through the step one, training an initial text emotion classification model based on the third sample data set.

As can be seen from the foregoing, the initial text emotion classification model includes an embedding layer, a full connection layer and an output layer, as shown in fig. 7, the training process of the initial text emotion classification model includes the following steps:

1. and inputting sample text in the sample data into the embedding layer.

2. The embedded layer comprises a marking module and a pre-training model, and the marking module is used for marking the sample text to convert the sample text into an input format required by the pre-training model; and then inputting the converted marked text into a pre-training model, and outputting the embedded characteristics corresponding to the sample text. Wherein, the pre-training model can adopt BERT.

3. The embedding characteristics output by the embedding layer are further finely adjusted through the full-connection layer to adapt to classification tasks, the full-connection layer is composed of 3 layers of full-connection, in order to avoid overfitting, dropout operation is added to the full-connection layer, and the final output dimension of the full-connection layer is the classification number.

4. And performing softmax operation on the output result of the full connection layer through the output layer, and calculating to obtain the probability of each classification. The calculation formula of softmax is shown in formula (5) of the above embodiment, and is not described in detail here.

5. An objective function. The cross entropy loss is used to calculate the loss value of the sample data, and the calculation formula is referred to as formula (7) in the above embodiment, which is not described herein again.

6. And (5) training and optimizing. During training, an optimizer (e.g., AdamaOptizer) may be used to optimize the model parameters. And after the loss value of the sample data is calculated according to the objective function of forward propagation, performing backward propagation by using an optimizer, and performing iterative optimization on the model parameters until the model converges or a convergence condition is reached.

Second emotion information of the input text may be predicted based on the trained text emotion classification model in order to obtain text with a negative emotion.

The following describes a guidance process for a target user with a corresponding negative emotion rating in an embodiment of the present disclosure.

Based on the above embodiment of the disclosure, after the negative emotion level of the target user is determined, the media resource with the positive emotion level matched with the negative emotion level is selected from the media resource library and recommended to the target user, so that the operation behavior of the target user for the media resource is correctly guided, and the negative emotion of the target user is relieved.

In some embodiments, the selecting a media resource from the media resource library in the step S203 with a positive emotion rating matching the negative emotion rating of the target user, and recommending the selected media resource to the target user, may include the following steps g1-g 3:

g1, selecting a plurality of media resources from the media resource library, wherein the positive emotion level is matched with the negative emotion level of the target user, and forming a media resource candidate set.

Wherein the plurality of positive emotion ratings, which are assigned to positive emotions of the media asset, may correspond to the plurality of negative emotion ratings, which are assigned to negative emotions of the user, that is, one positive emotion rating corresponds to one negative emotion rating. The active mood level of the selected media asset may correspond to the negative mood level of the target user, may be higher than the negative mood level of the target user, and in some cases may be lower than the negative mood level of the target user, and may be set according to actual conditions.

Illustratively, assume that the plurality of negative emotion ratings classified for negative emotions of the user include a mild negative emotion, a moderate negative emotion, a severe negative emotion, and the plurality of positive emotion ratings classified for positive emotions of the media resource include a low positive emotion rating, a moderate positive emotion rating, a high positive emotion rating; when the target user's negative mood level is a mild negative mood, media resources of a low positive mood level and a medium positive mood level may be selected; when the negative emotion rating of the target user is a moderate negative emotion, media resources with a medium positive emotion rating and a high positive emotion rating can be selected; when the target user's negative mood level is a severe negative mood, media resources with a medium positive mood level and a high positive mood level may be selected.

g2, for each media resource in the candidate set of media resources, performing the following operations g21-g 22:

g21, fusing the first characteristics of the media resources with the second characteristics of the target user to obtain a context characteristic vector; the first feature includes at least one of: positive mood characteristics, attribute characteristics, the second characteristics including at least one of: a negative mood feature, a second portrait feature.

In the first feature of the media resource, the positive emotion feature may be a positive emotion level, the attribute feature may include an audio feature of an audio contained in the media resource, a semantic feature of a text corresponding to the audio, and may also include features of a genre, a language type, and the like of the media resource, taking a song as an example, and the attribute information of the song includes: song style, song language, semantic characteristics of lyrics, audio characteristics of melodies (e.g., BPM, etc.). In the second characteristic of the target user, the negative emotion characteristic may be a negative emotion level of the target user, and the second portrait characteristic includes a user portrait such as an age, a gender, a region, an educational level, and the like of the target user, and may further include an interest preference (e.g., a genre, a language, and the like of a favorite media source).

Specifically, the first feature of the media resource is spliced with the second feature of the target user to obtain a context feature vector. For example: the first feature is a d 1-dimensional vector, the second feature is a d 2-dimensional vector, and the first feature and the second feature are spliced to obtain a d-dimensional context feature vector, wherein d = d1+ d 2.

g22, determining the evaluation value corresponding to the media resource based on the context feature vector and the preset feature parameter of the media resource.

Specifically, the preset characteristic parameters may be set as required, for example: the context feature vector is a d-dimensional vector, and for each media resource in the candidate set of media resources, if the media resource is never recommended to the target user, a d-x-d-dimensional weight matrix Q (for example, the initial Q may be a d-dimensional identity matrix) and a d-dimensional coefficient vector R (for example, the initial R may be a d-dimensional zero vector) are initialized for the media resource, and Q, R is used as a preset feature parameter; further, assuming that the media resource is recommended to the target user, feedback information of the target user on the media resource is obtained, and the preset characteristic parameter Q, R of the media resource is updated based on the feedback information.

Therefore, for each media resource in the media resource candidate set, the preset feature parameter may be initialized or the updated preset feature parameter may be acquired, and based on the context feature vector and the preset feature parameter of the media resource, the evaluation value corresponding to the media resource may be determined in a set evaluation manner. The larger the evaluation value of the media asset is, the higher the acceptance of the media asset by the target user is indicated.

Illustratively, one media resource in the candidate set of media resources is denoted by item, and the evaluation value of item may be calculated based on the following equations (10) and (11):

wherein, gain _i Is the evaluation value of item; z is a radical of _i Is the context feature vector of item; a. the _i Is a parameter matrix of item; a is an initialized super parameter, and can be set as required, for example, a =0.3, which is not limited herein; qi is a weight matrix of item, and Ri is a coefficient vector of item.

g3, selecting the first media resource with the evaluation value satisfying the preset condition from the media resource candidate set, and recommending the first media resource to the target user.

Wherein the preset condition can be set as required. For example, the preset condition is that the evaluation value is the largest, and the first media resource with the largest evaluation value is selected from the media resource candidate set to be recommended to the target user, so that the media resource with the positive emotion rating matched with the negative emotion rating of the target user is accurately recommended to the target user.

The following is an exemplary description of a specific process for recommending media assets to a target user.

Illustratively, as shown in fig. 8, after the media resource candidate set corresponding to the target user is obtained, a specific process of recommending media resources to the target user includes the following steps:

step S801, traverse the media resource candidate set corresponding to the target user.

Step S802, the first characteristic of the media resource and the second characteristic of the target user are spliced to obtain a d-dimensional context characteristic vector zi.

Step S803, query the preset feature parameters of the media resource, including the weight matrix Q and the coefficient vector R.

Here, the preset feature parameters of the media resources are queried from a preset feature parameter library, in which the preset feature parameters corresponding to each media resource are stored.

Step S804, determining whether to obtain the weight matrix Q and the coefficient vector R, if yes, performing step S806, and if no, performing step S805.

In step S805, a d × d dimensional weight matrix Q is initialized, and a d dimensional coefficient vector R is initialized.

Step S806, according to the context feature vector z _i The weight matrix Q and the coefficient vector R are used for calculating the evaluation value of the media resource.

In step S807, the first media resource with the largest evaluation value is selected and recommended to the target user.

Step S808, obtaining feedback information of the target user, and updating the weight matrix Q and the coefficient vector R of the first media resource based on the feedback information.

Specifically, the weight matrix Q and the coefficient vector R of the first media resource in the preset feature parameter library are updated based on the updated weight matrix Q and the updated coefficient vector R of the first media resource.

The embodiment of the disclosure provides a strategy of small step guidance, which performs appropriate and effective personalized incentive guidance on target users according to the passive emotion level, portrait characteristics and the like of each target user and the active emotion level, attribute characteristics and the like of media resources.

Further, for the recommended first media resource, feedback information of the target user on the first media resource, including positive feedback or negative feedback, may be obtained, so as to update the preset characteristic parameter of the first media resource based on the feedback information, thereby adjusting the recommended media resource, and making it easier for the target user to accept the media resource recommended to the target user.

Optionally, on the basis of the above steps g1-g3, the following steps g4-g7 may also be performed:

g4, acquiring feedback information of the target user for the recommended first media resource;

wherein the feedback information comprises positive feedback or negative feedback. The forward feedback may be obtained based on forward operations performed by the target user on the first media asset, for example, forward operations including, but not limited to, favorites, adding lists, full play, and the like; negative feedback may be obtained based on negative actions performed by the target user on the first media asset, for example, negative actions including, but not limited to, skipping, joining a trash can, and the like. Specifically, after the first media resource is recommended to the target user, the positive operation or the negative operation performed by the target user on the first media resource may be obtained by collecting the user log of the target user in real time.

And g5, updating the preset characteristic parameters according to the feedback information and the context characteristic vector to obtain the updated preset characteristic parameters.

In this step, for example, the preset characteristic parameter includes the weight matrix and the coefficient vector, where the weight matrix may represent the recommended times of the first media resource, and the coefficient vector may represent the forward feedback times of the target user on the first media resource. If the feedback information of the first media resource is forward feedback, the weight matrix and the coefficient vector can be increased at the same time to increase the probability that the first media resource is recommended; if the feedback information of the first media resource is negative feedback, the weight matrix can be increased, and the coefficient vector is kept unchanged.

Illustratively, the weight matrix Q may be updated according to the following equation (12) _i And updating the coefficient vector according to the following expression (13)

：

The rewarding parameter r of the first media resource may be set to 1 if the feedback information of the target user on the first media resource is positive feedback, and the rewarding parameter r of the first media resource may be set to 0 if the feedback information of the target user on the first media resource is negative feedback.

g6, determining an updated evaluation value corresponding to the first media resource based on the context feature vector of the first media resource and the updated preset feature parameter.

This step is similar to the implementation of step g22, and will not be described herein.

And g7, selecting a second media resource with the evaluation value meeting the preset condition from the media resource candidate set, and recommending the second media resource to the target user.

This step is similar to the implementation of step g3, and will not be described herein.

In the embodiment of the disclosure, in order to make a target user more easily accept media resources recommended to the target user and ensure that emotion of the target user is not damaged in a recommendation process, a recommendation strategy is determined by using an online learning algorithm Bandit, and the media resources are recommended to the target user based on the recommendation strategy, where the recommendation strategy is as follows: recommending the media resources with more recommended times to more target users; trying to recommend the media resources with less recommended times to a small number of target users; the more times the media resource is recommended, the higher the cure probability of the media resource to the extreme emotion can be shown. In this way, the negative emotion of the target user can be relieved to the greatest extent. And the feedback information of the target user on the recommended media resources is received in real time, and the parameters in the recommendation strategy are flexibly adjusted, so that the aims of prescribing medicine according to the symptoms, gradually relieving the negative emotion of the target user and finally curing the negative emotion of the target user are fulfilled.

Considering that the target user's negative emotion rating may change, it is necessary to dynamically adjust the recommended media resources for the target user according to the target user's real-time negative emotion rating, and therefore, on the basis of the above steps g1-g3, the following steps g8-g10 may also be performed:

g8, if the negative emotion level of the target user changes, selecting a plurality of media resources with the positive emotion level matched with the changed negative emotion level from the media resource library based on the changed negative emotion level, and obtaining an updated media resource candidate set;

g9, determining an evaluation value of the media resource for each media resource in the updated candidate set of media resources;

g10, selecting a third media resource with the evaluation value meeting the preset condition from the updated media resource candidate set, and recommending the third media resource to the target user.

The implementation process of the steps g9-g10 is described in the steps g1-g3, and will not be described herein.

The guiding process for the target user is described in detail below with reference to fig. 9-10.

As shown in fig. 9, embodiments of the present disclosure employ a dynamic steering mechanism that includes obtaining a negative mood rating of a target user in real-time and dynamically updating a media resource candidate set for the target user.

Firstly, acquiring a negative emotion grade of a target user in real time, wherein the negative emotion grade comprises a mild negative emotion, a moderate negative emotion or a severe negative emotion; selecting media resources with corresponding positive emotion grades to form a media resource candidate set based on the negative emotion grades of the target user, wherein the positive emotion grades of the media resources comprise a low positive emotion grade, a medium positive emotion grade and a high positive emotion grade, and the positive emotion grades of the selected media resources can correspond to the negative emotion grades of the target user and can also be higher than the negative emotion grades of the target user under normal conditions; in some special cases, the active mood level of the selected media asset may also be lower than the negative mood level of the target user, e.g., the negative mood level of the target user is a severe negative mood, in which case the media asset of the medium positive mood level may also be selected if the amount of media assets of the high positive mood level is insufficient.

In particular, the negative emotion level of the target user may be obtained in real time by a processing engine, for example Apache Flink, which is a framework and distributed processing engine for performing state calculations on unbounded and bounded data streams.

As shown in fig. 10, after determining the candidate set of media resources corresponding to the target user, the process of recommending media resources to the target user includes the following two parts: and one part is that aiming at each media resource in the media resource candidate set corresponding to the target user, the first characteristic of the media resource and the second characteristic of the target user are spliced into a context characteristic vector, then, the evaluation value of the media resource is calculated based on the context characteristic vector and the preset characteristic parameter corresponding to the media resource, and the media resource with the maximum evaluation value is recommended to the target user. And the other part updates the preset characteristic parameters of the recommended media resources by acquiring the feedback information of the target user, and writes the updated preset characteristic parameters into a preset characteristic parameter library for the next recommendation.

Fig. 11 shows a schematic implementation logic diagram of a user emotion adjustment method according to an embodiment of the present disclosure.

As shown in fig. 11, the implementation flow of the user adjustment method of the embodiment of the present disclosure includes the following parts:

a first part, the acquired user data comprising: the method comprises the following steps of expression information of a user for texts, operation information of the user for media resources and feedback information of the user for recommended media resources.

And a second part, constructing a media resource emotion classification model to predict first emotion information of any media resource, wherein the first emotion information comprises a negative emotion level, a positive emotion level or no emotion.

And a third part, constructing a text emotion classification model to predict second emotion information of any text expressed by the user, wherein the second emotion information comprises a negative emotion level, a positive emotion level or no emotion.

The fourth part, first, identify the target user with negative emotion; and then constructing a user negative emotion classification model to predict the negative emotion level of any target user.

And in the fifth part, recommending media resources to the target user, and gradually curing the negative emotion of the target user.

For the specific implementation processes of the above parts, reference is made to the above embodiments of the present disclosure, and details are not described herein.

In the embodiment of the disclosure, the first emotion information of any media resource is predicted based on a media resource emotion classification model so as to determine the media resource with a corresponding negative emotion grade and screen out the media resource with a corresponding positive emotion grade; identifying a target user with negative emotion based on expression information of the user for the text and operation information of the media resource, and predicting a negative emotion grade of the target user based on a user negative emotion classification model; based on the real-time negative emotion level, the user portrait and the interest preference of the target user, the appropriate media resources are screened from the media resource candidate set and recommended to the user, the recommendation strategy is adjusted in real time according to the feedback information of the target user, the media resources received by the target user are intervened flexibly and scientifically, the target user is guided to walk out of the negative emotion step by step, and the positive optimistic and stable emotion is kept.

Based on the same inventive concept, the embodiment of the present disclosure further provides a user emotion adjusting device, and the principle of the device to solve the problem is similar to the method of the above embodiment, so that the implementation of the device may refer to the implementation of the method, and repeated details are omitted.

Referring to fig. 12, a user emotion adjusting apparatus according to an embodiment of the present disclosure includes a negative user identification module 120, a user rating determination module 121, and a first recommendation module 122.

A negative user identification module 120 for identifying a target user having a negative emotion based on behavior information of any one user; the behavior information includes at least one of: operation information for media resources, expression information for text;

a user rank determination module 121, configured to determine a negative emotion rank of any target user based on the behavior feature and/or the first image feature of the target user; the behavioral characteristics include at least one of: an operational characteristic for media assets having a negative emotion, an expression characteristic for text having a negative emotion;

and the first recommending module 122 is used for selecting the media resources with the positive emotion ratings matched with the negative emotion ratings of the target users from the media resource library and recommending the selected media resources to the target users.

In the embodiment of the disclosure, the target user with the negative emotion is identified, and the negative emotion grade of the target user is determined, so that the media resource with the positive emotion grade matched with the negative emotion grade of the target user is recommended to the target user, and therefore the operation behavior of the target user on the media resource is correctly guided, and the negative emotion of the target user is relieved.

Optionally, the passive subscriber identity module 120 is further configured to:

determining whether the user meets at least one of the following conditions based on the behavior information of any user: the operation information aiming at the media resource with the negative emotion meets a first preset condition, and the expression information aiming at the text with the negative emotion meets a second preset condition;

if so, the user is determined to be a target user with a negative emotion.

Optionally, the user level determining module 121 is further configured to:

inputting the behavior characteristics and/or the first image characteristics of the target user into a user negative emotion classification model to obtain a negative emotion grade of the target user;

the user negative emotion classification model comprises a first-order feature crossing module, a second-order feature crossing module and classification modules connected with the first-order feature crossing module and the second-order feature crossing module respectively.

Optionally, the user negative emotion classification model is obtained by training based on a first sample data set, where each sample data in the first sample data set includes: the behavior characteristics and/or the first image characteristics of the sample user and the negative emotion level label corresponding to the sample user;

the apparatus further comprises a first tag determination module 123 for determining a corresponding negative emotion rating tag for the sample user by:

determining, for each sample user of the plurality of sample users, a negative sentiment score for the sample user based on the specified operating characteristics of the sample user for the media asset having the corresponding negative sentiment rating and/or the specified expression characteristics for the text having the corresponding negative sentiment rating over a set period of time;

and determining a corresponding negative emotion grade label of the negative emotion score of each sample user based on the preset corresponding relation between the negative emotion score range and the negative emotion grade.

Optionally, the media resource has a negative emotion corresponding to at least one negative emotion rating, and the media resource has a positive emotion corresponding to at least one positive emotion rating;

the apparatus further comprises a media sentiment determination module 124 for determining media assets having negative sentiments by:

inputting the attribute characteristics of any media resource into a media resource emotion classification model to obtain first emotion information of the media resource, wherein the first emotion information comprises one of the following: negative mood level, positive mood level, no mood;

determining the media resource with the first emotion information having the negative emotion level as the media resource with the negative emotion;

the media resource emotion classification model is a neural network model, and the attribute characteristics of the media resource comprise at least one of the following: audio characteristics of audio contained by the media asset; semantic features of text corresponding to the audio.

Optionally, the media resource emotion classification model is obtained by training based on a second sample data set, where sample data included in the second sample data set includes: the method comprises the steps of obtaining a sample media resource, attribute characteristics of the sample media resource and a first emotion information label corresponding to the sample media resource;

the apparatus further comprises a second tag determination module 125 for determining a first emotional information tag that the sample media asset has by:

determining respective emotion confidence degrees of a plurality of sample media resources contained in a plurality of sample media resource sets based on respective attribute tags and/or title information of the plurality of sample media resource sets and preset emotion classification tags; wherein the sentiment classification tags include at least one negative sentiment tag and at least one positive sentiment tag, the sentiment confidence including a negative sentiment confidence or a positive sentiment confidence;

and determining a first emotion information label corresponding to any sample media resource based on the emotion confidence of the sample media resource.

Optionally, when determining the emotion confidence degrees of the multiple sample media resources included in the multiple sample media resource sets based on the attribute tags and/or tag information of the multiple sample media resource sets and the preset emotion classification tags, the second tag determining module 125 is further configured to:

determining initial first weights of emotion classification labels corresponding to a plurality of sample media resource sets based on attribute labels and/or title information of the sample media resource sets to obtain an initial first weight matrix;

determining initial second weights of emotion classification labels corresponding to a plurality of sample media resources contained in a plurality of sample media resource sets based on the initial first weight matrix, and obtaining an initial second weight matrix;

iteratively performing the following steps until a second weight of the emotion classification label corresponding to each of the plurality of sample media assets converges: updating the initial first weight matrix based on the initial second weight matrix to obtain an updated first weight matrix, and updating the initial second weight matrix based on the updated first weight matrix to obtain an updated second weight matrix;

and aiming at any sample media resource, normalizing the second weight of the emotion classification label corresponding to the sample media resource to obtain the emotion confidence corresponding to the sample media resource.

Optionally, when determining the first emotion information tag corresponding to the sample media resource based on the emotion confidence of the sample media resource, the second tag determining module 125 is further configured to perform one of the following operations:

if the negative emotion confidence of the sample media resource reaches a first set value, determining a negative emotion grade label corresponding to the negative emotion confidence of the sample media resource based on the corresponding relation between the preset negative emotion grade and the negative emotion confidence;

if the positive emotion confidence of the sample media resource reaches a second set value, determining a positive emotion grade label corresponding to the positive emotion confidence of the sample media resource based on the corresponding relation between the preset positive emotion grade and the positive emotion confidence;

if the negative emotion confidence of the sample media resource does not reach a first set value, determining that a first emotion information label of the sample media resource is a no emotion label;

and if the confidence coefficient of the positive emotion of the sample media resource does not reach the second set value, determining that the first emotion information label of the sample media resource is a no emotion label.

Optionally, the apparatus further comprises a text emotion determining module 126 for determining text with negative emotion by:

aiming at any text expressed by a user, executing text processing operation on the text through an embedding layer of a text emotion classification model to obtain embedding characteristics corresponding to the text;

executing corresponding operation on the embedded features through a full connection layer of the text emotion classification model to obtain target features;

corresponding operation is performed on the target characteristics through an output layer of the text emotion classification model, and second emotion information of the text is obtained, wherein the second emotion information comprises a negative emotion level, a positive emotion level or no emotion;

and confirming the text with the negative emotion grade of the second emotion information as the text with the negative emotion.

Optionally, the text emotion classification model is obtained by training based on a third sample data set, where each sample data in the third sample data set includes: the sample text and a second emotion information label corresponding to the sample text;

the device further comprises a sample obtaining module 127, configured to obtain the sample text and a second emotion information tag corresponding to the sample text by:

aiming at any emotion level label, acquiring a preset keyword set corresponding to the emotion level label;

preprocessing a text aiming at any text to obtain a target keyword set corresponding to the text;

if any keyword in the target keyword set of the text belongs to a preset keyword set corresponding to any emotion level label, taking the text as a sample text;

and taking the emotion level label corresponding to the belonged preset keyword set as a second emotion information label corresponding to the sample text.

Optionally, the operational characteristics for media assets with negative emotions include at least one of:

the target user operates the number and/or proportion of media resources with negative emotions between a set historical time point and a current time point;

the target users respectively execute the specified operation times aiming at the media resources with negative emotion in at least one historical time period;

a frequency at which the target user performs a specified operation with respect to the media assets having negative emotions for at least one historical period of time, respectively;

the target users respectively execute the specified operation times aiming at the media resources with the corresponding negative emotion grades in at least one historical time period;

a frequency at which the target user performs a specified operation with respect to the media assets having the corresponding negative emotion rating, for at least one historical time period, respectively;

whether the target user has the following historical operating states: the number of times of executing specified operations aiming at the media resources with negative emotions within a first set time length reaches a first set number of times;

the time point at which the historical operating state was last reached is a length of time from the current time point.

Optionally, the expression features for text with negative emotions include at least one of:

the target users respectively execute the times of the specified expression behaviors aiming at the texts with the negative emotions in at least one historical time period;

the target users respectively execute the frequency of appointed expression behaviors aiming at the texts with negative emotions in at least one historical time period;

the target users respectively execute the times of the specified expression behaviors aiming at the texts with the corresponding negative emotion levels in at least one historical time period; wherein the negative emotion of the text corresponds to at least one negative emotion rating;

the target users respectively execute the frequency of the appointed expression behaviors aiming at the texts with the corresponding negative emotion levels in at least one historical time period;

whether the target user has the following historical behavior states: performing the specified expressive behavior for the text with the negative emotion within a second set time period for a second set number of times;

the time length of the time point last reached the historical behavior state from the current time point.

Optionally, the first recommending module 122 is further configured to:

selecting a plurality of media resources with positive emotion grades matched with the negative emotion grades of the target users from a media resource library to form a media resource candidate set;

for each media asset in the candidate set of media assets, performing the following:

fusing the first characteristics of the media resources with the second characteristics of the target user to obtain a context characteristic vector; the first feature includes at least one of: positive emotional characteristics, attribute characteristics, the second characteristics including at least one of: a negative mood feature, a second portrait feature;

determining an evaluation value corresponding to the media resource based on the context feature vector and a preset feature parameter of the media resource;

and selecting a first media resource with an evaluation value meeting a preset condition from the media resource candidate set, and recommending the first media resource to the target user.

Optionally, the apparatus further comprises a second recommending module 128 for:

acquiring feedback information of a target user aiming at a recommended first media resource;

updating the preset characteristic parameters according to the feedback information and the context characteristic vector to obtain the updated preset characteristic parameters;

determining an updated evaluation value corresponding to the first media resource based on the context feature vector of the first media resource and the updated preset feature parameter;

and selecting a second media resource with an evaluation value meeting a preset condition from the media resource candidate set, and recommending the second media resource to the target user.

Optionally, the apparatus further comprises a third recommending module 129, configured to:

if the negative emotion level of the target user is changed, selecting a plurality of media resources with the positive emotion level matched with the changed negative emotion level from the media resource library based on the changed negative emotion level, and obtaining an updated media resource candidate set;

determining an evaluation value of the media resource for each media resource in the updated media resource candidate set;

and selecting a third media resource with an evaluation value meeting a preset condition from the updated media resource candidate set, and recommending the third media resource to the target user.

Based on the same inventive concept, the embodiment of the present disclosure further provides an electronic device, and the principle of the electronic device to solve the problem is similar to the method of the above embodiment, so that the implementation of the electronic device may refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 13, the electronic device may include a processor 132 and a memory 131. The memory 131 provides the processor 132 with computer programs and data stored in the memory 131. In the embodiment of the present disclosure, the memory 131 may be used to store programs for multimedia resource processing in the embodiment of the present disclosure.

The processor 132 is configured to execute the user emotion adjusting method in any of the above-described method embodiments, for example, the user emotion adjusting method provided in the embodiment shown in fig. 2, by calling the computer program stored in the memory 131, by the processor 132.

The specific connection medium between the memory 131 and the processor 132 is not limited in the embodiments of the present disclosure. In fig. 13, the memory 131 and the processor 132 are connected by a bus 133, the bus 133 is represented by a thick line in fig. 13, and the connection manner between other components is merely illustrative and not limited. The bus 133 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 10, but that does not indicate only one bus or one type of bus.

The Memory may include a Read-Only Memory (ROM) and a Random Access Memory (RAM), and may further include a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The disclosed embodiment also provides a computer storage medium, in which a computer program is stored, a processor of an electronic device reads the computer program from the computer readable storage medium, and the processor executes the computer program, so that the electronic device executes the method for adjusting the user emotion in any method embodiment.

In particular implementations, computer storage media may include: various storage media capable of storing program codes, such as a Universal Serial Bus Flash Drive (USB), a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Based on the same inventive concept as the above method embodiments, the present application embodiments provide a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the electronic device reads the computer program from the computer-readable storage medium, and the processor executes the computer program, so that the electronic device performs the steps of any one of the above-described user emotion adjusting methods.

The computer program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made to the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. A method for adjusting user emotion, comprising:

determining a negative emotion level of any target user based on the behavior characteristics and/or the first image characteristics of the target user; the behavioral characteristics include at least one of: an operational characteristic for media assets having a negative emotion, an expression characteristic for text having a negative emotion;

2. The method of claim 1, wherein identifying a target user with a negative emotion based on behavioral information of either user comprises:

determining whether any user meets at least one of the following conditions based on the behavior information of the user: the operation information aiming at the media resource with the negative emotion meets a first preset condition, and the expression information aiming at the text with the negative emotion meets a second preset condition;

if so, the user is determined to be a target user with a negative emotion.

3. The method of claim 1, wherein determining the negative emotion rating of any target user based on the behavioral characteristics and/or first imagery characteristics of the target user comprises:

the user negative emotion classification model comprises a first-order feature crossing module, a second-order feature crossing module and classification modules which are respectively connected with the first-order feature crossing module and the second-order feature crossing module.

4. The method of claim 3, wherein the user negative emotion classification model is obtained based on a first sample data set, each sample data in the first sample data set comprising: behavior characteristics and/or first image characteristics of sample users and negative emotion level labels corresponding to the sample users;

the corresponding negative emotion rating label for the sample user is determined by:

determining, for each sample user of a plurality of sample users, a negative sentiment score for the sample user based on a specified operating characteristic of the sample user for a media asset having a corresponding negative sentiment rating and/or a specified expression characteristic for text having a corresponding negative sentiment rating over a set period of time;

5. The method of any of claims 1-4, wherein the media asset has a negative emotion corresponding to at least one negative emotion rating and the media asset has a positive emotion corresponding to at least one positive emotion rating;

the media assets having negative emotions are determined by:

inputting attribute characteristics of any media resource into a media resource emotion classification model to obtain first emotion information of the media resource, wherein the first emotion information comprises one of the following: negative mood level, positive mood level, no mood;

wherein the media resource emotion classification model is a neural network model, and the attribute characteristics of the media resource include at least one of the following: audio characteristics of audio contained by the media asset; semantic features of text corresponding to the audio.

6. The method of claim 5, wherein the media resource emotion classification model is trained based on a second sample data set, wherein sample data contained in the second sample data set comprises: sample media resources, attribute characteristics of the sample media resources, and first emotion information labels corresponding to the sample media resources;

wherein the sample media asset has a first sentiment information tag determined by:

determining respective emotion confidence degrees of a plurality of sample media resources contained in a plurality of sample media resource sets based on respective attribute tags and/or title information of the sample media resource sets and preset emotion classification tags; wherein the sentiment classification tags include at least one negative sentiment tag and at least one positive sentiment tag, the sentiment confidence including a negative sentiment confidence or a positive sentiment confidence;

7. The method according to claim 6, wherein the determining emotional confidence levels of the sample media resources included in the sample media resource sets based on the attribute tags and/or tag information of the sample media resource sets and preset emotional classification tags comprises:

determining initial first weights of the emotion classification labels corresponding to the sample media resource sets based on attribute labels and/or title information of the sample media resource sets to obtain an initial first weight matrix;

determining initial second weights of the emotion classification labels corresponding to the sample media resources contained in the sample media resource sets based on the initial first weight matrix, and obtaining an initial second weight matrix;

iteratively performing the following steps until a second weight of the emotion classification label corresponding to each of the plurality of sample media resources converges: updating the initial first weight matrix based on the initial second weight matrix to obtain an updated first weight matrix, and updating the initial second weight matrix based on the updated first weight matrix to obtain an updated second weight matrix;

and for any sample media resource, normalizing the second weight of the emotion classification label corresponding to the sample media resource to obtain the emotion confidence corresponding to the sample media resource.

8. The method of claim 6, wherein determining the corresponding first sentiment information tag of the sample media asset based on the sentiment confidence of the sample media asset comprises one of:

if the negative emotion confidence of the sample media resource reaches a first set value, determining a negative emotion grade label corresponding to the negative emotion confidence of the sample media resource based on a preset corresponding relationship between a negative emotion grade and the negative emotion confidence;

if the positive emotion confidence of the sample media resource reaches a second set value, determining a positive emotion grade label corresponding to the positive emotion confidence of the sample media resource based on a preset corresponding relationship between the positive emotion grade and the positive emotion confidence;

if the negative emotion confidence of the sample media resource does not reach the first set value, determining that a first emotion information label of the sample media resource is a no emotion label;

and if the confidence of the positive emotion of the sample media resource does not reach the second set value, determining that the first emotion information label of the sample media resource is a no emotion label.

9. The method of claim 1, wherein the text with negative emotions is determined by:

aiming at any text expressed by the user, executing text processing operation on the text through an embedding layer of a text emotion classification model to obtain embedding characteristics corresponding to the text;

performing corresponding operation on the target features through an output layer of the text emotion classification model to obtain second emotion information of the text, wherein the second emotion information comprises a negative emotion level, a positive emotion level or no emotion;

and determining the text with the second emotion information having the negative emotion level as the text with the negative emotion.

10. The method of claim 9, wherein the text emotion classification model is trained on a third sample data set, each sample data set comprising: the second emotion information label corresponds to the sample text;

the sample text and the second emotion information label corresponding to the sample text are obtained by the following method:

for any text, preprocessing the text to obtain a target keyword set corresponding to the text;

and taking the emotion grade label corresponding to the belonged preset keyword set as a second emotion information label corresponding to the sample text.

11. The method of any of claims 1-4 and 9-10, wherein selecting a media asset from the library of media assets that has a positive mood rating that matches the target user's negative mood rating, recommending the selected media asset to the target user comprises:

selecting a plurality of media resources with positive emotion grades matched with the negative emotion grades of the target users from the media resource library to form a media resource candidate set;

for each media resource in the candidate set of media resources, performing the following:

12. The method of claim 11, further comprising:

obtaining feedback information of the target user for the recommended first media resource;

updating the preset characteristic parameters according to the feedback information and the context characteristic vector to obtain updated preset characteristic parameters;

and selecting a second media resource with an evaluation value meeting the preset condition from the media resource candidate set, and recommending the second media resource to the target user.

13. The method of claim 11, further comprising:

if the negative emotion level of the target user is changed, selecting a plurality of media resources with positive emotion levels matched with the changed negative emotion level from the media resource library based on the changed negative emotion level, and obtaining an updated media resource candidate set;

determining an evaluation value of the media resource for each media resource in the updated candidate set of media resources;

and selecting a third media resource with an evaluation value meeting the preset condition from the updated media resource candidate set, and recommending the third media resource to the target user.

14. A user emotion adjusting apparatus, comprising:

15. An electronic device, comprising a processor and a memory, the memory having stored thereon a computer program operable on the processor, the computer program, when executed by the processor, causing the processor to perform the steps of the method of any of claims 1 to 13.

16. A computer-readable storage medium, in which a computer program is stored which, when run on an electronic device, is adapted to cause the electronic device to carry out the steps of the method of any one of claims 1 to 13.