WO2021114841A1

WO2021114841A1 - User report generating method and terminal device

Info

Publication number: WO2021114841A1
Application number: PCT/CN2020/119300
Authority: WO
Inventors: 邓悦; 金戈; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-05-14
Filing date: 2020-09-30
Publication date: 2021-06-17
Also published as: CN111694940A

Abstract

A user report generating method and device, comprising: acquiring a plurality of voice signals generated by a target user during a session, and converting the voice signals into a corresponding session text (S101); implementing semantic analysis of the session text to obtain session keywords corresponding to the session text and a session label corresponding to the keywords, and generating a session content set (S102); acquiring a session word vector corresponding to the session keywords in the session content set and, on the basis of the session word vectors, determining an emotion feature value corresponding to the voice signals (S103); and, on the basis of the emotion feature values of all of the voice signals, generating a personality analysis report of the target user (S104). The present method does not require the user to spend additional time writing a personality analysis report relating to the target user and can thereby greatly reduce user operations, and determines emotion feature values based on the voice signals at different stages during the session, increasing the accuracy of the personality analysis report.

Description

Method for generating user report and terminal equipment

This application affirms that it enjoys the priority of the Chinese patent application with the application number 202010406546.2 and the name "A method and terminal device for generating user reports" filed on May 14, 2020. The entire content of the Chinese patent application is incorporated by reference In this application.

Technical field

This application belongs to the field of artificial intelligence technology, and in particular relates to a method for generating a user report and a terminal device.

Background technique

With the continuous expansion of the scale of the enterprise, the number of employees has also increased. Therefore, the inventor realized how to efficiently screen interviewers and determine the personality characteristics of the interviewers, which directly affects the efficiency of the interview and the speed of decision-making. The inventor found that through the user analysis report, the situation of the interviewed user can be quickly understood, and the interview efficiency can be greatly improved.

With the existing user analysis report generation technology, the inventor found that it mainly relies on the interviewer to analyze the interviewer’s personality. By collecting the interviewer’s answers to preset questions, the interviewer’s personality characteristics are subjectively determined, and the user analysis report is generated. People realize that the existing user analysis reports are completed by manpower, and the generation efficiency is low, thereby reducing the efficiency of personnel management.

technical problem

In view of this, the embodiment of the present application provides a method and terminal device for generating user reports to solve the problem of existing user report generation technology, which relies on manpower to complete, and the report generation efficiency is low, thereby reducing the efficiency of personnel management. .

Technical solutions

The first aspect of the embodiments of the present application provides a method for generating a user report, including:

Acquiring multiple voice signals generated by the target user during the conversation, and converting each of the voice signals into corresponding conversation text;

Performing semantic analysis on the conversation text, obtaining conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and generating a conversation content set;

Obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining an emotional feature value corresponding to the voice signal based on each of the conversation word vectors;

Based on the emotional feature values of all voice signals, a personality analysis report of the target user is generated.

Beneficial effect

The embodiment of the application collects the voice signal of the target user during the conversation with the target user, converts the voice signal into the corresponding conversation text, and performs semantic analysis on the conversation text to obtain the corresponding conversation content collection, which is based on the conversation The conversation word vector of each conversation keyword in the content collection generates the emotional feature value corresponding to the voice signal, and based on the emotional feature value of all voice signals, determines the personality type of the target user, and generates a personality analysis report about the target user, so as to be able to During the conversation with the target user, the target user’s language is used to determine the personality, which achieves the purpose of automatically outputting an analysis report. Compared with the existing user report technology, this embodiment does not rely on the interviewer or the conversation object to manually fill in or subjectively judge, and does not require the user to spend extra time writing a personality analysis report on the target user, thereby greatly reducing user operations, and The above process can determine the emotional characteristic value through the voice signals at different stages in the conversation process, instead of using a single utterance or sentence to judge the personality, thereby improving the accuracy of the personality analysis report.

Description of the drawings

FIG. 1 is an implementation flowchart of a method for generating a user report provided by the first embodiment of the present application;

2 is a specific implementation flow chart of a method S103 for generating a user report provided by the second embodiment of the present application;

FIG. 3 is a specific implementation flow chart of a method S1031 for generating a user report provided by the third embodiment of the present application;

FIG. 4 is a specific implementation flow chart of a method S301 for generating a user report provided by the fourth embodiment of the present application;

FIG. 5 is a specific implementation flowchart of a method S302 for generating a user report provided by the fifth embodiment of the present application;

6 is a specific implementation flow chart of a method S1034 for generating a user report provided by the sixth embodiment of the present application;

FIG. 7 is a specific implementation flowchart of a method S104 for generating a user report provided by the seventh embodiment of the present application;

FIG. 8 is a structural block diagram of a device for generating a user report according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a terminal device provided by another embodiment of the present application.

Embodiments of the present invention

In the embodiment of the present application, the execution subject of the process is a terminal device, which includes but is not limited to: servers, computers, smart phones, tablet computers, and other devices capable of executing the method for generating user reports. Fig. 1 shows an implementation flow chart of the method for generating a user report provided by the first embodiment of the present application, and the details are as follows:

In S101, multiple voice signals generated by a target user during a conversation are acquired, and each of the voice signals is converted into a corresponding conversation text.

In this embodiment, the terminal device may be a server of the user database, and the server may be connected to the distributed microphone module through a communication link. The communication link may be a physical link for wired communication, or may be through a local area network or the Internet. The virtual link established by the method. The microphone module can be deployed in the same area as the terminal device, or distributed in various interview locations to collect voice signals generated during the interview.

Optionally, in this embodiment, the microphone module is specifically a microphone array. The microphone array contains multiple microphone devices. During the process of collecting voice signals, the microphone array can obtain information about the current interview scene from multiple different angles. The voice signal is filtered and shaped through multiple voice signals to obtain a target signal for voice recognition. A microphone array composed of a certain number of microphones to collect voice signals to sample and process the spatial characteristics of the sound field. It is used in the complex environment of the interview environment and can effectively solve noise, reverberation, vocal interference, echo, etc. The problem is to improve the signal quality of the voice signal collection, so that when the text information is subsequently output, the success rate of the text information conversion can be improved.

In this embodiment, the terminal device may be set with an interview time period. If the terminal device detects that the current time has reached the preset interview start time, the microphone module is turned on to obtain the voice signal of the current interview scene through the microphone module. In addition, when the terminal device detects that the current time reaches the preset interview end time, the microphone module is turned off, and all the voice signals collected during the interview time period are converted into text information. Since during the meeting, the user's speech is not continuous, but intermittent, the terminal device can be configured with a start decibel value and an end decibel value. When the microphone module detects that the decibel value of the current interview scene is greater than the start decibel value, it will Start to collect the voice signal, and when the decibel value is less than the end decibel value, end the collection of the voice signal, take each collected voice signal as a conversation paragraph in the conversation process, and output the corresponding conversation text for each conversation paragraph In the interview process, multiple conversational paragraphs are generated between the target user and the interviewer based on the question-and-answer process. The terminal device can recognize the emotional feature value of each conversational paragraph, and generate all the conversational texts generated during the entire conversation. Personality analysis report of target users.

Optionally, in this embodiment, the terminal device may perform the output operation of text information after receiving a segment of voice signal, and after detecting the end of the current interview (for example, reaching the preset interview end time or detecting the pre-interview) If no voice signal is received within the waiting time), the operation of S102 is executed based on the text information corresponding to all the collected voice signals, that is, the collection operation is performed in parallel with the voice recognition operation; the terminal device can also collect the current conference All the voice information of is stored in the database, and after the interview is over, the operation of S102 is executed.

In this embodiment, the terminal device may be provided with a voice recognition algorithm. The terminal device may parse the voice signal through the voice recognition algorithm and output text information corresponding to the voice signal. This achieves the purpose of voice recognition, automatically records the interview content, and obtains the target. The conversation text of the user during the conversation. Optionally, in the process of voice recognition, the terminal device can determine the interview language used in the interview process, and adjust the voice recognition algorithm based on the interview language, thereby improving the accuracy of recognition. Specifically, the manner of determining the language of the interview may be: obtaining user information of the target user participating in the interview, the user information including information such as the user's household registration or residential address; and determining the language of the interview based on the household register or residential address of the target user.

In a possible implementation, the terminal device can divide the conversation text into multiple conversation segments based on the preset maximum number of sentences, and each conversation segment contains no more than the preset maximum number of sentences. When the conversation duration is long, the amount of generated conversation text is relatively large. By dividing the conversation text, the efficiency of subsequent recognition operations can be improved, and the number of marks can be stabilized. Of course, the terminal device can generate a corresponding sentence selection box based on the maximum number of sentences, and traverse the conversation text based on the sentence selection box to select consecutive conversation segments of multiple sentences, thereby stabilizing the number of sentences recognized each time. The consistency of the identification parameters is improved.

In a possible implementation manner, the manner of converting the voice signal into conversational text may specifically be: parsing the voice signal, and extracting the waveform characteristics and pitch characteristics corresponding to each frame of the voice signal. The waveform characteristics and pitch characteristics corresponding to each frame of speech signal are sequentially input into the trained speech recognition model. The speech recognition model is specifically trained based on the standard waveforms and pitch waveforms corresponding to all candidate characters, and the similarity with each candidate character can be calculated by importing the speech signal of each frame into the aforementioned speech recognition model. The candidate character with the highest similarity is selected as the text corresponding to the speech signal of the frame, and the conversation text corresponding to the speech signal is generated based on the text of all frames.

In S102, semantic analysis is performed on the conversation text to obtain conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and a conversation content set is generated.

In this embodiment, the terminal device may be equipped with a semantic recognition algorithm, which can perform semantic analysis on the conversation text, and extract the conversation keywords contained in the aforementioned conversation text. The process of extracting conversation keywords by the semantic recognition algorithm can be specifically as follows: the conversation text is divided into words, divided into multiple phrases containing several characters, and each phrase contains at least one character and no more than 4 characters; The part-of-speech recognition of phrases can filter invalid phrases that are not related to emotions. For example, some connectives have less relevance to the analysis of emotional personality, such as the connectives "and", "and" and "union", etc. There are some particles such as "的", "地" and "得". After the terminal device filters out invalid phrases, the effective phrases containing user emotions are obtained, and the above effective phrases are recognized as conversation keywords; optionally The terminal device stores a key dictionary and judges whether the valid phrase is in the key dictionary. If it exists, the valid phrase is recognized as a conversation keyword; otherwise, the valid phrase is recognized as an invalid phrase.

In this embodiment, the terminal device may configure a corresponding session tag for the session keyword, and the session tag is used to indicate the feature value of the session keyword in the preset word dimension. For example, the conversation tag can be used to mark the part of speech of the conversation keyword, such as the conversation keyword "today". In the case of part-of-speech classification, the conversation label can be set to "noun", and in the case of word content classification, the conversation label Can be set to "time qualifier" and so on. Based on different division methods and the needs of the emotion recognition process, different session tags can be configured for session keywords. The number of the above-mentioned session tags can be one, or two or more, which is not limited here. Encapsulate all session keywords with session tags to obtain the above-mentioned set of session content. Illustratively, the set of session content can be expressed as

Where, i = 1, ..., N ; j = 1, ..., N i; the N speech signal included in the total for the entire session, i.e. the session number of text; N _i indicates the The number of sentences contained in the i-th session text.

For example, the conversation between the interviewer and the interviewer is as follows: "Interviewer: Hello, please introduce yourself. Interviewer: Hello, interviewer. My name is Zhang San. I am from Shenzhen. Graduated from university. Good at testing. Interviewer: What do you know about our position?” Among them, there are 3 speech signals in the above conversation process, that is, the number of conversation texts is 3, and i is the number of conversation texts. For example, the conversation sequence of "Hello, please introduce yourself" is 1. And each dialogue contains the corresponding number of sentences. For example, "Hello, please introduce themselves" statement includes the number is 2, namely, "Hello" and "Please introduce yourself" At this point, N _i 2.

Further, as another embodiment of the present application, before S102, it may also include: before determining the tag corresponding to the keyword, the automatic tag recognition algorithm can be trained to maximize the value of the maximization function, and the automatic recognition can be performed at this time. The label recognition algorithm has been adjusted, where the maximization function can be expressed as:

Among them, θ represents model parameters.

In S103, a conversation word vector corresponding to each of the conversation keywords in the conversation content set is obtained, and an emotional feature value corresponding to the voice signal is determined based on each of the conversation word vectors.

In this embodiment, the terminal device may generate a conversation word vector corresponding to the conversation keyword according to each conversation keyword and the corresponding conversation label in the conversation content set. In a possible implementation manner, the method for generating the above-mentioned conversational word vector may be as follows: the terminal device is configured with a key dictionary, and each candidate keyword in the key dictionary is configured with a corresponding word number, and the conversation keyword is identified in the above-mentioned The word number in the key dictionary determines the value of the first dimension based on the word number; correspondingly, the terminal device can generate a tag dictionary and determine the second dimension value of the conversation keyword by querying the tag number of the conversation tag in the tag dictionary. A conversation word vector is generated based on the first dimension value and the second dimension value.

In a possible implementation, the method for generating the above-mentioned conversational word vector may also be: obtaining the parameter values of the conversation keyword in multiple parts of speech dimensions, generating a multi-dimensional vector, and correspondingly, obtaining the conversation label in multiple parts of speech dimensions. The parameter value can also generate a multi-dimensional vector about the conversation tag, and merge the multi-dimensional vector of the conversation keyword with the multi-dimensional vector of the tag key to obtain the above-mentioned conversation word vector.

In this embodiment, the terminal device may be configured with an emotion recognition network, and the terminal device imports the emotion recognition network in sequence according to the appearance order of each session keyword, and imports the preset end mark after all the session keywords are input. The emotion recognition network outputs the emotional feature value corresponding to the above-mentioned conversational text, that is, the speech signal. Specifically, the aforementioned emotional feature value may include scores in multiple emotional dimensions, such as an emotional magnitude dimension and a positive degree dimension.

In S104, based on the emotional feature values of all voice signals, a personality analysis report of the target user is generated.

In an embodiment, the generated personality analysis report of the target user is stored in the blockchain network, and the data information can be shared between different platforms through the storage of the blockchain, and the data can also be prevented from being tampered with.

Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

In this embodiment, the terminal device can generate a user portrait of the target user according to the emotional characteristics corresponding to all conversational content, determine the probability score corresponding to each personality type, and finally select the personality type with the highest probability score as the personality type of the target user. And generate the personality analysis report of the target user mentioned above. Optionally, the terminal device can also record the probability scores of all personality types in the personality analysis report, so that the interview manager can determine the potential personality characteristics of the target user based on the personality analysis report, which improves the richness of the content of the personality analysis report.

It can be seen from the above that the method for generating a user report provided by the embodiment of the present application collects the voice signal of the target user during a conversation with the target user, converts the voice signal into the corresponding conversation text, and responds to the conversation. The text is semantically analyzed to obtain the corresponding conversational content collection, and based on the conversational word vector of each conversation keyword in the conversational content collection, the emotional feature value corresponding to the voice signal is generated, and based on the emotional feature value of all voice signals, the personality of the target user is determined Type, and generate a personality analysis report on the target user, so that the target user’s language can be used to determine the personality during the conversation with the target user, and the purpose of automatically outputting the analysis report is realized. Compared with the existing user report technology, this embodiment does not rely on the interviewer or the conversation object to manually fill in or subjectively judge, and does not require the user to spend extra time writing a personality analysis report on the target user, thereby greatly reducing user operations, and The above process can determine the emotional characteristic value through the voice signals at different stages in the conversation process, instead of using a single utterance or sentence to judge the personality, thereby improving the accuracy of the personality analysis report.

Fig. 2 shows a specific implementation flow chart of a method S103 for generating a user report provided by the second embodiment of the present application. Referring to FIG. 2, compared with the embodiment described in FIG. 1, S103 in a method for generating a user report provided in this embodiment includes: S1031 to S1036, which are detailed as follows:

Further, the obtaining the conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining the emotional feature value corresponding to the voice signal based on each of the conversation word vectors includes:

In S1031, determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities.

In this embodiment, the terminal device is configured with a knowledge graph, which contains multiple knowledge nodes, and there are corresponding association relationships between different knowledge nodes, thereby forming a network connected by multiple knowledge nodes, that is, the above Knowledge graph. The terminal device can determine the knowledge node associated with the session keyword on the above-mentioned knowledge graph, and identify other nodes adjacent to the associated knowledge node, that is, other knowledge nodes with an association relationship, as the associated entity of the session keyword.

In a possible implementation manner, the terminal device may determine the weighted weight of the above-mentioned concern entity according to the confidence of the association relationship between the knowledge node associated with the session keyword and the associated entity.

In S1032, the word concept vector of the conversation keyword is generated according to the weighted weight of all the associated entities.

In this embodiment, since it is determined based on the knowledge graph that the associated entity does not have features related to the contextual concept and emotion of the conversational sentence where the conversation keyword is located, and the weighted weight is calculated considering the contextual relevance and emotional features, it can be converted to A concept vector containing the above two features. The specific calculation method is:

Wherein, c(t) is the above-mentioned word concept vector; g(t) is the total number of associated entities contained in the session keyword; c _k is the word vector of the k-th associated entity of the above-mentioned session keyword, and w _k is the above-mentioned The weighted weight of the k-th associated entity of the session keyword.

Preferably, as another embodiment of the present application, after the word concept vector corresponding to the conversation keyword is calculated, the word concept vector can be converted into a word feature vector by linear change, and the specific conversion method can be:

among them,

Is the word feature vector of the above-mentioned session keywords; W is the model parameter, and the model parameter W∈R ^d*2d ; t is the sentence vector of the sentence where the session keyword is located; Embed(t) is the size of the embedded session keyword; Post( t) is the coding at the position of the conversation sentence; R ^d is the word vector size of the conversation keyword.

In S1033, based on the conversational sentence to which each of the conversational keywords belongs, encapsulate all word concept vectors belonging to the same conversational sentence to generate the sentence concept vector of the conversational sentence; The conversation text is obtained after sentence division.

In this embodiment, the conversation text may contain multiple conversation sentences. The terminal device may divide the conversation keywords based on the conversation sentences described by the conversation keywords to obtain multiple conversation keyword groups, and all conversation keywords in each conversation keyword group correspond to the same conversation sentence. The terminal device can encapsulate the word concept vectors belonging to the same conversation sentence to generate a sentence probability vector corresponding to the conversation sentence.

In S1034, the sentence concept vector of each conversation sentence is imported into the first attention algorithm to obtain the dialogue update vector of each conversation sentence.

In this embodiment, the above-mentioned dialogue update vector sentence is used to characterize the emotional characteristics of the conversation sentence in this application. Therefore, the terminal device can separately import the sentence concept vector of each conversation sentence into the first attention algorithm to obtain the dialogue update vector.

In S1035, encapsulate the sentence concept vectors of all conversation sentences of the conversation text, generate the conversation concept vectors of the conversation text, and import the conversation concept vectors into the second attention model to generate the conversation Text concept vector for text.

In this embodiment, since the first attention model is specifically used to determine the emotional characteristics of a single sentence, the terminal device can determine the overall emotional characteristics of the entire conversation text according to the contextual connections between different sentences. Therefore, the sentence concept vectors of all conversational sentences can be encapsulated to obtain the conversation concept vector, and the conversation concept vector can be imported into the aforementioned second attention model to obtain the text concept vector. Among them, the text approximate vector can be specifically expressed as:

FF(x)=max(0,W ₁ x+b ₁ )W ₂ +b ₂

Where is the text concept vector, which is the dialogue concept vector of the i-th conversational text; W ₁ , W ₂ , b ₁ and b ₂ are the model parameters of the second attention model; d _s is the number of endpoints based on linear transformation The coefficient value determined by h, d _s =d/h. L(x) is a linear transformation based on the number of endpoints h; L'(x) is an inverse linear transformation based on the number of endpoints h.

In S1036, the emotional feature value is determined according to the dialogue update vector and the text concept vector.

In this embodiment, the terminal device can import the dialogue update vector and the text concept vector into the third attention model to obtain the emotion concept vector corresponding to the conversation text. The emotional concept vector can be specifically:

Among them, R ⁱ is the above-mentioned emotional concept vector;

Update the vector for the dialogue. The terminal device can import the foregoing emotional concept vector into a preset pooling layer, perform emotional feature extraction, and obtain the emotional feature value corresponding to the foregoing emotional concept vector. The pooling layer can be expressed as:

O=max_pool(R ⁱ )

p=softmax(O*W ₃ +b ₃ )

Among them, p is the above-mentioned emotional feature value; W ₃ ∈ R ^d*q , b ₃ ∈ R ^q represents the model parameter, and q represents the number of classes.

In the embodiment of this application, the conversational content is extended by obtaining the associated entities of the conversation keywords, and the conversation update vector based on a single sentence and the text concept vector based on all sentences are determined respectively to determine the emotional feature value of the target user. Determine the user's emotional characteristics from multiple dimensions, thereby improving the accuracy of the emotional characteristics.

Fig. 3 shows a specific implementation flow chart of a method S1031 for generating a user report provided by the third embodiment of the present application. Referring to FIG. 3, compared to the embodiment described in FIG. 2, S1031 in a method for generating a user report provided in this embodiment includes: S301 to S303, which are detailed as follows:

Further, the determining the associated entity of each of the session keywords in the preset knowledge graph and obtaining the weighted weight corresponding to each of the associated entities includes:

In S301, obtain the correlation strength factor between each of the associated entities and the session keywords.

In this embodiment, according to the closeness of the association between the different knowledge nodes, the association confidence between the different associated nodes can be determined. For example, if two knowledge nodes have a co-occurrence relationship in most of the text (the co-occurrence relationship is that multiple knowledge nodes appear in the same sentence at the same time), the correlation between the above-mentioned knowledge nodes has a higher confidence; , If two knowledge nodes only have a co-occurrence relationship in a small amount of text, the confidence of the association between the above-mentioned knowledge nodes is low. According to the confidence of the association between the knowledge node and the associated entity associated with the session keyword, the above-mentioned association strength factor can be obtained.

In a possible implementation manner, the terminal device may include a conversion algorithm of the correlation strength factor, and import the correlation confidence corresponding to the associated entity into the conversion algorithm to generate the aforementioned correlation strength factor.

In S302, based on a preset emotion measurement algorithm, the emotion intensity factor of each associated entity is determined.

In this embodiment, because different words have corresponding emotional characteristics, for example, the word "smile" is a more positive word emotionally, while the word "cry" is a negative word emotionally, and can be based on different words The corresponding content and meaning can be converted into corresponding emotional intensity factors. The terminal device can be equipped with an emotion measurement algorithm, which can convert words into a computer-recognizable emotion intensity factor. In this case, the terminal device can import the associated entity into the aforementioned emotion measurement algorithm, and output the emotion intensity factor corresponding to the associated entity.

In S303, a weighted weight of the associated entity is constructed based on the emotional intensity factor and the associated intensity factor.

In this embodiment, the terminal device can generate the weighted weight of the associated entity according to the emotion factor and the correlation strength factor, and the weighted weight includes the closeness of the association with the session keyword and the emotional feature, which is convenient for the subsequent emotional feature value. Of ok. Wherein, the weighting weight may specifically be:

w _k =λ _k *rel _k +(1-λ _k )*aff _k

Where w _k is the weighted weight corresponding to the k-th associated entity; rel _k is the associated intensity factor corresponding to the k-th associated entity, aff _k is the emotional intensity factor of the k-th associated entity, and λ _k is the k-th associated entity. The preset parameters of the associated entity.

In the embodiment of the present application, by calculating the correlation strength between the associated entity and the session keyword and the emotional feature of the keyword in this application, the weighted weight corresponding to the associated entity when calculating the emotional feature value is determined. The higher the degree of association, The higher the corresponding weighted weight, the greater the contribution of the emotional feature of the present application to the emotional feature value of the subsequent conversational text, so that the accuracy of the emotional feature value can be improved.

FIG. 4 shows a specific implementation flowchart of a method S301 for generating a user report provided by the fourth embodiment of the present application. Referring to FIG. 4, compared with the embodiment described in FIG. 3, a method S301 for generating a user report provided by this embodiment includes: S3011 to S3013, which are detailed as follows:

Further, the obtaining the correlation strength factor between each of the associated entities and the session keywords includes:

In S3011, based on the knowledge graph, the association confidence between the associated entity and the session keyword is determined.

In this embodiment, the confidence level of the association relationship between each knowledge node may be recorded in the knowledge graph, and the terminal device marks the session keyword and the associated entity in the knowledge graph to determine the confidence level of the association relationship between the two. The confidence of the correlation is identified as the confidence of the correlation between the above two. Among them, the more the number of co-occurrences between the associated entity and the session keyword, the higher the corresponding confidence of the association; conversely, the less the number of co-occurrences between the two, the lower the confidence of the corresponding association.

In S3012, the conversation sentence associated with the conversation keyword is imported into a preset pooling layer, a sentence vector of the conversation sentence associated with each of the conversation keywords is generated, and the conversation keyword is determined based on the sentence vector The conversational text vector of the segment; the conversational text vector is specifically:

Wherein, CR(X ⁱ ) is the conversation text vector of the conversation keyword, and the conversation text number where the conversation keyword is located is i;

Is the sentence vector of the conversation sentence where the conversation keyword is located, and the sentence number of the conversation sentence in the conversation text is j; the M is a preset correlation coefficient.

In this embodiment, the conversation text contains multiple conversation sentences. Assume that the conversation sentence where the conversation keyword is

Then the associated sentence that has an association relationship with the session keyword is

to

M conversational sentences in, where M is the preset correlation coefficient. In order to control the data processing volume of the terminal device, the terminal device can be configured with the number of contacts M. In the process of emotional feature recognition, the maximum number of sessions that need to be uniformly recognized can be determined based on the correlation coefficient M. By importing the sentence vectors corresponding to the M conversational sentences into the above-mentioned text-to-vector conversion function, the conversational text vector corresponding to the conversational text based on the conversational keywords can be determined.

In S3013, the correlation strength factor is calculated based on the conversation text vector and the correlation confidence; the correlation strength factor is specifically:

rel _k =max-min(s _k )*|cos(CR(X ⁱ ),c _k )|

Where rel _k is the correlation strength factor of the k-th session keyword; c _k is the correlation confidence of the k-th associated entity of the session keyword; max-min(s _k ) is the session keyword The emotions corresponding to k associated entities are extremely poor.

In this embodiment, the terminal device may include multiple different emotion measurement algorithms, and the emotional parameter values of the associated entities determined by different emotion measurement algorithms may be different. The terminal device may determine the related entities according to different emotion measurement algorithms. The emotional range, that is, the above-mentioned max-min(s _k ), the emotional range of the related entity and the above two parameters are imported into a preset correlation strength conversion algorithm to obtain the correlation strength value of the related entity.

In the embodiment of the present application, by determining the conversational text vector, in the process of calculating the associated entity, the association between different conversational sentences in the entire conversational text is considered, so that the accuracy of the correlation strength factor can be improved.

FIG. 5 shows a specific implementation flowchart of a method S302 for generating a user report provided by the fifth embodiment of the present application. Referring to FIG. 5, compared to the embodiment described in FIG. 3, a method S302 for generating a user report provided in this embodiment includes: S3021 to S3023, which are detailed as follows:

Further, the determining the emotion intensity factor of each associated entity based on a preset emotion measurement algorithm includes:

In S3021, the emotional attributes of the associated entity are identified.

In this embodiment, the terminal device can determine the emotional intensity factor in different ways according to the different emotional attributes of the associated entities. For example, if the demonstrative pronoun "I" does not contain emotional characteristics, the corresponding emotional attribute is a non-affective type; the adjective "great" contains a certain degree of emotional characteristics, and the corresponding emotional attribute is an emotional type. Based on this, the terminal device can identify the emotional attribute of each associated entity. If the emotional attribute of the associated entity is a non-emotional type, the operation of S3022 is performed; conversely, if the emotional type of the associated entity is an emotional type, the operation of S3023 is performed.

In S3022, if the affective attribute of the associated entity is a non-emotional type, the affective intensity factor is configured as a preset default value.

In this embodiment, the terminal device can configure all non-emotional related entities with a fixed value of emotional intensity factor, and the value of the emotional intensity factor can be configured to be 0.5.

In S3023, if the emotion attribute of the associated entity is an emotion type, the emotion intensity factor of the conversation keyword is calculated through a preset emotion conversion algorithm; the emotion intensity factor is specifically:

Where aff _k is the emotional intensity factor of the k-th associated entity; VAD(c _k ) is the positive emotional score of the k-th associated entity; A(c _k ) is the k-th associated entity The emotional magnitude score of the associated entity.

In this embodiment, the emotional intensity factor is specifically composed of two different emotional dimensions, divided into a positive emotional dimension and an emotional amplitude dimension, where the positive emotional dimension is specifically used to identify whether the corresponding emotional feature of the entity is positive, if the degree of positive The higher the value, the higher the corresponding emotional score. For example, the positive emotional score corresponding to "laugh" is positive, while the positive emotional score corresponding to "cry" is negative, and the emotional positive score corresponding to "optimism" is higher than the emotional positive score of "acceptance" The emotional amplitude score is used to identify the emotional fluctuation amplitude of the entity, for example, the emotional amplitude score of "laugh" will be lower than the emotional amplitude score of "large". The terminal device can determine the corresponding emotional scores of each associated entity in the above two dimensions through a preset emotional measurement algorithm, and obtain the corresponding emotional intensity factor. among them,

It is based on the norm of 2.

In the embodiment of the present application, the emotion attribute of the associated entity is identified, and the calculation method of the corresponding emotion intensity factor is selected, thereby improving the accuracy of the emotion intensity factor.

FIG. 6 shows a specific implementation flow chart of a method S1034 for generating a user report provided by the sixth embodiment of the present application. Referring to FIG. 6, compared to the embodiment described in FIG. 2, S1034 in a method for generating a user report provided in this embodiment includes: S601 to S603, which are detailed as follows:

Further, the respectively importing the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences includes:

In S601, the sentence concept vector of the conversation sentence is linearly changed to obtain a linear vector containing h endpoints; where h is the preset number of endpoints.

In this embodiment, the terminal device can perform a linear transformation on the sentence concept of the conversational sentence, and project the sentence concept vector into h more endpoints to obtain a linear vector about the sentence concept vector. Wherein, the above-mentioned value of h may be a preset linear transformation parameter of the first attention algorithm, or may be changed based on the text amount of the conversational text.

In S602, the linear vector is imported into the multi-head self-attention layer of the first attention algorithm to obtain the attention vector of the conversation sentence; the attention vector is specifically:

among them,

Is the attention vector of the nth conversation sentence in the ith conversation text;

Is the linear vector; d _s is a coefficient value determined based on the number of endpoints h of the linear vector.

In this embodiment, the terminal device may import the linear vector obtained by the foregoing calculation to the foregoing multi-head attention layer, and the attention layer includes three nodes. First, the terminal device can calculate the product between the linear vector and the transposition of the linear vector, and process the multiplied vector through the softmax function, and finally multiply the linear vector again to achieve triple iteration to improve feature extraction Accuracy.

In S603, a dialogue update vector of the conversation sentence is generated based on the attention vector; the dialogue update vector is specifically:

Wherein, W ₁ , W ₂ , b ₁ and b ₂ are model parameters of the first attention model.

In this embodiment, the terminal device can import the generated attention vector into the feedforward layer of the first shocking network to obtain the dialogue update vector corresponding to the conversation sentence. The feedforward layer can first perform an inverse linear transformation on the attention vector, transform the attention vector containing multiple endpoints to a vector containing a single endpoint, and then perform subsequent operations.

In the embodiment of the application, after adding the emotion judgment method based on the new NLP transformer, the interviewer can quickly judge certain personality characteristics of the candidate through the candidate's answer, and give necessary and reasonable follow-up questions. In the actual AI interview application, because the judgment is more accurate, the response speed of the hardware has also been improved, so it not only saves the hardware space, but also improves the running speed and interview experience. AI intelligent interview can judge the emotion of the candidate based on the answer to the candidate, and then judge the personality of the candidate. The interviewer can analyze the candidate's personality characteristics after the interview is completed, and use it as the basis for selecting candidates.

FIG. 7 shows a specific implementation flowchart of a method S104 for generating a user report provided by the seventh embodiment of the present application. Referring to FIG. 7, relative to any one of the embodiments described in FIG. 1 to FIG. 6, S104 in a method for generating a user report provided in this embodiment includes: S1041-S1043, which are detailed as follows:

Further, the generating a personality analysis report of the target user based on the emotional characteristic values of all voice signals includes:

In S1041, an emotion waveform diagram of the target user is generated according to the emotion characteristic value of each voice signal.

In this embodiment, the terminal device can mark each emotional feature value on a preset coordinate axis according to each conversation text, that is, the generation sequence of the voice signal, and connect each emotional feature value in turn to obtain the target user in the entire conversation process. Corresponding sentiment waveform in.

In S1042, the emotion waveform diagram is matched with the standard personality waveform diagrams of each candidate personality to determine the user personality of the target user.

In this embodiment, the terminal device can calculate the deviation value between the standard waveform diagrams of each candidate personality of the emotion waveform diagram, and calculate the matching degree between the target user and each candidate personality based on the inverse of the deviation value, and select the matching degree The highest candidate personality is taken as the user personality of the target user. Of course, multiple candidate personalities with a matching degree greater than a preset matching threshold can also be selected as the user personality of the target user.

In S1043, the personality analysis report is obtained based on the user's personality.

In this embodiment, the terminal device can obtain the standard language segment corresponding to each user's personality, and generate a personality analysis report based on the above-mentioned standard language segment.

In the embodiment of the present application, by generating the emotional waveform diagram of the target user, and identifying the user personality of the target user from the candidate personality, the personality analysis report is generated, thereby improving the generation efficiency of the personality analysis report.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

FIG. 8 shows a structural block diagram of a device for generating a user report according to an embodiment of the present application, and each unit included in the device for generating a user report is used to execute each step in the embodiment corresponding to FIG. 1. For details, please refer to the relevant description in the embodiment corresponding to FIG. 8 and FIG. 1. For ease of description, only the parts related to this embodiment are shown.

Referring to Figure 8, the device for generating the user report includes:

The conversation text obtaining unit 81 is configured to obtain multiple voice signals generated by the target user during a conversation, and convert each of the voice signals into corresponding conversation text;

The conversation content collection generating unit 82 is configured to perform semantic analysis on the conversation text to obtain the conversation keywords corresponding to the conversation text and the conversation tags corresponding to each of the keywords, and generate a conversation content collection;

The emotion feature value determining unit 83 is configured to obtain the conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determine the emotion feature value corresponding to the voice signal based on each of the conversation word vectors;

The personality analysis report generating unit 84 is configured to generate a personality analysis report of the target user based on the emotional feature values of all voice signals.

Optionally, the emotional feature value determining unit 83 includes:

A weighted weight determining unit, configured to determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities;

A word concept vector generating unit, configured to generate the word concept vector of the conversation keyword according to the weighted weights of all the associated entities;

The sentence concept vector generating unit is used to encapsulate all the word concept vectors belonging to the same conversation sentence based on the conversation sentence to which each of the conversation keywords belongs to generate the sentence concept vector of the conversation sentence; the conversation sentence Is obtained after sentence division of the conversation text;

A dialogue update vector generating unit, configured to respectively import the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences;

The text concept vector generating unit is used to encapsulate the sentence concept vectors of all conversation sentences of the conversation text, generate the conversation concept vectors of the conversation text, and import the conversation concept vectors into the second attention model, Generating a text concept vector of the conversation text;

The emotional feature value calculation unit is configured to determine the emotional feature value according to the dialogue update vector and the text concept vector.

Optionally, the weighting weight determining unit includes:

An association strength factor determination unit, configured to obtain an association strength factor between each of the associated entities and the session keywords;

The emotion intensity factor determination unit is configured to determine the emotion intensity factor of each associated entity based on a preset emotion measurement algorithm;

The weighted weight calculation unit is configured to construct the weighted weight of the associated entity based on the emotional intensity factor and the associated intensity factor.

Optionally, the correlation strength factor determination unit includes:

An association confidence degree determining unit, configured to determine the association confidence degree between the associated entity and the session keyword based on the knowledge graph;

The conversational text vector determining unit is used to import the conversational sentences associated with the conversational keywords into the preset pooling layer, generate the sentence vectors of the conversational sentences associated with each of the conversational keywords, and determine all the conversational sentences based on the sentence vectors. The conversational text vector of the segment where the conversational keywords are located; the conversational text vector is specifically:

Is the sentence vector of the conversation sentence where the conversation keyword is located, the sentence number of the conversation sentence in the conversation text is j; the M is a preset correlation coefficient;

The correlation strength factor calculation unit is configured to calculate the correlation strength factor based on the conversational text vector and the correlation confidence; the correlation strength factor is specifically:

rel _k =max-min(s _k )*|cos(CR(X ⁱ ),c _k )|

Optionally, the emotion intensity factor determination unit includes:

A time distribution diagram generating unit, configured to generate a request time distribution diagram for the service type according to the request initiation time included in all the service requests;

The emotional attribute recognition unit is used to recognize the emotional attribute of the associated entity;

A non-emotional type processing unit, configured to configure the emotional intensity factor to a preset default value if the emotional attribute of the associated entity is a non-emotional type;

The emotion type processing unit is configured to calculate the emotion intensity factor of the conversation keyword by using a preset emotion conversion algorithm if the emotion attribute of the associated entity is an emotion type; the emotion intensity factor is specifically :

Optionally, the dialog update vector generating unit includes:

The linear vector generating unit is used to linearly change the sentence concept vector of the conversation sentence to obtain a linear vector containing h endpoints; wherein, the h is the preset number of endpoints;

The attention vector generating unit is configured to import the linear vector into the multi-head self-attention layer of the first attention algorithm to obtain the attention vector of the conversation sentence; the attention vector is specifically:

among them,

Is the linear vector; d _s is a coefficient value determined based on the number of endpoints h of the linear vector;

The dialogue update vector determining unit is configured to generate a dialogue update vector of the conversation sentence based on the attention vector; the dialogue update vector is specifically:

Optionally, the personality analysis report generating unit 84 includes:

An emotion waveform diagram generating unit, configured to generate the emotion waveform diagram of the target user according to the emotion characteristic value of each of the voice signals;

A user personality determination unit, configured to match the emotion waveform diagram with the standard personality waveform diagrams of each candidate personality to determine the user personality of the target user;

The personality analysis report output unit is configured to obtain the personality analysis report based on the user's personality.

Therefore, the user report generation device provided by the embodiment of the present application also does not rely on the interviewer or the conversation object to manually fill in or subjectively judge, and does not require the user to spend extra time writing a personality analysis report on the target user, thereby greatly reducing user operations. In addition, the above process can determine the emotional characteristic value through the voice signals at different stages in the conversation process, instead of using a single utterance or sentence to judge the personality, so that the accuracy of the personality analysis report can be improved.

FIG. 9 is a schematic diagram of a terminal device provided by another embodiment of the present application. As shown in FIG. 9, the terminal device 9 of this embodiment includes: a processor 90, a memory 91, and a computer program 92 stored in the memory 91 and running on the processor 90, such as a program for generating user reports . When the processor 90 executes the computer program 92, the steps in the foregoing method for generating user reports are implemented, such as S101 to S104 shown in FIG. 1. Alternatively, when the processor 90 executes the computer program 92, the functions of the units in the foregoing device embodiments, such as the functions of the modules 81 to 84 shown in FIG. 8, are realized.

Exemplarily, the computer program 92 may be divided into one or more units, and the one or more units are stored in the memory 91 and executed by the processor 90 to complete the application. The one or more units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 92 in the terminal device 9. For example, the computer program 92 may be divided into a conversation text acquisition unit, a conversation content collection generation unit, an emotional feature value determination unit, and a personality analysis report generation unit, and the specific functions of each unit are as described above.

The terminal device 9 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud terminal device. The terminal device may include, but is not limited to, a processor 90 and a memory 91. Those skilled in the art can understand that FIG. 9 is only an example of the terminal device 9 and does not constitute a limitation on the terminal device 9. It may include more or less components than shown in the figure, or a combination of certain components, or different components. For example, the terminal device may also include input and output devices, network access devices, buses, and so on.

The so-called processor 90 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 91 may be an internal storage unit of the terminal device 9, for example, a hard disk or a memory of the terminal device 9. The memory 91 may also be an external storage device of the terminal device 9, such as a plug-in hard disk equipped on the terminal device 9, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD). Card, Flash Card, etc. Further, the memory 91 may also include both an internal storage unit of the terminal device 9 and an external storage device. The memory 91 is used to store the computer program and other programs and data required by the terminal device. The memory 91 can also be used to temporarily store data that has been output or will be output.

The embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc., which can store program codes Medium.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A method for generating user reports, including:

Acquiring multiple voice signals generated by the target user during the conversation, and converting each of the voice signals into corresponding conversation text;

Performing semantic analysis on the conversation text, obtaining conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and generating a conversation content set;

Obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining an emotional feature value corresponding to the voice signal based on each of the conversation word vectors;

Based on the emotional feature values of all voice signals, a personality analysis report of the target user is generated.
The generating method according to claim 1, wherein said obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining the emotion corresponding to the speech signal based on each of the conversation word vectors Characteristic values, including:

Determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities;

Generating the word concept vector of the conversation keyword according to the weighted weights of all the associated entities;

Based on the conversational sentence to which each of the conversational keywords belongs, encapsulate all word concept vectors belonging to the same conversational sentence to generate the sentence concept vector of the conversational sentence; the conversational sentence is a sentence for the conversational text Obtained after division;

Importing the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences;

Encapsulate the sentence concept vectors of all the conversation sentences of the conversation text to generate the conversation concept vector of the conversation text, and import the conversation concept vector into the second attention model to generate the text concept of the conversation text vector;

The emotional feature value is determined according to the dialogue update vector and the text concept vector.
The generating method according to claim 2, wherein the determining the associated entity of each of the session keywords in a preset knowledge graph and obtaining the weighted weight corresponding to each of the associated entities comprises:

Acquiring the correlation strength factor between each of the associated entities and the session keywords;

Determine the emotional intensity factor of each associated entity based on a preset emotion measurement algorithm;

Based on the emotion intensity factor and the association intensity factor, a weighted weight of the associated entity is constructed.
The generating method according to claim 3, wherein said obtaining the correlation strength factor between each of said associated entities and said session keywords comprises:

Based on the knowledge graph, determining the association confidence between the associated entity and the session keyword;

Import the conversational sentences associated with the conversation keywords into the preset pooling layer, generate the sentence vectors of the conversational sentences associated with each of the conversation keywords, and determine based on the sentence vectors the segment of the conversation keywords. Conversation text vector; the conversation text vector is specifically:

Wherein, CR(X i ) is the conversation text vector of the conversation keyword, and the conversation text number where the conversation keyword is located is i;
Is the sentence vector of the conversation sentence where the conversation keyword is located, the sentence number of the conversation sentence in the conversation text is j; the M is a preset correlation coefficient;

Based on the conversational text vector and the correlation confidence, the correlation strength factor is calculated; the correlation strength factor is specifically:

rel k =max-min(s k )*|cos(CR(X i ),c k )|

Where rel k is the correlation strength factor of the k-th session keyword; c k is the correlation confidence of the k-th associated entity of the session keyword; max-min(s k ) is the session keyword The emotions corresponding to k associated entities are extremely poor.
The generating method according to claim 3, wherein the determining the emotion intensity factor of each of the associated entities based on a preset emotion measurement algorithm comprises:

Identifying the emotional attributes of the associated entity;

If the emotional attribute of the associated entity is a non-emotional type, configure the emotional intensity factor as a preset default value;

If the emotion attribute of the associated entity is an emotion type, the emotion intensity factor of the conversation keyword is calculated through a preset emotion conversion algorithm; the emotion intensity factor is specifically:

Where aff k is the emotional intensity factor of the k-th associated entity; VAD(c k ) is the positive emotional score of the k-th associated entity; A(c k ) is the k-th associated entity The emotional magnitude score of the associated entity.
The generating method according to claim 2, wherein the respectively importing the sentence concept vector of each of the conversational sentences into a first attention algorithm to obtain the dialogue update vector of each of the conversational sentences comprises:

Linearly change the sentence concept vector of the conversation sentence to obtain a linear vector containing h endpoints; wherein, h is the preset number of endpoints;

The linear vector is imported into the multi-head self-attention layer of the first attention algorithm to obtain the attention vector of the conversation sentence; the attention vector is specifically:

among them,
Is the attention vector of the nth conversation sentence in the ith conversation text;
Is the linear vector; d s is a coefficient value determined based on the number of endpoints h of the linear vector;

The dialogue update vector of the conversation sentence is generated based on the attention vector; the dialogue update vector is specifically:

Wherein, W 1 , W 2 , b 1 and b 2 are model parameters of the first attention model.
The generating method according to any one of claims 1 to 6, wherein the generating the personality analysis report of the target user based on the emotional characteristic values of all voice signals comprises:

Generating the emotional waveform diagram of the target user according to the emotional feature value of each of the voice signals;

Matching the emotion waveform diagram with the standard personality waveform diagrams of each candidate personality to determine the user personality of the target user;

The personality analysis report is obtained based on the user's personality.
A device for generating user reports, including:

The conversation text obtaining unit is used to obtain multiple voice signals generated by the target user in the conversation process, and convert each of the voice signals into corresponding conversation text;

A conversation content collection generating unit, configured to perform semantic analysis on the conversation text to obtain conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and generate a conversation content collection;

An emotion feature value determining unit, configured to obtain a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determine the emotion feature value corresponding to the voice signal based on each of the conversation word vectors;

The personality analysis report generating unit is configured to generate the personality analysis report of the target user based on the emotional characteristic values of all voice signals.
8. The generating device according to claim 8, wherein the emotional feature value determining unit comprises:

A weighted weight determining unit, configured to determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities;

A word concept vector generating unit, configured to generate the word concept vector of the conversation keyword according to the weighted weights of all the associated entities;

The sentence concept vector generating unit is used to encapsulate all the word concept vectors belonging to the same conversation sentence based on the conversation sentence to which each of the conversation keywords belongs to generate the sentence concept vector of the conversation sentence; the conversation sentence Is obtained after sentence division of the conversation text;

A dialogue update vector generating unit, configured to respectively import the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences;

The text concept vector generating unit is used to encapsulate the sentence concept vectors of all conversation sentences of the conversation text, generate the conversation concept vectors of the conversation text, and import the conversation concept vectors into the second attention model, Generating a text concept vector of the conversation text;

The emotional feature value calculation unit is configured to determine the emotional feature value according to the dialogue update vector and the text concept vector.
The generating device according to claim 8, wherein the weighting weight determining unit comprises:

An association strength factor determination unit, configured to obtain an association strength factor between each of the associated entities and the session keywords;

The emotion intensity factor determination unit is configured to determine the emotion intensity factor of each associated entity based on a preset emotion measurement algorithm;

The weighted weight calculation unit is configured to construct the weighted weight of the associated entity based on the emotional intensity factor and the associated intensity factor.
A terminal device, wherein the terminal device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and when the processor executes the computer-readable instructions Perform the following steps:

Acquiring multiple voice signals generated by the target user during the conversation, and converting each of the voice signals into corresponding conversation text;

Performing semantic analysis on the conversation text, obtaining conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and generating a conversation content set;

Obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining an emotional feature value corresponding to the voice signal based on each of the conversation word vectors;

Based on the emotional feature values of all voice signals, a personality analysis report of the target user is generated.
The terminal device according to claim 11, wherein said obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining the emotion corresponding to the voice signal based on each of the conversation word vectors Characteristic values, including:

Determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities;

Generating the word concept vector of the conversation keyword according to the weighted weights of all the associated entities;

Based on the conversational sentence to which each of the conversational keywords belongs, encapsulate all word concept vectors belonging to the same conversational sentence to generate the sentence concept vector of the conversational sentence; the conversational sentence is a sentence for the conversational text Obtained after division;

Importing the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences;

Encapsulate the sentence concept vectors of all the conversation sentences of the conversation text, generate the conversation concept vectors of the conversation text, and import the conversation concept vectors into the second attention model to generate the text concept of the conversation text vector;

The emotional feature value is determined according to the dialogue update vector and the text concept vector.
The terminal device according to claim 12, wherein the determining the associated entity of each of the session keywords in a preset knowledge graph and obtaining the weighted weight corresponding to each of the associated entities comprises:

Acquiring the correlation strength factor between each of the associated entities and the session keywords;

Determine the emotional intensity factor of each associated entity based on a preset emotion measurement algorithm;

Based on the emotion intensity factor and the association intensity factor, a weighted weight of the associated entity is constructed.
The terminal device according to claim 13, wherein said obtaining the correlation strength factor between each of said associated entities and said session keywords comprises:

Based on the knowledge graph, determining the association confidence between the associated entity and the session keyword;

Import the conversational sentences associated with the conversational keywords into the preset pooling layer, generate sentence vectors of conversational sentences associated with each of the conversational keywords, and determine the segment of the conversational keywords based on the sentence vectors. Conversation text vector; the conversation text vector is specifically:

Wherein, CR(X i ) is the conversation text vector of the conversation keyword, and the conversation text number where the conversation keyword is located is i;
Is the sentence vector of the conversation sentence where the conversation keyword is located, the sentence number of the conversation sentence in the conversation text is j; the M is a preset correlation coefficient;

The correlation strength factor is calculated based on the conversation text vector and the correlation confidence; the correlation strength factor is specifically:

rel k =max-min(s k )*|cos(CR(X i ),c k )|

Where rel k is the correlation strength factor of the k-th session keyword; c k is the correlation confidence of the k-th associated entity of the session keyword; max-min(s k ) is the session keyword The emotions corresponding to k associated entities are extremely poor.
The terminal device according to claim 13, wherein the determining the emotion intensity factor of each of the associated entities based on a preset emotion measurement algorithm comprises:

Identifying the emotional attributes of the associated entity;

If the emotional attribute of the associated entity is a non-emotional type, configure the emotional intensity factor as a preset default value;

If the emotion attribute of the associated entity is an emotion type, the emotion intensity factor of the conversation keyword is calculated through a preset emotion conversion algorithm; the emotion intensity factor is specifically:

Where aff k is the emotional intensity factor of the k-th associated entity; VAD(c k ) is the positive emotional score of the k-th associated entity; A(c k ) is the k-th associated entity The emotional magnitude score of the associated entity.
A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein, when the computer-readable instructions are executed by a processor, the following steps are implemented:

Acquiring multiple voice signals generated by the target user during the conversation, and converting each of the voice signals into corresponding conversation text;

Performing semantic analysis on the conversation text, obtaining conversation keywords corresponding to the conversation text and conversation tags corresponding to each of the keywords, and generating a conversation content set;

Obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining an emotional feature value corresponding to the voice signal based on each of the conversation word vectors;

Based on the emotional feature values of all voice signals, a personality analysis report of the target user is generated.
The computer-readable storage medium according to claim 16, wherein said obtaining a conversation word vector corresponding to each of the conversation keywords in the conversation content set, and determining the voice signal based on each of the conversation word vectors Corresponding emotional feature values include:

Determine the associated entity of each of the session keywords in the preset knowledge graph, and obtain the weighted weight corresponding to each of the associated entities;

Generating the word concept vector of the conversation keyword according to the weighted weights of all the associated entities;

Based on the conversational sentence to which each of the conversational keywords belongs, encapsulate all word concept vectors belonging to the same conversational sentence to generate the sentence concept vector of the conversational sentence; the conversational sentence is a sentence for the conversational text Obtained after division;

Importing the sentence concept vector of each of the conversational sentences into the first attention algorithm to obtain the dialogue update vector of each of the conversational sentences;

Encapsulate the sentence concept vectors of all the conversation sentences of the conversation text, generate the conversation concept vectors of the conversation text, and import the conversation concept vectors into the second attention model to generate the text concept of the conversation text vector;

The emotional feature value is determined according to the dialogue update vector and the text concept vector.
18. The computer-readable storage medium according to claim 17, wherein said determining the associated entity of each of the session keywords in a preset knowledge graph and obtaining the weighted weight corresponding to each of the associated entities comprises:

Acquiring the correlation strength factor between each of the associated entities and the session keywords;

Determine the emotional intensity factor of each associated entity based on a preset emotion measurement algorithm;

Based on the emotion intensity factor and the association intensity factor, a weighted weight of the associated entity is constructed.
18. The computer-readable storage medium according to claim 18, wherein said obtaining the correlation strength factor between each of said associated entities and said session keywords comprises:

Based on the knowledge graph, determining the association confidence between the associated entity and the session keyword;

Import the conversational sentences associated with the conversational keywords into the preset pooling layer, generate sentence vectors of conversational sentences associated with each of the conversational keywords, and determine the segment of the conversational keywords based on the sentence vectors. Conversation text vector; the conversation text vector is specifically:

Wherein, CR(X i ) is the conversation text vector of the conversation keyword, and the conversation text number where the conversation keyword is located is i;
Is the sentence vector of the conversation sentence where the conversation keyword is located, the sentence number of the conversation sentence in the conversation text is j; the M is a preset correlation coefficient;

The correlation strength factor is calculated based on the conversation text vector and the correlation confidence; the correlation strength factor is specifically:

rel k =max-min(s k )*|cos(CR(X i ),c k )|

Where rel k is the correlation strength factor of the k-th session keyword; c k is the correlation confidence of the k-th associated entity of the session keyword; max-min(s k ) is the session keyword The emotions corresponding to k associated entities are extremely poor.
18. The computer-readable storage medium of claim 18, wherein the determining the emotional intensity factor of each of the associated entities based on a preset emotion measurement algorithm comprises:

Identifying the emotional attributes of the associated entity;

If the emotional attribute of the associated entity is a non-emotional type, configure the emotional intensity factor as a preset default value;

If the emotion attribute of the associated entity is an emotion type, the emotion intensity factor of the conversation keyword is calculated through a preset emotion conversion algorithm; the emotion intensity factor is specifically:

Where aff k is the emotional intensity factor of the k-th associated entity; VAD(c k ) is the positive emotional score of the k-th associated entity; A(c k ) is the k-th associated entity The emotional magnitude score of the associated entity.