CN109087670B

CN109087670B - Emotion analysis method, system, server and storage medium

Info

Publication number: CN109087670B
Application number: CN201811005214.2A
Authority: CN
Inventors: 申王萍
Original assignee: Xian Wingtech Electronic Technology Co Ltd
Current assignee: Xian Wingtech Electronic Technology Co Ltd
Priority date: 2018-08-30
Filing date: 2018-08-30
Publication date: 2021-04-20
Anticipated expiration: 2038-08-30
Also published as: CN109087670A

Abstract

The embodiment of the invention discloses an emotion analysis method, an emotion analysis system, a server and a storage medium, wherein the method comprises the following steps: identifying language keywords and target tones included in the acquired voice data; analyzing and determining voice characteristic points according to the acquired language keywords and the target intonation; generating a target emotion model based on the voice feature points, and calibrating the target voice feature points on the target emotion model; matching the target emotion model with standard emotion models in a standard emotion model library to adjust the calibrated target voice feature points on the target emotion model and recording the change data of the target voice feature points; and matching the change data with the tone characteristic data and the psychological behavior characteristic data in the database, and outputting the emotion or emotion change data of the user according to the matching result. Therefore, the purposes of objectively and accurately analyzing the emotion change of the user and helping the user to manage the emotion can be achieved.

Description

Emotion analysis method, system, server and storage medium

Technical Field

The invention relates to the technical field of data analysis, in particular to an emotion analysis method, system, server and storage medium.

Background

Human emotional expression is of exceptional importance to individuals, as changes in mind and physiology can occur over time if not released at the right time. Today, with the rapid development of economy, people are increasingly inattentive to the care of special groups. Therefore, there is an urgent need for a method for managing emotions by helping a special group having emotion problems objectively analyze emotions through an intelligent method.

At present, the most common method is to establish a harmonious man-machine environment by endowing a computer system with the emotional ability to recognize, understand, express and adapt to people through an emotion computer technology so as to help users objectively analyze emotions and manage emotions. Common methods for analyzing the emotion of the user include analyzing the emotion of the user through a voice recognition technology and a face recognition technology. The user emotion is analyzed only by recognizing sensitive words included in the voice using the voice recognition technology, and the accuracy is low. When the face recognition technology is used, the emotion of a user is analyzed by collecting the expression characteristics of the face and analyzing the expression characteristics. However, the facial expressions usually last for a relatively short time and even disappear once, which is inconvenient to collect, and sometimes the facial expressions express opposite emotions, so that sometimes the analysis of the emotions of the user through face recognition is not accurate. Therefore, it is difficult to achieve the purpose of helping the user manage the emotion by objectively and accurately analyzing the emotion change of the user.

Disclosure of Invention

The embodiment of the invention provides an emotion analysis method, an emotion analysis system, a server and a storage medium, which aim to objectively and accurately analyze emotion changes of a user and help the user to manage emotion.

In a first aspect, an embodiment of the present invention provides an emotion analysis method, including:

identifying language keywords and target tones included in the acquired voice data;

analyzing and determining voice feature points according to the acquired language keywords and the target intonation, wherein the voice feature points are keywords and intonations which mark user emotion in the language keywords and the target intonation;

generating a target emotion model based on the voice feature points, and calibrating the target voice feature points on the target emotion model;

matching the target emotion model with standard emotion models in a standard emotion model library to adjust the calibrated target voice feature points on the target emotion model and recording the change data of the target voice feature points;

and matching the change data with the tone characteristic data and the psychological behavior characteristic data in the database, and outputting the emotion or emotion change data of the user according to the matching result.

In a second aspect, an embodiment of the present invention provides an emotion analysis system, including:

the recognition module is used for recognizing the language keywords and the target intonation included in the acquired voice data;

the feature point analysis module is used for analyzing and determining voice feature points according to the acquired language key words and the target intonation, wherein the voice feature points are the key words and the intonations which mark the emotion of the user in the language key words and the target intonation;

the modeling module is used for generating a target emotion model based on the analyzed and determined voice feature points and calibrating the target voice feature points on the target emotion model;

the adjustment recording module is used for matching the target emotion model with a standard emotion model in a standard emotion model library so as to adjust the calibrated target voice feature points on the target emotion model and record the change data of the target voice feature points;

and the emotion analysis module is used for matching the change data with the tone characteristic data and the psychological behavior characteristic data in the database and outputting the emotion or emotion change data of the user according to the matching result.

In a third aspect, an embodiment of the present invention further provides a server, including:

one or more processors;

a memory for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method of emotion analysis as in any of the embodiments of the invention.

In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the emotion method according to any of the embodiments of the present invention.

According to the emotion analysis method, the system, the server and the storage medium provided by the embodiment of the invention, the target emotion model is established according to the extracted voice feature points, the target voice feature points are calibrated on the target emotion model so as to achieve the purpose of screening out the voice feature points capable of accurately showing the emotion of the user, then the target emotion model is matched with the standard emotion model in the standard emotion model library, the calibrated target voice feature points on the target emotion model are adjusted to enable the target emotion model to be closer to the standard emotion model, and then the target emotion model is matched with the tone feature data and the psychological behavior feature data in the database according to the change data of the target voice feature points, so that the emotion change of the user is quickly and accurately analyzed.

Drawings

Fig. 1 is a schematic flow chart of a method for emotion analysis according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a method for analyzing emotion according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of an emotion analyzing system provided in the third embodiment of the present invention;

fig. 4 is a structural diagram of a server according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of an emotion analyzing method provided in an embodiment of the present invention, which is applicable to objectively analyzing and managing emotions for a specific group having an emotional problem, and which may be executed by an emotion analyzing system, which may be configured in a server, for example. The method specifically comprises the following steps:

and S110, identifying the language key words and the target intonation included in the acquired voice data.

The voice data of the user can be collected through a preset rule, illustratively, the voice data of the user within a period of time or the voice data of the user within different periods of time in one day are collected according to a preset time rule, and specifically, the voice data can be collected through corresponding voice collection software.

For the collected voice data, the semantic content and intonation included in the voice data can be recognized through a voice recognition technology, wherein the intonation includes volume, speed, tone and the like of the voice data. Illustratively, semantic content, i.e., textual content, in the speech data may be recognized through acoustic model and language model analysis.

And extracting language keywords and target intonations from the identified semantic content and intonation according to the voice recognition result, wherein the target intonation comprises at least one of volume, speed, tone and respective change trend of voice data. Illustratively, a participle thesaurus can be used to remove meaningless words in semantic content, and simultaneously, language keywords which can indicate the emotion of a user are extracted; for the identified intonation, the target intonation meeting the preset condition is selected, illustratively, the target intonation is selected from the intonation whose volume exceeds the maximum preset threshold and is lower than the minimum preset threshold, or the target intonation is selected from the intonation whose speed exceeds a certain preset threshold.

And S120, analyzing and determining voice feature points according to the acquired language keywords and the target intonation, wherein the voice feature points are the keywords and the intonation which mark the emotion of the user in the language keywords and the target intonation.

And further analyzing and screening the acquired language keywords and target tones, and determining the keywords and the tones which can obviously indicate the emotion of the user as voice feature points, wherein the voice feature points comprise the keyword feature points and the tone feature points. Illustratively, the language keywords can be screened through an emotion sensitive word bank established in advance, and the screened keywords are determined as keyword feature points, wherein the emotion sensitive word bank comprises words frequently spoken under various different emotions of the user. Since the target intonation is usually displayed in the form of a waveform map, a point with a relatively obvious variation trend can be used as an intonation feature point, for example, a point with a suddenly increased speech speed.

And S130, generating a target emotion model based on the voice feature points, and calibrating the target voice feature points on the target emotion model.

And generating a target emotion model according to the determined voice feature points so as to analyze the emotion of the user according to the target emotion model. And calibrating target voice feature points on the target emotion model, wherein the target voice feature points can be a part with more prominent features in the voice feature points determined in the step S120, so that further screening of the emotional features of the user is realized, and the emotional features of the user are more obvious.

And S140, matching the target emotion model with a standard emotion model in a standard emotion model library to adjust the calibrated target voice feature points on the target emotion model and record the change data of the target voice feature points.

Matching the target emotion model with standard emotion models in a standard emotion model library, namely matching the voice feature points calibrated on the target emotion model with the feature points on the standard emotion model, including matching of keywords and tones, finely adjusting the calibrated target voice feature points on the target emotion model according to the matching result and the data trend of each target voice feature point, and recording the change data of the target voice feature points. Illustratively, the target voice feature point is adjusted from a position A to a position B, and the change of each feature of the target voice feature point between the positions A and B, such as the change of the speed of speech, the change of the tone, the change of the volume and the like, is recorded.

S150, matching the change data with the tone characteristic data and the psychological behavior characteristic data in the database, and outputting the emotion or emotion change data of the user according to the matching result.

And outputting emotion or emotion change data of the user according to the matching result of the change data of the target voice feature point and the tone feature data and the psychological behavior feature data in the database. Illustratively, if the speech rate frequency is high, the volume is large, and the tone change is large, the emotion of the user is considered to be urgent; if the speech rate frequency is slow, the volume is moderate, and the tone is gentle, the emotion of the user is considered to be stable.

In this embodiment, a target emotion model is established according to the extracted voice feature points, the target voice feature points are calibrated on the target emotion model to achieve the purpose of screening out the voice feature points capable of accurately showing the emotion of the user, then the target emotion model is matched with emotion models in a standard emotion model library, the calibrated target voice feature points on the target emotion model are adjusted to enable the target emotion model to be closer to the standard model, and then the target emotion model is matched with tone feature data and psychological behavior feature data in the database according to change data of the target voice feature points, so that the emotion change of the user is quickly and accurately analyzed.

Example two

Fig. 2 is a schematic flow chart of an emotion analysis method provided in the second embodiment of the present invention, where the present embodiment performs optimization based on the above embodiment, and the method includes:

s210, identifying the language keywords and the target intonation included in the acquired voice data.

S220, analyzing and determining voice feature points according to the obtained language keywords and the target intonation, wherein the voice feature points are the keywords and the intonation which mark the emotion of the user in the language keywords and the target intonation.

S230, carrying out convolution operation on the voice feature points to obtain a convolution operation result, generating a target emotion model based on the convolution operation result, and calibrating extreme points in the convolution operation result as target voice feature points.

In this embodiment, the voice feature points obtained in S220 are subjected to convolution operation to obtain a convolution operation result, and a target emotion model is generated according to the data trend of each feature point in the convolution operation result, where the generated target emotion model may be a waveform diagram. Further, in order to screen feature points capable of obviously indicating the emotion of the user from the obtained voice feature points, the voice feature points corresponding to extreme points, such as peak and trough positions, in the model map are marked as target voice feature points.

S240, matching the target emotion model with a standard emotion model in a standard emotion model library to adjust the calibrated target voice feature points on the target emotion model and recording the change data of the target voice feature points.

And S250, matching the change data with the tone characteristic data and the psychological behavior characteristic data in the database, and outputting the emotion or emotion change data of the user according to the matching result.

Further, in this embodiment, in order to improve the accuracy of emotion analysis of the user, the intonation feature data and the psychological behavior feature data in the database are updated through big data analysis calculation. Illustratively, the internet data is periodically collected, and the collected internet data is analyzed and calculated, and the intonation characteristic data and the psychological behavior characteristic data in the database are continuously updated, wherein the database is a local database or an internet database.

In this embodiment, a target emotion model is established by performing convolution operation on the acquired voice feature points, and the target voice feature points are calibrated according to extreme values in the calculation results, so that the purpose of screening the feature points which obviously indicate the emotion of the user is achieved, the accuracy of emotion analysis of the user is improved, and the tone feature data and the psychological behavior feature data in the database are periodically updated, so that the emotion analysis is more accurate, and the emotion management of the user is facilitated.

EXAMPLE III

Fig. 3 is a schematic structural diagram of an emotion analyzing system provided in the third embodiment of the present invention, and as shown in fig. 3, the system includes:

the recognition module 310 is configured to recognize a language keyword and a target intonation included in the acquired voice data;

the feature point analysis module 320 is configured to analyze and determine a voice feature point according to the obtained language keyword and the target intonation, where the voice feature point is a keyword and an intonation that identify a user emotion in the language keyword and the target intonation;

the modeling module 330 is configured to generate a target emotion model based on the analyzed and determined voice feature points, and calibrate the target voice feature points on the target emotion model;

the adjustment recording module 340 is configured to match the target emotion model with a standard emotion model in a standard emotion model library, so as to adjust a calibrated target voice feature point on the target emotion model and record change data of the target voice feature point;

and the emotion analysis module 350 is configured to match the change data with the intonation feature data and the psychological behavior feature data in the database, and output emotion or emotion change data of the user according to a matching result.

In this embodiment, the feature point analysis module determines a voice feature point according to a recognition result of the recognition module, the modeling module establishes a target emotion model according to the voice feature point, the adjustment recording module performs fine adjustment on the target emotion model and records a change of the target voice feature point, and the emotion analysis module analyzes emotion and emotion change of a user according to change data of the target voice feature point. Therefore, the purpose of helping the user to manage the emotion by objectively and accurately analyzing the emotion change of the user can be achieved.

On the basis of the above embodiment, the identification module includes:

the acquisition unit is used for acquiring voice data of a user according to a preset rule;

the recognition unit is used for recognizing semantic content and intonation included in the voice data;

and the extraction unit is used for extracting the language key words and the target intonation from the identified semantic content and the intonation, wherein the target intonation comprises the volume, the speed and the tone of the voice data and the respective change trend.

On the basis of the above embodiment, the modeling module includes:

and the modeling unit is used for performing convolution operation on the voice feature points to obtain a convolution operation result and generating a target emotion model based on the convolution operation result.

On the basis of the above embodiment, the modeling module includes:

and the calibration unit is used for calibrating the extreme value points in the convolution operation result as target voice characteristic points.

On the basis of the above embodiment, the system further includes:

and the big data processing and updating module is used for updating the intonation characteristic data and the psychological behavior characteristic data in the database through big data analysis and calculation, wherein the database is a local database or an Internet database.

The emotion analysis system provided by the embodiment of the invention can execute the emotion analysis method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Example four

Fig. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary server 12 suitable for use in implementing embodiments of the present invention. The server 12 shown in fig. 4 is only an example, and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.

As shown in FIG. 4, the server 12 is in the form of a general purpose computing device. The components of the server 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

The server 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by server 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The server 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

The server 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the server 12, and/or with any devices (e.g., network card, modem, etc.) that enable the server 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the server 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the server 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the server 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing by running a program stored in the system memory 28, for example, to implement the emotion analysis classification method provided by the embodiment of the present invention, including:

and matching the change data with the intonation characteristics and the psychological behavior characteristics in the database, and outputting the emotion or emotion change data of the user according to the matching result.

In one embodiment, processing unit 16 implements a method of emotion analysis by executing a program stored in system memory 28, further comprising:

acquiring voice data of a user according to a preset rule;

recognizing semantic content and intonation included in the voice data;

extracting language keywords and target intonations from the identified semantic content and intonations, wherein the target intonation comprises at least one of volume, speed, tone and respective trend of change of the voice data.

and carrying out convolution operation on the voice feature points to obtain a convolution operation result, and generating a target emotion model based on the convolution operation result.

and marking the extreme value points in the convolution operation result as target voice characteristic points.

and updating the intonation characteristic data and the psychological behavior characteristic data in the database through big data analysis and calculation, wherein the database is a local database or an Internet database.

EXAMPLE five

An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the emotion analysis method provided in the embodiment of the present invention, and the method includes:

In one embodiment, the program when executed by the processor may further implement:

acquiring voice data of a user according to a preset rule;

recognizing semantic content and intonation included in the voice data;

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method of sentiment analysis, the method comprising:

2. The method according to claim 1, wherein the recognizing the language keyword and the target intonation included in the acquired voice data comprises:

acquiring voice data of a user according to a preset rule;

recognizing semantic content and intonation included in the voice data;

3. The method of claim 1, wherein generating a target emotion model based on the speech feature points comprises:

4. The method of claim 3, wherein the step of calibrating the target speech feature points on the target emotion model comprises:

5. The method of claim 1, further comprising:

6. An emotion analysis system, characterized in that the system comprises:

7. The system of claim 6, wherein the identification module comprises:

and the extraction unit is used for extracting the language key words and the target intonation from the identified semantic content and the intonation, wherein the target intonation comprises at least one of the volume, the speed, the tone and the change trend of the voice data.

8. The system of claim 6, further comprising:

9. A server, characterized in that the server comprises:

one or more processors;

a memory for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the emotion analysis method as recited in any of claims 1-5.

10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the emotion analyzing method as set forth in any one of claims 1 to 5.