CN116503841A

CN116503841A - Mental health intelligent emotion recognition method

Info

Publication number: CN116503841A
Application number: CN202310478600.8A
Authority: CN
Inventors: 姚尧; 袁礼承; 徐锋; 陈冠伟
Original assignee: Good Feeling Health Industry Group Co ltd
Current assignee: Good Feeling Health Industry Group Co ltd
Priority date: 2023-04-28
Filing date: 2023-04-28
Publication date: 2023-07-28

Abstract

The invention provides a mental health intelligent emotion recognition method, which comprises the following steps: a user terminal and a service cloud; the service cloud creates a model, trains the model, adjusts model parameters and identifies the subsequent service after emotion; the user terminal is responsible for shooting, storing and expression recognition of voice and pictures and uploading the voice and pictures to the service remote end. And the server side continuously trains a model according to the collected characteristic expression data set, optimizes model parameters, forms an effective model algorithm and continuously improves the accuracy and the effectiveness of emotion recognition. The face photo of the driver can be acquired in real time, the facial expression is recognized, and a preliminary emotion recognition report is provided.

Description

Mental health intelligent emotion recognition method

Technical Field

The invention relates to the field of emotion recognition, in particular to a mental health intelligent emotion recognition method.

Background

Psychological emotion is a feedback mode of people on objective things, and is more a presentation mode of emotion of people. Emotion is an important component of human mental activities, and plays a role in organizing and guiding behaviors of people, communicating and predicting willingness of others. Due to physiological changes, expression changes, sound changes and other factors, people generate corresponding signals when expressing their emotion, and other people are caused to infer that the process is an emotion recognition process.

The good mood is an internet medical platform focused on mental and psychological health services in the CNS (central nervous system) field, definition and classification of moods are discussed through a large amount of data and literature analysis, and future directions of emotion recognition research and application values of the moods are expected.

For a long time, the danger rate of a long-distance freight car driver is higher, larger personal and property losses are caused, most accidents are caused by improper driving operation caused by larger emotion fluctuation of the freight car driver through follow-up analysis, and insurance companies provide better service and lower the danger rate through monitoring facial expressions of the freight car driver in real time, analyzing the emotion of the freight car driver, reporting to a service end in time when the driver generates dangerous emotion, and balancing the emotion of the driver through voice intervention service of the service end so as to standardize safe driving behaviors of the driver, thereby reducing the probability of accident occurrence.

Based on the above problem analysis, the product solutions currently on the market are integrated. The method is characterized in that a face recognition model is created by inputting and modeling the face of a truck driver. The terminal equipment detects face feature pictures of driving positions in real time, identifies the identity of a truck driver, identifies the face expression of the truck driver in driving through the face expression pictures, synthesizes time dimension, analyzes emotion of the truck driver, reports analysis results in real time when the emotion of the truck driver is abnormal, and generates an alarm when photo information is sent to a server. The server receives the alarm information, manually analyzes dangerous behaviors, issues a voice packet through the instruction of the server, and issues a product service mode of voice warmth prompting behaviors, so that dangerous driving emotion and behaviors of a driver are effectively stopped, and a service mode for reducing accidents is achieved.

Disclosure of Invention

The present invention has been made in view of the above problems, and has as its object to provide a mental health intelligent emotion recognition method which overcomes or at least partially solves the above problems.

According to an aspect of the present invention, there is provided a mental health intelligent emotion recognition method, the emotion recognition method comprising: a user terminal and a service cloud;

the service cloud creates a model, trains the model, adjusts model parameters and identifies the subsequent service after emotion;

the user terminal is responsible for shooting, storing and expression recognition of voice and pictures and uploading the voice and pictures to the service remote end.

Optionally, the emotion recognition method specifically includes:

uploading the face photo to the service cloud through the user terminal;

the service cloud terminal generates face recognition model parameters by the face photos through a face analysis model, and transmits the parameters to the model of the user terminal for face recognition of the terminal;

and the user terminal is used for shooting a photo group at regular time, recognizing a human face, analyzing a human face expression result, analyzing the emotion of the current person, and uploading the analysis result to the service cloud.

Optionally, after uploading the analysis result to the service cloud, the method further includes:

the service cloud differentially provides subsequent services according to different emotion results.

Optionally, the face recognition algorithm specifically includes: the face recognition algorithm, the emotion analysis algorithm dynamically adjusts algorithm models and algorithm parameters along with the increase of the traffic volume, the change of the traffic scene and the increase of the data volume.

Optionally, the expression recognition result is deeply integrated with the service line, and auxiliary support is provided for decision making of the service line.

The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of face emotion recognition provided by an embodiment of the present invention;

fig. 2 is a flowchart of a method for facial emotion recognition+voice emotion recognition according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The terms "comprising" and "having" and any variations thereof in the description embodiments of the invention and in the claims and drawings are intended to cover a non-exclusive inclusion, such as a series of steps or elements.

The technical scheme of the invention is further described in detail below with reference to the accompanying drawings and the examples.

The face of the truck driver is input, the face is modeled, and a face recognition model is created. The terminal equipment detects face feature pictures of driving positions in real time, identifies the identity of a truck driver, identifies the face expression of the truck driver in driving through the face expression pictures, synthesizes time dimension, analyzes emotion of the truck driver, reports analysis results in real time when the emotion of the truck driver is abnormal, and generates an alarm when photo information is sent to a server. The server receives the alarm information, manually analyzes dangerous behaviors, issues a voice packet through the instruction of the server, and issues a product service mode of voice warmth prompting behaviors, so that dangerous driving emotion and behaviors of a driver are effectively stopped, and a service mode for reducing accidents is achieved.

The existing product solution is that a server remotely inputs face information, generates face modeling basic information, issues a face model to terminal equipment, the terminal equipment is implanted with a face recognition algorithm and an expression recognition algorithm, the terminal equipment controls the input frequency of photos and the expression recognition frequency, saves key characteristic expression data and uploads the key characteristic expression data to the server. The terminal equipment can receive the voice playing instruction of the server, and the algorithm parameters of the server are adjusted and the algorithm is updated.

And the server side continuously trains a model according to the collected characteristic expression data set, optimizes model parameters, forms an effective model algorithm and continuously improves the accuracy and the effectiveness of emotion recognition. By continuously enriching the content and the expression form of the voice service package, more, better and better follow-up services are provided.

At present, the face photo of the driver can be acquired in real time, the facial expression is recognized, and a preliminary emotion recognition report is provided.

The emotion recognition comprises two parts, namely a user terminal and a server cloud. The cloud is responsible for the creation of models, model training, model parameter adjustment, and subsequent service provision after emotion recognition. The terminal is responsible for shooting, storing, expression recognition, uploading and the like of voice and pictures.

The emotion recognition basic technical route is as follows: and uploading the face photo to a cloud end through the terminal, enabling the photo to pass through a face analysis model by the cloud end, creating face recognition model parameters, and then transmitting the parameters to the terminal model for face recognition of the terminal. The terminal shoots a photo group at regular time, recognizes a face, analyzes a facial expression result, analyzes the emotion of the current person, and uploads the analysis result to the cloud. The cloud end differentially provides subsequent services according to different emotion results.

The face recognition algorithm, the emotion analysis algorithm dynamically adjusts algorithm models and algorithm parameters along with the increase of the traffic volume, the change of the traffic scene and the increase of the data volume. The accuracy of the algorithm and the service scene coverage range of the algorithm are continuously improved.

Expression recognition is only implanted on a linux system at present, and is compatible with other terminal equipment in the future, so that application of multiple scenes is covered. With the deep algorithm, a larger demand is needed for hardware computing power in the future, and the hardware computing power is bound with the depth of a hardware manufacturer, so that the development of the hardware is promoted.

The expression recognition result can be deeply fused with other service lines, and auxiliary support is provided for decision making of other service lines.

The core technology of face emotion recognition and voice emotion recognition is divided into two layers. One is an algorithmic model and one is the compatibility of multi-scenario devices.

In the aspect of algorithm models, the face recognition algorithm and the expression recognition algorithm on the market at present have the characteristics of low recognition degree, simple recognition result and the like, and particularly have low recognition degree in the aspect of data sets of voice emotion recognition. The algorithm of the driver can train according to a large amount of data according to the service scene, and the recognition accuracy of the algorithm model, the expression recognition accuracy and the emotion analysis multi-scene coverage are continuously improved, so that the expression change trend of a single person is improved. Constructing basic feature models of the expressions of people in different areas and different professions, and constructing emotion feature models of patients with depression.

At present, the face recognition and expression recognition algorithm model is mostly set in a fixed terminal. For various mobile terminals, the coverage of industrial terminals is very small. Most of the hardware devices on the market do not support the implantation of the algorithm model, and particularly the dynamic upgrading of the algorithm model is not operable. The industrial hardware also has the characteristics of high price and limited calculation power, and the cost is reduced by continuously covering the business scene according to the business scene, so that the technology can benefit more people.

The application of face algorithms at the industrial level is currently basically based on C language implantation, but a large number of algorithm models are python language.

As shown in fig. 1, facial emotion recognition is a technology based on artificial intelligence technology for analyzing emotion skills in video derived from different pictures. Signals are generally obtained from cameras, social media pages, video libraries and the like, static and dynamic facial expression detection is carried out, and emotional states are classified; facial emotion recognition is usually based on a deep learning algorithm and comprises three stages of face information preprocessing, feature learning and emotion recognition; in the area of mental illness, facial emotion recognition techniques can be used to predict the probability of illness, aid diagnosis, and aid in improving the care quality of medical personnel. At present, facial emotion recognition has endogenous risks such as data accuracy, algorithm fairness, data privacy and user reactivity.

As shown in fig. 2, speech emotion recognition, i.e. given a section of speech signal, a computer automatically determines the multidimensional information of the speaker. Human speech production involves a complex multi-system coordination process of brain cognitive activities and body muscle movements. The voice signal contains three layers of acoustic and linguistic emotion information, the motion fiber is highly mechanized, and the generated voice signal has objective and repeatable characteristics. The voice recognition flow is subjected to three stages of voice signal processing, feature extraction and emotion modeling, wherein the related recognition algorithm comprises a traditional algorithm, an algorithm based on deep learning and an end-to-end algorithm. The voice biomarker is the characteristic or characteristic combination of the voice audio signal related to the clinical result, and can be used for screening, diagnosis and disease detection of mental and psychological diseases, and intervention means of AI+CBT and digital therapy. At present, the method is also influenced by fuzzy emotion definition, scarcity data and difficulty in labeling, and the difficulty in speech emotion recognition technology is high.

The facial emotion recognition and the voice emotion recognition can effectively improve recognition accuracy, and can mutually verify data interference factors generated in the recognition process, and the accuracy of the result is influenced by the user emotion influence interference factors.

The beneficial effects are that: the server side remotely inputs face information, generates face modeling basic information, transmits a face model to the terminal equipment, implants a face recognition algorithm and an expression recognition algorithm into the terminal equipment, controls the input frequency of photos and the expression recognition frequency of the terminal equipment, saves key characteristic expression data and uploads the key characteristic expression data to the server side. The terminal equipment can receive the voice playing instruction of the server, and the algorithm parameters of the server are adjusted and the algorithm is updated.

The foregoing detailed description of the invention has been presented for purposes of illustration and description, and it should be understood that the invention is not limited to the particular embodiments disclosed, but is intended to cover all modifications, equivalents, alternatives, and improvements within the spirit and principles of the invention.

Claims

1. A mental health intelligent emotion recognition method, characterized in that the emotion recognition method comprises the following steps: a user terminal and a service cloud;

2. The mental health intelligent emotion recognition method according to claim 1, wherein the emotion recognition method specifically comprises:

uploading the face photo to the service cloud through the user terminal;

3. The method for intelligent emotion recognition for mental health according to claim 2, wherein the uploading the analysis result to the service cloud further comprises:

4. The mental health intelligent emotion recognition method according to claim 2, wherein the face recognition algorithm specifically comprises: the face recognition algorithm, the emotion analysis algorithm dynamically adjusts algorithm models and algorithm parameters along with the increase of the traffic volume, the change of the traffic scene and the increase of the data volume.

5. The intelligent emotion recognition method for mental health according to claim 1, wherein the result of expression recognition is deeply integrated with a service line, and auxiliary support is provided for decision making of the service line.