WO2021031817A1 - 情绪识别方法、装置、计算机装置及存储介质 - Google Patents
情绪识别方法、装置、计算机装置及存储介质 Download PDFInfo
- Publication number
- WO2021031817A1 WO2021031817A1 PCT/CN2020/105630 CN2020105630W WO2021031817A1 WO 2021031817 A1 WO2021031817 A1 WO 2021031817A1 CN 2020105630 W CN2020105630 W CN 2020105630W WO 2021031817 A1 WO2021031817 A1 WO 2021031817A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- training sample
- user
- features
- training
- sample set
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Definitions
- This application relates to the field of artificial intelligence technology, in particular to an emotion recognition method, device, computer device, and storage medium.
- emotion recognition has become one of the most active research topics in the field of artificial intelligence. Its purpose is to detect, track and identify human image sequences and explain human behavior more scientifically. Emotion recognition can be applied to all aspects of life: game manufacturers can intelligently analyze the player’s emotions, interact with players in a targeted manner according to different expressions, and improve the game experience; camera manufacturers can use this technology to capture human expressions, such as when a photo is needed When you are smiling or angry, you can capture the facial expressions of the person being photographed and quickly complete the photographing work; the government or sociologists can install cameras in public places to analyze the facial expressions and body movements of the entire social group to understand people's life and work pressure; Commercial buildings can conduct relevant market research on products based on the actions and facial expression videos of customers when shopping for products.
- the first aspect of the present application provides an emotion recognition method, wherein the method includes:
- each training sample in the training sample set is a time series of the acceleration of the user walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- the second aspect of the present application provides a computer device, wherein the computer device includes a processor configured to execute computer-readable instructions stored in a memory to implement the following steps:
- each training sample in the training sample set is a time series of the acceleration of the user walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- a third aspect of the present application provides a storage medium with computer-readable instructions stored on the storage medium, where the computer-readable instructions implement the following steps when executed by a processor:
- each training sample in the training sample set is a time series of the acceleration of the user walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- a fourth aspect of the present application provides an emotion recognition device, wherein the device includes:
- An obtaining module configured to obtain a training sample set, each training sample in the training sample set is a time series of the acceleration of a user's walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- An extraction module for extracting multiple features for each training sample in the training sample set
- a construction module for constructing multiple classification regression trees according to multiple characteristics of each training sample in the training sample set
- the recognition module is configured to input multiple characteristics of the user to be recognized into the random forest, and determine the emotion category of the user to be recognized according to the output of the random forest, wherein the multiple characteristics of the user to be recognized are based on the Obtained by identifying the time series of the acceleration of the user walking.
- a user’s walking acceleration time series with emotion category labels are used as training samples, a random forest is generated according to each training sample, and the acceleration time series of the user to be identified are identified using the random forest.
- the application realizes the recognition of the user's emotions based on the acceleration data during the user's walking process.
- Fig. 1 is a flowchart of an emotion recognition method provided by an embodiment of the present application.
- Fig. 2 is a structural diagram of an emotion recognition device provided by an embodiment of the present application.
- Fig. 3 is a schematic diagram of a computer device provided by an embodiment of the present application.
- the emotion recognition method of the present application is applied in one or more computer devices.
- the computer device is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
- Its hardware includes, but is not limited to, a microprocessor and an application specific integrated circuit (ASIC) , Field-Programmable Gate Array (FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
- ASIC application specific integrated circuit
- FPGA Field-Programmable Gate Array
- DSP Digital Processor
- embedded equipment etc.
- the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
- the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
- FIG. 1 is a flowchart of the emotion recognition method provided by Embodiment 1 of the present application.
- the emotion recognition method is applied to a computer device.
- the emotion recognition method of the present application involves machine learning, which is used to recognize the user's emotion according to the acceleration data during the walking process of the user.
- the emotion recognition method includes:
- each training sample in the training sample set is a time series of a user's walking acceleration, and each training sample has a label, and the label marks the emotion category corresponding to the training sample.
- the acceleration data during the walking process of the user can be collected through acceleration sensors on the wrist and/or ankle of the user within a preset time, and the acceleration time series can be obtained according to the acceleration data.
- Each acceleration time series may include a preset number of acceleration data, for example, 100 acceleration data.
- each acceleration time series may include acceleration data within a preset time (for example, 60 seconds).
- the acceleration data may be acceleration data in the X-axis, Y-axis or Z-axis directions, so as to obtain the acceleration time series in the X-axis, Y-axis or Z-axis directions.
- the acceleration sensor on the user's wrist collects a preset number (for example, 100) acceleration data in the X-axis direction, and the collected acceleration data in the X-axis direction composes an acceleration time series to obtain A training sample.
- the acceleration sensor on the user’s ankle during the user’s walk collects acceleration data in the X-axis direction for a preset time interval (for example, 60 seconds), and composes the collected acceleration data in the X-axis direction during the preset time period An acceleration time series, get a training sample.
- Each training sample corresponds to a label, which is used to identify the emotion category.
- the emotion category may include positive emotions (excited, happy), neutral emotions (calm) or negative emotions (sadness, sadness).
- the label may be a number, such as 1, 2, 3. For example, if the user’s emotion is positive, the corresponding label is 3; if the user’s emotion is neutral, the corresponding label is 2; if the user’s emotion is negative, the corresponding label is 1.
- the acceleration data of the user's walking is different.
- the user's acceleration data can be collected when the user has different emotions, and training samples with different labels can be obtained.
- a plurality of training samples obtained by collecting acceleration data of a user's walking constitute the training sample set.
- the training sample set may include training samples of multiple users, that is, a time series of accelerations of multiple users walking.
- the training sample set may include a training sample of a user, that is, a time series of acceleration of a user's walking.
- Extracting multiple features for each training sample in the training sample set is extracting multiple identical features for each training sample.
- the multiple features may include the standard deviation, average value, peak value, skewness coefficient, FFT coefficient, power spectral density average, power spectral density standard deviation, and coordinate axis coefficient of the acceleration time series.
- the skewness coefficient of the acceleration time series is a measure of the asymmetry of the acceleration time series distribution. If a training sample is symmetric, the skewness coefficient is equal to 0; if a training sample is left-biased, the skewness coefficient is less than 0; if a training sample is right-biased, then the skewness coefficient Greater than 0.
- the FFT coefficients of the acceleration time series are coefficients obtained by performing FFT (Fast Fourier Transform, Fast Fourier Transform) transformation on the acceleration time series, and the FFT coefficients from the 2nd dimension to the 32nd dimension can be taken.
- FFT Fast Fourier Transform, Fast Fourier Transform
- the training sample is an acceleration time series in the X-axis direction
- the corresponding coordinate axis coefficients are:
- cov(Y,Z) is the covariance of the acceleration time series in the Y-axis direction of the training sample and the acceleration time series in the Z-axis direction of the training sample
- D(Y) is the acceleration in the Y-axis direction of the training sample
- the variance of the time series, D(Z) is the variance of the acceleration time series in the Z-axis direction of the training sample
- the coordinate axis coefficient is ⁇ XZ , and the calculation formula of ⁇ XZ can refer to the above ⁇ YZ ;
- the coordinate axis coefficient is ⁇ XY
- the calculation formula of ⁇ XY can refer to the above ⁇ YZ .
- the multiple features of each training sample in the training sample set may be normalized to obtain multiple normalized features of each training sample.
- the normalizing the multiple features of each training sample in the training sample set may include:
- B ij is the normalized value of the j-th feature of the i-th training sample
- b ij is the value before normalization of the j-th feature of the i-th training sample.
- i 1, 2,..., N
- N is the number of training samples in the training sample set.
- j 1, 2, ..., M; M is the number of features of each training sample.
- the j-th feature of the i-th training sample refers to the j-th feature among the multiple features of the i-th training sample.
- the method further includes:
- Preprocessing is performed on each training sample in the training sample set.
- the preprocessing of each training sample in each training sample set includes:
- performing noise reduction on the training samples may include: performing a moving average noise reduction on the training samples.
- the training samples can be denoised by moving average according to the following formula:
- output[i] is the output corresponding to the i-th acceleration data in the training sample (ie acceleration time series)
- w is a constant and the value is 3 or 5
- input[i+j] is the output in the training sample The i+jth acceleration data.
- wavelet noise reduction can be performed on the training samples.
- the filling in the missing values in the training sample may include: taking several pieces of acceleration data before and after the missing value in the training sample (for example, the first 5 and last 5 acceleration data of the missing value) , Filling the missing value with an average value of several acceleration data before and after the missing value.
- the K-nearest neighbor algorithm can be used to determine the K training samples closest to the training sample with missing values (for example, the K training samples closest to the training sample with missing values are determined according to Euclidean distance), and the K training samples The weighted average of the data is used to estimate the missing value of the training sample.
- other methods can be used to fill in the missing values.
- the missing value can be filled by regression fitting method or interpolation method.
- the method of correcting the outliers in the training sample can be the same as the method of filling in missing values.
- several acceleration data before and after the abnormal value in the training sample can be taken (for example, the first 5 acceleration data and the last 5 acceleration data of the abnormal value), and the average value of several acceleration data before and after the abnormal value can be used for correction.
- the abnormal value can be used to determine the K training samples closest to the training sample with outliers (for example, the K training samples closest to the training sample with outliers are determined according to Euclidean distance), and the K training samples The weighted average of the data is used to estimate the outliers of the training sample.
- other methods can be used to correct the abnormal value.
- the abnormal value can be corrected by regression fitting method or interpolation method.
- the constructing multiple classification regression trees according to multiple characteristics of each training sample of the training sample set may include:
- the optimal segmentation feature and segmentation point can be determined according to the following objective function:
- the above formula means to traverse all the feature values (ie, the segmentation point s) of the K features (ie segmentation feature j) of the sample to be classified, and find the optimal segmentation feature and segmentation point according to the minimum square error criterion.
- x i is the i-th training sample in the sample to be classified
- yi is the label of x i.
- x (j) ⁇ s ⁇ , R 2 (j,s) ⁇ x
- R 1 (j,s) is the set of samples to be classified with the feature value of the jth feature less than or equal to s
- R 2 (j,s) ⁇ x
- N 1 is the number of samples to be classified in the subset R 1
- N 2 is the number of samples to be classified in the subset R 2 .
- the meeting the preset stop condition may include:
- the preset stop condition is met
- the preset stopping condition is satisfied.
- a classification regression tree is obtained.
- the root node of the classification regression tree corresponds to the initial sample to be classified, and each leaf node of the classification regression tree corresponds to a subset that is no longer divided.
- the output of the classification regression tree is the output corresponding to the leaf node, that is, the average value of the label of the sample to be classified into the leaf node.
- Multiple classification regression trees are formed into the random forest, and different classification regression trees are independent of each other.
- the input of the random forest is the input of each classification regression tree in the random forest; the output of the random forest is the average value of the outputs of all classification regression trees in the random forest.
- the generating a random forest according to the multiple classification regression trees includes:
- the random forest is generated according to the multiple classification regression trees after the pruning process.
- Pruning the multiple classification regression trees includes:
- T t represents the subtree with t as the root node
- C(t) is the prediction error obtained according to the sample to be classified into the internal node t
- C(Tt) is the sample to be classified according to the subtree T t
- the obtained prediction error, C(t) is the prediction error obtained according to the sample to be classified into the t node
- is the number of leaf nodes of the subtree T t;
- the cross-validation method is used to select the optimal subtree T ⁇ in the subtree sequence T 0 , T 1 , ..., T n .
- each classification regression tree in the random forest takes multiple characteristics of the user to be identified as input, and classifies the user to be identified according to the multiple characteristics of the user to be identified to obtain the classification regression tree Calculate the average value of the output of all classification regression trees in the random forest to obtain the output of the random forest; determine the emotion category of the user to be identified according to the output of the random forest.
- the emotion category corresponding to the label with the smallest output difference of the random forest may be selected as the emotion category of the user to be identified.
- the user to be identified can be included in the user corresponding to the training sample.
- the training sample set includes training samples of user A, and the user to be identified is user A.
- the training sample set includes training samples of user A, user B, user C, and user D, and the user to be identified is user A.
- the user to be identified may not be included in the user corresponding to the training sample.
- the training sample set includes training samples of user A, user B, user C, and user D, and the user to be identified is user E.
- the emotion recognition method of the first embodiment takes the acceleration time series of the user walking with emotion category tags as training samples, generates a random forest according to each training sample, and uses the random forest to recognize the acceleration time series of the user to be identified.
- the first embodiment realizes the recognition of the user's emotion according to the acceleration data of the user during walking.
- FIG. 2 is a structural diagram of an emotion recognition device provided in Embodiment 2 of the present application.
- the emotion recognition device 20 is applied to a computer device.
- the emotion recognition device 20 recognizes the emotion of the user according to the acceleration data during the walking process of the user.
- the emotion recognition device 20 may include an acquisition module 201, an extraction module 202, a construction module 203, a generation module 204, and an identification module 205.
- the obtaining module 201 is configured to obtain a training sample set.
- Each training sample in the training sample set is a time series of the acceleration of a user's walking.
- Each training sample has a label, and the label marks the emotion category corresponding to the training sample.
- the acceleration data during the walking process of the user can be collected through acceleration sensors on the wrist and/or ankle of the user within a preset time, and the acceleration time series can be obtained according to the acceleration data.
- Each acceleration time series may include a preset number of acceleration data, for example, 100 acceleration data.
- each acceleration time series may include acceleration data within a preset time (for example, 60 seconds).
- the acceleration data may be acceleration data in the X-axis, Y-axis or Z-axis directions, so as to obtain the acceleration time series in the X-axis, Y-axis or Z-axis directions.
- the acceleration sensor on the user's wrist collects a preset number (for example, 100) acceleration data in the X-axis direction, and the collected acceleration data in the X-axis direction composes an acceleration time series to obtain A training sample.
- the acceleration sensor on the user’s ankle during the user’s walk collects acceleration data in the X-axis direction for a preset time interval (for example, 60 seconds), and composes the collected acceleration data in the X-axis direction during the preset time period An acceleration time series, get a training sample.
- Each training sample corresponds to a label, which is used to identify the emotion category.
- the emotion category may include positive emotions (excited, happy), neutral emotions (calm) or negative emotions (sadness, sadness).
- the label may be a number, such as 1, 2, 3. For example, if the user’s emotion is positive, the corresponding label is 3; if the user’s emotion is neutral, the corresponding label is 2; if the user’s emotion is negative, the corresponding label is 1.
- the acceleration data of the user's walking is different.
- the user's acceleration data can be collected when the user has different emotions, and training samples with different labels can be obtained.
- a plurality of training samples obtained by collecting acceleration data of a user's walking constitute the training sample set.
- the training sample set may include training samples of multiple users, that is, a time series of accelerations of multiple users walking.
- the training sample set may include a training sample of a user, that is, a time series of acceleration of a user's walking.
- the extraction module 202 is configured to extract multiple features for each training sample in the training sample set.
- Extracting multiple features for each training sample in the training sample set is extracting multiple identical features for each training sample.
- the multiple features may include the standard deviation, average value, peak value, skewness coefficient, FFT coefficient, power spectral density average, power spectral density standard deviation, and coordinate axis coefficient of the acceleration time series.
- the skewness coefficient of the acceleration time series is a measure of the asymmetry of the acceleration time series distribution. If a training sample is symmetric, the skewness coefficient is equal to 0; if a training sample is left-biased, the skewness coefficient is less than 0; if a training sample is right-biased, then the skewness coefficient Greater than 0.
- the FFT coefficients of the acceleration time series are coefficients obtained by performing FFT (Fast Fourier Transform, Fast Fourier Transform) transformation on the acceleration time series, and can take the FFT coefficients from the 2nd dimension to the 32nd dimension.
- FFT Fast Fourier Transform, Fast Fourier Transform
- the training sample is an acceleration time series in the X-axis direction
- the corresponding coordinate axis coefficients are:
- cov(Y,Z) is the covariance of the acceleration time series in the Y-axis direction of the training sample and the acceleration time series in the Z-axis direction of the training sample
- D(Y) is the acceleration in the Y-axis direction of the training sample
- the variance of the time series, D(Z) is the variance of the acceleration time series in the Z-axis direction of the training sample
- the coordinate axis coefficient is ⁇ XZ , and the calculation formula of ⁇ XZ can refer to the above ⁇ YZ ;
- the coordinate axis coefficient is ⁇ XY
- the calculation formula of ⁇ XY can refer to the above ⁇ YZ .
- the multiple features of each training sample in the training sample set may be normalized to obtain multiple normalized features of each training sample.
- the normalizing the multiple features of each training sample in the training sample set may include:
- B ij is the normalized value of the j-th feature of the i-th training sample
- b ij is the value before normalization of the j-th feature of the i-th training sample.
- i 1, 2,..., N
- N is the number of training samples in the training sample set.
- j 1, 2, ..., M; M is the number of features of each training sample.
- the j-th feature of the i-th training sample refers to the j-th feature among the multiple features of the i-th training sample.
- the method further includes:
- Preprocessing is performed on each training sample in the training sample set.
- the preprocessing of each training sample in each training sample set includes:
- performing noise reduction on the training samples may include: performing a moving average noise reduction on the training samples.
- the training samples can be denoised by moving average according to the following formula:
- output[i] is the output corresponding to the i-th acceleration data in the training sample (ie acceleration time series)
- w is a constant and the value is 3 or 5
- input[i+j] is the output in the training sample The i+jth acceleration data.
- wavelet noise reduction can be performed on the training samples.
- the filling in the missing values in the training sample may include: taking several pieces of acceleration data before and after the missing value in the training sample (for example, the first 5 and last 5 acceleration data of the missing value) , Filling the missing value with an average value of several acceleration data before and after the missing value.
- the K-nearest neighbor algorithm can be used to determine the K training samples closest to the training sample with missing values (for example, the K training samples closest to the training sample with missing values are determined according to Euclidean distance), and the K training samples The weighted average of the data is used to estimate the missing value of the training sample.
- other methods can be used to fill in the missing values.
- the missing value can be filled in by regression fitting method or interpolation method.
- the method of correcting the outliers in the training sample can be the same as the method of filling in missing values.
- several acceleration data before and after the abnormal value in the training sample can be taken (for example, the first 5 acceleration data and the last 5 acceleration data of the abnormal value), and the average value of several acceleration data before and after the abnormal value can be used for correction.
- the abnormal value can be used to determine the K training samples closest to the training sample with outliers (for example, the K training samples closest to the training sample with outliers are determined according to Euclidean distance), and the K training samples The weighted average of the data is used to estimate the outliers of the training sample.
- other methods can be used to correct the abnormal value.
- the abnormal value can be corrected by regression fitting method or interpolation method.
- the construction module 203 is configured to construct multiple classification regression trees according to multiple characteristics of each training sample in the training sample set.
- the constructing multiple classification regression trees according to multiple characteristics of each training sample of the training sample set may include:
- the optimal segmentation feature and segmentation point can be determined according to the following objective function:
- the above formula means to traverse all the feature values (ie, the segmentation point s) of the K features (ie segmentation feature j) of the sample to be classified, and find the optimal segmentation feature and segmentation point according to the minimum square error criterion.
- x i is the i-th training sample in the sample to be classified
- yi is the label of x i.
- x (j) ⁇ s ⁇ , R 2 (j,s) ⁇ x
- R 1 (j,s) is the set of samples to be classified with the feature value of the jth feature less than or equal to s
- R 2 (j,s) ⁇ x
- N 1 is the number of samples to be classified in the subset R 1
- N 2 is the number of samples to be classified in the subset R 2 .
- the meeting the preset stop condition may include:
- the preset stop condition is met
- the preset stopping condition is satisfied.
- a classification regression tree is obtained.
- the root node of the classification regression tree corresponds to the initial sample to be classified, and each leaf node of the classification regression tree corresponds to a subset that is no longer divided.
- the output of the classification regression tree is the output corresponding to the leaf node, that is, the average value of the label of the sample to be classified into the leaf node.
- the generating module 204 is configured to generate a random forest according to the multiple classification regression trees.
- Multiple classification regression trees are formed into the random forest, and different classification regression trees are independent of each other.
- the input of the random forest is the input of each classification regression tree in the random forest; the output of the random forest is the average value of the outputs of all classification regression trees in the random forest.
- the generating a random forest according to the multiple classification regression trees includes:
- the random forest is generated according to the multiple classification regression trees after the pruning process.
- Pruning the multiple classification regression trees includes:
- T t represents the subtree with t as the root node
- C(t) is the prediction error obtained according to the sample to be classified into the internal node t
- C(Tt) is the sample to be classified according to the subtree T t
- the obtained prediction error, C(t) is the prediction error obtained according to the sample to be classified into the t node
- is the number of leaf nodes of the subtree T t;
- the cross-validation method is used to select the optimal subtree T ⁇ in the subtree sequence T 0 , T 1 , ..., T n .
- the recognition module 205 is configured to input multiple characteristics of the user to be recognized into the random forest, and determine the emotional category of the user to be recognized according to the output of the random forest, wherein the multiple characteristics of the user to be recognized are based on the The acceleration time series of the user to be recognized walking is obtained.
- each classification regression tree in the random forest takes multiple characteristics of the user to be identified as input, and classifies the user to be identified according to the multiple characteristics of the user to be identified to obtain the classification regression tree Calculate the average value of the output of all classification regression trees in the random forest to obtain the output of the random forest; determine the emotion category of the user to be identified according to the output of the random forest.
- the emotion category corresponding to the label with the smallest output difference of the random forest may be selected as the emotion category of the user to be identified.
- the user to be identified can be included in the user corresponding to the training sample.
- the training sample set includes training samples of user A, and the user to be identified is user A.
- the training sample set includes training samples of user A, user B, user C, and user D, and the user to be identified is user A.
- the user to be identified may not be included in the user corresponding to the training sample.
- the training sample set includes training samples of user A, user B, user C, and user D, and the user to be identified is user E.
- the emotion recognition device 20 of the second embodiment uses the acceleration time series of the user walking with emotion category tags as training samples, generates a random forest according to each training sample, and uses the random forest to recognize the acceleration time series of the user to be identified.
- the second embodiment realizes the recognition of the user's emotion according to the acceleration data of the user during walking.
- This embodiment provides a storage medium that stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the steps in the foregoing embodiment of the emotion recognition method are implemented, such as 101-105 shown in FIG. 1 :
- each training sample in the training sample set is a time series of a user's walking acceleration, and each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- the obtaining module 201 is configured to obtain a training sample set, each training sample in the training sample set is a time series of the acceleration of the user's walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- the extraction module 202 is configured to extract multiple features for each training sample in the training sample set
- the construction module 203 is configured to construct multiple classification regression trees according to multiple characteristics of each training sample in the training sample set;
- a generating module 204 configured to generate a random forest according to the multiple classification regression trees
- the recognition module 205 is configured to input multiple characteristics of the user to be recognized into the random forest, and determine the emotional category of the user to be recognized according to the output of the random forest, wherein the multiple characteristics of the user to be recognized are based on the The acceleration time series of the user to be recognized walking is obtained.
- FIG. 3 is a schematic diagram of a computer device provided in Embodiment 4 of this application.
- the computer device 30 includes a memory 301, a processor 302, and a computer program 303 stored in the memory 301 and running on the processor 302, such as an emotion recognition program.
- the processor 302 executes the computer program 303, the steps in the foregoing embodiment of the emotion recognition method are implemented, for example, 101-105 shown in Fig. 1:
- each training sample in the training sample set is a time series of a user's walking acceleration, and each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- each module in the above device embodiment is realized, for example, the modules 201-205 in FIG. 2:
- the obtaining module 201 is configured to obtain a training sample set, each training sample in the training sample set is a time series of the acceleration of the user's walking, each training sample has a label, and the label marks the emotion category corresponding to the training sample;
- the extraction module 202 is configured to extract multiple features for each training sample in the training sample set
- the construction module 203 is configured to construct multiple classification regression trees according to multiple characteristics of each training sample in the training sample set;
- a generating module 204 configured to generate a random forest according to the multiple classification regression trees
- the recognition module 205 is configured to input multiple characteristics of the user to be recognized into the random forest, and determine the emotional category of the user to be recognized according to the output of the random forest, wherein the multiple characteristics of the user to be recognized are based on the The acceleration time series of the user to be recognized walking is obtained.
- the computer program 303 may be divided into one or more modules, and the one or more modules are stored in the memory 301 and executed by the processor 302 to complete the method.
- the one or more modules may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 303 in the computer device 30.
- the computer program 303 may be divided into an acquisition module 201, an extraction module 202, a construction module 203, a generation module 204, and an identification module 205 in FIG. 2.
- an acquisition module 201 For specific functions of each module, refer to the second embodiment.
- the computer device 30 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
- a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
- the schematic diagram 3 is only an example of the computer device 30 and does not constitute a limitation on the computer device 30. It may include more or less components than those shown in the figure, or combine certain components, or be different.
- the computer device 30 may also include input and output devices, network access devices, buses, etc.
- the so-called processor 302 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor can be a microprocessor or the processor 302 can also be any conventional processor, etc.
- the processor 302 is the control center of the computer device 30 and connects the entire computer device 30 with various interfaces and lines. Various parts.
- the memory 301 may be used to store the computer program 303, and the processor 302 implements the computer device by running or executing the computer program or module stored in the memory 301 and calling data stored in the memory 301 30 various functions.
- the memory 301 may mainly include a program storage area and a data storage area.
- the program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data (such as audio data) created according to the use of the computer device 30 and the like are stored.
- the memory 301 may include volatile and non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash memory card ( Flash Card), at least one disk storage device, flash memory device or other storage device.
- volatile and non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash memory card ( Flash Card), at least one disk storage device, flash memory device or other storage device.
- the integrated modules of the computer device 30 are implemented in the form of software functional modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
- the computer program can be stored in a storage medium, and the computer program is When executed by the processor, the steps of the foregoing method embodiments can be implemented.
- the computer program includes computer readable instruction code, and the computer readable instruction code may be in the form of source code, object code, executable file, or some intermediate form.
- the computer-readable storage medium may include: any entity or device capable of carrying the computer-readable instruction code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), etc.
- the computer-readable storage medium may be non-volatile or volatile.
- modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
- the above-mentioned integrated modules can be implemented in the form of hardware, or in the form of hardware plus software functional modules.
- the above-mentioned integrated modules implemented in the form of software functional modules may be stored in a computer-readable storage medium.
- the above-mentioned software function module is stored in a storage medium and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute the method described in each embodiment of the present application Part of the steps.
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (20)
- 一种情绪识别方法,其中,所述方法包括:获取训练样本集,所述训练样本集中的每个训练样本为用户步行的加速度时间序列,每个训练样本带有标签,所述标签标记所述训练样本对应的情绪类别;对所述训练样本集中的每个训练样本提取多个特征;根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树;根据所述多个分类回归树生成随机森林;将待识别用户的多个特征输入所述随机森林,根据所述随机森林的输出确定所述待识别用户的情绪类别,其中所述待识别用户的多个特征根据所述待识别用户步行的加速度时间序列得到。
- 如权利要求1所述的情绪识别方法,其中,所述多个特征包括如下各项的任意组合:加速度时间序列的标准差、平均值、峰值、偏态系数、FFT系数、功率谱密度平均值、功率谱密度标准偏差、坐标轴系数。
- 如权利要求1所述的情绪识别方法,其中,所述方法还包括:对所述训练样本集中的每个训练样本的多个特征进行归一化处理,得到每个训练样本归一化后的多个特征;所述根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树包括:根据每个训练样本归一化后的多个特征构建所述多个分类回归树。
- 如权利要求1所述的情绪识别方法,其中,所述对所述训练样本集中的每个训练样本提取多个特征之前,所述方法还包括:对所述训练样本进行降噪;和/或对所述训练样本中的缺失值进行填充;和/或对所述训练样本中的异常值进行修正。
- 如权利要求1所述的情绪识别方法,其中,所述根据所述训练样本集的各个训练样本的多个特征构建多个分类回归树包括:从所述训练样本集中随机选取Q个训练样本作为待分类样本;从所述待分类样本的多个特征中随机选取K个特征;确定所述待分类样本的所述K个特征中最优的切分特征和切分点,根据所述最优的切分特征和切分点将所述待分类样本划分为两个子集;计算划分的每个子集中的待分类样本的标签的均值;对于划分的每个子集,重复执行所述从所述待分类样本的多个特征中随机选取K个特征至所述计算划分的每个子集中的待分类样本的标签的均值,直至满足预设停止条件。
- 如权利要求1所述的情绪识别方法,其中,所述根据所述多个分类回归树生成随机森林包括:对所述多个分类回归树进行剪枝处理;根据剪枝处理后的所述多个分类回归树生成所述随机森林。
- 一种计算机装置,其中,所述计算机装置包括处理器,所述处理器用于执行存储器中存储的计算机可读指令以实现以下步骤:获取训练样本集,所述训练样本集中的每个训练样本为用户步行的加速度时间序列,每个训练样本带有标签,所述标签标记所述训练样本对应的情绪类别;对所述训练样本集中的每个训练样本提取多个特征;根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树;根据所述多个分类回归树生成随机森林;将待识别用户的多个特征输入所述随机森林,根据所述随机森林的输出确定所述待识别用户的情绪类别,其中所述待识别用户的多个特征根据所述待识别用户步行的加速度时间序列得到。
- 如权利要求8所述的计算机装置,其中,所述多个特征包括如下各项的任意组合:加速度时间序列的标准差、平均值、峰值、偏态系数、FFT系数、功率谱密度平均值、功率谱密度标准偏差、坐标轴系数。
- 如权利要求8所述的计算机装置,其中,所述处理器执行所述计算机可读指令还用以实现以下步骤:对所述训练样本集中的每个训练样本的多个特征进行归一化处理,得到每个训练样本归一化后的多个特征;所述处理器执行所述计算机可读指令以实现所述根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树时,具体包括:根据每个训练样本归一化后的多个特征构建所述多个分类回归树。
- 如权利要求8所述的计算机装置,其中,所述处理器执行所述计算机可读指令以实现所述对所述训练样本集中的每个训练样本提取多个特征之前,还用以实现以下步骤:对所述训练样本进行降噪;和/或对所述训练样本中的缺失值进行填充;和/或对所述训练样本中的异常值进行修正。
- 如权利要求8所述的计算机装置,其中,所述处理器执行所述计算机可读指令以实现所述根据所述训练样本集的各个训练样本的多个特征构建多个分类回归树时,具体包括:从所述训练样本集中随机选取Q个训练样本作为待分类样本;从所述待分类样本的多个特征中随机选取K个特征;确定所述待分类样本的所述K个特征中最优的切分特征和切分点,根据所述最优的切分特征和切分点将所述待分类样本划分为两个子集;计算划分的每个子集中的待分类样本的标签的均值;对于划分的每个子集,重复执行所述从所述待分类样本的多个特征中随机选取K个特征至所述计算划分的每个子集中的待分类样本的标签的均值,直至满足预设停止条件。
- 如权利要求8所述的计算机装置,其中,所述处理器执行所述计算机可读指令以实现所述根据所述多个分类回归树生成随机森林时,具体包括:对所述多个分类回归树进行剪枝处理;根据剪枝处理后的所述多个分类回归树生成所述随机森林。
- 一种存储介质,所述存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:获取训练样本集,所述训练样本集中的每个训练样本为用户步行的加速度时间序列,每个训练样本带有标签,所述标签标记所述训练样本对应的情绪类别;对所述训练样本集中的每个训练样本提取多个特征;根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树;根据所述多个分类回归树生成随机森林;将待识别用户的多个特征输入所述随机森林,根据所述随机森林的输出确定所述待识别用户的情绪类别,其中所述待识别用户的多个特征根据所述待识别用户步行的加速度时间序列得到。
- 如权利要求15所述的存储介质,其中,所述计算机可读指令被所述处理器执行还实现以下步骤:对所述训练样本集中的每个训练样本的多个特征进行归一化处理,得到每个训练样本归一化后的多个特征;所述计算机可读指令被所述处理器执行以实现所述根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树时,具体包括:根据每个训练样本归一化后的多个特征构建所述多个分类回归树。
- 如权利要求15所述的存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述对所述训练样本集中的每个训练样本提取多个特征之前,还实现以下步骤:对所述训练样本进行降噪;和/或对所述训练样本中的缺失值进行填充;和/或对所述训练样本中的异常值进行修正。
- 如权利要求15所述的存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述根据所述训练样本集的各个训练样本的多个特征构建多个分类回归树时,具体包括:从所述训练样本集中随机选取Q个训练样本作为待分类样本;从所述待分类样本的多个特征中随机选取K个特征;确定所述待分类样本的所述K个特征中最优的切分特征和切分点,根据所述最优的切分特征和切分点将所述待分类样本划分为两个子集;计算划分的每个子集中的待分类样本的标签的均值;对于划分的每个子集,重复执行所述从所述待分类样本的多个特征中随机选取K个特征至所述计算划分的每个子集中的待分类样本的标签的均值,直至满足预设停止条件。
- 一种情绪识别装置,其中,所述装置包括:获取模块,用于获取训练样本集,所述训练样本集中的每个训练样本为用户步行的加速度时间序列,每个训练样本带有标签,所述标签标记所述训练样本对应的情绪类别;提取模块,用于对所述训练样本集中的每个训练样本提取多个特征;构建模块,用于根据所述训练样本集中的各个训练样本的多个特征构建多个分类回归树;生成模块,用于根据所述多个分类回归树生成随机森林;识别模块,用于将待识别用户的多个特征输入所述随机森林,根据所述随机森林的输出确定所述待识别用户的情绪类别,其中所述待识别用户的多个特征根据所述待识别用户步行的加速度时间序列得到。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910775783.3 | 2019-08-21 | ||
CN201910775783.3A CN110705584A (zh) | 2019-08-21 | 2019-08-21 | 情绪识别方法、装置、计算机装置及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021031817A1 true WO2021031817A1 (zh) | 2021-02-25 |
Family
ID=69193369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/105630 WO2021031817A1 (zh) | 2019-08-21 | 2020-07-29 | 情绪识别方法、装置、计算机装置及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110705584A (zh) |
WO (1) | WO2021031817A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113569482A (zh) * | 2021-07-29 | 2021-10-29 | 石家庄铁道大学 | 隧道服役性能的评估方法、装置、终端及存储介质 |
CN115919313A (zh) * | 2022-11-25 | 2023-04-07 | 合肥工业大学 | 一种基于时空特征的面部肌电情绪识别方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705584A (zh) * | 2019-08-21 | 2020-01-17 | 深圳壹账通智能科技有限公司 | 情绪识别方法、装置、计算机装置及存储介质 |
CN111643098A (zh) * | 2020-06-09 | 2020-09-11 | 深圳大学 | 一种基于智能声学设备的步态识别与情绪感知方法和系统 |
CN111881972B (zh) * | 2020-07-24 | 2023-11-07 | 腾讯音乐娱乐科技(深圳)有限公司 | 一种黑产用户识别方法及装置、服务器、存储介质 |
CN114334090B (zh) * | 2022-03-02 | 2022-07-12 | 博奥生物集团有限公司 | 一种数据分析方法、装置及电子设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120308971A1 (en) * | 2011-05-31 | 2012-12-06 | Hyun Soon Shin | Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof |
CN109447324A (zh) * | 2018-09-30 | 2019-03-08 | 深圳个人数据管理服务有限公司 | 行为活动预测方法、装置、设备及情绪预测方法 |
CN109492682A (zh) * | 2018-10-30 | 2019-03-19 | 桂林电子科技大学 | 一种多分枝随机森林数据分类方法 |
CN110705584A (zh) * | 2019-08-21 | 2020-01-17 | 深圳壹账通智能科技有限公司 | 情绪识别方法、装置、计算机装置及存储介质 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103400123A (zh) * | 2013-08-21 | 2013-11-20 | 山东师范大学 | 基于三轴加速度传感器及神经网络的步态类型鉴别方法 |
US9786299B2 (en) * | 2014-12-04 | 2017-10-10 | Microsoft Technology Licensing, Llc | Emotion type classification for interactive dialog system |
CN105306703A (zh) * | 2015-09-30 | 2016-02-03 | 西安沧海网络科技有限公司 | 一种基于智能手机的情绪识别可穿戴装置 |
CN106097360A (zh) * | 2016-06-17 | 2016-11-09 | 中南大学 | 一种带钢表面缺陷识别方法及装置 |
JP7083809B2 (ja) * | 2016-08-02 | 2022-06-13 | アトラス5ディー, インコーポレイテッド | プライバシーの保護を伴う人物の識別しおよび/または痛み、疲労、気分、および意図の識別および定量化のためのシステムおよび方法 |
CN107220591A (zh) * | 2017-04-28 | 2017-09-29 | 哈尔滨工业大学深圳研究生院 | 多模态智能情绪感知系统 |
CN107582037A (zh) * | 2017-09-30 | 2018-01-16 | 深圳前海全民健康科技有限公司 | 基于脉搏波设计医疗产品的方法 |
CN109846496B (zh) * | 2017-11-30 | 2022-06-10 | 昆山光微电子有限公司 | 智能可穿戴设备情绪感知功能的硬件实现方法及组合 |
CN108537123A (zh) * | 2018-03-08 | 2018-09-14 | 四川大学 | 基于多特征提取的心电识别方法 |
CN109255391B (zh) * | 2018-09-30 | 2021-07-23 | 武汉斗鱼网络科技有限公司 | 一种识别恶意用户的方法、装置及存储介质 |
CN109480780B (zh) * | 2018-11-14 | 2021-08-24 | 重庆三峡医药高等专科学校 | 一种脑卒中预警系统的评估方法及系统 |
CN109933782B (zh) * | 2018-12-03 | 2023-11-28 | 创新先进技术有限公司 | 用户情绪预测方法和装置 |
-
2019
- 2019-08-21 CN CN201910775783.3A patent/CN110705584A/zh active Pending
-
2020
- 2020-07-29 WO PCT/CN2020/105630 patent/WO2021031817A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120308971A1 (en) * | 2011-05-31 | 2012-12-06 | Hyun Soon Shin | Emotion recognition-based bodyguard system, emotion recognition device, image and sensor control apparatus, personal protection management apparatus, and control methods thereof |
CN109447324A (zh) * | 2018-09-30 | 2019-03-08 | 深圳个人数据管理服务有限公司 | 行为活动预测方法、装置、设备及情绪预测方法 |
CN109492682A (zh) * | 2018-10-30 | 2019-03-19 | 桂林电子科技大学 | 一种多分枝随机森林数据分类方法 |
CN110705584A (zh) * | 2019-08-21 | 2020-01-17 | 深圳壹账通智能科技有限公司 | 情绪识别方法、装置、计算机装置及存储介质 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113569482A (zh) * | 2021-07-29 | 2021-10-29 | 石家庄铁道大学 | 隧道服役性能的评估方法、装置、终端及存储介质 |
CN113569482B (zh) * | 2021-07-29 | 2024-02-06 | 石家庄铁道大学 | 隧道服役性能的评估方法、装置、终端及存储介质 |
CN115919313A (zh) * | 2022-11-25 | 2023-04-07 | 合肥工业大学 | 一种基于时空特征的面部肌电情绪识别方法 |
CN115919313B (zh) * | 2022-11-25 | 2024-04-19 | 合肥工业大学 | 一种基于时空特征的面部肌电情绪识别方法 |
Also Published As
Publication number | Publication date |
---|---|
CN110705584A (zh) | 2020-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021031817A1 (zh) | 情绪识别方法、装置、计算机装置及存储介质 | |
Xia et al. | Multi-stage feature constraints learning for age estimation | |
US10776470B2 (en) | Verifying identity based on facial dynamics | |
CN106295313B (zh) | 对象身份管理方法、装置和电子设备 | |
WO2019119505A1 (zh) | 人脸识别的方法和装置、计算机装置及存储介质 | |
JP6517681B2 (ja) | 映像パターン学習装置、方法、及びプログラム | |
WO2016110005A1 (zh) | 基于灰度和深度信息的多层融合的多模态人脸识别装置及方法 | |
WO2022105118A1 (zh) | 基于图像的健康状态识别方法、装置、设备及存储介质 | |
CN109145766A (zh) | 模型训练方法、装置、识别方法、电子设备及存储介质 | |
WO2015070764A1 (zh) | 一种人脸定位的方法与装置 | |
CN113313053B (zh) | 图像处理方法、装置、设备、介质及程序产品 | |
Santhalingam et al. | Sign language recognition analysis using multimodal data | |
CN108985133B (zh) | 一种人脸图像的年龄预测方法及装置 | |
Yi et al. | Multi-modal learning for affective content analysis in movies | |
CN111401339A (zh) | 识别人脸图像中的人的年龄的方法、装置及电子设备 | |
WO2023151237A1 (zh) | 人脸位姿估计方法、装置、电子设备及存储介质 | |
Sayed | Biometric Gait Recognition Based on Machine Learning Algorithms. | |
CN111340213A (zh) | 神经网络的训练方法、电子设备、存储介质 | |
CN114519401B (zh) | 一种图像分类方法及装置、电子设备、存储介质 | |
CN110516638B (zh) | 一种基于轨迹和随机森林的手语识别方法 | |
CN113987188B (zh) | 一种短文本分类方法、装置及电子设备 | |
CN111666976A (zh) | 基于属性信息的特征融合方法、装置和存储介质 | |
Verma et al. | Estimation of sex through morphometric landmark indices in facial images with strength of evidence in logistic regression analysis | |
CN111753583A (zh) | 一种识别方法及装置 | |
Travieso et al. | Using a discrete Hidden Markov Model Kernel for lip-based biometric identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20854168 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20854168 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.08.22) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20854168 Country of ref document: EP Kind code of ref document: A1 |