CN108256307B - Hybrid enhanced intelligent cognitive method of intelligent business travel motor home - Google Patents
Hybrid enhanced intelligent cognitive method of intelligent business travel motor home Download PDFInfo
- Publication number
- CN108256307B CN108256307B CN201810030098.3A CN201810030098A CN108256307B CN 108256307 B CN108256307 B CN 108256307B CN 201810030098 A CN201810030098 A CN 201810030098A CN 108256307 B CN108256307 B CN 108256307B
- Authority
- CN
- China
- Prior art keywords
- driver
- passenger
- gesture
- intelligent
- sojourn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000001149 cognitive effect Effects 0.000 title claims abstract description 12
- 229940036051 sojourn Drugs 0.000 claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims abstract description 14
- 238000012544 monitoring process Methods 0.000 claims abstract description 12
- 230000009471 action Effects 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 36
- 230000033001 locomotion Effects 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 20
- 238000001514 detection method Methods 0.000 claims description 17
- 230000006399 behavior Effects 0.000 claims description 14
- 230000004927 fusion Effects 0.000 claims description 14
- 210000003128 head Anatomy 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 9
- 210000000744 eyelid Anatomy 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 8
- 210000000056 organ Anatomy 0.000 claims description 8
- 230000002787 reinforcement Effects 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 8
- 230000004424 eye movement Effects 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 230000002618 waking effect Effects 0.000 claims description 4
- 238000010219 correlation analysis Methods 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000012417 linear regression Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 230000001105 regulatory effect Effects 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 3
- 238000001228 spectrum Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000036544 posture Effects 0.000 description 7
- 230000019771 cognition Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/19—Sensors therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Security & Cryptography (AREA)
- Ophthalmology & Optometry (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a hybrid enhanced intelligent cognitive method of an intelligent business sojourn motor home, which specifically comprises the following steps: s1: the driver and the passenger communicate with the vehicle-mounted electronic equipment and are tracked through the conversation state of the user and the vehicle-mounted electronic equipment; s2: carrying out identity authentication on the driver and the passenger according to the voiceprint information of the driver and the passenger which is tracked and collected; s3: analyzing the behavior intention of the driver and the passenger; s4: carrying out face recognition on a driver and a passenger to carry out identity authentication and fatigue monitoring on the driver and the passenger; s5: performing gesture recognition on the driver and the passengers; s6: and comprehensively obtaining an analysis and identification result. The invention introduces the human action and the human cognitive model into the business sojourn recreational vehicle to form a stronger intelligent form, improves the capability of understanding and adapting to the internal and external environments of the business sojourn recreational vehicle by a machine, completes complex space-time related tasks, and enhances the function and space experience of the business sojourn recreational vehicle.
Description
Technical Field
The invention belongs to the field of artificial intelligence hybrid enhanced intelligent cognition, and relates to a hybrid enhanced intelligent cognition method for an intelligent business motor caravan.
Background
At present, China enters the national self-driving travel era. The tourism of the commercial motor caravan is more and more popular with the consumers, and the government is also in vigorous development of the commercial motor caravan industry to drive the collaborative development of the tourism industry and the automobile industry. Meanwhile, the development of the automobile industry in China also steps towards the development era of intelligent networking. The sojourn car is a product combining an intelligent networked automobile and an intelligent home, embodies the fact that artificial intelligence enters human social life, and realizes the deep integration of science and technology and social life. In the intelligent process of a business sojourn motor home, identifying, reasoning and cognition of information carried by each modal data of a loading system are one of core problems to be solved by the intellectualization of the motor home.
Disclosure of Invention
In view of the above, the present invention provides a hybrid enhanced intelligent cognition method for an intelligent business sojourn car, which introduces human actions and a human cognition model into the business sojourn car to form a stronger intelligent form by constructing a cross-media unified semantic expression of a multi-dimensional intelligent space of the business sojourn car, so as to improve machine understanding, adapt to the internal and external environments of the business sojourn car, complete the complex time-space correlation task, and enhance the function and space experience of the business sojourn car.
In order to achieve the purpose, the invention provides the following technical scheme:
a hybrid enhanced intelligent cognition method for an intelligent business motor caravan specifically comprises the following steps:
s1: the driver and the passenger communicate with the vehicle-mounted electronic equipment and are tracked through the conversation state of the user and the vehicle-mounted electronic equipment;
s2: carrying out identity authentication on the driver and the passenger according to the voiceprint information of the driver and the passenger which is tracked and collected;
s3: analyzing the behavior intention of the driver and the passenger;
s4: carrying out face recognition on a driver and a passenger to carry out identity authentication and fatigue monitoring on the driver and the passenger;
s5: performing gesture recognition on the driver and the passengers;
s6: and comprehensively obtaining an analysis and identification result.
Further, step S1 specifically includes:
s11: converting the voice of the driver and the crew into text data through a voice recognition engine, and performing spelling error correction on the text data;
s12: performing Word segmentation processing on the text data after error correction to obtain a Word sequence, and obtaining a Word vector through Word2 Vec;
s13: processing the word vectors through a cascade convolution neural network to obtain a conversation scene type;
s14: constructing a deep reinforcement learning network, and iteratively finishing reinforcement learning of a dialogue state behavior strategy through two independent deep reinforcement learning networks;
s15: constructing a semantic knowledge graph through the triples, and calculating the upper and lower bounds of the association score and the embedding cost of the knowledge atoms in the semantic knowledge graph in real time to obtain a knowledge query result;
s16: and generating corresponding spectral parameters and fundamental frequency by adopting a multi-space probability distribution HMM parameter generation algorithm to generate a smooth acoustic feature sequence, and submitting the acoustic feature sequence to a synthesizer to generate final voice.
Further, step S2 specifically includes:
s21: carrying out pre-emphasis, framing and windowing and end point detection processing on the voice information of the drivers and passengers which is tracked and collected;
s22: fourier transform is carried out on the processed voice information to obtain frequency spectrum energy distribution, a triangular filter bank is adopted to carry out critical band division, and amplitude weighting calculation and discrete cosine transform are carried out to obtain cepstrum coefficients;
s23: inputting the cepstrum coefficients into a speaker identification (UBM) model to obtain voiceprint characteristics;
s24: and matching the voiceprint templates and judging whether the voiceprint information corresponds to the voiceprint template.
Further, step S24 specifically includes:
s241: the energy function is defined as the function of,
wherein h and v are vectors respectively representing the states of the hidden layer and the visible layer, a and b represent the bias of the visible layer and the hidden layer, vjAnd hjRepresenting the states of the ith visible layer node and the jth hidden layer node, m is the number of the hidden layer nodes, WijIs the connection weight of the visible layer and the hidden layer, i is the number of the ranking, j is the number of the ranking, aiFor the ith visible layer offset, bjBiasing for the jth hidden layer;
s242: given model θ ═ wij,ai,bjObtaining the joint probability distribution of the states (v, h),
s243: obtaining two types of high-dimensional Gaussian supervectors of RBM-i-vector by the energy evolution calculation of RBM, and performing channel compensation by adopting linear discriminant analysis;
s244: and performing cosine similarity calculation on the two types of compensated high-dimensional Gaussian supervectors, comparing the two types of compensated high-dimensional Gaussian supervectors with a preset threshold value, and judging whether the voiceprint information corresponds to the voiceprint information or not.
Further, step S3 specifically includes:
s31: constructing a depth cascade network with head and shoulder recognition, and intercepting a series of candidate image blocks according to a preset step length by adopting a multi-scale sliding window for each frame of image to form a sample to be recognized;
s32: inputting a sample to be recognized into a trained head-shoulder/non-head-shoulder recognition model for recognition and classification;
s33: introducing a nonlinear model and an appearance model for correlation analysis;
s34: and detecting the attitude coincidence degree and comparing thresholds.
Further, step S34 specifically includes:
s341: fusing continuously detected passenger posture frames into a complete action;
s342: designing a two-posture fusion rule,
wherein f (i, j) is a fusion function, 1 indicates fusion is possible, 0 indicates fusion is not possible,
detecting frame overlap ratio, T, for two posesIoUAs a coincidence threshold, ShisFor two detection in-box histogram matching scores, ThisFor histogram matching threshold, t1,t2At time, Δ T is the two-pose time difference threshold.
Further, the identification of the identity of the driver and the passenger in step S4 is specifically as follows:
s401: roughly positioning the human face through radial symmetric transformation;
s402: obtaining an optimal iterative vector from a current point to a target point by adopting supervised gradient descent (SDM) learning, and establishing a shape offset delta x as x*-x and the characteristics of the current shape xA linear regression model of the two or more,
s403: iterating by using the current shape x and the deformation vector delta x to obtain an expected position vector;
x:=x+Δx
x represents the desired position vector;
s404: constructing a learning target of the SDM, obtaining the real deviation of the ith point and the actual boundary point,
where k is the number of iterations, xkRepresenting the shape vector when iterated to the kth time,representing the coordinates of the ith point in the shape vector,the coordinate deformation amount of the ith point in the shape vector, bkIs the bias at the kth iteration;
s405: by adopting a differential operator, the method adopts the following steps,
accurately positioning a human face organ, wherein Gσ(r) is a smoothing function, I (x, y) is an image gray matrix, (a, b) is a circle center, and r is a radius;
s406: fitting the human face organs in the target area with the characteristic points to obtain characteristic point mark positions;
s407: intercepting the subgraph in the neighborhood range of each feature point, obtaining the adjacent features of human face organs, connecting the adjacent features of all the feature points in series to form the feature point limit learning feature,
s408: counting the extreme learning characteristics of each image partition as a training set of a generalized single hidden layer feedforward neural network, training an extreme learning machine, searching an identity label of a specific person with the characteristics fused, and completing identity identification;
the fatigue monitoring of the driver and the passengers in the step S4 is specifically as follows:
s411: 3D face modeling of a driver is realized by adopting a 3D face modeling method, and the head posture of the driver is tracked in real time according to the face recognition method of S406-S408;
s412: solving the positions of the eyes in the 2D face image by using the positions of the eyes and the head posture in the 3D face model;
s413: positioning feature points in the eye region by adopting a face point detection algorithm (CLM), and utilizing face image texture normalization to verify the positioned feature points;
s414: positioning the center of the iris according to the physiological structure characteristics of the iris;
s415: positioning the upper eyelid and the lower eyelid according to the parameterized template, and extracting the eye movement of the driver and the crew;
s416: respectively extracting fatigue characteristics related to the opening and closing degree of the eyelids, the opening and closing speed of the eyes and the motion characteristics of the iris according to the eye movement, and comparing the fatigue characteristics with the characteristics in the waking state to obtain variation characteristics;
s417: and analyzing the relevance among all fatigue characteristics by adopting a Bayesian network classifier to complete the fatigue monitoring of the drivers and passengers.
Further, step S5 specifically includes:
s51: collecting gesture images of driver and passenger, and converting the gesture images into image sequence Irgb;
S52: image sequence IrgbConversion into a sequence of grayscale images IgrayAnd image IrgbConversion into a binary skin tone image sequence Iskin;
S53: according to a gray-scale image sequence IgrayAnd binary skin tone signal image sequence IskinCalculating motion parameters as the inter-frame motion characteristics;
s54: regulating the time length of the gesture motion, constructing a probability function,
where i denotes the ith state, j denotes the jth characteristic parameter, xi,jFor the time-normalized motion and shape characteristics of a gesture sequence, λ represents the gesture class, μ represents the mathematical expectation matrix for each characteristic parameter, σ is the standard deviation, ui,jA mathematical expectation matrix, σ, for the jth characteristic parameter of the ith statei,jThe standard deviation of the jth characteristic parameter of the ith state;
s55: a probability function of the occurrence of the complete gesture sequence observation is constructed,
wherein X is the observation of the complete gesture sequence, m is the sum of the gesture states, and n is the sum of the gesture features;
s56: for all kinds of gesture recognition, calculating
And obtaining the minimum value which is the attribution gesture category.
The invention has the beneficial effects that: the invention relates to a hybrid enhanced intelligent cognitive technology based on deep learning and oriented to a business sojourn motor home.
The first is to carry out people's intelligence pronunciation interactive demand to facilities such as driver and crew and on-vehicle electron, car machine amusement, and people's car many people dialogue model has been designed to this patent, realizes driver and crew and mobile device's intelligent pronunciation and exchanges. The module specifically comprises a voice data acquisition layer, a preprocessing layer, a semantic analysis layer, a dialogue management layer, a knowledge reasoning layer and a voice output layer. The voice data is analyzed and processed step by step, and voice communication between a person and the vehicle-mounted system is achieved.
The second is identity recognition and fatigue detection for drivers. The driver identification comprises voiceprint identification and face identification, and personnel identification is carried out through an algorithm driving model and a feedforward neural network model respectively. Through the discernment to the navigating mate, ensured the safety of car as a house and inside property.
Thirdly, driver fatigue detection to ensure driving safety is essential for the in-vehicle devices of the motor home. The fatigue detection is to extract the information of the head posture and the eye movement, calculate a characteristic value and compare the characteristic value with the characteristic value in the waking state, acquire a variation characteristic and judge whether fatigue exists according to the value of the variation characteristic.
Fourthly, the patent also provides own innovative ideas for behavior analysis and gesture recognition of drivers. And analyzing the behavior intention of the driver by training and designing a deep cascade network of the nonlinear motion model and the appearance model to assist driving. Gesture recognition is a part of vehicle-mounted human-vehicle interaction, and the purpose of the part is to simplify the operation requirement of a driver, extract and analyze motion information and color information of hands through a multi-posture model to interact with the driver, and meet the requirement of the driver. The method provided by the patent enriches the actual experience of drivers and passengers while ensuring property safety and driving safety.
Drawings
In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:
FIG. 1 is a multi-turn dialogue model based on a POMDP strategy and a ternary knowledge graph according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of modeling an overall factor based on a constrained Boltzmann machine according to an embodiment of the present invention;
FIG. 3 is a diagram of a three-level deep cascade network according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating multi-target tracking for online learning of non-linear motion patterns and robust appearance models in accordance with an embodiment of the present invention;
FIG. 5 is a diagram of a face feature extraction area and a positioning effect in an embodiment of the present invention;
FIG. 6 is a monitoring diagram of fatigue status of drivers and passengers according to the embodiment of the invention;
FIG. 7 is a diagram of gesture feature spatiotemporal representations according to an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
The invention comprises the following steps:
1. human-vehicle intelligent voice interaction technology. Around the requirement of carrying out human-vehicle intelligent voice interaction between a driver and a vehicle-mounted electronic device, a vehicle-mounted electronic device and other devices, a human-vehicle multi-turn conversation model based on a POMDP strategy and a ternary knowledge map is designed by adopting a conversation state tracking and management technology, and smooth communication between the driver and the vehicle-mounted device is realized. As shown in fig. 1, the specific steps are as follows:
a) a data acquisition layer: and converting the user voice into text data through a voice recognition engine, and completing the spelling error correction of the characters.
b) A pretreatment layer: and performing Word segmentation processing on the corrected text data to obtain a Word sequence, completing part-of-speech tagging, entity naming, common-finger disambiguation and relationship dependence in vocabulary and semantics, and acquiring Word vectors by means of Word2 Vec.
c) A semantic analysis layer: and submitting the word vectors subjected to fusion coding to a cascade convolution neural network to complete primary semantic analysis and obtain the conversation scene type.
d) A conversation management layer: and designing a dialogue problem guide strategy in the POMDP model to realize dialogue state tracking, and completing reinforcement learning of a dialogue state behavior strategy by constructing a deep reinforcement learning network (DQN) and by means of iteration of two independent Q networks.
e) Knowledge reasoning layer: the semantic knowledge graph is constructed by constructing triples, the upper and lower bounds of knowledge atom association scores and embedding costs in the knowledge graph are calculated in real time under the condition that indexes are not adopted, the knowledge query result of Top-k is deduced, corresponding score functions are respectively designed on the basis of determining a single scene knowledge atom combination set and a cross-scene knowledge atom combination set, training of the cross-boundary combination set is driven by combining a multi-column convolution network, and cross-boundary knowledge fusion scores are calculated.
f) A voice output layer: the method comprises the steps of performing text analysis on an input text, generating corresponding spectral parameters and fundamental frequency by adopting a multi-space probability distribution HMM parameter generation algorithm to generate a smooth acoustic feature sequence, and submitting the acoustic feature sequence to a synthesizer to generate final voice.
2. The driver is identified using voiceprints. As shown in fig. 2, around the requirement of identity authentication of drivers and passengers in the intelligent security field of business travel caravans, i-vector feature extraction is replaced by a limited boltzmann machine feature extraction technology under the total variation factor, a UBM model under the drive of an EM algorithm is designed, and voiceprint identification under the high-dimensional gaussian component representation is realized.
a) And collecting and processing the voice fragments. The collected voice segments are processed by pre-emphasis, framing and windowing and end point detection. Fourier transform is carried out on the signals to obtain frequency spectrum energy distribution, a triangular filter bank is adopted to carry out critical band division, and amplitude weighting calculation and discrete cosine transform are carried out to obtain a cepstrum coefficient (MFCC).
b) Voiceprint features are obtained. And submitting the cepstrum coefficient to a UBM model trained by an EM algorithm to obtain probability scores of the voiceprint characteristics, and performing template matching with corresponding Gaussian components.
c) And matching the voiceprint templates. Designing a Restricted Boltzmann Machine (RBM) consisting of a visible layer of n nodes and a hidden layer of m nodes, and defining an energy function as follows:where vectors h and v represent the states of the hidden and visible layers, respectively, a and b represent the offsets of the visible and hidden layers, viAnd hjRepresenting the states of the ith visible level node and the jth hidden level node. Given model θ ═ wij,ai,bjGet the joint probability distribution of the states (v, h)Wherein,is a normalization factor.
d) And judging the speaker. And obtaining two types of high-dimensional Gaussian supervectors of the RBM-i-vector by the energy evolution calculation of the RBM, and performing channel compensation by adopting Linear Discriminant Analysis (LDA). And performing cosine similarity calculation on the two compensated RBM-i-vectors, and comparing the two compensated RBM-i-vectors with a preset threshold value, thereby judging the attribution of the voiceprint to a specific speaker.
3. And analyzing the behavior of the driver. Around the analysis requirement of the driver and passenger intentions in the intelligent behavior interaction field of the commercial motor caravan, a deep cascade network with a head and shoulder recognition function is designed by adopting nonlinear motion mode learning and appearance model multi-instance learning, and the analysis of the driver and passenger behavior intentions driven by a hierarchical association multi-target tracking learning strategy is realized.
a) And constructing a deep cascade network screening sample. Constructing a depth cascade network (HsNet) with head and shoulder identification, and intercepting a series of candidate blocks (Patch) according to a preset step length by adopting a multi-scale sliding window for each frame of image to form a sample to be identified; the samples are sent to a pre-trained head-shoulder/non-head-shoulder recognition model HsNet and a three-level CNN cascade network, and classified as shown in figure 3. In the specific classification process, the Patch judged as a negative sample is directly abandoned, and the rest samples continue to enter the next stage of the network for more strict identification and classification, so that three stages of CNN network classification and identification are sequentially carried out; the output result of the third level of the network is used for judging whether the image Patch belongs to a head and shoulder area or not, and the height of the head and shoulder frame is expanded to be 3 times of that of the original corresponding sliding window to obtain a whole body frame detected by the passenger; and for the same passenger, a plurality of detection frames are formed, and finally, redundant detection frames are removed by using a non-maximum suppression strategy, and only one most possible detection frame-passenger detection identification result is reserved at each position.
b) And introducing a nonlinear model and an appearance model for correlation analysis. Introducing nonlinear motion mode learning and appearance model multi-instance learning in a hierarchical association multi-target tracking learning strategy, and performing bottom layer credible association on a detection object to form a track segment; and effectively connecting track segments by utilizing the nonlinear motion mode online learning and the appearance model multi-instance learning to obtain a reliable object track. Parameters such as speed, direction and distance extracted from the motion trail of the object are used as features, and a plurality of features are combined to form higher-level semantics to describe the behavior of the object, so that the behavior intention of the driver and the passenger is judged.
c) And detecting the attitude coincidence degree and a comparison threshold value. As shown in fig. 4, to improve the robustness of behavior detection, continuously detected passenger posture frames are fused into a complete action behavior. The two-posture fusion rule is designed as follows:
where f (i, j) is a fusion function, 1 indicates fusion is possible, 0 indicates fusion is not possible,
detecting frame overlap ratio, T, for two posesIoU0.5 denotes the overlap ratio threshold, ShisFor two detection in-box histogram matching scores, This35 is the histogram matching threshold, TΔ25 denotes the two-pose time difference threshold.
4. And (5) carrying out face recognition and fatigue monitoring on the driver and the passengers. Around the requirements of identity authentication and fatigue state monitoring of drivers in the intelligent safety field of commercial motor homes, a supervision gradient descent algorithm and a CLM positioning algorithm are respectively adopted, a generalized single-hidden-layer feedforward neural network based on an extreme learning machine is designed to complete identity authentication of specific drivers, and a matching template based on face 3D modeling is used for monitoring the fatigue state of the drivers.
a) Identity authentication based on human face features: roughly positioning the face by radial symmetric transformation, obtaining the optimal iterative vector from the current point to the target point by adopting supervised gradient descent (SDM) learning, and establishing the shape offset delta x as x*-x and the characteristics of the current shape xLinear regression model betweenThen, iteration is carried out by using the current shape x and the deformation vector delta x to obtainThe desired position vector x:x + Δ x. Constructing a learning objective for SDM:
where k is the number of iterations, xkRepresenting the shape vector when iterated to the kth time,representing the coordinates of the ith point in the shape vector. And carrying out repeated iterative learning to obtain the real deviation of the ith point and the actual boundary point. Then, a calculus operator is adopted:
accurately locating the eyes, nose, mouth, etc. in the face of a person, where GσAnd (r) is a smoothing function, I (x, y) is an image gray matrix, (a, b) is a circle center, and r is a radius. The face localization result is shown in fig. 5.
A generalized single hidden layer feedforward neural network is designed to carry out face recognition, and the network has invariance to monotone gray level change and angle rotation and insensitivity to image change caused by uneven illumination. In the identification process, fitting the human face organs in the target area with the characteristic points to obtain the marking positions of the characteristic points. And intercepting the subgraph in the neighborhood range of each feature point, acquiring the adjacent features of the face organs, and finally connecting the adjacent features of all the feature points in series to form the feature point limit learning feature. And respectively counting the extreme learning features of each image partition as a training set of a generalized single hidden layer feedforward neural network, training a plurality of extreme learning machines, combining the output results of the feedforward neural network, and searching a specific human identity label fused with the features under the drive of an output decision of an optimal integrated classifier to finish identity authentication. The result of face extreme learning feature extraction is shown in fig. 5.
b) Fatigue monitoring based on human face features: and 3D face modeling of the driver is realized by adopting a 3D face modeling method, and the head posture of the driver is tracked in real time by combining the face recognition method. The eye positions in the 2D face image are indirectly solved by using the eye positions and the head postures in the 3D face model, the feature points in the eye areas are positioned by using a CLM algorithm, and the feature point positioning is verified by using the texture normalization of the face image. The center of the iris is positioned through the physiological structure characteristics of the iris, and the imaging difference of the iris under different illumination conditions is overcome. And (3) positioning the upper eyelid and the lower eyelid by using a parameterized template to extract the eye movement of the driver. According to the eye action characteristics, fatigue characteristics related to the eyelid opening and closing degree, the eye opening and closing speed and the iris movement characteristics are respectively extracted and compared with characteristic values in a waking state to obtain variation characteristics. And constructing a Bayesian network classifier to analyze the relevance among all fatigue characteristics, and finishing the fatigue state judgment of the driver, as shown in FIG. 6.
5. And (5) gesture recognition of the driver and the passenger. Surrounding the requirement that a driver and a vehicle carry out human-vehicle typical gesture interaction with facilities such as vehicle-mounted electronics and vehicle-mounted entertainment, a multi-state Gaussian probability model under a complex background is designed, and the hand motion information and the color information of a hand are combined to carry out human-hand segmentation driver and vehicle gesture recognition, as shown in FIG. 7.
a) And (5) converting and processing the image. Color image sequence I obtained by shootingrgbOn the one hand, it is converted into a 256-level gray image sequence IgrayFor analysis of motion parameters; on the other hand, according to the distribution of RGB colors in HSI space, converting the RGB colors into binary skin color signal image sequence IskinDivided into skin color areas and non-skin color areas.
b) And extracting characteristic information and fusing the image. For gray level image sequence IgrayProcessing to obtain a coarse binary motion image sequence Imov. At the same time, ImovAnd IskinCorresponding to the AND operation between the images to obtain a binary skin motion region image sequence Imov-skinSequence Imov-skinThe middle area is the moving skin area. Due to Imov-skinDoes not necessarily contain a complete hand region, so the seed algorithm is designed to find a complete hand region. First, assume a motion region of a handThe domain is Imov-skinIn accordance with regional connectivity, inmov-skinThe largest connected domain B is found by applying a seed algorithm, and the connected domain B is used as a part of a human hand; then, mapping connected component B to IskinAt the same position in (1), applying a seed algorithm to seed the position, atskinImage sequence I with middle expansion to obtain complete hand areahand. For hand region image sequence IhandExtracting shape characteristics of hand region, combining with IgrayAnd IhandAnd calculating motion parameters in the hand areas of two adjacent frames as the inter-frame motion characteristics.
c) And extracting gesture motion characteristics and shape characteristics. Let L denote the temporal length of the gesture, and the shape feature of the t-th frame is s [ t ]]The motion characteristic between the t-th frame and the t + 1-th frame is m [ t ]]Defining an 8-dimensional feature vector f [ t ]](f[t]=[m[t],s[t]T]) The method is used for uniformly describing the apparent features of the gestures, forming a time scale invariant feature sequence and realizing time scale invariant feature extraction and matching. Constructing a spatiotemporal apparent feature of a gesture A ═ f [0 [ ]],f[1],…,f[L-2]]TDefining a feature vector f [ t ]]Change over time. Regulating the time length L of the gesture motion, and constructing the j characteristic parameter appearance observation x in the ith statei,jThe probability function of (c):
wherein xi,jThe method comprises the steps of representing motion characteristics and shape characteristics of a time-normalized gesture sequence, representing any gesture type model by lambda, representing a mathematical expectation matrix of each characteristic parameter by mu, and representing standard deviation by sigma. Then for the gesture model λ (μ, σ), a probability function of the occurrence of the complete gesture sequence observation X can be constructed:
for each gesture recognition, calculatingThe gesture category with the smallest value is the attributive gesture category.
Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, although the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.
Claims (7)
1. A hybrid enhanced intelligent cognitive method of an intelligent business sojourn motor home is characterized by comprising the following steps: the method specifically comprises the following steps:
s1: the driver and the passenger communicate with the vehicle-mounted electronic equipment and are tracked through the conversation state of the user and the vehicle-mounted electronic equipment;
s2: carrying out identity authentication on the driver and the passenger according to the voiceprint information of the driver and the passenger which is tracked and collected;
s3: analyzing the behavior intention of the driver and the passenger;
s4: carrying out face recognition on a driver and a passenger to carry out identity authentication and fatigue monitoring on the driver and the passenger;
s5: performing gesture recognition on the driver and the passengers;
s6: comprehensively obtaining an analysis and identification result;
step S1 specifically includes:
s11: converting the voice of the driver and the crew into text data through a voice recognition engine, and performing spelling error correction on the text data;
s12: performing Word segmentation processing on the text data after error correction to obtain a Word sequence, and obtaining a Word vector through Word2 Vec;
s13: processing the word vectors through a cascade convolution neural network to obtain a conversation scene type;
s14: constructing a deep reinforcement learning network, and iteratively finishing reinforcement learning of a dialogue state behavior strategy through two independent deep reinforcement learning networks;
s15: constructing a semantic knowledge graph through the triples, and calculating the upper and lower bounds of the association score and the embedding cost of the knowledge atoms in the semantic knowledge graph in real time to obtain a knowledge query result;
s16: and generating corresponding spectral parameters and fundamental frequency by adopting a multi-space probability distribution HMM parameter generation algorithm to generate a smooth acoustic feature sequence, and submitting the acoustic feature sequence to a synthesizer to generate final voice.
2. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 1, wherein: step S2 specifically includes:
s21: carrying out pre-emphasis, framing and windowing and end point detection processing on the voice information of the drivers and passengers which is tracked and collected;
s22: fourier transform is carried out on the processed voice information to obtain frequency spectrum energy distribution, a triangular filter bank is adopted to carry out critical band division, and amplitude weighting calculation and discrete cosine transform are carried out to obtain cepstrum coefficients;
s23: inputting the cepstrum coefficient into a speaker identification UBM model to obtain voiceprint characteristics;
s24: and matching the voiceprint templates and judging whether the voiceprint information corresponds to the voiceprint template.
3. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 2, wherein: step S24 specifically includes:
s241: the energy function is defined as the function of,
wherein h and v are vectors respectively representing the states of the hidden layer and the visible layer, a and b represent the bias of the visible layer and the hidden layer, vjAnd hjRepresenting the states of the ith visible layer node and the jth hidden layer node, m is the number of the hidden layer nodes, WijIs the connection weight of the visible layer and the hidden layer, i is the number of the ranking, j is the number of the ranking, aiFor the ith visible layer offset, bjBiasing for the jth hidden layer;
s242: given model θ ═ wij,ai,bjObtaining the joint probability distribution of the states (v, h),
s243: obtaining two types of high-dimensional Gaussian supervectors of RBM-i-vector by the energy evolution calculation of RBM, and performing channel compensation by adopting linear discriminant analysis;
s244: and performing cosine similarity calculation on the two types of compensated high-dimensional Gaussian supervectors, comparing the two types of compensated high-dimensional Gaussian supervectors with a preset threshold value, and judging whether the voiceprint information corresponds to the voiceprint information or not.
4. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 2, wherein: step S3 specifically includes:
s31: constructing a depth cascade network with head and shoulder recognition, and intercepting a series of candidate image blocks according to a preset step length by adopting a multi-scale sliding window for each frame of image to form a sample to be recognized;
s32: inputting a sample to be recognized into a trained head-shoulder/non-head-shoulder recognition model for recognition and classification;
s33: introducing a nonlinear model and an appearance model for correlation analysis;
s34: and detecting the attitude coincidence degree and comparing thresholds.
5. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 4, wherein: step S34 specifically includes:
s341: fusing continuously detected passenger posture frames into a complete action;
s342: designing a two-posture fusion rule,
wherein f (i, j) is a fusion function, 1 indicates fusion is possible, 0 indicates fusion is not possible,
6. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 4, wherein: the identity authentication of the driver and the passenger in the step S4 specifically comprises the following steps:
s401: roughly positioning the human face through radial symmetric transformation;
s402: obtaining an optimal iterative vector from a current point to a target point by adopting supervised gradient descent (SDM) learning, and establishing a shape offset delta x as x*-x and the characteristics of the current shape xA linear regression model of the two or more,
s403: iterating by using the current shape x and the deformation vector delta x to obtain an expected position vector;
x:=x+Δx
x represents the desired position vector;
s404: constructing a learning target of the SDM, obtaining the real deviation of the ith point and the actual boundary point,
where k is the number of iterations, xkRepresenting the shape vector when iterated to the kth time,representing the coordinates of the ith point in the shape vector,the coordinate deformation amount of the ith point in the shape vector, bkIs the bias at the kth iteration;
s405: by adopting a differential operator, the method adopts the following steps,
accurately positioning a human face organ, wherein Gσ(r) is a smoothing function, I (x, y) is an image gray matrix, (a, b) is a circle center, and r is a radius;
s406: fitting the human face organs in the target area with the characteristic points to obtain characteristic point mark positions;
s407: intercepting the subgraph in the neighborhood range of each feature point, obtaining the adjacent features of human face organs, connecting the adjacent features of all the feature points in series to form the feature point limit learning feature,
s408: counting the extreme learning characteristics of each image partition as a training set of a generalized single hidden layer feedforward neural network, training an extreme learning machine, searching an identity label of a specific person with the characteristics fused, and completing identity identification;
the fatigue monitoring of the driver and the passengers in the step S4 is specifically as follows:
s411: 3D face modeling of a driver is realized by adopting a 3D face modeling method, and the head posture of the driver is tracked in real time according to the face recognition method of S406-S408;
s412: solving the positions of the eyes in the 2D face image by using the positions of the eyes and the head posture in the 3D face model;
s413: the method comprises the steps of positioning feature points in an eye region by adopting a face point detection algorithm (CLM), and utilizing face image texture normalization to verify the positioned feature points;
s414: positioning the center of the iris according to the physiological structure characteristics of the iris;
s415: positioning the upper eyelid and the lower eyelid according to the parameterized template, and extracting the eye movement of the driver and the crew;
s416: respectively extracting fatigue characteristics related to the opening and closing degree of the eyelids, the opening and closing speed of the eyes and the motion characteristics of the iris according to the eye movement, and comparing the fatigue characteristics with the characteristics in the waking state to obtain variation characteristics;
s417: and analyzing the relevance among all fatigue characteristics by adopting a Bayesian network classifier to complete the fatigue monitoring of the drivers and passengers.
7. The hybrid enhanced intelligent cognitive method of the intelligent business sojourn caravan according to claim 6, wherein: step S5 specifically includes:
s51: collecting gesture images of driver and passenger, and converting the gesture images into image sequence Irgb;
S52: image sequence IrgbConversion into a sequence of grayscale images IgrayAnd image IrgbConversion into a binary skin tone image sequence Iskin;
S53: according to a gray-scale image sequence IgrayAnd binary skin tone signal image sequence IskinCalculating motion parameters as the inter-frame motion characteristics;
s54: regulating the time length of the gesture motion, constructing a probability function,
where i denotes the ith state, j denotes the jth characteristic parameter, xi,jFor the time-normalized motion and shape characteristics of a gesture sequence, λ represents the gesture class, μ represents the mathematical expectation matrix for each characteristic parameter, σ is the standard deviation, ui,jA mathematical expectation matrix, σ, for the jth characteristic parameter of the ith statei,jThe standard deviation of the jth characteristic parameter of the ith state;
s55: a probability function of the occurrence of the complete gesture sequence observation is constructed,
wherein X is the observation of the complete gesture sequence, m is the sum of the gesture states, and n is the sum of the gesture features;
s56: for all kinds of gesture recognition, calculating
And obtaining the minimum value which is the attribution gesture category.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810030098.3A CN108256307B (en) | 2018-01-12 | 2018-01-12 | Hybrid enhanced intelligent cognitive method of intelligent business travel motor home |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810030098.3A CN108256307B (en) | 2018-01-12 | 2018-01-12 | Hybrid enhanced intelligent cognitive method of intelligent business travel motor home |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108256307A CN108256307A (en) | 2018-07-06 |
CN108256307B true CN108256307B (en) | 2021-04-02 |
Family
ID=62727133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810030098.3A Active CN108256307B (en) | 2018-01-12 | 2018-01-12 | Hybrid enhanced intelligent cognitive method of intelligent business travel motor home |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108256307B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034020A (en) * | 2018-07-12 | 2018-12-18 | 重庆邮电大学 | A kind of community's Risk Monitoring and prevention method based on Internet of Things and deep learning |
CN109079813A (en) * | 2018-08-14 | 2018-12-25 | 重庆四通都成科技发展有限公司 | Automobile Marketing service robot system and its application method |
CN109143870B (en) * | 2018-10-23 | 2021-08-06 | 宁波溪棠信息科技有限公司 | Multi-target task control method |
CN110070884B (en) * | 2019-02-28 | 2022-03-15 | 北京字节跳动网络技术有限公司 | Audio starting point detection method and device |
CN109918513B (en) * | 2019-03-12 | 2023-04-28 | 北京百度网讯科技有限公司 | Image processing method, device, server and storage medium |
CN110111795B (en) * | 2019-04-23 | 2021-08-27 | 维沃移动通信有限公司 | Voice processing method and terminal equipment |
CN112308116B (en) * | 2020-09-28 | 2023-04-07 | 济南大学 | Self-optimization multi-channel fusion method and system for old-person-assistant accompanying robot |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120317621A1 (en) * | 2011-06-09 | 2012-12-13 | Canon Kabushiki Kaisha | Cloud system, license management method for cloud service |
US9286029B2 (en) * | 2013-06-06 | 2016-03-15 | Honda Motor Co., Ltd. | System and method for multimodal human-vehicle interaction and belief tracking |
CN105654753A (en) * | 2016-01-08 | 2016-06-08 | 北京乐驾科技有限公司 | Intelligent vehicle-mounted safe driving assistance method and system |
CN105812129A (en) * | 2016-05-10 | 2016-07-27 | 成都景博信息技术有限公司 | Method for monitoring vehicle running state |
CN104183091B (en) * | 2014-08-14 | 2017-02-08 | 苏州清研微视电子科技有限公司 | System for adjusting sensitivity of fatigue driving early warning system in self-adaptive mode |
CN106682603A (en) * | 2016-12-19 | 2017-05-17 | 陕西科技大学 | Real time driver fatigue warning system based on multi-source information fusion |
-
2018
- 2018-01-12 CN CN201810030098.3A patent/CN108256307B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120317621A1 (en) * | 2011-06-09 | 2012-12-13 | Canon Kabushiki Kaisha | Cloud system, license management method for cloud service |
US9286029B2 (en) * | 2013-06-06 | 2016-03-15 | Honda Motor Co., Ltd. | System and method for multimodal human-vehicle interaction and belief tracking |
CN104183091B (en) * | 2014-08-14 | 2017-02-08 | 苏州清研微视电子科技有限公司 | System for adjusting sensitivity of fatigue driving early warning system in self-adaptive mode |
CN105654753A (en) * | 2016-01-08 | 2016-06-08 | 北京乐驾科技有限公司 | Intelligent vehicle-mounted safe driving assistance method and system |
CN105812129A (en) * | 2016-05-10 | 2016-07-27 | 成都景博信息技术有限公司 | Method for monitoring vehicle running state |
CN106682603A (en) * | 2016-12-19 | 2017-05-17 | 陕西科技大学 | Real time driver fatigue warning system based on multi-source information fusion |
Non-Patent Citations (5)
Title |
---|
Hybrid-augmented intelligence:collaboration and cognition;Nan-ning ZHENG,et al;《Frontiers of Information Technology & Electronic Engineering》;20170215;第18卷(第2期);第153-179页 * |
Toward Intelligent Driver-Assistance and Safety Warning Systems;Nan-Ning Zheng,et al;《IEEE Intelligent System》;20040430;第8-11页 * |
Visually Guided Landing of an Unmanned Aerial Vehicle;Srikanth Saripalli,et al;《IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION》;20030625;第19卷(第3期);第371-380页 * |
基于数据资源的认知图挖掘系统研究;李嫄源,等;《重庆邮电大学学报 (自然科学版 )》;20110630;第23卷(第3期);第374-378页 * |
车用自组网媒体访问控制机制改进;李嫄源,等;《微电子学》;20110630;第41卷(第3期);第372-376,380页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108256307A (en) | 2018-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256307B (en) | Hybrid enhanced intelligent cognitive method of intelligent business travel motor home | |
CN109409296B (en) | Video emotion recognition method integrating facial expression recognition and voice emotion recognition | |
CN110609891B (en) | Visual dialog generation method based on context awareness graph neural network | |
CN107145842B (en) | Face recognition method combining LBP characteristic graph and convolutional neural network | |
Chen et al. | K-means clustering-based kernel canonical correlation analysis for multimodal emotion recognition in human–robot interaction | |
CN106127156A (en) | Robot interactive method based on vocal print and recognition of face | |
CN112101241A (en) | Lightweight expression recognition method based on deep learning | |
CN101187990A (en) | A session robotic system | |
CN110853656B (en) | Audio tampering identification method based on improved neural network | |
Ocquaye et al. | Dual exclusive attentive transfer for unsupervised deep convolutional domain adaptation in speech emotion recognition | |
CN111114556A (en) | Lane change intention identification method based on LSTM under multi-source exponential weighting loss | |
CN108345866B (en) | Pedestrian re-identification method based on deep feature learning | |
CN115294658B (en) | Personalized gesture recognition system and gesture recognition method for multiple application scenes | |
CN109344713A (en) | A kind of face identification method of attitude robust | |
CN103035239B (en) | Speaker recognition method based on partial learning | |
CN113361636A (en) | Image classification method, system, medium and electronic device | |
CN116363712B (en) | Palmprint palm vein recognition method based on modal informativity evaluation strategy | |
CN106992000A (en) | Prediction-based multi-feature fusion old people voice emotion recognition method | |
Ahammad et al. | Recognizing Bengali sign language gestures for digits in real time using convolutional neural network | |
Das et al. | Emotion recognition from face dataset using deep neural nets | |
Sahu et al. | Modeling feature representations for affective speech using generative adversarial networks | |
CN116244474A (en) | Learner learning state acquisition method based on multi-mode emotion feature fusion | |
Shekofteh et al. | MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space | |
Shukla et al. | A novel stochastic deep conviction network for emotion recognition in speech signal | |
CN114582372A (en) | Multi-mode driver emotional feature recognition method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |