CN108256307A

CN108256307A - A kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car

Info

Publication number: CN108256307A
Application number: CN201810030098.3A
Authority: CN
Inventors: 朱智勤; 王冠; 李鹏华; 李嫄源; 秦石磊
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2018-01-12
Filing date: 2018-01-12
Publication date: 2018-07-06
Anticipated expiration: 2038-01-12
Also published as: CN108256307B

Abstract

A kind of mixing the present invention relates to intelligent business Sojourn house car enhances intelligent cognition method, and this method specifically comprises the following steps：S1：Driver and passenger are exchanged with vehicle electronic device, and the dialogue state for passing through user and vehicle electronic device tracks；S2：Voiceprint according to collected driver and passenger are tracked carries out authentication to driver and passenger；S3：The behavior intention of driver and passenger is analyzed；S4：Recognition of face is carried out to driver and passenger and carries out driver and passenger's identity authentication and fatigue monitoring；S5：Gesture identification is carried out to driver and passenger；S6：Synthesis obtains analysis and identification result.The cognitive model of the effect of people and people is introduced into commercial Sojourn house car by the present invention forms stronger intelligent form, hoisting machine understands and adapts to commercial Sojourn house car internal and external environment, completes the ability of complicated space time correlation task, enhances commercial Sojourn house car function and space experience.

Description

A kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car

Technical field

The invention belongs to the mixing enhancing intelligent cognition fields of artificial intelligence, are related to a kind of the mixed of intelligent business Sojourn house car Close enhancing intelligent cognition method.

Background technology

At present, China enters whole people's self-driving tourism epoch.Commercial Sojourn house car tourism is increasingly by consumers in general Welcome, government also drives the cooperative development of tourist industry and automobile industry greatly developing commercial Sojourn house car industry.With this Meanwhile China's development of automobile industry is also marched toward " intelligent network connection " development epoch.Sojourn house car is intelligent networking automobile and intelligent family United product is occupied, embodies the social life that artificial intelligence enters people, realizes the depth integration of Science ＆ Society life. During the intelligence of commercial Sojourn house car, information entrained by the upper each modal data of dress system is identified, reasoning, is recognized Know it is one of intelligent key problem to be solved of caravan.

Invention content

In view of this, the purpose of the present invention is to provide a kind of mixing of intelligent business Sojourn house car to enhance intelligent cognition side Method is expressed by across the media Uniform semantics for building commercial Sojourn house car various dimensions intelligent space, by recognizing for the effect of people and people Perception model, which is introduced into commercial Sojourn house car, forms stronger intelligent form, and hoisting machine understands and adapts in commercial Sojourn house car External environment, the ability for completing complicated space time correlation task, enhance commercial Sojourn house car function and space experience.

In order to achieve the above objectives, the present invention provides following technical solution：

A kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car, this method specifically comprise the following steps：

S1：Driver and passenger are exchanged with vehicle electronic device, and pass through the dialogue state of user and vehicle electronic device Tracking；

S2：Voiceprint according to collected driver and passenger are tracked carries out authentication to driver and passenger；

S3：The behavior intention of driver and passenger is analyzed；

S4：Recognition of face is carried out to driver and passenger and carries out driver and passenger's identity authentication and fatigue monitoring；

S5：Gesture identification is carried out to driver and passenger；

S6：Synthesis obtains analysis and identification result.

Further, step S1 is specially：

S11：The voice of driver and passenger is converted to, and text data is spelled by text data by speech recognition engine Write error correction；

S12：By after error correction text data carry out word segmentation processing, obtain word sequence, by Word2Vec obtain word to Amount；

S13：Term vector is handled by concatenated convolutional neural network, obtains session operational scenarios type；

S14：Depth enhancing learning network is built, enhancing learning network iteration by two independent depth completes dialogue shape The intensified learning of state behavioral strategy；

S15：Semantic knowledge figure is built by triple, calculates knowledge atom relationship score and embedding in semantic knowledge figure in real time Enter the bound of cost, obtain knowledge query results；

S16：Corresponding spectrum parameter is generated using more spatial probability distribution HMM parameter generation algorithms and fundamental frequency generates smoothly Acoustic feature sequence, and submit to synthesizer and generate final voice.

Further, step S2 is specially：

S21：Preemphasis, framing adding window and end-point detection processing are carried out to tracking collected driver and passenger's voice messaging；

S22：By treated, voice messaging progress Fourier transformation obtains spectrum energy distribution, using triangle filter Group carries out critical band division, and line amplitude of going forward side by side weighted calculation and discrete cosine transform obtain cepstrum coefficient；

S23：Cepstrum coefficient is input to speaker identification (UBM) model, obtains vocal print feature；

S24：Vocal print template matches are carried out, whether judgement voiceprint corresponds to.

Further, step S24 is specially：

S241：Defining energy function is,

Wherein, h and v is vector, represents the state of hidden layer and visible layer respectively, and a and b represents the inclined of visible layer and hidden layer It puts, v_jAnd h_jRepresent the state of i-th of visible node layer and j-th of hidden node, m is hidden node number, W_ijFor visible layer With the connection weight of hidden layer, i for which sequence number, j for which sort number, a_iIt is biased for i-th of visible layer, b_jIt is j-th Hidden layer biases；

S242：Setting models θ={ w_ij,a_i,b_jThe joint probability distribution of state (v, h) is obtained,

Wherein,For normalization factor；

S243：The higher-dimension Gauss super vector for obtaining two class RBM-i-vector is calculated by the energy evolution of RBM, using line Property discriminant analysis carry out channel compensation；

S244：Two class higher-dimension Gauss super vectors after compensation are subjected to the similar calculating of cosine, and compared with predetermined threshold value, Whether judgement voiceprint corresponds to.

Further, step S3 is specially：

S31：Depth cascade network of the structure with head and shoulder identification, to every frame image using multi-scale sliding window mouth according to pre- A series of candidate segments of fixed step size interception, form sample to be identified；

S32：Sample to be identified is inputted into trained head and shoulder/non-head and shoulder identification model, classification is identified；

S33：It introduces nonlinear model and display model is associated analysis；

S34：Test pose registration simultaneously carries out threshold value comparison.

Further, step S34 is specially：

S341：The passenger attitude frame being consecutively detected is fused into a complete action；

S342：Two posture fusion rules are designed,

Wherein, wherein f (i, j) is fusion function, and 1 represents to merge, and 0 represents to merge,

For two attitude detection frame registrations, T_IoUAttach most importance to Right threshold value, S_hisFor Histogram Matching score in two detection blocks, T_hisFor Histogram Matching threshold value, t₁, t₂For the moment, Δ T is two Posture time difference threshold value.

Further, driver and passenger's identity authentication is specially in step S4：

S401：Pass through radial symmetry transform coarse positioning face；

S402：Current point is obtained to the optimal iterative vectorized of target point using supervision gradient descent method (SDM) study, is established Shaped Offset amount Δ x=x^*The feature of-x and current shape xBetween linear regression model (LRM),

Wherein x^*For, b is biasing,For regression model；

S403：Desired position vector is obtained using current shape x and deformation vectors Δ x iteration；

x:=x+ Δs x

x:Represent desired position vector；

S404：The learning objective of SDM is constructed, obtains the i-th point of true deviation with actual boundary point,

Wherein, k is iterations, x_kShape vector when representing to iterate to kth time,Represent in the shape vector The coordinate of i point,For i-th point of coordinate deformation quantity in the shape vector, b_kBiasing during for kth time iteration；

S405：Using differential operator,

Face's organ of people is accurately positioned, wherein, G_σ(r) for smooth function, I (x, y) is gradation of image matrix, (a, b) For the center of circle, r is radius；

S406：Human face in target area with characteristic point is fitted, obtains characteristic point mark position；

S407：Interception subgraph in each characteristic point contiguous range, obtains human face adjacent features, and by all characteristic points Adjacent features characteristic point limit learning characteristic in series,

S408：Count training of the limit learning characteristic of each picture portion as extensive Single hidden layer feedforward neural networks Collection, training extreme learning machine search for the identity label of the particular person of fusion feature, complete identity authentication；

Driver and passenger's fatigue monitoring is specially in step S4：

S411：The 3D faces for realizing driver using 3D human face model buildings model, and according to the face of S406-S408 The head pose of recognition methods real-time tracing driver；

S412：The position of eyes in 2D facial images is solved using the eye position in 3D faceforms and head pose；

S413：Characteristic point in eye areas is positioned, and utilize facial image texture using face point detection algorithm (CLM) Normalization verifies positioned characteristic point；

S414：Iris center is positioned according to the physiological structure characteristic of iris；

S415：Upper palpebra inferior is positioned according to parameterized template, extracts driver and passenger's eye motion；

S416：According to eye motion, eyelid opening and closing degree is extracted respectively, eyes open that close speed related to iris kinetic characteristic Fatigue characteristic, and compared with the feature under waking state, obtain variation features；

S417：Relevance between each fatigue characteristic is analyzed using BAYESIAN NETWORK CLASSIFIER, it is tired to complete driver and passenger Labor monitors.

Further, step S5 is specially：

S51：The images of gestures of driver and passenger is acquired, and is converted into image sequence I_rgb；

S52：By image sequence I_rgbBe converted to grayscale image sequence I_gray, and by image I_rgbBe converted to two-value skin-color news Image sequence I_skin；

S53：According to grayscale image sequence I_grayWith two-value skin-color news image sequence I_skinKinematic parameter is calculated, as frame Between motion feature；

S54：The time span of regular gesture motion constructs probability function,

Wherein i represents i-th of state, and j represents j-th of characteristic parameter, x_i,jMovement for time normalization gesture sequence is special It seeks peace shape feature, λ represents gesture class, and μ represents the mathematic expectaion matrix of each characteristic parameter, and σ is standard deviation, u_i,jIt is i-th The mathematic expectaion matrix of j-th of characteristic parameter of state, σ_i,jThe standard deviation of j-th of characteristic parameter for i-th of state；

S55：There is the probability function of complete gesture sequence observation in structure,

Wherein, X is the observation of complete gesture sequence, m for gesture state and, n for gesture feature and；

S56：For all kinds of gesture identifications, calculate

The minimum value of acquisition is to belong to gesture classification.

The beneficial effects of the present invention are：The present invention is the enhancing of the mixing based on deep learning towards commercial Sojourn house car Intelligent cognition technology.

First is the demand interacted for the facilities such as driver and passenger and vehicle electronics, vehicle device amusement into pedestrian's intelligent sound, This patent devises the more people's dialog models of people's vehicle, realizes that driver and passenger exchange with the intelligent sound of mobile unit.The module is specific Including data under voice layer, pretreatment layer, semantic analytic sheaf, Dialog management Layer, knowledge reasoning layer, voice output layer.Voice Data realize the speech exchange of people and onboard system by analysis and processing step by step.

Second is the identification and fatigue detecting for driver.Application on Voiceprint Recognition and people are included to the discriminating of driver Face identifies, carries out personal identification by algorithm drives model and BP network model respectively.Pass through the knowledge to driver Not, the safety of caravan and its internal property has been ensured.

Third, the driver fatigue detection for ensureing traffic safety is required for caravan mobile unit.Fatigue detecting is By the way that head pose, the information extraction of eye motion and the characteristic value calculated under characteristic value and waking state are made and being compared, obtain Take variation features and according to the values of variation features to determine whether fatigue.

4th, behavioural analysis and gesture identification of this patent for driver also proposed the innovation idea of oneself.By non- The training projected depth of linear movement model and display model joins grade network and is intended to analyze the behavior of driver, carries out auxiliary and drives It sails.Gesture identification is a part for vehicle-mounted people's car mutual, and the purpose of this part is the demand in order to simplify driver's operation, is passed through The movable information and colouring information of colourful states model opponent extract make analysis come with driver interaction, meet driver's Demand.The method of the proposition of this patent enriches the reality of driver and passenger while property safety and traffic safety is ensured Experience.

Description of the drawings

In order to make the purpose of the present invention, technical solution and advantageous effect clearer, the present invention provides drawings described below and carries out Explanation：

Fig. 1 is more wheel dialog models of the embodiment of the present invention based on POMDP strategies and ternary knowledge mapping；

Fig. 2 is that overall factor of the embodiment of the present invention based on limited Boltzmann machine models schematic diagram；

Fig. 3 is three-level depth cascade network structure chart of the embodiment of the present invention；

Fig. 4 is on-line study of embodiment of the present invention non-linear movement pattern and the multiple target tracking of robust display model；

Fig. 5 embodiment of the present invention face characteristic extracts region and locating effect figure；

Fig. 6 is driver and passenger of embodiment of the present invention fatigue state monitoring figure；

Fig. 7 is gesture feature of embodiment of the present invention space-time performance figure.

Specific embodiment

Below in conjunction with attached drawing, the preferred embodiment of the present invention is described in detail.

The present invention includes the following steps：

1st, people's vehicle intelligent sound interaction technique.The facilities such as entertain into pedestrian's vehicle around driver and passenger and vehicle electronics, vehicle device The demand of intelligent sound interaction, using dialogue state tracking and administrative skill, design is based on POMDP strategies and ternary knowledge mapping People's vehicle take turns dialog model more, realize that driver and passenger exchange with the smooth of mobile unit.As shown in Figure 1, it is as follows：

A) data collection layer：User speech is converted to by text data by speech recognition engine, and completes word spelling Error correction.

B) pretreatment layer：Text data after correction is subjected to word segmentation processing and obtains word sequence, it is complete in vocabulary and semanteme Into part-of-speech tagging, entity name, to refer to disambiguation, relationship altogether interdependent, and obtains term vector by Word2Vec.

C) semantic analytic sheaf：The term vector merged after encoding is submitted into concatenated convolutional neural network, is completed preliminary semantic Parsing obtains session operational scenarios type.

D) Dialog management Layer：Design dialogue problem guiding strategy realizes dialogue state tracking in POMDP models, passes through structure Depth enhancing learning network (DQN) is built, the extensive chemical of dialogue state behavioral strategy is completed by the iteration of two independent Q networks It practises.

E) knowledge reasoning layer：Semantic knowledge figure is built by building triple, in the case where not using index, in real time The bound of knowledge atom relationship score and embedded cost in calculation knowledge figure, derives the knowledge query results of Top-k, and On the basis of determining single scene and across scene knowledge atom combination of sets, corresponding scoring function is separately designed, with reference to multiple row convolution Training of the network-driven to combination of sets transboundary calculates knowledge fusion score transboundary.

F) voice output layer：Text analyzing is carried out by the text to input, is given birth to using more spatial probability distribution HMM parameters Corresponding spectrum parameter is generated into algorithm and fundamental frequency generates smooth acoustic feature sequence, and is submitted to synthesizer and generated final language Sound.

2nd, differentiate driver with vocal print.As shown in Fig. 2, the driving people in commercial Sojourn house car intelligent and safe field Member's authentication demand substitutes i-vector features using the limited Boltzmann machine Feature Extraction Technology under the entire change factor Extraction designs the UBM model under EM algorithm drives, realizes that the vocal print under higher-dimension Gaussian component characterization differentiates.

A) acquisition process sound bite.Preemphasis, framing adding window and end-point detection are carried out by the sound bite to acquisition Processing.Signal progress Fourier transformation is obtained into spectrum energy distribution, critical band division is carried out using triangle filter group, and It carries out amplitude weighting calculating and discrete cosine transform obtains cepstrum coefficient (MFCC).

B) vocal print feature is obtained.Cepstrum coefficient is submitted to the UBM model trained by EM algorithms, obtains the general of vocal print feature Rate score, and carry out template matches with corresponding Gaussian component.

C) vocal print template matches.The limited Boltzmann that design is made of the visible layer of n node and the hidden layer of m node Machine (RBM), defining its energy function is：Wherein, vectorial h and v difference The state of hidden layer and visible layer is represented, a and b represents the biasing of visible layer and hidden layer, v_iAnd h_jRepresent i-th visible node layer and The state of j-th of hidden node.Setting models θ={ w_ij,a_i,b_j, obtain the joint probability distribution of state (v, h)Wherein,For normalization factor.

D) judge speaker.Calculated by the energy evolution of RBM obtain two class RBM-i-vector higher-dimension Gauss surpass to Amount, and channel compensation is carried out using linear discriminant analysis (LDA).Two class RBM-i-vector after compensation are subjected to cosine phase Like calculating, and compared with predetermined threshold value, so as to judge ownership of the vocal print to speaker dependent.

3rd, driver's behavioural analysis.Driver and passenger in commercial Sojourn house car intelligent behavior interaction field are intended to divide Analysis demand, using non-linear movement pattern study and the more case-based learnings of display model, depth of the design with head and shoulder identification function Cascade network realizes that driver and passenger's behavior of layering association multiple target tracking learning strategy driving is intended to analysis.

A) depth cascade network Screening Samples are built.Depth cascade network (HsNet) of the structure with head and shoulder identification, to every Frame image, according to a series of candidate segments (Patch) of pre- fixed step size interception, forms sample to be identified using multi-scale sliding window mouth； By these samples be sent into advance trained head and shoulder/non-head and shoulder identification model HsNet, three-level CNN cascade networks, as shown in figure 3, Classify.In specific assorting process, the Patch for being judged as negative sample directly gives up, and remaining sample goes successively to net The next stage of network carries out tightened up identification classification, so carries out three-level CNN network class successively and differentiates；The network third level it is defeated Go out result for judging whether image Patch belongs to head and shoulder region, head and shoulder frame height degree is extended to the 3 of former corresponding sliding window Times, obtain the whole body frame of occupant detection；For same passenger, multiple detection blocks can be formed, finally with non-maxima suppression plan Extra detection block is slightly rejected, each position only retains a most probable detection block-occupant detection recognition result.

B) it introduces nonlinear model and display model is associated analysis.It is associated in multiple target tracking learning strategy in layering Non-linear movement pattern study and the more case-based learnings of display model are introduced, by carrying out the credible association of bottom, shape to detection object Into path segment；Using non-linear movement pattern on-line study and the more case-based learnings of display model, path segment is carried out effective Connection, obtains reliable object trajectory.Using parameters such as speed, direction, the distances extracted from object motion trajectory as special Multiple features are combined the more advanced semanteme of composition and carry out description object behavior, so as to judge that driver and passenger's behavior is intended to by sign.

C) test pose registration and compare threshold value.As shown in figure 4, the robustness to improve behavioral value, will continuously examine The passenger attitude frame measured is fused into a complete action behavior.Designing two posture fusion rules is：

Wherein f (i, j) is fusion function, and 1 represents to merge, and 0 represents to merge,

For two attitude detection frame registrations, T_IoU= 0.5 represents registration threshold value, S_hisFor Histogram Matching score in two detection blocks, T_his=35 be Histogram Matching threshold value, T_Δ= 25 represent two posture time difference threshold values.

4th, driver and passenger's recognition of face and fatigue monitoring.Driver and passenger in commercial Sojourn house car intelligent and safe field Supervision gradient descent algorithm and CLM location algorithms is respectively adopted in authentication and fatigue state monitoring requirements, and design is based on the limit The identity that the extensive Single hidden layer feedforward neural networks of learning machine complete specific driver differentiates and based on face 3D modeling Matching template realizes the fatigue state monitoring of driver.

A) identity authentication based on face characteristic：By radial symmetry transform coarse positioning face, declined using supervision gradient Method (SDM) study obtains current point to the optimal iterative vectorized of target point, establishes shaped Offset amount Δ x=x^*- x and current shape The feature of xBetween linear regression model (LRM)Then current shape x and deformation vectors Δ are utilized X iteration obtains desired position vector x:=x+ Δs x.Construct the learning objective of SDM：

Wherein, k is iterations, x_kShape vector when representing to iterate to kth time,Represent i-th in the shape vector The coordinate of a point.Successive ignition study is carried out, obtains the i-th point of true deviation with actual boundary point.Then calculus is used Operator：

It is accurately positioned the organs, wherein G such as eyes, nose, the mouth in face_σ(r) it is smooth function, I (x, y) is image ash Matrix is spent, (a, b) is the center of circle, and r is radius.The results are shown in Figure 5 for Face detection.

It designs extensive Single hidden layer feedforward neural networks and carries out recognition of face, which rotates dull grey scale change and angle With invariance, there is insensitivity to image change caused by uneven illumination.In identification process, by the people in target area Face is fitted with characteristic point, obtains characteristic point mark position.Subgraph is intercepted in each characteristic point contiguous range, is obtained Human face adjacent features, finally by all characteristic point adjacent features characteristic point limit learning characteristic in series.It counts respectively Training set of the limit learning characteristic of each picture portion as extensive Single hidden layer feedforward neural networks, the multiple limit study of training Machine, the output of combination feedforward neural network is as a result, and under the driving of optimal integrated classifier output decision, search for fusion feature Particular person identity label completes identity authentication.The results are shown in Figure 5 for the extraction of face limit learning characteristic.

B) fatigue monitoring based on face characteristic：The 3D faces for realizing driver using 3D human face model buildings model, And combine the head pose of above-mentioned face identification method real-time tracing driver.Using the eye position in 3D faceforms with Head pose solves the eye position in 2D facial images indirectly, and the characteristic point in eye areas, and profit are positioned using CLM algorithms Calibration feature point location is normalized with face-image texture.Iris center is positioned by the physiological structure characteristic of iris, overcomes rainbow Imaging difference of the film under different illumination conditions.Upper palpebra inferior is positioned using parameterized template, realizes driver's eye motion Extraction.According to eye motion characteristic, eyelid opening and closing degree is extracted respectively, eyes open and close speed and iris kinetic characteristic associated fatigue Feature, and make comparisons to obtain variation features with the feature value under waking state.It is each to build BAYESIAN NETWORK CLASSIFIER analysis Relevance between a fatigue characteristic completes the fatigue state differentiation of driver, as shown in Figure 6.

5th, driver and passenger's gesture identification.It is typical into pedestrian's vehicle around the facilities such as driver and passenger and vehicle electronics, vehicle device amusement The demand of gesture interaction designs the multi state Gaussian probability model under complex background, with reference to the movable information and colouring information of hand Driver and passenger's gesture identification of human hand segmentation is carried out, as shown in Figure 7.

A) conversion and processing of image.Color image sequence I is obtained by shooting_rgb, on the one hand it is converted into 256 grades of ashes Spend image sequence I_gray, for the analysis of kinematic parameter；On the other hand it according to distribution of the RGB color in HSI spaces, is converted Image sequence I is interrogated for two-value skin-color_skin, wherein being divided into skin-coloured regions and non-skin color region.

B) extraction of characteristic information and image co-registration.To grayscale image sequence I_gray, handle and obtain rough two-value movement Image sequence I_mov.Meanwhile I_movAnd I_skinBetween correspondence image with operation, obtain two-value skin movements area image sequence I_mov-skin, sequence I_mov-skinMiddle region is the skin area of movement.Due to I_mov-skinIn not necessarily comprising complete hand region, Therefore design seed algorithm finds complete hand region.First, it is assumed that the moving region of hand is I_mov-skinIn major part, root According to the connectivity of region, in I_mov-skinIt is middle to apply seed algorithm, maximum connected domain B is found, using this connected domain B as human hand A part；Then, connected domain B is mapped to I_skinIn same position, using seed algorithm using this position as seed, in I_skin Middle extension obtains the image sequence I in complete hand region_hand.For hand area image sequence I_hand, the shape spy in extraction hand region Sign, with reference to I_grayAnd I_hand, in the hand region of adjacent two frame, kinematic parameter is calculated, as interframe movement feature

C) extraction of gesture motion feature and shape feature.L is enabled to represent the time span of gesture, the shape feature of t frames It is s [t], the motion feature between t frames and t+1 frames is m [t], defines 8 dimensional feature vector f [t] (f [t]=[m [t],s[t]^T]), for the appearance features of Unify legislation gesture, THE INVARIANCE OF THE SCALE OF TIME characteristic sequence is formed, realizes time scale Invariant feature extraction and matching.Construct the space-time appearance features A=[f [0], f [1] ..., f [L-2]] of gesture^TDefined feature to Amount f [t] changes with time.The time span L of regular gesture motion, construction i-th of state, j-th of characteristic parameter are observed x_i,jProbability function：

Wherein x_i,jRepresent the motion feature and shape feature of time normalization gesture sequence, λ represents any gesture class mould Type, μ represent the mathematic expectaion matrix of each characteristic parameter, and σ is standard deviation.Then for gesture model λ (μ, σ), appearance can be built The probability function of complete gesture sequence observation X：

During to each gesture identification, calculateObtain the minimum as ownership of value Gesture classification.

Finally illustrate, preferred embodiment above is only to illustrate the technical solution of invention and unrestricted, although passing through Above preferred embodiment is described in detail the present invention, however, those skilled in the art should understand that, can be in shape Various changes are made in formula and to it in details, without departing from claims of the present invention limited range.

Claims

1. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car, it is characterised in that：This method is specifically comprising such as Lower step：

S1：Driver and passenger are exchanged with vehicle electronic device, and the dialogue state for passing through user and vehicle electronic device tracks；

S3：The behavior intention of driver and passenger is analyzed；

S5：Gesture identification is carried out to driver and passenger；

S6：Synthesis obtains analysis and identification result.

2. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 1, feature exist In：Step S1 is specially：

S11：The voice of driver and passenger is converted to by text data by speech recognition engine, and spelling is carried out to text data and is entangled It is wrong；

S12：Text data after error correction is subjected to word segmentation processing, word sequence is obtained, term vector is obtained by Word2Vec；

S14：Depth enhancing learning network is built, enhancing learning network iteration by two independent depth completes dialogue state row Intensified learning for strategy；

S15：Semantic knowledge figure is built by triple, calculates knowledge atom relationship score and embedded generation in semantic knowledge figure in real time The bound of valency obtains knowledge query results；

S16：Corresponding spectrum parameter is generated using more spatial probability distribution HMM parameter generation algorithms and fundamental frequency generates smooth acoustics Characteristic sequence, and submit to synthesizer and generate final voice.

3. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 2, feature exist In：Step S2 is specially：

S22：Treated voice messaging is subjected to Fourier transformation and obtains spectrum energy distribution, using triangle filter group into Row critical band divides, and line amplitude of going forward side by side weighted calculation and discrete cosine transform obtain cepstrum coefficient；

4. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 3, feature exist In：Step S24 is specially：

S241：Defining energy function is,

Wherein, h and v is vector, represents the state of hidden layer and visible layer respectively, and a and b represents the biasing of visible layer and hidden layer, v_j And h_jRepresent the state of i-th of visible node layer and j-th of hidden node, m is hidden node number, W_ijFor visible layer and hidden layer Connection weight, i for which sequence number, j for which sort number, a_iIt is biased for i-th of visible layer, b_jIt is inclined for j-th of hidden layer It puts；

Wherein,For normalization factor；

S243：The higher-dimension Gauss super vector for obtaining two class RBM-i-vector is calculated by the energy evolution of RBM, using linearly sentencing Channel compensation Fen Xi not carried out；

S244：Two class higher-dimension Gauss super vectors after compensation are subjected to the similar calculating of cosine, and compared with predetermined threshold value, judgement Whether voiceprint corresponds to.

5. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 3, feature exist In：Step S3 is specially：

S31：Depth cascade network of the structure with head and shoulder identification, to every frame image using multi-scale sliding window mouth according to predetermined step A series of long candidate segments of interception, form sample to be identified；

S33：It introduces nonlinear model and display model is associated analysis；

6. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 5, feature exist In：Step S34 is specially：

S342：Two posture fusion rules are designed,

For two attitude detection frame registrations, T_IoUFor registration Threshold value, S_hisFor Histogram Matching score in two detection blocks, T_hisFor Histogram Matching threshold value, t₁, t₂For the moment, Δ T is two postures Time difference threshold value.

7. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 5, feature exist In：Driver and passenger's identity authentication is specially in step S4：

S401：Pass through radial symmetry transform coarse positioning face；

S402：Current point is obtained to the optimal iterative vectorized of target point using supervision gradient descent method (SDM) study, establishes shape Offset Δ x=x^*The feature of-x and current shape xBetween linear regression model (LRM),

Wherein x^*For, b is biasing,For regression model；

x:=x+ Δs x

x:Represent desired position vector；

Wherein, k is iterations, x_kShape vector when representing to iterate to kth time,Represent i-th point in the shape vector Coordinate,For i-th point of coordinate deformation quantity in the shape vector, b_kBiasing during for kth time iteration；

S405：Using differential operator,

Face's organ of people is accurately positioned, wherein, G_σ(r) it is smooth function, I (x, y) is gradation of image matrix, and (a, b) is circle The heart, r are radius；

S407：Interception subgraph in each characteristic point contiguous range, obtains human face adjacent features, and by the neighbour of all characteristic points Nearly feature series connection constitutive characteristic point limit learning characteristic,

S408：Training set of the limit learning characteristic of each picture portion as extensive Single hidden layer feedforward neural networks is counted, is instructed Practice extreme learning machine, search for the identity label of the particular person of fusion feature, complete identity authentication；

Driver and passenger's fatigue monitoring is specially in step S4：

S411：The 3D faces for realizing driver using 3D human face model buildings model, and according to the recognition of face of S406-S408 The head pose of method real-time tracing driver；

S413：Characteristic point in eye areas is positioned, and utilize facial image texture normalizing using face point detection algorithm (CLM) Change and verify positioned characteristic point；

S416：According to eye motion, eyelid opening and closing degree is extracted respectively, eyes open and close speed and iris kinetic characteristic associated fatigue Feature, and compared with the feature under waking state, obtain variation features；

S417：Relevance between each fatigue characteristic is analyzed using BAYESIAN NETWORK CLASSIFIER, completes driver and passenger's fatigue prison It surveys.

8. a kind of mixing enhancing intelligent cognition method of intelligent business Sojourn house car according to claim 7, feature exist In：Step S5 is specially：

S53：According to grayscale image sequence I_grayWith two-value skin-color news image sequence I_skinKinematic parameter is calculated, is transported as interframe Dynamic feature；

S54：The time span of regular gesture motion constructs probability function,

Wherein i represents i-th of state, and j represents j-th of characteristic parameter, x_i,jFor time normalization gesture sequence motion feature and Shape feature, λ represent gesture class, and μ represents the mathematic expectaion matrix of each characteristic parameter, and σ is standard deviation, u_i,jFor i-th of state J-th of characteristic parameter mathematic expectaion matrix, σ_i,jThe standard deviation of j-th of characteristic parameter for i-th of state；

S56：For all kinds of gesture identifications, calculate

The minimum value of acquisition is to belong to gesture classification.