CN115440221B

CN115440221B - Vehicle-mounted intelligent voice interaction method and system based on cloud computing

Info

Publication number: CN115440221B
Application number: CN202211395643.1A
Authority: CN
Inventors: 徐俊
Original assignee: Foshan Tiandixing Technology Co ltd
Current assignee: Foshan Tiandixing Technology Co ltd
Priority date: 2022-11-09
Filing date: 2022-11-09
Publication date: 2023-03-24
Anticipated expiration: 2042-11-09
Also published as: CN115440221A

Abstract

The invention discloses a vehicle-mounted intelligent voice interaction method and system based on cloud computing, which comprises the following steps: acquiring attention information of a driving user in a driving process in a vehicle-mounted environment, and analyzing state information of the driving user by combining with vehicle-mounted environment information to generate a driving scene in the driving process of a current vehicle; acquiring the position and identity information of a target user through voice information, and initializing an instruction hierarchical graph; performing semantic recognition on voice information based on machine learning, searching in an instruction hierarchical graph to generate an interactive instruction, generating comprehensive constraint according to the driving situation, and correcting the interactive instruction through the comprehensive constraint; and acquiring feedback information of the target user on the interactive instruction, analyzing interactive instruction habit information of the target user, and compensating for correction of the interactive instruction. The method and the device provided by the invention are used for intelligently analyzing the interactive instruction based on the driving situation, better meeting the behavior characteristics of voice interaction of the automobile user on the premise of ensuring the recognition efficiency, and improving the interactive experience in the automobile.

Description

Vehicle-mounted intelligent voice interaction method and system based on cloud computing

Technical Field

The invention relates to the technical field of voice interaction, in particular to a vehicle-mounted intelligent voice interaction method and system based on cloud computing.

Background

With the gradual rise of the internet of vehicles and intelligent automobiles, intelligent transportation and the internet of vehicles are becoming hot topics concerned by people, and more functions are carried on the vehicle machines. At present, the voice interaction technology is successfully applied to scenes such as intelligent sound boxes, input methods and the like, and is beneficial to reducing the operation dependence of a driver on equipment in a vehicle and improving the driving safety factor. Under the background of rapid development of intelligent technologies, key technologies of voice interaction comprise voice recognition, semantic understanding and voice synthesis. The speech recognition technology is to convert human voice signals, i.e. natural language, into corresponding text or instructions. Semantic understanding techniques process received text or instructions to convert natural language into a language that a machine can understand, thereby understanding the user's intent.

The voice recognition in the vehicle-mounted voice interaction is used for recognizing the identity of a speaker besides text content, and providing differentiated services for drivers and passengers in a vehicle according to application scenes, so that compared with the existing voice interaction widely applied to the vehicle, the voiceprint recognition is a field with a relatively higher technical threshold, how to realize more accurate user information habit acquisition by utilizing the voiceprint recognition, and an intelligent voice interaction model aiming at a vehicle-mounted special scene is constructed, so that the behavior characteristics of the voice interaction of the vehicle user are better met, the interaction experience in the vehicle is improved, and the safety protection of the vehicle is strengthened, so that the safety of a driver is guaranteed, and the interaction experience in the driving process is improved.

Disclosure of Invention

In order to solve the technical problems, the invention provides a vehicle-mounted intelligent voice interaction method and system based on cloud computing.

The invention provides a vehicle-mounted intelligent voice interaction method based on cloud computing, which comprises the following steps:

acquiring attention information of a driving user in a driving process in a vehicle-mounted environment, analyzing state information of the driving user according to the attention information and the vehicle-mounted environment information, and generating a driving scene in the current driving process of a vehicle;

acquiring interactive voice information in a vehicle-mounted environment, acquiring position and identity information of a target user through the interactive voice information, judging whether the target user is a driving user, matching an interactive instruction set corresponding to the target user through a judgment result, and initializing an instruction hierarchical diagram;

the cloud carries out semantic recognition on interactive voice information based on machine learning, searches in an instruction hierarchical graph to generate an interactive instruction, generates comprehensive constraints according to the driving scene, and corrects the interactive instruction through the comprehensive constraints;

feedback information of the target user to the interactive instruction is obtained, instruction habit information of the target user is analyzed through matching of voiceprint information and the feedback information of the target user, and correction of the interactive instruction is compensated based on the instruction habit information.

In the scheme, the state information of a driving user is analyzed by combining the attention information with the vehicle-mounted environment information, and the driving scene in the driving process of the current vehicle is generated, specifically:

acquiring facial frame image data of a driver through an in-vehicle camera, preprocessing the facial frame image data, and extracting a key frame of the facial frame image data;

extracting human face characteristic points of a driving user according to key frames of the facial frame image data, and acquiring human face orientation information, human eye closing degree and sight line direction according to the human face characteristic points;

comparing and analyzing the acquired face orientation information, the eye closing degree and the sight line direction with a preset threshold value, reading the attention information of a driving user, setting weight information according to road condition information of a current driving road section, and adjusting the attention threshold value by using the weight information;

evaluating the attention information of the driving user according to the attention threshold value at the current moment, acquiring vehicle-mounted environment information and an attention evaluation result, and performing matching analysis on the state information of the driving user;

when the state information of the driver is in a fatigue state, generating voice information to remind the driver, making a decision according to the vehicle-mounted environment information to generate a vehicle-mounted environment change suggestion, and acquiring voice feedback of the driver to execute a corresponding instruction;

in addition, the driving scene in the current vehicle driving process is generated through the state information of the driving user, the vehicle-mounted environment information and the vehicle driving information.

In this scheme, acquire the mutual speech information in the vehicle-mounted environment, acquire target user's position and identity information through mutual speech information, specifically do:

acquiring interactive voice information in the vehicle-mounted environment according to a voice receiving module in the vehicle-mounted environment, carrying out filtering and denoising on the interactive voice information, and dividing the vehicle-mounted environment into sub-areas with preset quantity;

acquiring sound energy information and arrival time difference of the received interactive voice information in each subarea, and judging a source subarea of the interactive voice information according to the sound energy information and the arrival time difference;

determining the position of the interactive voice information, then carrying out voiceprint recognition, retrieving identity information through big data at the cloud according to the voiceprint recognition result, and calculating the voiceprint corresponding to the interactive voice information and carrying out similarity calculation on cloud storage data;

acquiring data with the similarity meeting a preset similarity standard, extracting corresponding identity information as the identity information of a target user, reading the voice habit characteristics matched with the stored voice habit characteristics through the identity information, and if the cloud storage data do not meet the preset similarity standard, creating a voiceprint sequence and storing the voiceprint sequence in the cloud;

and matching an interactive instruction set corresponding to the functional information according to the position information of the target user, and initializing an instruction hierarchical graph based on the identity information through the interactive instruction set.

In the scheme, the cloud carries out semantic recognition on interactive voice information based on machine learning, searches in an instruction hierarchical graph to generate an interactive instruction, generates comprehensive constraints according to the driving scene, and corrects the interactive instruction through the comprehensive constraints, specifically comprising the following steps:

preprocessing interactive voice information, extracting Word vectors from the preprocessed interactive voice information through a Word2vec model, performing weighted average construction on sentence vector expression according to the Word vectors, and taking the Word vectors and the sentence vector expression as semantic features;

establishing a key information extraction model based on a bidirectional long-short term memory neural network model, inputting semantic features into the key information extraction model, and configuring differentiated weights by combining an attention mechanism and context to obtain key information in interactive voice information;

classifying by using the key information, labeling category labels, retrieving in the initialized instruction hierarchical diagram to obtain corresponding instructions of the key information, and managing the intention of a target user;

when the retrieval path in the instruction hierarchical graph corresponds to an instruction which is not unique, setting question-back voice information according to retrieval contents, updating the intention according to the feedback of a target user, and matching the corresponding interactive instruction through the updated intention;

and setting comprehensive constraints based on the current driving situation, judging whether the matched interactive instruction accords with the range of the comprehensive constraints, and if not, modifying the interactive instruction and inquiring feedback information of a target user through voice.

In this scheme, still include: monitoring attention information of a driving user during voice interaction, specifically comprising:

after receiving the interactive instruction, acquiring the state information of the current timestamp of the driving user, and creating a temporary attention monitoring task based on the state information of the current timestamp;

acquiring the sight line drop point frequency of a driving user in the driving scene of the current timestamp to acquire a watching hot spot area, acquiring the sight line drop point of each timestamp in the attention monitoring task through the sight line direction of the driving user, and marking the watching duration of the sight line drop point;

judging whether a sight line drop point of each timestamp in the attention monitoring task falls in the watching hotspot area, if the watching duration that the sight line drop point does not fall in the watching hotspot area is longer than a preset threshold, judging whether voice interaction is suspended according to the type of the interaction instruction, and generating voice prompt;

and after the voice interaction is suspended, when the condition that the sight line drop point of the driving user returns to the watching hot spot area is detected, the voice interaction scene is recovered, and the instruction corresponding operation is realized according to the historical interaction instruction.

In the scheme, feedback information of the target user to the interactive instruction is acquired, instruction habit information of the target user is analyzed through matching of voiceprint information and the feedback information of the target user, and the interactive instruction is compensated based on the instruction habit information, specifically:

after the interactive instruction is executed, feedback information of the target user on the interactive instruction is obtained, a supplementary data set of each interactive instruction is set through the feedback information, and the supplementary data set is set as a voiceprint information tag of the target user;

the method comprises the steps of performing supplementary correction on an instruction hierarchical diagram based on a supplementary data set of each interactive instruction, extracting a diagram structure according to the corrected instruction hierarchical diagram, and training a graph convolution neural network according to the extracted diagram structure to obtain instruction habit information of a target user;

establishing a personalized database of a target user by combining the instruction habit information with the corresponding vehicle-mounted environment, learning according to the personalized data, and compensating the correction precision of the interactive instruction so that the interactive instruction can achieve the expected effect of the target user at one time;

and presetting a cloud storage time threshold, and deleting the personalized database of the target object when the non-calling time of the personalized database corresponding to the voiceprint information of the target object exceeds the preset storage time threshold.

The second aspect of the present invention further provides a cloud computing-based vehicle-mounted intelligent voice interaction system, which includes: the vehicle-mounted intelligent voice interaction method based on the cloud computing comprises a memory and a processor, wherein the memory comprises the vehicle-mounted intelligent voice interaction method based on the cloud computing, and when the processor executes the vehicle-mounted intelligent voice interaction method based on the cloud computing, the following steps are realized:

acquiring interactive voice information in a vehicle-mounted environment, acquiring position and identity information of a target user through the interactive voice information, judging whether the target user is a driving user, matching an interactive instruction set corresponding to the target user through a judgment result, and initializing an instruction hierarchical graph;

The invention discloses a vehicle-mounted intelligent voice interaction method and system based on cloud computing, which comprises the following steps: acquiring attention information of a driving user in a driving process in a vehicle-mounted environment, and analyzing state information of the driving user by combining with vehicle-mounted environment information to generate a driving scene in the driving process of a current vehicle; acquiring the position and identity information of a target user through voice information, and initializing an instruction hierarchical graph; performing semantic recognition on voice information based on machine learning, searching in an instruction hierarchical graph to generate an interactive instruction, generating comprehensive constraint according to the driving scene, and correcting the interactive instruction through the comprehensive constraint; and acquiring feedback information of the target user on the interactive instruction, analyzing interactive instruction habit information of the target user, and compensating for the correction of the interactive instruction. The method and the device are used for intelligently analyzing the interactive instruction based on the driving scene, better meeting the behavior characteristics of voice interaction of the automobile user on the premise of ensuring the recognition efficiency and improving the interactive experience in the automobile.

Drawings

FIG. 1 is a flow chart of a cloud computing-based vehicle-mounted intelligent voice interaction method according to the invention;

FIG. 2 is a flowchart illustrating a method for obtaining location and identity information of a target user through interactive voice information according to the present invention;

FIG. 3 is a flow chart of a method for semantic recognition of interactive voice information based on machine learning according to the present invention;

fig. 4 shows a block diagram of a cloud computing-based vehicle-mounted intelligent voice interaction system.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

Fig. 1 shows a flow chart of a cloud computing-based vehicle-mounted intelligent voice interaction method of the invention.

As shown in fig. 1, a first aspect of the present invention provides a cloud-computing-based vehicle-mounted intelligent voice interaction method, including:

s102, acquiring attention information of a driving user in a driving process in a vehicle-mounted environment, analyzing state information of the driving user according to the attention information and the vehicle-mounted environment information, and generating a driving scene in the current vehicle driving process;

s104, acquiring interactive voice information in a vehicle-mounted environment, acquiring position and identity information of a target user through the interactive voice information, judging whether the target user is a driving user, matching an interactive instruction set corresponding to the target user through a judgment result, and initializing an instruction hierarchical graph;

s106, performing semantic recognition on interactive voice information by the cloud based on machine learning, searching in an instruction hierarchical graph to generate an interactive instruction, generating comprehensive constraint according to the driving scene, and correcting the interactive instruction through the comprehensive constraint;

and S108, acquiring feedback information of the target user to the interactive instruction, analyzing the instruction habit information of the target user through matching of the voiceprint information of the target user and the feedback information, and compensating the modification of the interactive instruction based on the instruction habit information.

The method comprises the steps that facial frame image data of a driving user are obtained through a camera in a vehicle, the frame image data are preprocessed, and a key frame of the facial frame image data is extracted; extracting human face characteristic points of a driving user according to key frames of the facial frame image data, and acquiring human face orientation information, human eye closing degree and sight line direction; setting corresponding threshold intervals for the extracted data, matching attention levels to the threshold intervals, comparing and analyzing the acquired face orientation information, the eye closing degree and the sight line direction with preset thresholds, reading the attention information of the driving user, and judging the current attention level of the driving user. Setting weight information according to the road condition information of the current driving road section, and adjusting the attention threshold value by using the weight information, for example, when the current driving road section has a congested road condition, reducing the corresponding attention threshold value, so that a driving user is more concentrated in the driving process; evaluating the attention information of the driver according to the attention threshold value at the current moment, acquiring vehicle-mounted environment information and performing matching analysis on the state information of the driver with the attention evaluation result, wherein the vehicle-mounted environment information comprises information such as the number of people in the vehicle, the sound of the vehicle-mounted environment, the temperature of the vehicle-mounted environment, the air quality of the vehicle-mounted environment and the like; when the state information of the driver is in a fatigue state, generating voice information to remind the driver, making a decision according to the vehicle-mounted environment information to generate a vehicle-mounted environment change suggestion, and acquiring voice feedback of the driver to execute a corresponding instruction, for example, when the driver is in a slight fatigue state, inquiring whether to open a vehicle window, play music or reduce the temperature of an air conditioner through voice interaction. In addition, the driving scene in the current vehicle driving process is generated through the state information of the driving user, the vehicle-mounted environment information and the vehicle driving information.

Fig. 2 is a flowchart illustrating a method for obtaining location and identity information of a target user through interactive voice information according to the present invention.

According to the embodiment of the invention, the interactive voice information in the vehicle-mounted environment is acquired, and the position and identity information of the target user is acquired through the interactive voice information, which specifically comprises the following steps:

s202, acquiring interactive voice information in the vehicle-mounted environment according to a voice receiving module in the vehicle-mounted environment, filtering and denoising the interactive voice information, and dividing the vehicle-mounted environment into a preset number of sub-regions;

s204, acquiring sound energy information and arrival time difference of the received interactive voice information in each subarea, and judging a source subarea of the interactive voice information according to the sound energy information and the arrival time difference;

s206, determining the position of the interactive voice information, then performing voiceprint recognition, retrieving identity information through big data at the cloud according to the voiceprint recognition result, and calculating the similarity between the voiceprint corresponding to the interactive voice information and cloud storage data;

s208, acquiring data with the similarity meeting a preset similarity standard, extracting corresponding identity information as the identity information of a target user, reading the voice habit characteristics matched with the stored voice habit characteristics through the identity information, and if the cloud storage data does not meet the preset similarity standard, creating a voiceprint sequence and storing the voiceprint sequence in the cloud;

and S210, matching an interactive instruction set corresponding to the function information according to the position information of the target user, and initializing an instruction hierarchical graph based on the identity information through the interactive instruction set.

It should be noted that, an interactive instruction set corresponding to the function information is matched according to the location information of the target user, and an instruction hierarchical diagram is initialized based on the identity information through the interactive instruction set, specifically: the method comprises the steps of studying interactive instructions based on position information of a vehicle-mounted environment for classification, obtaining keyword information corresponding to each interactive instruction through big data retrieval, and constructing an interactive instruction knowledge graph by the interactive instructions, the keyword information and classification results; extracting a corresponding interactive instruction set from the interactive instruction knowledge graph according to a source subregion of the target user interactive voice information; acquiring similar historical driving scenes from the historical driving scenes according to the current driving scenes, extracting the use frequency of each interactive instruction in the similar historical driving scenes, and carrying out priority classification on the interactive instructions in the interactive instruction set according to the use frequency; if the identity information of the target user is stored in the cloud, constructing a user portrait according to a historical interaction instruction corresponding to the identity information of the target user, adjusting the priority through the user portrait, and generating an instruction hierarchical graph according to the adjusted priority by the interaction instruction set.

FIG. 3 is a flow chart illustrating a method for semantic recognition of interactive voice information based on machine learning according to the present invention.

According to the embodiment of the invention, the cloud carries out semantic recognition on interactive voice information based on machine learning, searches in an instruction hierarchical graph to generate an interactive instruction, generates comprehensive constraints according to the driving scene, and corrects the interactive instruction through the comprehensive constraints, specifically:

s302, preprocessing interactive voice information, extracting Word vectors from the preprocessed interactive voice information through a Word2vec model, performing weighted average according to the Word vectors to construct sentence vector expression, and taking the Word vectors and the sentence vector expression as semantic features;

s304, establishing a key information extraction model based on the bidirectional long-short term memory neural network model, inputting semantic features into the key information extraction model, and configuring differentiation weights by combining an attention mechanism and context to obtain key information in the interactive voice information;

s306, classifying by using the key information, labeling category labels, retrieving in the initialized instruction hierarchical diagram to obtain corresponding instructions of the key information, and managing the intention of the target user;

s308, when the retrieval path in the instruction hierarchical graph corresponds to an instruction which is not unique, setting question-back voice information according to the retrieval content, updating the intention according to the feedback of the target user, and matching the corresponding interactive instruction according to the updated intention;

s310, comprehensive constraints are set based on the current driving situation, whether the matched interactive instruction meets the range of the comprehensive constraints or not is judged, if not, the interactive instruction is corrected, and then feedback information of a target user is inquired through voice.

It should be noted that after receiving the interactive instruction, obtaining the state information of the current timestamp of the driving user, and creating a temporary attention monitoring task based on the state information of the current timestamp; acquiring the sight line drop frequency of a driving user in the driving scene of the current timestamp to acquire a watching hotspot area, acquiring the sight line drop of each timestamp in the attention monitoring task through the sight line direction of the driving user, and marking the watching duration of the sight line drop; judging whether a sight line drop point of each timestamp in the attention monitoring task falls in the watching hotspot area, if the watching duration that the sight line drop point does not fall in the watching hotspot area is longer than a preset threshold, judging whether voice interaction is suspended according to the type of the interaction instruction, and generating voice prompt; and after the voice interaction is suspended, when the fact that the sight line drop point of the driving user returns to the watching hot spot area is detected, the voice interaction scene is recovered, and the instruction corresponding operation is realized according to the historical interaction instruction, wherein the sight line drop point detection of the target user can be obtained through eye movement hardware detection equipment or a pupil projection space method.

Integrating the context characteristics of the interactive voice information based on a bidirectional long and short term memory neural network model, ensuring the semantic integrity of the interactive voice information, constructing a data set by analyzing interactive instruction keywords through big data, dividing the data set into a training set and a verification set, carrying out word vector representation on the training data set, inputting the training data set into the bidirectional long and short term memory neural network combined with an attention mechanism for training, and extracting key information in the interactive voice information through the trained model.

In addition, comprehensive constraints are set based on the current driving scene, risk factors in the current driving scene are analyzed, constraint information is set on part of interactive instructions according to the risk factors, when the interactive instructions corresponding to the interactive voice information of the target user do not accord with a constraint preset range, voice prompt is generated, corresponding instructions are executed according to feedback of the target user, for example, when children are detected in the vehicle according to vehicle-mounted environment information, the height of a window where the children are located is constrained.

After the interactive instruction is executed, feedback information of the target user on the interactive instruction is acquired, a supplementary data set of each interactive instruction is set through the feedback information, and the supplementary data set is set as a voiceprint information tag of the target user; the method comprises the steps of supplementing and correcting an instruction hierarchical graph based on a supplement data set of each interactive instruction, extracting a graph structure according to the corrected instruction hierarchical graph, representing habit characteristics of a target user to the interactive instruction, training a graph convolution neural network according to the extracted graph structure to obtain instruction habit information of the target user, and executing message propagation between the target user and the interactive instruction which are connected with each other in the graph structure, wherein the message propagation process comprises characteristic transformation, neighborhood aggregation and nonlinear activation, so that target user nodes of self attribute characteristics have local neighborhood information and are expressed in a vector form, and the method specifically comprises the following steps:

/>

wherein,

represents the target user node->

At the fifth place>

Sub-convolved feature vector representation->

Indicating that the interactive instruction node->

Is selected, based on the set of neighbor nodes in the system, and>

response history interactive instruction node>

And the target user node->

In the degree of (c), is greater than or equal to>

Representing interactive instruction node>

At the fifth place>

A feature vector representation of the sub-convolution;

constructing a personalized database of a target user by combining the instruction habit information with the corresponding vehicle-mounted environment, learning according to the personalized data, and compensating the correction precision of the interactive instruction, so that the interactive instruction can achieve the expected effect of the target user at one time; and presetting a cloud storage time threshold, and deleting the personalized database of the target object when the non-calling time of the personalized database corresponding to the voiceprint information of the target object exceeds the preset storage time threshold.

The second aspect of the present invention further provides a cloud computing-based vehicle-mounted intelligent voice interaction system 4, which includes: the memory 41 and the processor 42, where the memory includes a cloud-computing-based vehicle-mounted intelligent voice interaction method program, and when executed by the processor, the cloud-computing-based vehicle-mounted intelligent voice interaction method program implements the following steps:

feedback information of the target user on the interactive instruction is obtained, instruction habit information of the target user is analyzed through matching of voiceprint information and the feedback information of the target user, and correction of the interactive instruction is compensated based on the instruction habit information.

It should be noted that, the method includes acquiring facial frame image data of a driving user through an in-vehicle camera, preprocessing the facial frame image data, and extracting a key frame of the facial frame image data; extracting face characteristic points of a driver according to key frames of the face frame image data, and acquiring face orientation information, eye closing degree and sight direction; setting corresponding threshold intervals for the extracted data, matching attention levels to the threshold intervals, comparing and analyzing the acquired face orientation information, the eye closing degree and the sight line direction with preset thresholds, reading the attention information of the driving user, and judging the current attention level of the driving user. Setting weight information according to the road condition information of the current driving road section, and adjusting the attention threshold value by using the weight information, for example, when the current driving road section has a congested road condition, reducing the corresponding attention threshold value, so that a driving user is more concentrated in the driving process; evaluating the attention information of the driver according to the attention threshold value at the current moment, acquiring vehicle-mounted environment information and performing matching analysis on the state information of the driver by the attention evaluation result, wherein the vehicle-mounted environment information comprises information such as the number of people in the vehicle, the sound of the vehicle-mounted environment, the temperature of the vehicle-mounted environment, the air quality of the vehicle-mounted environment and the like; when the state information of the driver is in a fatigue state, generating voice information to remind the driver, making a decision according to the vehicle-mounted environment information to generate a vehicle-mounted environment change suggestion, and acquiring voice feedback of the driver to execute a corresponding instruction, for example, when the driver is in a slight fatigue state, inquiring whether to open a vehicle window, play music or reduce the temperature of an air conditioner through voice interaction. In addition, the driving scene in the current vehicle driving process is generated through the state information of the driving user, the vehicle-mounted environment information and the vehicle driving information.

when the retrieval path in the instruction hierarchical graph corresponds to a non-unique instruction, setting question-back voice information according to retrieval contents, updating intentions according to feedback of a target user, and matching corresponding interactive instructions according to the updated intentions;

It should be noted that after receiving the interactive instruction, obtaining the state information of the current timestamp of the driving user, and creating a temporary attention monitoring task based on the state information of the current timestamp; acquiring the sight line drop frequency of a driving user in the driving scene of the current timestamp to acquire a watching hotspot area, acquiring the sight line drop of each timestamp in the attention monitoring task through the sight line direction of the driving user, and marking the watching duration of the sight line drop; judging whether a sight line drop point of each timestamp in the attention monitoring task falls in the watching hotspot area, if the watching duration that the sight line drop point does not fall in the watching hotspot area is longer than a preset threshold, judging whether voice interaction is suspended according to the type of the interaction instruction, and generating voice prompt; and after the voice interaction is suspended, when the fact that the sight line drop point of the driving user returns to the watching hot spot area is detected, the voice interaction scene is recovered, and the instruction corresponding operation is realized according to the historical interaction instruction, wherein the sight line drop point detection of the target user can be obtained through eye movement hardware detection equipment or a projection space method.

wherein,

represents the target user node->

At the fifth place>

Sub-convolved feature vector representation->

Indicating that the interactive instruction node->

In a neighbor node set, in conjunction with a node selection unit>

Response history interactive instruction node->

And the target user node->

Is greater than or equal to>

Indicating that the interactive instruction node->

At the fifth place>

A feature vector representation of the sub-convolution;

The third aspect of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a cloud-computing-based vehicle-mounted intelligent voice interaction method program, and when the cloud-computing-based vehicle-mounted intelligent voice interaction method program is executed by a processor, the steps of the cloud-computing-based vehicle-mounted intelligent voice interaction method are implemented.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A vehicle-mounted intelligent voice interaction method based on cloud computing is characterized by comprising the following steps:

2. The cloud-computing-based vehicle-mounted intelligent voice interaction method according to claim 1, characterized in that state information of a driving user is analyzed according to attention information and vehicle-mounted environment information, and a driving scene in a current vehicle driving process is generated, specifically:

3. The cloud-computing-based vehicle-mounted intelligent voice interaction method according to claim 1, characterized in that interactive voice information in a vehicle-mounted environment is acquired, and position and identity information of a target user is acquired through the interactive voice information, specifically:

4. The vehicle-mounted intelligent voice interaction method based on cloud computing according to claim 1, wherein the cloud performs semantic recognition on interaction voice information based on machine learning, searches in an instruction hierarchical diagram to generate an interaction instruction, generates a comprehensive constraint according to the driving scenario, and corrects the interaction instruction through the comprehensive constraint, specifically:

5. The cloud-computing-based vehicle-mounted intelligent voice interaction method according to claim 1, further comprising: monitoring attention information of a driving user during voice interaction, specifically comprising:

acquiring the sight line drop frequency of a driving user in the driving scene of the current timestamp to acquire a watching hotspot area, acquiring the sight line drop of each timestamp in the attention monitoring task through the sight line direction of the driving user, and marking the watching duration of the sight line drop;

6. The cloud-computing-based vehicle-mounted intelligent voice interaction method according to claim 1, characterized in that feedback information of a target user on an interaction instruction is acquired, instruction habit information of the target user is analyzed through matching of voiceprint information and the feedback information of the target user, and correction of the interaction instruction is compensated based on the instruction habit information, specifically:

after the interactive instruction is executed, feedback information of a target user to the interactive instruction is obtained, a supplementary data set of each interactive instruction is set through the feedback information, and the supplementary data set is set to be a voiceprint information label of the target user;

the method comprises the steps of supplementing and correcting an instruction hierarchical diagram based on a supplement data set of each interactive instruction, extracting a diagram structure according to the corrected instruction hierarchical diagram, and training a graph convolution neural network according to the extracted diagram structure to obtain instruction habit information of a target user;

constructing a personalized database of a target user by combining the instruction habit information with the corresponding vehicle-mounted environment, learning according to the personalized data, and compensating the correction precision of the interactive instruction, so that the interactive instruction can achieve the expected effect of the target user at one time;

7. The utility model provides an on-vehicle intelligent voice interaction system based on cloud calculates which characterized in that, this system includes: the vehicle-mounted intelligent voice interaction method based on the cloud computing comprises a storage and a processor, wherein the storage comprises the vehicle-mounted intelligent voice interaction method based on the cloud computing, and when the vehicle-mounted intelligent voice interaction method based on the cloud computing is executed by the processor, the following steps are realized:

acquiring interactive voice information in a vehicle-mounted environment, acquiring position and identity information of a target user through the interactive voice information, judging whether the target user is a driving user, matching an interactive instruction set corresponding to the user through a judgment result, and initializing an instruction hierarchical graph;

8. The vehicle-mounted intelligent voice interaction system based on cloud computing as claimed in claim 7, wherein interactive voice information in a vehicle-mounted environment is acquired, and the position and identity information of a target user is acquired through the interactive voice information, and specifically:

9. The cloud-computing-based vehicle-mounted intelligent voice interaction system according to claim 7, wherein the cloud performs semantic recognition on interaction voice information based on machine learning, searches in an instruction hierarchical diagram to generate an interaction instruction, generates a comprehensive constraint according to the driving scenario, and corrects the interaction instruction through the comprehensive constraint, specifically:

10. The cloud-computing-based vehicle-mounted intelligent voice interaction system according to claim 7, wherein feedback information of a target user on an interaction instruction is acquired, instruction habit information of the target user is analyzed through matching of voiceprint information and the feedback information of the target user, and correction of the interaction instruction is compensated based on the instruction habit information, and specifically: