CN117115788B - Intelligent interaction method for vehicle, back-end server and front-end equipment - Google Patents

Intelligent interaction method for vehicle, back-end server and front-end equipment Download PDF

Info

Publication number
CN117115788B
CN117115788B CN202311352746.4A CN202311352746A CN117115788B CN 117115788 B CN117115788 B CN 117115788B CN 202311352746 A CN202311352746 A CN 202311352746A CN 117115788 B CN117115788 B CN 117115788B
Authority
CN
China
Prior art keywords
information
driver
word
state
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311352746.4A
Other languages
Chinese (zh)
Other versions
CN117115788A (en
Inventor
徐显杰
赵伟亭
刘�东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suoto Hangzhou Automotive Intelligent Equipment Co Ltd
Tianjin Soterea Automotive Technology Co Ltd
Original Assignee
Suoto Hangzhou Automotive Intelligent Equipment Co Ltd
Tianjin Soterea Automotive Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suoto Hangzhou Automotive Intelligent Equipment Co Ltd, Tianjin Soterea Automotive Technology Co Ltd filed Critical Suoto Hangzhou Automotive Intelligent Equipment Co Ltd
Priority to CN202311352746.4A priority Critical patent/CN117115788B/en
Publication of CN117115788A publication Critical patent/CN117115788A/en
Application granted granted Critical
Publication of CN117115788B publication Critical patent/CN117115788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C5/00Registering or indicating the working of vehicles
    • G07C5/08Registering or indicating performance data other than driving, working, idle, or waiting time, with or without registering driving, working, idle or waiting time
    • G07C5/0808Diagnosing performance data
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • B60W2040/089Driver voice
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/21Voice

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Physics (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses an intelligent interaction method for a vehicle, a back-end server and front-end equipment, wherein the method comprises the following steps: collecting at least one of a driver problem and a driver driving state as trigger information; collecting at least one of driving environment, personal information of a driver and vehicle information as additional information; combining the trigger information and the additional information to obtain state information describing the current state of the driver; analyzing the current demand of the driver according to the state information, and generating output information meeting the current demand of the driver; and determining a target feedback mode based on the acquired information, and feeding back output information to a driver in the target feedback mode so as to realize intelligent interaction with the driver and avoid the situation that the driver of the long-distance driving is distracted, fatigued and the like.

Description

Intelligent interaction method for vehicle, back-end server and front-end equipment
Technical Field
The embodiment of the invention relates to the technical field of auxiliary driving, in particular to an intelligent interaction method for a vehicle, a back-end server and front-end equipment.
Background
In the long-distance driving process of the commercial vehicle, the driver is difficult to avoid being trapped in the conditions of distraction, fatigue and the like after driving for a long time due to boring, boring and the like. These conditions lead to an increase in accident rate, which can cause serious damages to individuals, families and society.
In order to reduce the accident rate, there are various means to help drivers to know the real-time condition of the vehicle and to give a warning to the driver at the time of danger. However, these warnings are sent out when danger is about to occur, and there are problems such as lag, hard warning mode, etc., so that the driver receives information too late and does not pay attention to the warnings, and finally the accident still occurs.
Disclosure of Invention
The invention provides an intelligent interaction method for a vehicle, a rear-end server and front-end equipment, which are used for realizing intelligent interaction with a driver and avoiding the situation that a long-distance driver gets into distraction, fatigue and the like.
In a first aspect, an embodiment of the present invention provides an intelligent interaction method for a vehicle, where the method includes:
collecting at least one of a driver problem and a driver driving state as trigger information;
collecting at least one of driving environment, personal information of a driver and vehicle information as additional information;
combining the trigger information and the additional information to obtain state information describing the current state of the driver;
analyzing the current demand of the driver according to the state information, and generating output information meeting the current demand of the driver;
and determining a target feedback mode based on the acquired information, and feeding back the output information to the driver in the target feedback mode so as to realize intelligent interaction with the driver.
Further, the analyzing the current requirement of the driver according to the state text, generating the output information meeting the current requirement of the driver, includes:
extracting word content features and word position features of the state information, and carrying out semantic analysis of a set depth on the state text based on the word content features and the word position features;
and outputting output information of the responding driver based on the semantic analysis result and pre-learned knowledge, wherein the pre-learned knowledge comprises interaction data of daily drivers during long-distance driving and knowledge related to interaction.
Further, the semantic analysis of the set depth is carried out based on the word content characteristics and the word position characteristics; outputting output information of a responding driver based on semantic analysis results and pre-learned knowledge, wherein the pre-learned knowledge comprises interaction data of a daily driver during long-distance driving and knowledge related to interaction, and the method comprises the following steps of:
inputting the word content characteristics and the word position characteristics into a pre-trained intelligent interaction network, and outputting word vector probability distribution of candidate replies arranged in sequence;
determining word vectors of the output information according to the word vector probability distribution of the candidate replies;
synthesizing and mapping word vectors of the output information to an output information text, and taking the output information text as the output information of a responding driver;
the intelligent interaction network is obtained by training a large language model by utilizing interaction data of a daily driver during long-distance driving and knowledge related to interaction; the network layer number of the intelligent interaction network is smaller than a set network layer number threshold value.
Further, the extracting the word content characteristics of the state information includes:
if the state information is a vector formed by combining a word vector matrix of each word in the trigger information and a word vector matrix of each word in the additional information, the vector of the state information is used as the word content characteristic;
if the state information is a text obtained by combining the text of the trigger information and the text of the additional information, word segmentation is carried out on the text of the state information, vector conversion is carried out on words after word segmentation, a word vector matrix is generated, and the generated word vector matrix is used as the word content characteristic;
the word position feature of the extracted state information comprises:
calculating the absolute position and the relative position of each word in the trigger information and the additional information, coding the absolute position and the relative position, generating a position vector matrix of the word, and taking the generated position vector matrix as the position characteristic of the word.
Further, if the state information is a vector formed by combining a word vector matrix of each word in the trigger information and a word vector matrix of each word in the additional information, before taking the vector of the state information as the word content feature, the method further includes:
carrying out structuring treatment on the additional information;
vectorizing the additional information and the trigger information after the structuring treatment respectively;
and combining the additional information with the triggering information in the form of vectors to obtain vectors describing the state information of the current state of the driver.
Further, the intelligent interaction network is a Decoder decoding layer structure, a GQA (Group Query Attention, multi-query attention) mechanism is adopted in the decoding layer, the number of branches set by Key keys and Value values in the attention mechanism is smaller than a set branch threshold, and the normalization in the decoding layer adopts RMS Norm (Root Mean Square Layer Normalization ).
Further, after determining the target feedback mode based on the collected information and feeding back the output information to the driver in the target feedback mode, the method further includes:
obtaining satisfaction degree of a driver on output information, and taking the output information with satisfaction degree lower than a set satisfaction degree threshold value and associated state text as a badcase negative sample;
modifying the negative sample, and correcting the pre-learned knowledge content based on the corrected data;
or (b)
And identifying the happiness information of the driver according to the satisfaction degree of the driver on the output information, and adjusting the learned knowledge aiming at the happiness information of the driver so as to accord with the preference of the driver.
Further, before the analyzing the current requirement of the driver according to the state text, the method further comprises:
judging whether the state information is larger than a set length threshold value or not;
and if the length of the state information is larger than the set length threshold, filtering or replacing the state text.
In a second aspect, an embodiment of the present invention further provides an intelligent interaction backend server for a vehicle, including:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the intelligent interaction method for a vehicle as described in any of the embodiments of the present invention.
In a third aspect, an embodiment of the present invention further provides an intelligent interactive front-end device for a vehicle, where the device includes:
the vehicle-mounted microphone is in charge of receiving the voice information of a driver;
the signal receiver is connected with the vehicle-mounted equipment and used for receiving vehicle information, driving state of a driver and/or driving environment information of the driver;
the signal transmitter is connected with the intelligent interaction back-end server for the vehicle and is used for transmitting the received information to the intelligent interaction back-end server for the vehicle;
and the player is connected with the signal receiver and used for playing the received output information fed back to the driver.
According to the technical scheme provided by the embodiment of the invention, the problems of the driver and/or the driving state of the driver are collected to serve as the triggering information of intelligent interaction, so that the intelligent interaction with the driver is realized when the driver actively interacts or the driving state of the driver is bad, and the situation that the driver is trapped into the conditions of distraction, fatigue and the like is avoided.
By combining the information such as the driving environment of the driver, personal information of the driver, vehicle information and the like which are taken as additional information, the driver is more accurately understood, the driver is further accurately responded or fed with the information, and the bad experience that only the driver is warned hard is avoided by determining a target feedback mode based on the acquired information.
The complete text is subjected to semantic analysis based on the word content features and the word position features, and the accuracy of the semantic analysis is improved due to the addition of the word position features. By semantic analysis of the setting depth, setting can be performed according to the calculation power of the actual equipment. In a driver driving environment, the real-time requirement of intelligent interaction feedback and limited calculation power of equipment are required to be integrated, and the setting of semantic analysis depth is carried out, so that the use of the scene is satisfied.
The output information of the responding driver is output based on the semantic analysis result and the pre-learned knowledge, wherein the pre-learned knowledge comprises the interaction data of the daily driver during long-distance driving and the knowledge involved in the interaction, so that the daily interaction requirement of the driver is met, and the driver experience is further improved.
The absolute position and the relative position of each word after word segmentation are calculated, the absolute position and the relative position are coded, a position vector matrix of the word is generated, and the generated position vector matrix is used as the position characteristic of the word, so that the position characteristic of the word can be better described because the absolute position and the relative position of the word are combined.
Aiming at an application scene of vehicle-mounted real-time feedback, an intelligent interaction network is built by utilizing a decoding layer structure, a GQA attention mechanism and an RMS Norm, wherein the decoding layer number is smaller than a set layer number threshold Value, and the number of branches set by a Key Key and a Value in the attention mechanism is smaller than the set branch number threshold Value, so that the built intelligent interaction network is miniaturized as far as possible on the basis of meeting accuracy, namely, a designed model link has a certain depth but is not excessively complicated, and the practical application scene mainly used for danger reminding and accompanying chat reply under a commercial vehicle scene is met.
In order to better cater to the preference of a driver, after the output information is fed back to the driver in a target feedback mode, the satisfaction degree of the driver on the output information is obtained, and the output information with the satisfaction degree lower than a set satisfaction degree threshold and associated state text are used as badcase; modifying badcase, correcting or adjusting pre-learned knowledge content based on corrected data to realize continuous optimization of accumulated knowledge.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
fig. 1 is a schematic flow chart of an intelligent interaction method for a vehicle according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another intelligent interaction method for a vehicle according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of another intelligent interaction method for a vehicle according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an intelligent interaction back-end server for a vehicle according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an intelligent interactive front-end device for a vehicle according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In the description of the present invention, "/" means "or" unless otherwise indicated, for example, A/B may mean A or B. "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. Further, "at least one", "a plurality" means two or more. The terms "first," "second," and the like do not limit the number and order of execution, and the terms "first," "second," and the like do not necessarily differ.
In the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion that may be readily understood.
Furthermore, references to the terms "comprising" and "having" and any variations thereof in the description of the present application are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules but may, alternatively, include other steps or modules not listed or inherent to such process, method, article, or apparatus.
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the following description will be made with reference to the accompanying drawings of the present invention by way of specific embodiments.
Fig. 1 is a schematic flow chart of an intelligent interaction method for a vehicle according to an embodiment of the present invention. The method is suitable for a scene of intelligent accompany to a driver driving long distance. The method may be performed by a back-end server in conjunction with a front-end interactive device. Referring to fig. 1, the method for intelligent interaction for a vehicle provided by the embodiment of the invention includes:
s110, collecting at least one of the problem of the driver and the driving state of the driver as trigger information.
The driver problem refers to the problem of driver output, and the output form can be expressed or text input.
The driving state of the driver refers to information reflecting the driving state of the driver, and can be particularly fatigue, boring, smoking, inattention and the like.
The collection of the driver problem can be realized based on any method in the prior art, and can be obtained by voice recognition or text collection.
The driving state of the driver can also be realized based on any method in the prior art, and particularly, the driving state of the driver can be obtained by collecting driving images of the driver and identifying the images.
The triggering information is information for triggering the execution of the intelligent interaction method for the vehicle according to the embodiment.
S120, collecting at least one of driving environment of a driver, personal information of the driver and vehicle information as additional information.
The driving environment of the driver refers to external environment data when the driver drives, and specifically can comprise date, week, time period, season, weather, wind direction, road information, city name, and city name to be approached.
The driver personal information refers to information describing the driver personal portrait, and specifically may include age, sex, age of service, hometown, dialect type, preference, current time spent on the journey, last stop to present time, etc.
The vehicle information refers to information describing a current running state of the vehicle, vehicle history running information, and attributes of the vehicle itself, and may specifically include: current speed, braking system state, dead zone early warning, oil tank oil quantity, tire state, historical driving mileage, historical maintenance records, vehicle part models and the like.
The additional information is information assisting in intelligent interaction.
S130, combining the trigger information and the additional information to obtain state information describing the current state of the driver.
The status information refers to information describing the current status of the driver, and the form of the information is not limited, and may be text or vector.
The specific combined trigger information and additional information may be: directly before or after the trigger information.
If the state information is in a text form, text semantics expressed by the same word at different positions of the text have a larger difference, for example, three under-four 3000 yuan, four under-three 3000 yuan. So to facilitate accurate understanding of the driver's current state through the state text, the combined trigger information and additional information may be:
setting a text combination template of various information according to the grammar structure;
and combining the triggering information and the accessory information according to the text combination template to obtain a state text which describes the current state of the driver and accords with the grammar structure.
Specifically, the text composition template may be: driver driving environment, driver driving state, driver problems, vehicle information, driver personal information.
The driving environment information of the driver can also have a specific environment description template, such as the date, weather and road name of the driver from the city name to the city name which is close to the city nameOn the road.
By way of example, the status text may be: today is on the road from city a to city B on day friday, 9 of 2023, 8, weather sunny, the driver says: what is today? The current running state of the vehicle is: the current speed is 100km/h, the braking system is normal, no early warning exists in a dead zone, the oil quantity of a mailbox is larger than an early warning threshold value, the tire pressure of a tire is normal, and the like; the driver portrait is: men, 36 years old, like sports football, like music rock etc.
S140, analyzing the current demand of the driver according to the state information, and generating output information meeting the current demand of the driver.
Alternatively, the identification and output of the demand may be achieved through a pre-statistical mapping.
Specifically, if weather how or what weather is matched, weather information is output.
If the driver is tired, the pre-prepared refreshing audio is selected and played according to the preference of the driver.
And S150, determining a target feedback mode based on the acquired information, and feeding back the output information to the driver in the target feedback mode so as to realize intelligent interaction with the driver.
The collected information includes the preference of the driver, the dialect type of the driver, etc.
The feedback mode is determined according to the preference of the driver and/or the dialect type of the driver.
For example, if weather how or what weather is matched, weather information is output based on the dialect type of the driver in a description manner that the driver likes.
The description mode can be female voice, male voice, skin-regulating lovely, knowledge, sinking stability and the like.
According to the technical scheme, the problems of the driver and/or the driving state of the driver are/is collected to serve as the triggering information of intelligent interaction, and the intelligent interaction with the driver is performed when the driver actively interacts or the driving state of the driver is bad, so that the driver is prevented from sinking into the conditions of distraction, fatigue and the like.
By combining the information such as the driving environment of the driver, personal information of the driver, vehicle information and the like which are taken as additional information, the driver is more accurately understood, the driver is further accurately responded or fed with the information, and the bad experience that only the driver is warned hard is avoided by determining a target feedback mode based on the acquired information.
In order to better meet the scene requirement of long-distance driving accompanying of a driver, the calculated amount needs to be reduced, the response speed is improved, and therefore the intelligent interaction method for the vehicle is further improved. Specifically, before the analyzing the current requirement of the driver according to the state text, the method further includes:
judging whether the state information is larger than a set length threshold value or not;
and if the length of the state information is larger than the set length threshold, filtering or replacing the state information.
The set length threshold can be set according to actual needs.
If the status information is in text form, the specific text filtering may be: nonsensical words in the text are identified, and nonsensical words or duplicate description words are deleted.
The substitution of text may be: identifying words with the length larger than a set length threshold value in the text, searching similar words of the words, and selecting words with relatively shorter lengths from the similar words to replace the words.
By filtering or replacing the state text to limit the input quantity, the calculated quantity is reduced, and the response time is quickened.
If the state information is in a vector form, the state information may be filtered and replaced in a vector form based on the principles described above.
In order to improve accuracy of semantic recognition of the state text, the analyzing the current requirement of the driver according to the state text to generate output information meeting the current requirement of the driver includes:
extracting word content features and word position features of the state text, and carrying out semantic analysis of set depth on the state text based on the word content features and the word position features;
and outputting output information of the responding driver based on the semantic analysis result and pre-learned knowledge, wherein the pre-learned knowledge comprises interaction data of daily drivers during long-distance driving and knowledge related to interaction.
The word content features refer to features describing the content of words, and the word position features refer to features describing the position of words in state text.
Depth can be understood as the number of times of semantic parsing calculation on a text, the more times, the more accurate parsing, but the longer the calculation time.
To meet the real-time response requirements of a scene, the calculated depth may be set according to the requirements, typically the depth cannot exceed a set depth threshold.
Fig. 2 is a schematic flow chart of another intelligent interaction method for a vehicle according to an embodiment of the present invention. This embodiment is an alternative to the embodiments described above. Referring to fig. 2, an intelligent interaction method for a vehicle provided by an embodiment of the present invention includes:
s210, collecting at least one of the problem of the driver and the driving state of the driver as trigger information.
S220, collecting at least one of driving environment of a driver, personal information of the driver and vehicle information as additional information.
S230, word segmentation is carried out on the trigger information, and vector conversion is carried out on words after word segmentation, so that a word vector matrix is generated; and carrying out structuring treatment on the additional information, and carrying out vectorization on the additional information after structuring treatment.
The structured template can be set from the viewpoint of easy voice analysis.
S240, combining the additional information and the triggering information in the form of vectors to obtain vectors of state information describing the current state of the driver, and taking the generated vectors of the state information as the word content characteristics.
S250, calculating absolute positions and relative positions of words in the trigger information and the additional information, coding the absolute positions and the relative positions, generating a position vector matrix of the words, and taking the generated position vector matrix as a word position feature.
Alternatively, the vector of state information may be encoded by RoPe rotation encoding to obtain the word position feature.
S260, inputting the word content characteristics and the word position characteristics into a pre-trained intelligent interaction network, and outputting word vector probability distribution of candidate replies arranged in sequence.
Because under the chat scene in the commercial vehicle, the main focus of the driver can be based on the related aspects of vehicle operation, and the local interest, surrounding environment, information of cities and the like are considered. And the system needs to provide functions including driver chat topics, driver status care and reminding. It is therefore necessary to consider how to introduce a priori knowledge of the correlation and to weight the information aspect of heavy interest more heavily (e.g., danger alerts) so that the model will give priority to feedback on the correlation aspect when chatting. For this purpose, the intelligent interaction network is trained by a large language model by using the interaction data of daily drivers during long-distance driving and the knowledge involved in the interaction.
The current AI large model pursues comprehensive mastery of comprehensive knowledge existing in the world and pursues highly accurate answers, and adopts an autoregressive reasoning mode, so that the overall parameters of the model are huge. And the reply will tend to be long. Even if the user directly converses with the ChatGPT which inputs a great deal of calculation force on the network, the reasoning speed is not fast enough, and the reasoning performance supported by related equipment in the commercial vehicle can not be considered. Based on the point, the model needs to be simplified and modified, the reasoning capability of the model in the fixed field is reserved, the reasoning performance is emphasized more, and smooth dialogue reasoning is guaranteed to be realized under the support of commercial vehicle end equipment. Therefore, the network layer number of the intelligent interaction network is smaller than a set network layer number threshold value, so that the network reasoning speed is improved.
In order to further improve the network reasoning speed, the intelligent interaction network is a Decoder decoding layer structure, a GQA mechanism is adopted in the decoding layer, the number of branches set by Key and Value in the attention mechanism is smaller than a set branch threshold, and RMS Norm is adopted for normalization in the decoding layer. The branch number refers to the number of branches splitting each Key and Value, a plurality of branches are also called multiple heads, and how many branches are the multiple heads, which is basically the same concept of multiple heads in the multiple head attention mechanism.
S270, determining word vectors of the output information according to the word vector probability distribution of the candidate replies; and synthesizing and mapping the word vector of the output information to an output information text, and taking the output information text as the output information of the responding driver.
S280, converting the text of the output information into voice and playing the voice to a driver.
The embodiment combines a network structure, an attention mechanism algorithm and a normalization algorithm to realize the refinement of the network for adapting to scene requirements on the basis of ChatGPT.
To better cater to the preference of the driver, the method further comprises the steps of after determining a target feedback mode based on the acquired information and feeding the output information back to the driver in the target feedback mode:
obtaining satisfaction degree of a driver on output information, and taking the output information with satisfaction degree lower than a set satisfaction degree threshold value and associated state text as negative samples;
modifying the negative sample, and correcting the pre-learned knowledge content based on the corrected data;
or (b)
And identifying the happiness information of the driver according to the satisfaction degree of the driver on the output information, and adjusting the learned knowledge aiming at the happiness information of the driver so as to accord with the preference of the driver.
The pre-learned knowledge can be adjusted specifically according to the happiness information of the driver, so that the favorite knowledge of the driver can be increased, and the anti-favorite knowledge of the driver can be deleted.
Fig. 3 is a schematic flow chart of another intelligent interaction method for a vehicle according to an embodiment of the present invention. This embodiment is an alternative to the embodiments described above. Referring to fig. 3, an intelligent interaction method for a vehicle provided by an embodiment of the present invention includes:
s310, collecting at least one of the problem of the driver and the driving state of the driver as trigger information.
S320, collecting at least one of driving environment of a driver, personal information of the driver and vehicle information as additional information.
S330, combining the text of the trigger information and the text of the additional information to obtain the text of the state information, segmenting the text of the state information, performing vector conversion on segmented words to generate a word vector matrix, and using the generated word vector matrix as word content characteristics.
S340, calculating the absolute position and the relative position of each word in the trigger information and the additional information, coding the absolute position and the relative position, generating a position vector matrix of the word, and taking the generated position vector matrix as a word position feature.
S350, inputting the word content characteristics and the word position characteristics into a pre-trained intelligent interaction network, and outputting word vector probability distribution of candidate answers arranged in sequence.
S360, determining word vectors of output information according to word vector probability distribution of candidate replies; and synthesizing and mapping the word vector of the output information to an output information text, and taking the output information text as the output information of the responding driver.
S370, converting the text of the output information into voice, and playing the voice to a driver.
According to the technical scheme provided by the embodiment of the invention, the additional information vector is added at the beginning of the model link to perform information compensation, so that the information access is ensured, the reasoning of the trigger information serving as the core content is not interfered as much as possible, and the calculation of the model side branch is as small as possible.
Fig. 4 is a schematic structural diagram of an intelligent interaction back-end server for a vehicle according to an embodiment of the present invention. As shown in fig. 4, the apparatus comprises a processor 40, a memory 41, an input device 42 and an output device 43; the number of processors 40 in the device may be one or more, one processor 40 being taken as an example in fig. 4; the processor 40, the memory 41, the input means 42 and the output means 43 in the device may be connected by a bus or other means, in fig. 4 by way of example.
The memory 41 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and modules, such as program instructions/modules corresponding to the intelligent interaction method for a vehicle in the embodiment of the present invention. The processor 40 executes various functional applications of the device and data processing, i.e., implements the above-described object detection method, by running software programs, instructions, and modules stored in the memory 41.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 41 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 42 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output means 43 may comprise a display device such as a display screen.
Fig. 5 is a schematic structural diagram of an intelligent interactive front-end device for a vehicle according to an embodiment of the present invention. As shown in fig. 5, the apparatus includes:
the vehicle-mounted microphone 51 for receiving the driver's voice information;
a signal receiver 52 connected to the in-vehicle apparatus for receiving vehicle information, a driver driving state, and/or driver driving environment information;
the signal transmitter 53 is connected with the intelligent interaction back-end server for vehicle according to the above embodiment, and is configured to send the received information to the intelligent interaction back-end server for vehicle;
and the player 54 is connected with the signal receiver and is used for playing the received output information fed back to the driver.
Alternatively, the embodiment is not limited to the specific component form as long as the above-described functions can be achieved, and may be specifically set according to actual needs.
For adding other functions, the relevant functional components may be added in addition to the components described above, which is not set in this embodiment.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (7)

1. An intelligent interaction method for a vehicle is characterized by comprising the following steps:
collecting at least one of a driver problem and a driver driving state as trigger information;
collecting a driving environment of a driver, personal information of the driver and vehicle information as additional information;
combining the trigger information and the additional information to obtain state information describing the current state of the driver;
analyzing the current demand of the driver according to the state information, and generating output information meeting the current demand of the driver;
determining a target feedback mode based on the acquired information, and feeding back output information to a driver in the target feedback mode so as to realize intelligent interaction with the driver;
analyzing the current demand of the driver according to the state information to generate output information meeting the current demand of the driver, including:
extracting word content characteristics and word position characteristics of the state information;
inputting the word content characteristics and the word position characteristics into a pre-trained intelligent interaction network, and outputting word vector probability distribution of candidate replies arranged in sequence;
determining word vectors of the output information according to the word vector probability distribution of the candidate replies;
synthesizing and mapping word vectors of the output information to an output information text, and taking the output information text as the output information of a responding driver;
the intelligent interaction network is obtained by training a large language model by utilizing interaction data of a daily driver during long-distance driving and knowledge related to interaction; the network layer number of the intelligent interaction network is smaller than a set network layer number threshold value;
before the analyzing the current demand of the driver according to the state information, the method further comprises the following steps:
judging whether the state information is larger than a set length threshold value or not;
and if the length of the state information is larger than the set length threshold, filtering or replacing the state information.
2. The method of claim 1, wherein extracting word content features of the state information comprises:
if the state information is a vector formed by combining a word vector matrix of each word in the trigger information and a word vector matrix of each word in the additional information, the vector of the state information is used as the word content characteristic;
if the state information is a text obtained by combining the text of the trigger information and the text of the additional information, word segmentation is carried out on the text of the state information, vector conversion is carried out on words after word segmentation, a word vector matrix is generated, and the generated word vector matrix is used as the word content characteristic;
the word position feature of the extracted state information comprises:
calculating the absolute position and the relative position of each word in the trigger information and the additional information, coding the absolute position and the relative position, generating a position vector matrix of the word, and taking the generated position vector matrix as the position characteristic of the word.
3. The method of claim 2, wherein if the state information is a vector formed by combining a word vector matrix of each word in the trigger information and a word vector matrix of each word in the additional information, the method further comprises, before characterizing the vector of the state information as the word content:
carrying out structuring treatment on the additional information;
vectorizing the additional information and the trigger information after the structuring treatment respectively;
and combining the additional information with the triggering information in the form of vectors to obtain vectors describing the state information of the current state of the driver.
4. The method of claim 1, wherein the intelligent interaction network is a Decoder decoding layer structure, a GQA multi-query attention mechanism is adopted in the decoding layer, the number of heads set by Key keys and Value values in the attention mechanism is smaller than a set threshold of the number of heads, and normalization in the decoding layer adopts RMS Norm root mean square layer normalization.
5. The method of claim 1, wherein after determining the target feedback mode based on the collected information and feeding back the output information to the driver in the target feedback mode, the method further comprises:
obtaining satisfaction degree of a driver on output information, and taking the output information with satisfaction degree lower than a set satisfaction degree threshold value and associated state text as a badcase negative sample;
modifying the negative sample, and correcting the pre-learned knowledge content based on the corrected data;
or (b)
And identifying the happiness information of the driver according to the satisfaction degree of the driver on the output information, and adjusting the learned knowledge aiming at the happiness information of the driver so as to accord with the preference of the driver.
6. An intelligent interactive back-end server for a vehicle, the server comprising:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the intelligent interaction method for vehicles of any of claims 1-5.
7. An intelligent interactive front-end device for a vehicle, the device comprising:
the vehicle-mounted microphone is in charge of receiving the voice information of a driver;
the signal receiver is connected with the vehicle-mounted equipment and used for receiving vehicle information, driving state of a driver and/or driving environment information of the driver;
a signal transmitter connected to the intelligent interactive back-end server for vehicles according to claim 6, for transmitting the received information to the intelligent interactive back-end server for vehicles;
and the player is connected with the signal receiver and used for playing the received output information fed back to the driver.
CN202311352746.4A 2023-10-19 2023-10-19 Intelligent interaction method for vehicle, back-end server and front-end equipment Active CN117115788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311352746.4A CN117115788B (en) 2023-10-19 2023-10-19 Intelligent interaction method for vehicle, back-end server and front-end equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311352746.4A CN117115788B (en) 2023-10-19 2023-10-19 Intelligent interaction method for vehicle, back-end server and front-end equipment

Publications (2)

Publication Number Publication Date
CN117115788A CN117115788A (en) 2023-11-24
CN117115788B true CN117115788B (en) 2024-01-02

Family

ID=88796805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311352746.4A Active CN117115788B (en) 2023-10-19 2023-10-19 Intelligent interaction method for vehicle, back-end server and front-end equipment

Country Status (1)

Country Link
CN (1) CN117115788B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650633A (en) * 2016-11-29 2017-05-10 上海智臻智能网络科技股份有限公司 Driver emotion recognition method and device
CN112784695A (en) * 2020-12-31 2021-05-11 南京视察者智能科技有限公司 Driver abnormal state detection method based on image and voice recognition
CN113723528A (en) * 2021-09-01 2021-11-30 斑马网络技术有限公司 Vehicle-mounted voice-video fusion multi-mode interaction method, system, device and storage medium
CN114445888A (en) * 2022-01-21 2022-05-06 常州大学 Vehicle-mounted interaction system based on emotion perception and voice interaction
CN114925157A (en) * 2022-03-07 2022-08-19 武汉理工大学 Nuclear power station maintenance experience text matching method based on pre-training model
CN115440221A (en) * 2022-11-09 2022-12-06 佛山市天地行科技有限公司 Vehicle-mounted intelligent voice interaction method and system based on cloud computing
CN116403576A (en) * 2023-03-10 2023-07-07 中汽创智科技有限公司 Interaction method, device, equipment and storage medium of intelligent cabin of vehicle
CN116443025A (en) * 2023-03-06 2023-07-18 吉林大学 Operation vehicle driver fatigue driving intervention system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650633A (en) * 2016-11-29 2017-05-10 上海智臻智能网络科技股份有限公司 Driver emotion recognition method and device
CN112784695A (en) * 2020-12-31 2021-05-11 南京视察者智能科技有限公司 Driver abnormal state detection method based on image and voice recognition
CN113723528A (en) * 2021-09-01 2021-11-30 斑马网络技术有限公司 Vehicle-mounted voice-video fusion multi-mode interaction method, system, device and storage medium
CN114445888A (en) * 2022-01-21 2022-05-06 常州大学 Vehicle-mounted interaction system based on emotion perception and voice interaction
CN114925157A (en) * 2022-03-07 2022-08-19 武汉理工大学 Nuclear power station maintenance experience text matching method based on pre-training model
CN115440221A (en) * 2022-11-09 2022-12-06 佛山市天地行科技有限公司 Vehicle-mounted intelligent voice interaction method and system based on cloud computing
CN116443025A (en) * 2023-03-06 2023-07-18 吉林大学 Operation vehicle driver fatigue driving intervention system
CN116403576A (en) * 2023-03-10 2023-07-07 中汽创智科技有限公司 Interaction method, device, equipment and storage medium of intelligent cabin of vehicle

Also Published As

Publication number Publication date
CN117115788A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
CN107316643B (en) Voice interaction method and device
CN107240398B (en) Intelligent voice interaction method and device
CN107665706B (en) Rapid voice interaction method and system
CN112100349A (en) Multi-turn dialogue method and device, electronic equipment and storage medium
CN111966320B (en) Multimodal interaction method for vehicle, storage medium, and electronic device
CN107665704B (en) Voice instruction detection model construction method, detection method and system, and man-machine interaction method and equipment
CN113539242A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
WO2019214799A1 (en) Smart dialogue system and method of integrating enriched semantics from personal and contextual learning
CN111161726B (en) Intelligent voice interaction method, device, medium and system
CN115440221B (en) Vehicle-mounted intelligent voice interaction method and system based on cloud computing
US20230331250A1 (en) Method and apparatus for configuring deep learning algorithm for autonomous driving
CN113954855B (en) Self-adaptive matching method for automobile driving mode
CN117079299A (en) Data processing method, device, electronic equipment and storage medium
CN117115788B (en) Intelligent interaction method for vehicle, back-end server and front-end equipment
US20240046931A1 (en) Voice interaction method and apparatus
CN111196124B (en) In-vehicle environment regulation method and device, electronic equipment and storage medium
CN115470799B (en) Text transmission and semantic understanding integrated method for network edge equipment
CN116450799A (en) Intelligent dialogue method and equipment applied to traffic management service
CN116453506A (en) Audio classification method, system and device based on feature fusion
CN115168558A (en) Method for realizing multi-round man-machine conversation
CN115544232A (en) Vehicle-mounted intelligent question answering and information recommending method and device
DE102021212744A1 (en) DIALOGUE SYSTEM, VEHICLE WITH THE SAME AND METHOD FOR CONTROLLING A DIALOGUE SYSTEM
CN113409776B (en) Voice recognition method and device, electronic equipment and storage medium
CN116110396B (en) Voice interaction method, server and computer readable storage medium
CN111899729B (en) Training method and device for voice model, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant