CN117877475A - Voice interaction result presentation method and device based on environment and storage medium - Google Patents

Voice interaction result presentation method and device based on environment and storage medium Download PDF

Info

Publication number
CN117877475A
CN117877475A CN202311847695.2A CN202311847695A CN117877475A CN 117877475 A CN117877475 A CN 117877475A CN 202311847695 A CN202311847695 A CN 202311847695A CN 117877475 A CN117877475 A CN 117877475A
Authority
CN
China
Prior art keywords
vehicle
dimension
current
message
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311847695.2A
Other languages
Chinese (zh)
Inventor
袁学勇
张海波
章伟
雷琴辉
刘俊峰
孟操
陆金兵
梅林海
刘权
王士进
刘聪
胡国平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202311847695.2A priority Critical patent/CN117877475A/en
Publication of CN117877475A publication Critical patent/CN117877475A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Navigation (AREA)

Abstract

The application discloses a voice interaction result presentation method, device and storage medium based on environment. The method realizes the automatic judgment of the current scene information based on the vehicle related data and the provision of various message output strategies for the vehicle information based on the vehicle related data so as to adapt to the application requirement of scene change, thereby bringing better interaction experience to the user.

Description

Voice interaction result presentation method and device based on environment and storage medium
Technical Field
The application relates to the technical field of voice technology and artificial intelligence, in particular to a voice interaction result presentation method and device based on environment and a storage medium.
Background
With the vigorous development of artificial intelligence and mobile internet, in the current intelligent automobile man-machine interaction use scene, voice interaction is the most convenient and safe interaction mode in the automobile and the most applied interaction mode. Based on the voice technology, the vehicle can listen to instructions and answer questions, and can be linked with systems such as navigation, sound equipment and air conditioner of the vehicle, so that the function of voice control vehicle-mounted system is realized. For example, a user can control navigation, play music, adjust an air conditioner, control windows and the like through voice instructions, so that the driving safety and the driving comfort of a driver are greatly improved.
With the advent of virtual assistants for various automobiles, the automobile was gradually changed to our travel partner. The intelligent safety belt system can remind us of tying the safety belt, monitor whether the safety belt is in a fatigue driving state, inquire weather, plan an optimal route, and even reserve air tickets, hotels and the like for us. However, many times, the user feels that the voice interaction of the car machine is not intelligent enough, and even feels that the car machine is noisy, for example, when the car is running on a congested road section, and when the driver is in order to prevent rubbing, the driver is busy, the driver can listen to a lengthy navigation broadcast according to no effort, or when the user is chatting and listening to music on the car, the navigation of the waiting machine is always output crazy, so that the driving and riding experience of the user is seriously affected. Because the voice interaction system of the existing car machine does not integrally consider the problem of simplifying voice output, the voice output of the car machine is not intelligent enough, and good interaction experience cannot be brought to users.
Disclosure of Invention
In view of the above, the invention provides a voice interaction result presentation method, a device and a storage medium based on environment, so as to solve the problems that in the prior art, voice output of a vehicle machine is not intelligent enough and good interaction experience cannot be brought to users.
According to an aspect of the present application, there is provided a voice interaction result presenting method based on environment, the method including:
obtaining a message to be output by a vehicle machine and determining a grade result of the message;
collecting vehicle related data, and obtaining a reduction degree judgment result based on the vehicle related data;
and determining a message output strategy of the vehicle machine according to the reduction degree judging result and the message grade result, and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy.
Further, the method for obtaining the message to be output by the vehicle and the machine and determining the grade result of the message comprises the following steps:
the message is ranked using a trained message ranking model to determine ranking results for the message.
Further, the step of collecting vehicle-related data and obtaining a reduction degree determination result based on the vehicle-related data includes:
performing scene calculation based on the acquired vehicle related data to obtain scene dimension data;
and carrying out the reduction degree scoring according to the scene dimension data to obtain a reduction degree scoring result, and carrying out the reduction degree judgment according to the reduction degree scoring result so as to output a reduction degree judgment result.
Further, the scene dimension data includes at least one or a combination of the following:
(1) Vehicle speed change dimension data;
(2) Road condition dimension data;
(3) Sound dimension data;
(4) Video dimension data.
Further, the calculating the reduction score according to the scene dimension data to obtain a reduction score result includes:
acquiring a correction coefficient of the dimension of the current vehicle speed change and the current vehicle speed change;
acquiring a correction coefficient of the dimension of the current road condition, an average vehicle speed in the current analysis period and the sum of vehicle speed changes corresponding to all sampling points in the current analysis period;
acquiring a correction coefficient of the current in-vehicle sound dimension and a scene score of the current in-vehicle sound dimension;
acquiring a correction coefficient of the current in-vehicle video dimension and a scene score of the current in-vehicle video dimension;
accumulating the correction coefficient of the current vehicle speed change dimension with the current vehicle speed change to obtain a first streamline scoring result of the corresponding vehicle speed change dimension;
dividing the sum of the vehicle speed changes corresponding to all sampling points in the current analysis period by the average vehicle speed in the current analysis period, and then carrying out accumulation processing on the average vehicle speed and the correction coefficient of the current road condition dimension to obtain a second streamline scoring result corresponding to the road condition dimension;
Accumulating the correction coefficient of the current in-vehicle sound dimension and the scene score of the current in-vehicle sound dimension to obtain a third reduction degree scoring result of the corresponding sound dimension;
accumulating the correction coefficient of the current in-vehicle video dimension and the scene score of the current in-vehicle video dimension to obtain a fourth reduction scoring result corresponding to the video dimension;
and accumulating the first reduction degree scoring result, the second reduction degree scoring result, the third reduction degree scoring result and the fourth reduction degree scoring result to obtain the corresponding reduction degree scoring result.
Further, the calculating the reduction score according to the scene dimension data to obtain a reduction score result includes:
and calculating the reduction score of the scene dimension data according to the following steps:
wherein P represents a reduction degree scoring result, C represents a correction coefficient of a dimension of a current vehicle speed change, A' represents a current vehicle speed change, R represents a correction coefficient of a dimension of a current road condition,the method comprises the steps of representing the sum of vehicle speed changes corresponding to all sampling points in a current analysis period, V ' representing the average vehicle speed in the current analysis period, S representing the correction coefficient of the current in-vehicle sound dimension, S ' representing the scene score of the current in-vehicle sound dimension, M representing the correction coefficient of the current in-vehicle video dimension, and M ' representing the scene score of the current in-vehicle video dimension.
Further, the compaction degree determination result includes a plurality of different compaction degree levels, and the method further includes:
mapping the compactness scoring result into a corresponding compactness grade, and determining a corresponding message output strategy according to the compactness grade.
Further, the method further comprises:
when the condition that the message accords with the preset rule is detected, the vehicle is controlled to broadcast the message according to the determined message output strategy, and meanwhile, the vehicle is triggered to execute corresponding vehicle event actions.
Further, the vehicle event action includes at least one or a combination of the following:
the dashboard displays animations, steering wheel vibrations and voice message cues.
According to still another aspect of the present application, there is provided a voice interaction result presentation apparatus based on environment, the apparatus including:
the message acquisition module is used for acquiring a message to be output by the vehicle machine and determining a grade result of the message;
the system comprises a reduction degree judging module, a reduction degree judging module and a control module, wherein the reduction degree judging module is used for acquiring vehicle related data and obtaining a reduction degree judging result based on the vehicle related data;
and the control module is used for determining the message output strategy of the vehicle machine according to the reduction degree judging result and the grade result of the message and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy.
According to a further aspect of the present application, there is provided a computer readable storage medium storing a computer program loadable by a processor to perform the steps of any of the above-described context-based speech interaction result presentation methods.
The method, the device and the storage medium for presenting the voice interaction result based on the environment aim to acquire the message to be output by the vehicle and determine the grade result of the message; collecting vehicle related data, and obtaining a reduction degree judgment result based on the vehicle related data; and determining a message output strategy of the vehicle machine according to the reduction degree judging result and the message grade result, and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy. The method and the device realize that various message output strategies are provided for the vehicle information based on the environment data so as to adapt to application requirements of scene change, thereby bringing better interaction experience for users.
Further, the automobile machine is matched with related automobile machine event actions such as animation display of an instrument panel, vibration of a steering wheel and the like when broadcasting important information, and the important information is comprehensively reminded from the aspects of body feeling and visual effect of a user, so that the intelligent driving scene requirement is met.
Drawings
Technical solutions and other advantageous effects of the present application will be made apparent from the following detailed description of specific embodiments of the present application with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for presenting context-based voice interaction results according to an embodiment of the present application;
FIG. 2 is a functional block diagram of an environment-based voice interaction result presentation method according to an embodiment of the present application;
FIG. 3 is a schematic illustration of the overall course of motion of an S-shaped velocity profile provided in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating three different target speed phases according to an embodiment of the present application;
FIG. 5 is a schematic view showing a sub-process of step S300 shown in FIG. 1;
fig. 6 is a block diagram of a structure of an environment-based voice interaction result presentation apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "first," "second," and the like herein are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more of the described features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the description of the present application, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically connected, electrically connected or can be communicated with each other; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art as the case may be.
The following disclosure provides many different embodiments or examples for implementing different structures of the present application. In order to simplify the disclosure of the present application, the components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit the present application. Furthermore, the present application may repeat reference numerals and/or letters in the various examples, which are for the purpose of brevity and clarity, and which do not in themselves indicate the relationship between the various embodiments and/or arrangements discussed.
The inventor finds that the existing car machine voice interaction does not totally consider the problem of simplifying the car machine, and although in the mobile phone navigation application software, a user can manually set a navigation broadcasting mode and a speech speed, the user is required to manually select and confirm, and in some scenes, the mobile phone navigation application software can automatically pop up a dialog box to enable the user to confirm a broadcasting mode, however, most of the methods are based on a historical navigation route to infer corresponding recommendation given by familiarity of the user to the current driving route; the existing broadcasting mode is completely dependent on the functions of the navigation application software. For example, the broadcasting mode of the navigation software is set, the navigation scene is judged by prompting and adjusting according to the condition of the road condition and the route walked by the user, and the navigation only broadcasts information under the condition of a specific category in the reduced mode depending on manual selection or confirmation of the user. Therefore, the existing voice interaction result presentation mode cannot be intelligent and simplified, and cannot bring good voice interaction experience to the user.
In view of this, the present application aims to provide a method, a device and a storage medium for presenting a voice interaction result based on an environment, so as to provide various voice output strategies for a vehicle voice message based on environment data, so as to adapt to application requirements of scene change, and thus bring better voice interaction experience to users.
Fig. 1 is a flowchart of an environment-based voice interaction result presentation method according to an embodiment of the present application, and fig. 2 is a schematic block diagram of the environment-based voice interaction result presentation method according to an embodiment of the present application.
Referring to fig. 1 and 2, an embodiment of the present application provides a method for presenting a voice interaction result based on environment, where the method includes:
step S100, obtaining a message to be output by a vehicle machine and determining a grade result of the message;
step S200, collecting vehicle related data, and obtaining a reduction degree judgment result based on the vehicle related data;
and step S300, determining a message output strategy of the vehicle according to the reduction degree judging result and the grade result of the message, and controlling the vehicle to execute subsequent actions according to the determined message output strategy.
The steps S100 to S300 will be specifically described below.
In step S100, a voice message to be output by the vehicle is acquired, and a ranking result of the message is determined.
In this embodiment, the vehicle is a generic term of a complete set of control and processing systems of the vehicle, and includes, for example, a vehicle To Speech (TTS) device, an intelligent central control device, and the like. The voice interaction technology is a technology for generating artificial voice through a mechanical and electronic method and converting any text information into standard smooth voice for reading in real time.
Illustratively, the vehicle-to-machine interaction device includes, but is not limited to, a smart vehicle-to-machine system, an information search application, a map application, a social platform application, an audio-video playing application, a smart assistant application, and the like; the in-vehicle interaction device may be one having associated sound pickup means (e.g. one or more microphones) for capturing voice instructions of a user, and may be one having associated sound playing means (e.g. one or more speakers) for playing sound outwards.
In this embodiment, the method for obtaining a message to be output by a vehicle and determining a class result of the message includes: the message is ranked using a trained message ranking model to determine ranking results for the message.
Specifically, in the process of vehicle-computer interaction, the message to be output is firstly input into a trained message grading model to grade the message, and the specific message grades are as follows:
(1) The level I message is used for indicating extremely important information, such as overspeed prompt, steering prompt, question-answer message in voice wake-up state, etc. in the navigation output information, and the vehicle machine must broadcast.
(2) Class II messages are used to represent medium important information, such as weather reminders, that the vehicle can broadcast at the appropriate time.
(3) And III, a level III message is used for representing unimportant information such as various content recommendations, and the vehicle needs to broadcast at a time which does not affect the user.
In this embodiment, the message grading model may be obtained by machine learning methods such as labeling of the message samples and model training.
Illustratively, the training method of the message grading model comprises the following steps:
acquiring sample data of information and corresponding labeling data;
outputting a grade result corresponding to the message through the message grading model according to the sample data of the message;
and generating a loss value through a loss function based on the grade result and the labeling data, and updating the message grading model until convergence to obtain the trained message grading model.
Alternatively, in this embodiment, the message grading model may be built based on bert.
It should be understood that, in this embodiment, after the message to be output is subjected to the grading process of the message grading model, a grade result of the corresponding message may be obtained, so as to confirm whether the message needs to be output according to the grade result of the message.
In step S200, vehicle-related data is collected, and a result of the reduction degree determination is obtained based on the vehicle-related data.
In this embodiment, the vehicle-related data includes at least vehicle environment data, and further, in some embodiments, the vehicle-related data includes scene dimension data.
Specifically, scene calculation is performed based on the collected vehicle related data to obtain scene dimension data; and carrying out the reduction degree scoring according to the scene dimension data to obtain a reduction degree scoring result, and carrying out the reduction degree judgment according to the reduction degree scoring structure to output a reduction degree judgment result.
Illustratively, the scene dimension data includes at least one or a combination of the following:
(1) Vehicle speed change dimension data;
(2) Road condition dimension data;
(3) Sound dimension data;
(4) Video dimension data.
The vehicle speed change dimension data (1) can be directly obtained by a vehicle-mounted system, for example, the vehicle speed data of a plurality of sampling points are collected based on the frequency of 0.1s each interval, and the vehicle speed data of the plurality of sampling points are cached so as to facilitate the follow-up participation in the calculation of the reduction score.
(2) The road condition dimension data mainly acquire road condition information of a running road, can be directly acquired through a docking navigation system under the condition of starting navigation, and is calculated according to the speed deduction if the navigation is not started.
(3) The sound dimension data can be acquired by sound pickup devices such as an in-vehicle sound microphone.
(4) The video dimension data is mainly directly acquired by a vehicle-mounted system, and the video playing state is mainly acquired.
Based on the collected vehicle-related data, the embodiment of the application proposes a plurality of scene evaluation dimensions:
first, the vehicle speed change dimension, for example, the method for acquiring the vehicle speed change dimension data includes:
acquiring and caching a plurality of pieces of vehicle speed data acquired by the vehicle based on a preset sampling frequency;
constructing a relation curve of speed and time based on the cached multiple pieces of vehicle speed data;
dividing the current vehicle speed change dimension into a plurality of different target speed stage categories based on a first preset rule according to the relation curve of the speed and the time, and configuring correction coefficients of the vehicle speed change dimension corresponding to the plurality of different target speed stage categories.
Illustratively, normal speed versus time curves fall into two broad categories: t-type speed, S-type speed. Wherein the T-type velocity (acceleration will be abrupt and constant over time) is only achieved in an ideal situation or is considered to be a special case of the S-type.
In actual driving, the S-shaped speed curve is adopted, the whole movement process of the S-shaped speed curve is divided into 7 stages, namely a jerk stage, a uniform acceleration stage, a deceleration stage, a uniform speed stage, an acceleration-deceleration stage, a uniform deceleration stage and a deceleration-deceleration stage, the acceleration at the speed connection positions of the different stages is continuous, and the speed, the acceleration and the jerk curve of the S-shaped speed curve are shown in the figure 3.
In this embodiment, the absolute value of the speed change speed (acceleration) is used as the scene evaluation dimension, and the deceleration can be regarded as a negative acceleration, so that the speed change can be divided into three target speed stages of rapid acceleration, acceleration and uniform speed. Vehicle speed variation correction coefficients C (C 1 、C 2 、C 3 ) And participating in subsequent calculation of the reduction score. It is to be understood that the change in vehicle speed in the present embodiment can be understood as acceleration.
As an example, referring to fig. 4, the method for classifying the current vehicle speed change dimension into a plurality of different target speed stage categories based on the first preset rule according to the speed versus time curve includes:
determining all current speed stages according to the relation curve of the speed and the time;
For each speed stage, calculating an absolute value of a speed change corresponding to the speed stage, and comparing the absolute value of the speed change with a first preset threshold and a second preset threshold respectively, wherein,
if the absolute value of the speed change is larger than the first preset threshold value, determining the speed stage as a type of 'rapid acceleration stage';
if the absolute value of the speed change is smaller than the second preset threshold value, determining the speed stage as a uniform speed stage type;
if the absolute value of the speed change is between the second preset threshold and the first preset threshold, the speed stage is determined as an acceleration stage type.
Specifically, regarding the calculation mode of the vehicle speed change phase: can be measured by a change in speed over the past t seconds and a first preset threshold a is set 1 A second preset threshold a 2 From this, the calculation formula for three target speed phases can be obtained:
1. a rapid acceleration stage: the absolute value of the velocity change is greater than a 1 I.e.
2. Constant speed stage: the absolute value of the velocity change is smaller than a 2 I.e.
3. Acceleration phase: absolute value of speed change at a 1 、a 2 Between, i.e
Wherein: v t Representing the speed, v, from the current time, t seconds 0 Indicating the speed at the current time.
Second, the method for obtaining the dimension data of the road condition includes:
dividing the road condition dimension into a plurality of different road condition categories based on a second preset rule, and configuring road condition correction coefficients corresponding to the plurality of different road condition categories.
In this embodiment, the road conditions are divided into three evaluation dimensions: unblockedSlight congestion, correction coefficient R (R 1 、R 2 、R 3 ) The following steps are participated in the subsequent calculation of the reduction score, and the road condition judgment mode is calculated according to the following two scenes:
A. first, under the condition that the navigation system is started, judging the road condition category to which the current road condition belongs according to the congestion condition data acquired by the navigation system.
B. Secondly, under the condition that the navigation system is not started, judging the road condition category of the current road condition according to the vehicle speed data acquired by the vehicle-to-machine system.
Further, the method for determining the road condition category to which the current road condition belongs according to the vehicle speed data acquired by the vehicle-to-machine system under the condition that the navigation system is not started comprises the following steps:
setting the maximum value of the vehicle speed to v 1 The minimum value of the vehicle speed is v 2 The total analysis times is x, the first rapid acceleration times threshold value is alpha, the second rapid acceleration times threshold value is beta, and the interval time between two adjacent vehicle speed sampling points in a single analysis interval is k;
wherein, when the vehicle speed data of the sampling point is accumulated for x times continuously to be larger than v 1 And the number of times of second rapid acceleration is not more than beta times, judging that the vehicle is smooth;
when the vehicle speed data of the sampling point is accumulated for x times continuously to be greater than or equal to v 2 And the number of times of the second rapid acceleration is larger than or equal to beta, and smaller than or equal to alpha, the second rapid acceleration is judged to be slightly jammed;
when the accumulated vehicle speed data of the sampling point is continuously x times smaller than v 2 And the number of times of the first rapid acceleration is larger than alpha times, the congestion is judged.
It should be noted that, the "first rapid acceleration" may be indicated as a larger rapid acceleration, the "second rapid acceleration" may be indicated as a smaller rapid acceleration, and the relevant values of the "first rapid acceleration" and the "second rapid acceleration" may be defined according to the actual application, which is not repeated herein, where k x is the whole analysis time.
Third, the method for acquiring sound dimension data includes:
and continuously detecting a sound signal through pickup equipment in the vehicle, performing characteristic value matching with a pre-trained baseline model, analyzing whether the characteristic value of the sound signal reaches a preset threshold value of the baseline model according to frames in a period of time, if so, marking the sound signal as sound, and recording a scene grading value of the sound dimension of the sound signal.
Specifically, in the present embodiment, the scene score value of the sound dimension may be marked as S', and the correction coefficient of the sound dimension is set as S to participate in the subsequent reduction score calculation.
Fourth, the method for obtaining the video dimension data includes:
and detecting whether music or video is played in the current vehicle, if so, marking the current vehicle as video and recording the scene grading value of the video dimension.
Specifically, in this embodiment, the scene score value of the video dimension may be marked as M', and the correction coefficient of the video dimension is set as M to participate in the subsequent computation of the reduction score.
In this embodiment, the result of the reduction degree determination is divided into four broadcasting mode levels of silence broadcasting, very simple broadcasting, reduced broadcasting, complete broadcasting, wherein the index affecting the voice output reduction degree is at least one or a combination of vehicle speed change, road condition, voice (speaking) and video and audio, and the reduction degree score can be calculated based on the above four scene dimension data.
Illustratively, in this embodiment, the calculating the fitness score according to the context dimension data to obtain the fitness score result includes:
acquiring a correction coefficient of a current vehicle speed change dimension and a current vehicle speed change, wherein the current vehicle speed change refers to a current acceleration;
Acquiring a correction coefficient of the dimension of the current road condition, an average vehicle speed in the current analysis period and the sum of vehicle speed changes corresponding to all sampling points in the current analysis period;
acquiring a correction coefficient of the current in-vehicle sound dimension and a scene score of the current in-vehicle sound dimension;
acquiring a correction coefficient of the current in-vehicle video dimension and a scene score of the current in-vehicle video dimension;
accumulating the correction coefficient of the current vehicle speed change dimension with the current vehicle speed change to obtain a first streamline scoring result of the corresponding vehicle speed change dimension;
dividing the sum of the vehicle speed changes corresponding to all sampling points in the current analysis period by the average vehicle speed in the current analysis period, and then carrying out accumulation processing on the average vehicle speed and the correction coefficient of the current road condition dimension to obtain a second streamline scoring result corresponding to the road condition dimension;
accumulating the correction coefficient of the current in-vehicle sound dimension and the scene score of the current in-vehicle sound dimension to obtain a third reduction degree scoring result of the corresponding sound dimension;
accumulating the correction coefficient of the current in-vehicle video dimension and the scene score of the current in-vehicle video dimension to obtain a fourth reduction scoring result corresponding to the video dimension;
And accumulating the first reduction degree scoring result, the second reduction degree scoring result, the third reduction degree scoring result and the fourth reduction degree scoring result to obtain the corresponding reduction degree scoring result.
For example, the scene dimension data is subjected to a reduction score calculation according to the following formula:
wherein P represents a reduction degree scoring result, C represents a correction coefficient of a dimension of a current vehicle speed change, A' represents a current vehicle speed change, R represents a correction coefficient of a dimension of a current road condition,representing the sum of the vehicle speed changes corresponding to all sampling points in the current analysis period, V' representing the current analysisThe average speed in the time period, S represents the correction coefficient of the current in-vehicle sound dimension, S 'represents the scene score of the current in-vehicle sound dimension, M represents the correction coefficient of the current in-vehicle video dimension, and M' represents the scene score of the current in-vehicle video dimension.
When the vehicle speed change is in different target speed stages, C takes different correction coefficient values, such as C1, C2 and C3, wherein the higher the vehicle speed change is, the larger the value of C is.
Wherein, the formula of A' isv t Representing the speed, v, from the current time, t seconds 0 The speed at the current moment is represented, t represents time, the reaction simplicity score is in direct proportion to the change of the vehicle speed, and the higher the vehicle speed is, the higher the simplicity score is.
Under different road conditions, the correction coefficient of the road condition dimension takes different values, the more the road condition is congested, the larger the value of R is, and it is known that the reduction degree score of the road condition dimension is negatively related to the average speed, positively related to the sum of speed changes, the larger the average speed is, the smaller the reduction degree score is, the larger the sum of speed changes is, and the larger the reduction degree score is.
Wherein s 'represents a scene score of chat talking in the vehicle, s' is not 0 when the vehicle is talking or chatting, and otherwise 0 is taken.
Wherein m' is the scene score of the video dimension. When the video and audio or the broadcast is being played in the car, m' is not 0, otherwise, 0 is taken.
Further, the compaction degree determination result includes a plurality of different compaction degree levels, and the method further includes:
mapping the compactness scoring result into a corresponding compactness grade, and determining a corresponding message output strategy according to the compactness grade.
Specifically, a reduction degree scoring result P is obtained based on a preset time interval, and reduction degree judgment is performed on the reduction degree scoring result P according to the following method,
When P is in the first reduction scoring interval, determining that the reduction judgment result is a complete broadcasting mode;
when the P is in the second reduction scoring interval, determining that the reduction judgment result is a reduced broadcasting mode;
when P is in the third reduction degree scoring interval, determining that the reduction degree judgment result is a 'very simple broadcasting mode';
and when the P is in the fourth reduction degree scoring interval, determining that the reduction degree judgment result is a mute broadcasting mode.
Illustratively, in some embodiments of the present application,
when P is E [ 0-40), determining that the simplifying degree judgment result is a complete broadcasting mode;
when P is 40-80), determining that the simplifying degree judgment result is a simplified broadcasting mode;
when P is E [80,100], determining that the simplifying degree judgment result is a 'very simple broadcasting mode';
and when P is E (100, ++), determining that the result of the reduction degree judgment is a mute broadcasting mode.
Specifically, a timing scheduling module can be set, and each time interval is a period of time T, a voice output reduction degree score P is calculated according to driving scene dimension data of a user at the time, and then the reduction degree score P is mapped into a reduction degree grade.
It should be understood that, those skilled in the art may divide the scoring interval of the above-mentioned degree of simplification into 2, 3 or more, and may make corresponding adjustments according to the actual application requirements, which is not limited herein.
In step S300, a message output policy of the vehicle is determined according to the result of the degree of simplification determination and the result of the level of the message, and the vehicle is controlled to execute a subsequent action according to the determined message output policy. Step S300 may be performed by a message executor, for example.
In some embodiments, the message output policy of the vehicle machine includes, for example, voice synthesis broadcasting, a phonological feature of voice synthesis, and the like.
Fig. 5 is a schematic flow chart of a sub-process of step S300 shown in fig. 1, and as shown in fig. 5, the steps of determining a message output policy of the vehicle according to the result of the reduction degree determination and the result of the level of the message, and controlling the vehicle to execute a subsequent action according to the determined message output policy include:
step S310, if the detected result of the degree of simplification judgment is a mute broadcast mode, determining the result as a first message output strategy according to the level result of the message;
step S320, if the result of the degree of simplification judgment is detected to be a 'very simple broadcasting mode', determining that the message is a second message output strategy according to the level result of the message;
Step S330, if the detected result of the degree of simplification judgment is a 'simplified broadcasting mode', determining that the message is a third message output strategy according to the level result of the message;
step S340, if the result of the degree of simplification judgment is detected to be a 'complete broadcasting mode', determining that the message is a fourth message output strategy according to the level result of the message;
and step S350, controlling the vehicle to execute follow-up actions according to the first message output strategy, the second message output strategy, the third message output strategy and the fourth message output strategy.
Further, the message grade determined according to the grade result of the message comprises: class I messages, class II messages, and class III messages;
the step of determining the message output strategy of the vehicle according to the reduction degree judging result and the grade result of the message and controlling the vehicle to execute the follow-up action according to the determined message output strategy further comprises the following steps:
if the current reduction degree judging result is detected to be a mute broadcasting mode, the first message output strategy is as follows: only broadcasting the I-level message;
if the current reduction degree judging result is detected to be a 'very simple broadcasting mode', the second message output strategy is as follows: only broadcasting I-level messages, filtering all non-I-level messages, adding II-level messages into a delay queue, and filtering all III-level messages;
If the current reduction degree judging result is detected to be a 'reduced broadcasting mode', the third message output strategy is as follows: only broadcasting the I-level message and the II-level message, and filtering all the III-level messages;
if the current simplifying degree judging result is detected to be a complete broadcasting mode, the fourth message output strategy is as follows: all levels of messages will be broadcast.
It should be noted that, in this embodiment, for the II-level message added to the delay queue, the broadcasting content of the II-level message may be output under the condition that the next reduction degree determination result of the vehicle machine is that the playable II-level message is met, and the message filtering and delay queue processing may be performed by the message filter.
Further, the method further comprises:
when the condition that the message accords with the preset rule is detected, in order to remind the driver of the whole aspect, the vehicle is controlled to broadcast the message according to the determined message output strategy, and meanwhile, the vehicle is triggered to execute corresponding vehicle event actions. For example, the preset rule is that under the condition of playing the level I message correspondingly, the vehicle is triggered to execute the corresponding vehicle event action.
Illustratively, the vehicle event action includes at least one or a combination of the following: the dashboard displays animations, steering wheel vibrations and voice message cues.
The following describes the technical solution of the present application in an exemplary scenario in which Zhang Sanrun on a congested road.
The specific steps for calculating the reduction score are as follows:
1. zhang Sanzheng is not expected to be disturbed by the voice of the car machine when the car is chatted, and the degree of simplification score of the sound dimension is 50 can be obtained by assuming that the scene score S' of the sound dimension is 10 and the correction coefficient of the sound dimension is set to s=5;
2. combining road condition dimensions, and currently locating in a congested road condition, so that a correction coefficient R=7 of the road condition dimensions is set, the average vehicle speed is 20km/h, and the sum of speed changes of multiple sampling points in an analysis period is 100, thereby obtaining a streamline degree score of the road condition dimensions of 35;
3. in combination with the speed change dimension, if the current speed stage is in the "acceleration stage", assuming that the speed change dimension correction coefficient c=1, the current speed change is 10, the reduction degree score of the speed change dimension is 10.
Thus, the current fitness score p=40+35+10=85 is calculated.
It should be noted that: the correction coefficients and specific numerical values are assumed to facilitate understanding, and actual engineering practice can be correspondingly adjusted.
It can be derived from this that the final reduction degree determination result in the example scenario is: in the extremely simple broadcasting mode, non-I-level information is automatically filtered by the vehicle machine.
The voice interaction result presentation method based on the environment aims at obtaining the message to be output by the vehicle machine and determining the grade result of the message; collecting vehicle related data, and obtaining a reduction degree judgment result based on the vehicle related data; and determining a message output strategy of the vehicle machine according to the reduction degree judging result and the message grade result, and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy. The method and the device realize that various message output strategies are provided for the vehicle information based on the vehicle related data so as to adapt to the application requirements of scene change, thereby bringing better interaction experience to users.
Further, the automobile machine is matched with related automobile machine event actions such as animation display of an instrument panel, vibration of a steering wheel and the like when broadcasting important information, and the important information is comprehensively reminded from the aspects of body feeling and visual effect of a user, so that the intelligent driving scene requirement is met.
According to yet another aspect of the present application, there is provided an environment-based voice interaction result presentation apparatus.
Fig. 6 is a block diagram of a voice interaction result presentation device based on environment according to an embodiment of the present application, and as shown in fig. 6, an apparatus 600 described in the present application includes:
the message obtaining module 610 is configured to obtain a message to be output by the vehicle, and determine a level result of the message;
the reduction degree judging module 620 is configured to collect vehicle related data and obtain a reduction degree judging result based on the vehicle related data;
and the control module 630 is configured to determine a message output policy of the vehicle according to the result of the degree of simplification determination and the result of the level of the message, and control the vehicle to execute a subsequent action according to the determined message output policy.
Illustratively, in this embodiment, the control module 630 may be executed by a message executor.
It should be noted that, the voice interaction result presenting device based on the environment provided in this embodiment may execute the voice interaction result presenting method based on the environment described in the embodiments of the present application (for example, the embodiments of executing steps S100 to S300), and its implementation principle and technical effect are similar, and will not be described herein.
Furthermore, the present application provides a computer storage medium storing a computer program loadable by a processor to perform the steps of any of the above-described context-based speech interaction result presentation methods.
The specific limitation and implementation of the above steps may refer to an embodiment of the method for presenting a voice interaction result based on environment, which is not described herein.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced
SDRAM (ESDRAM), synchronous Link (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
The method, the device and the storage medium for presenting the voice interaction result based on the environment provided by the embodiment of the application are described in detail, and specific examples are applied to the description of the principle and the implementation of the application, and the description of the above embodiments is only used for helping to understand the technical scheme and the core idea of the application; those of ordinary skill in the art will appreciate that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims (11)

1. A method for presenting a voice interaction result based on an environment, the method comprising:
Obtaining a message to be output by a vehicle machine and determining a grade result of the message;
collecting vehicle related data, and obtaining a reduction degree judgment result based on the vehicle related data;
and determining a message output strategy of the vehicle machine according to the reduction degree judging result and the message grade result, and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy.
2. The method of claim 1, wherein the method of obtaining a message to be output by a vehicle and determining a ranking result of the message comprises:
the message is ranked using a trained message ranking model to determine ranking results for the message.
3. The method of claim 1, wherein the step of collecting vehicle-related data and deriving a compaction determination based on the vehicle-related data comprises:
performing scene calculation based on the acquired vehicle related data to obtain scene dimension data;
and carrying out the reduction degree scoring according to the scene dimension data to obtain a reduction degree scoring result, and carrying out the reduction degree judgment according to the reduction degree scoring result so as to output a reduction degree judgment result.
4. A method according to claim 3, wherein the scene dimension data comprises at least one or a combination of the following:
(1) Vehicle speed change dimension data;
(2) Road condition dimension data;
(3) Sound dimension data;
(4) Video dimension data.
5. The method of claim 4, wherein the calculating a compaction score from the scene dimension data comprises:
acquiring a correction coefficient of the dimension of the current vehicle speed change and the current vehicle speed change;
acquiring a correction coefficient of the dimension of the current road condition, an average vehicle speed in the current analysis period and the sum of vehicle speed changes corresponding to all sampling points in the current analysis period;
acquiring a correction coefficient of the current in-vehicle sound dimension and a scene score of the current in-vehicle sound dimension;
acquiring a correction coefficient of the current in-vehicle video dimension and a scene score of the current in-vehicle video dimension;
accumulating the correction coefficient of the current vehicle speed change dimension with the current vehicle speed change to obtain a first streamline scoring result of the corresponding vehicle speed change dimension;
dividing the sum of the vehicle speed changes corresponding to all sampling points in the current analysis period by the average vehicle speed in the current analysis period, and then carrying out accumulation processing on the average vehicle speed and the correction coefficient of the current road condition dimension to obtain a second streamline scoring result corresponding to the road condition dimension;
Accumulating the correction coefficient of the current in-vehicle sound dimension and the scene score of the current in-vehicle sound dimension to obtain a third reduction degree scoring result of the corresponding sound dimension;
accumulating the correction coefficient of the current in-vehicle video dimension and the scene score of the current in-vehicle video dimension to obtain a fourth reduction scoring result corresponding to the video dimension;
and accumulating the first reduction degree scoring result, the second reduction degree scoring result, the third reduction degree scoring result and the fourth reduction degree scoring result to obtain the corresponding reduction degree scoring result.
6. The method of claim 5, wherein the calculating a compaction score from the scene dimension data comprises:
and calculating the reduction score of the scene dimension data according to the following steps:
wherein P represents a reduction degree scoring result, C represents a correction coefficient of a dimension of a current vehicle speed change, A' represents a current vehicle speed change, R represents a correction coefficient of a dimension of a current road condition,the method comprises the steps of representing the sum of vehicle speed changes corresponding to all sampling points in a current analysis period, V ' representing the average vehicle speed in the current analysis period, S representing the correction coefficient of the current in-vehicle sound dimension, S ' representing the scene score of the current in-vehicle sound dimension, M representing the correction coefficient of the current in-vehicle video dimension, and M ' representing the scene score of the current in-vehicle video dimension.
7. The method of claim 6, wherein the results of the reduction determination comprise a plurality of different levels of reduction, the method further comprising:
mapping the compactness scoring result into a corresponding compactness grade, and determining a corresponding message output strategy according to the compactness grade.
8. The method according to claim 1, wherein the method further comprises:
when the condition that the message accords with the preset rule is detected, the vehicle is controlled to broadcast the message according to the determined message output strategy, and meanwhile, the vehicle is triggered to execute corresponding vehicle event actions.
9. The method of claim 8, wherein the vehicle event action comprises at least one or a combination of:
the dashboard displays animations, steering wheel vibrations and voice message cues.
10. An environment-based voice interaction result presentation apparatus, the apparatus comprising:
the message acquisition module is used for acquiring a message to be output by the vehicle machine and determining a grade result of the message;
the system comprises a reduction degree judging module, a reduction degree judging module and a control module, wherein the reduction degree judging module is used for acquiring vehicle related data and obtaining a reduction degree judging result based on the vehicle related data;
And the control module is used for determining the message output strategy of the vehicle machine according to the reduction degree judging result and the grade result of the message and controlling the vehicle machine to execute follow-up actions according to the determined message output strategy.
11. A computer readable storage medium, characterized in that it stores a computer program loadable by a processor to perform the steps in the context-based speech interaction result presentation method according to any of claims 1 to 9.
CN202311847695.2A 2023-12-27 2023-12-27 Voice interaction result presentation method and device based on environment and storage medium Pending CN117877475A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311847695.2A CN117877475A (en) 2023-12-27 2023-12-27 Voice interaction result presentation method and device based on environment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311847695.2A CN117877475A (en) 2023-12-27 2023-12-27 Voice interaction result presentation method and device based on environment and storage medium

Publications (1)

Publication Number Publication Date
CN117877475A true CN117877475A (en) 2024-04-12

Family

ID=90596260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311847695.2A Pending CN117877475A (en) 2023-12-27 2023-12-27 Voice interaction result presentation method and device based on environment and storage medium

Country Status (1)

Country Link
CN (1) CN117877475A (en)

Similar Documents

Publication Publication Date Title
DE102013222507B4 (en) Method for adapting a speech system
CN103208287B (en) Enhance the method and system of voice dialogue using the relevant information of vehicles of sound
US9601111B2 (en) Methods and systems for adapting speech systems
DE102012217160B4 (en) Procedures for correcting unintelligible synthetic speech
JP3322140B2 (en) Voice guidance device for vehicles
DE102013222757A1 (en) Adaptation methods and systems for speech systems
CN111402925A (en) Voice adjusting method and device, electronic equipment, vehicle-mounted system and readable medium
CN112614491B (en) Vehicle-mounted voice interaction method and device, vehicle and readable medium
CN110696756A (en) Vehicle volume control method and device, automobile and storage medium
DE10008226A1 (en) Voice control device and voice control method
JP7020098B2 (en) Parking lot evaluation device, parking lot information provision method and program
EP3923271B1 (en) Voice control method, vehicle, server and storage medium
DE102014224120A1 (en) Output audio contributions for a vehicle
CN111402879A (en) Vehicle navigation prompt voice control method, device, equipment and medium
CN118205507A (en) Method, apparatus, related device and computer program product for assisting sleep of vehicle-mounted passenger
CN115878070B (en) Vehicle-mounted audio playing method, device, equipment and storage medium
CN117877475A (en) Voice interaction result presentation method and device based on environment and storage medium
CN112959966A (en) Vehicle-mounted multimedia regulation and control method, system, equipment and computer storage medium based on internet trip and user habit deep learning
CN107566896A (en) Multimedia messages recommend method and device, storage medium, terminal
CN116353522A (en) Service management system and service management method for vehicle
CN115503639A (en) Voice processing method, voice interaction method, server and storage medium
JP2021156994A (en) Method and device for controlling audio output
Samardzic et al. The evaluation of speech intelligibility in a simulated driving environment using the hearing in noise test
JP7386076B2 (en) On-vehicle device and response output control method
CN111791893A (en) Fatigue processing method, fatigue processing device, vehicle and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination