CN113516978A - Method and device for controlling sound output - Google Patents

Method and device for controlling sound output Download PDF

Info

Publication number
CN113516978A
CN113516978A CN202110284661.1A CN202110284661A CN113516978A CN 113516978 A CN113516978 A CN 113516978A CN 202110284661 A CN202110284661 A CN 202110284661A CN 113516978 A CN113516978 A CN 113516978A
Authority
CN
China
Prior art keywords
user
factor
sound output
notification
stop instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110284661.1A
Other languages
Chinese (zh)
Inventor
安原真也
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honda Motor Co Ltd
Original Assignee
Honda Motor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honda Motor Co Ltd filed Critical Honda Motor Co Ltd
Publication of CN113516978A publication Critical patent/CN113516978A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/08Interaction between the driver and the control system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

A control method and a sound output control apparatus for sound output are provided. It is possible to restart the sound output stopped by the user while ensuring the user's recognition. Comprises the following steps: a stop instruction section that stops the audio output in response to a stop instruction of the audio output received from the user, among the audio outputs to the user; a factor estimating unit that estimates a factor for which a user has made a stop instruction in response to a case where the stop instruction is received; and a notification unit configured to notify the user of the estimated factor when the stopped sound output is restarted.

Description

Method and device for controlling sound output
Technical Field
The present invention relates to a method and apparatus for controlling audio output.
Background
Documents of the prior art
Conventionally, there is known an in-vehicle device that reproduces music or the like in accordance with an instruction from a user or provides various information required by the user by voice. For example, when the user issues a voice instruction of "tell me the headline news of today" or the like next to a so-called wakeup word indicating the start of the voice instruction with respect to the in-vehicle apparatus, the in-vehicle apparatus retrieves a news server on the internet and starts reading the headline news aloud.
If the user wishes to stop the sound output halfway for some reason, the user can stop the sound output by, for example, a sound instruction, and then, if the sound instruction is given again as necessary, the desired sound output can be instructed again.
However, there are various factors for stopping the audio output by the user, and depending on the factors, there may be cases where: it is desirable that the sound output is not completely ended but temporarily stopped (i.e., interrupted), and the sound output is restarted after the factor is eliminated.
For example, when a user speaks a relatively long news and outputs a voice, the user may desire: instead of ending reading due to a stop instruction from the user, it is desirable to interrupt reading and resume reading from the position of the interruption after the cause of the stop instruction is eliminated, so that there is no need to listen to the same news section again.
In addition, it is also desirable to restart the audio output appropriately in the audio dialog with the user by the dialog device. Especially in a voice conversation in which one user instruction is given through a plurality of conversations, after a conversation stop instruction from the user, if the conversation is restarted under appropriate conditions, the user instruction can be given through an efficient conversation.
Therefore, when the user stops the sound output, it is convenient for the user if the stopped sound output is restarted at an appropriate timing or condition corresponding to the cause of the stop. In this case, since the voice output instructed by the user to temporarily stop is restarted when the user does not instruct it, it is considered that it is necessary to ensure the user's approval for the restart. That is, it is desirable to realize the following technique: when the user stops the sound output, the stopped sound output can be restarted while ensuring the user's recognition.
As a prior art, patent document 1 discloses the following: in an on-vehicle dialogue device that dialogues with a driver, the driver is not notified when the driving load of the driver is high, and the driver starts to speak when the driving load is low and the driver is in an inattentive state (a state where attention is low, such as when the driving operation is slow or when a large correction operation is performed). Further, patent document 2 discloses the following: in a voice dialogue device mounted on a vehicle, when the driving margin of a driver determined from a signal of a brake sensor or the like is a level at which a voice message can be recognized, a voice from the driver is received.
However, in these conventional techniques, whether or not to permit output of speech to the driver or whether or not to permit reception of speech from the driver is determined based on the driving load, and thus, the convenience of the user in the scene where the user instructs to stop the audio output as described above is improved, and no countermeasure is given.
Patent document
Patent document 1: japanese patent laid-open publication No. 2017-067849
Patent document 2: japanese patent laid-open publication No. 2018-063338
Disclosure of Invention
Problems to be solved by the invention
In light of the above background, there is a demand for a technique that can restart the audio output that the user has stopped while ensuring the user's recognition.
Means for solving the problems
One aspect of the present invention is a method for controlling audio output, including: a stop instruction section that stops the sound output in response to receiving a stop instruction of the sound output from a user, in the sound output to the user; a factor estimating section that estimates a factor that the user has made the stop instruction in response to a case where the stop instruction is received; and a notification unit configured to notify the user of the factor estimated when the stopped sound output is restarted.
According to another aspect of the present invention, the notification includes a reason for restarting the sound output according to the estimated factor.
According to another aspect of the present invention, the notification includes a restart condition of the sound output corresponding to the estimated factor.
According to another aspect of the present invention, the user includes a driver of a vehicle, and the notifying step includes, when the estimated cause is an increase in driving load of the driver, notifying that a driving scene causing the increase in driving load ends as the restart reason.
According to another mode of the present invention, the notification includes an inquiry to the user as to whether or not the stopped sound output can be restarted.
According to another aspect of the present invention, the user includes a driver of a vehicle, and the notifying step includes notifying the user of a predetermined notification sound when the estimated factor is an increase in driving load of the driver, an elapsed time from the stop instruction to an end of a driving scene causing the increase in driving load is a predetermined time or less, and a reliability of the determination of the end of the driving scene causing the increase in driving load is a predetermined value or more.
According to another aspect of the present invention, the notification including the predetermined notification sound does not include a query to the user as to whether or not the restart is possible.
Another aspect of the present invention is an audio output control device that controls audio output, the audio output control device including: a stop instruction unit that stops audio output in response to receiving an instruction to stop audio output from a user, in audio output to the user; a factor estimating unit that estimates a factor that the user has performed the stop instruction in response to a case where the stop instruction is received; and a notification unit configured to notify the user of the estimated factor when the stopped sound output is restarted.
ADVANTAGEOUS EFFECTS OF INVENTION
According to the present invention, it is possible to restart the audio output stopped by the user while ensuring the user's recognition.
Drawings
Fig. 1 is a diagram showing a configuration of a UI control device according to an embodiment of the present invention.
Fig. 2 is a flowchart showing steps of a control process of the UI control device shown in fig. 1.
Fig. 3 is a flowchart showing the steps of the factor estimation process of the control process shown in fig. 2.
Fig. 4 is a flowchart showing the procedure of the condition determination process of the control process shown in fig. 2.
Fig. 5 is a flowchart showing the steps of the notification process of the control process shown in fig. 2.
Description of the reference numerals
100 … UI control means, 102 … vehicle, 104 … in-vehicle network bus, 106 … camera control means, 108 … vehicle information acquisition means, 110 … driving scene evaluation means, 112 … driving skill evaluation means, 114 … user information management means, 116 … driving load calculation means, 118 … AV output means, 120 … content provision means, 122 … in-vehicle camera, 124 … out-vehicle camera, 126 … sensor group, 128, 136, 150 … processing means, 130, 137, 152 … storage means, 132 … driving skill DB, 134 … hobby information DB, 138 news information, 139 … sightseeing information, 140 … microphone, 36142 speaker, 144 … display means, 146 … touch panel, 36156 UI control section, 158 … output control section, 36160 sound output section, … sound recognition section, 36164 UI control section, 166 … input display control section, … stop processing section, … instruction section, 172 … scene determination unit, 174 … factor estimation unit, 176 … condition decision unit, 178 … notification unit, 180 … restart instruction unit, 186 … load determination unit, 188 … session determination unit, 190 … sleep determination unit, and 192 … content determination unit.
Detailed Description
Embodiments of the present invention will be described below with reference to the drawings.
[ embodiment 1 ]
First, embodiment 1 of the present invention will be explained. Fig. 1 is a diagram showing a configuration of a user interface control device as a sound output control device according to embodiment 1 of the present invention. The user interface control device (hereinafter, UI control device) 100 is mounted on a vehicle 102 as a mobile body. The UI control device 100 as an audio output control device is communicably connected to a camera control device 106, a vehicle information acquisition device 108, a driving scene evaluation device 110, a driving skill evaluation device 112, a user information management device 114, a driving load calculation device 116, an AV (audio visual) output device 118, and a content providing device 120 via an in-vehicle network bus 104.
The UI control device 100 mediates interaction between the AV output device 118 and the content providing device 120 as clients and the user via a user interface configured by the microphone 140, the speaker 142, the display device 144, and the touch panel 146. In particular, the UI control device 100 controls the stop and restart of sound output from these client devices to the user via the speaker 142.
Hereinafter, the user refers to a user of the vehicle 102 including a driver and a fellow passenger of the vehicle 102.
The camera control device 106 captures an image of the interior of the vehicle 102 by the interior camera 122. Further, camera control device 106 captures an image of the environment outside vehicle 102, for example, by an exterior camera 124 provided on the exterior of vehicle 102.
The vehicle information acquisition device 108 detects the driving operation and the motion state (or the dynamic state) of the vehicle 102 from the sensor group 126. The sensor group 126 includes sensors for acquiring the presence or absence of user operation and the operation amount for various operators related to vehicle operation, such as an accelerator pedal sensor, a brake pedal sensor, a steering sensor, a shift sensor, and a direction indicator sensor. The sensor group 126 may include various sensors for detecting a motion state or a dynamic state of the vehicle, such as a 3-axis acceleration sensor, a yaw rate sensor, and a speed sensor.
The driving scene evaluation device 110 evaluates a driving scene (or traffic scene) that is a scene of a traffic environment in which the vehicle 102 travels, according to the related art. In the present embodiment, the driving scene is obtained by classifying various traffic scenes encountered when driving the vehicle, and can be represented by one or a combination of a plurality of traffic scenes such as intersection passage, intersection right turn, intersection left turn, narrow lane opposite travel, passing ahead, lane change, highway merging, emergency vehicle passage, two-wheel vehicle parallel travel, pedestrian congestion, street congestion, and traveling during stormy weather.
The driving scene evaluation device 110 calculates a confidence level (certainty, probability, or reliability) of determining that the driving scene matches the current driving scene for each of the driving scenes (candidate scenes). According to the calculated confidence degrees of the candidate scenes, the candidate scene with the highest confidence degree can be determined as the current driving scene. Here, the confidence level may be represented as a numerical value in a range of, for example, 0 to 1, in which the higher the degree of confidence is, the higher the value of the confidence level is.
Specifically, the driving scene evaluation device 110 includes a computer, i.e., a processing device, configured by a processor such as a CPU, and calculates the confidence of each driving scene based on, for example, the external environment of the vehicle 102, the driving behavior of the driver of the vehicle 102, and/or the motion state of the vehicle 102.
Here, the external environment may include map information (geometric configuration or lane configuration of a road such as a straight road, a curve, a four-way road, or an expressway entrance) near the current position of the vehicle 102, the presence of another vehicle that can be acquired from the vehicle exterior camera 124, a road sign, an operation state of a road device (such as a lighting color of a traffic light), and a weather state. The driving behavior of the driver may include a line of sight movement of the driver (line of sight movement to a side view mirror or a room mirror for safety confirmation), a type of driving operation (acceleration/deceleration operation, steering operation, and turning on of a winker), and/or an operation amount and an operation sequence of the driving operation. Further, the motion state of the vehicle 102 may include speed, acceleration, deceleration, rotation speed, gradient of the traveling road, and the like.
The driving scene evaluation device 110 acquires map information stored in its own storage device, information on the vehicle environment obtained by the outside camera 124, driver's sight line information obtained by the inside camera 122, and various vehicle information obtained by the vehicle information acquisition device 108.
The driving scene evaluation device 110 may compare the external environment, a series of driving actions, and the motion state of the vehicle, which are characteristic of each candidate scene, with the current external environment of the vehicle 102, the driving action of the driver, and the motion state of the vehicle 102, and calculate the confidence level based on the degree of agreement between them, for example.
However, the calculation method of the confidence is not limited thereto. For example, the driving scene evaluation device 110 may calculate the confidence of each candidate scene corresponding to the current external environment, driving action, and/or driving state using a learned model that is machine-learned so as to probabilistically estimate the current driving scene from the external environment, driving action, and/or moving state.
The driving scenario evaluation device 110 outputs the confidence of each of the candidate scenarios to another device via the in-vehicle network bus, determines the candidate scenario with the highest confidence as the current driving scenario, and outputs the result of the determination to another device.
The driving skill evaluation device 112 evaluates the driving skill of the driver of the vehicle 102 according to the related art, and stores the evaluation result. Specifically, the driving skill evaluation device 112 includes, for example: a computer, i.e., a processing device, including a processor such as a CPU; and a storage device. The driving skill evaluation device 112 compares a standard steering flow performed by a standard driver in the same driving scene as the current driving scene acquired from the driving scene evaluation device 110 with an actual execution steering flow performed by the current driver of the vehicle 102, and evaluates the driving skill of the current driver.
These steering flows can be expressed by parameters such as the type, sequence, start timing, speed of the driving operation, and/or magnitude of the operation amount of the driving operation in a series of steering. The driving skill evaluation device 112 evaluates the degree of deviation of each of the parameters in the execution steering flow of the current driver from the standard steering flow, and calculates the evaluation result as a driving skill evaluation score. The driving skill evaluation score may be calculated such that the upper limit is a value of 1, and the lower the driving skill (i.e., the greater the degree of deviation) the smaller the value.
Here, it is assumed that the parameter values constituting the above-described execution operation flow can be acquired from the vehicle information acquisition device 108. Further, the parameter values relating to the standard steering procedure described above can be stored in advance for each driving scenario.
The driving skill evaluation device 112 can calculate the driving skill evaluation score based on data of driving operations during a driving period (for example, 3 month period) of a predetermined length at predetermined time intervals (for example, every half year). For example, when the vehicle 102 is used by a plurality of users, the driving skill evaluation device 112 calculates the driving skill evaluation score for each user.
The driving skill evaluation device 112 outputs the calculated driving skill evaluation score for each driver to other devices via the in-vehicle network bus 104.
The user information management device 114 manages information (user information) related to a user who uses the vehicle 102 as a driver. The user information may include driving skill evaluation scores and preference information of each user. Specifically, the user information management apparatus 114 has a processing apparatus 128 and a storage apparatus 130. The processing device 128 is, for example, a computer having a processor such as a CPU. The storage device 130 is configured by, for example, a volatile and/or nonvolatile semiconductor memory, a hard disk device, or the like. The storage device 130 stores a driving skill database (driving skill DB)132 and a taste information database (taste information DB) 134.
The driving skill evaluation score of each user is stored in the driving skill DB 132. The processing device 128 receives the driving skill evaluation score for each user output from the driving skill evaluation device 112, and stores the score in the driving skill DB 132.
The preference information DB134 stores therein preference information of each user. The preference information is constituted by, for example, information indicating a preference category of the corresponding user preference. The preference category may be constituted by words representing, for example, a field of content (music, movie, news, etc.), a small classification for each field, and/or specific content, etc. The above-mentioned small categories represent, for example, differences in classical and popular categories in the case of music, differences in action, horror, Si-Fi, and the like in the case of movies, and differences in sports, a specific country, a specific news source, and the like in the case of news.
The processing device 128 acquires, for example, information on music or moving images reproduced by the user via the AV output device 118 described later, keywords for search using a browser provided by the AV output device 118, and content information for instructing the content providing device 120 described later to output, from the AV output device 118 and the content providing device 120. Then, the processing device 128 generates the taste information of the corresponding user based on the acquired information, and stores the taste information in the taste information DB 134.
The user information management device 114 also determines the user who is currently utilizing the vehicle 102 as a driver. For example, the processing device 128 identifies the current driver by authentication processing using ID information acquired from a smart key or a portable terminal used by each user or a face image of the driver acquired from the vehicle interior camera 122, or the like, according to the conventional technique.
The driving load calculation means 116 estimates the current driving load of the driver. The driving load calculation device 116 includes a processing device including a processor such as a CPU and a storage device, and calculates the current driving load of the driver based on the current driving scene of the vehicle 102 and the current degree of driving skill of the driver.
Specifically, the driving load calculation device 116 acquires the current driving scene of the vehicle 102 from the driving scene evaluation device 110. The driving load calculation device 116 also acquires a driving skill evaluation score indicating the current driving skill of the driver of the vehicle 102 from the user information management device 114.
Then, the driving load calculation means 116 calculates the current driving load of the current driver by, for example, multiplying the standard driving load, which numerically represents the driving load that the standard driver (standard driver) receives while traveling in the current driving scene, by the driving skill evaluation score.
Here, the standard driving load can be expressed by a numerical value having a larger value as the driving load is higher, for example. As described above, the standard driving load can be determined in advance and stored for each of the classified driving scenes, for example.
The AV output device 118 includes a processing device, such as a computer having a processor such as a CPU, and reproduces music or moving images according to the conventional technique. The AV output device 118 has, for example, a browser, and provides a user with functions of information retrieval and/or information browsing.
The AV output device 118 performs interaction with the driver via the UI control device 100. For example, the driver can give an instruction to reproduce music or moving images or an instruction to search for information by voice instruction via the microphone 140. The AV output device 118 receives the voice recognition result of the voice instruction via the UI control device 100, and executes the operation specified by the voice instruction. The AV output device 118 outputs reproduced sound or moving images to the speaker 142 or the display device 144 via the UI control device 100, and/or displays retrieved information on the display device 144.
Further, the AV output device 118 can obtain one instruction by interacting with the driver a plurality of times according to the conventional technique. For example, the AV output device 118 receives an audio instruction such as "please reproduce a song of" for reproducing a song of a specific artist "(a is an artist name) from the driver. The AV output device 118, in response to the instruction, for example, retrieves songs of the corresponding artist from the music stored in the storage device, displays a list thereof on the display device 144, and instructs the UI control device 100 to issue a speech such as "please select a song to be reproduced". Then, the AV output device 118 receives the response sound or the input via the touch panel of the display device 144 as a result of the selection by the driver.
The content providing apparatus 120 provides text information such as news and sightseeing information to the user in a reading manner. The content providing apparatus 120 includes a processing apparatus 136 including a processor such as a CPU and a storage apparatus 137. The content providing apparatus 120 cooperates with the AV output apparatus 118, for example, and stores text information, which is information retrieved by the browser of the AV output apparatus 118 in response to an instruction from the user, in the storage apparatus 137. The text information is stored in the storage device 137 as news information 138 or sightseeing information 139 for each category, for example.
Further, the processing device 136 reads text information stored in the storage device 137 aloud and outputs the information as sound from the speaker 142 in accordance with an instruction from the user via the UI control device 100. Here, the reading sound of the text information can be generated by various methods according to the related art. In addition to the sound information of the generated reading sound, the processing device 136 may display image information or display information associated with the provision of the reading sound on the display device 144 via the UI control device 100.
The UI control device 100 has the AV output device 118 and the content providing device 120 as clients, and outputs sound information and image information output from these client devices from the speaker 142 and the display device 144. The UI control device 100 acquires a voice instruction and an input instruction or input data of the user from the microphone 140 and the touch panel 146, and outputs the instructions and the input data to the corresponding client devices. As described above, in particular, the UI control device 100 controls the stop and restart of the sound output from these client devices to the user via the speaker 142.
Specifically, the UI control device 100 has a processing device 150 and a storage device 152. The storage device 152 is configured by, for example, a volatile and/or nonvolatile semiconductor memory, a hard disk device, or the like.
The processing device 150 is a computer having a processor such as a CPU, for example. The processing device 150 may be configured to have a ROM into which programs are written, a RAM for temporarily storing data, and the like. The processing device 150 includes a UI (user interface) control unit 156 and an output control unit 158 as functional elements or functional means.
The UI control unit 156 includes a voice output unit 160, a voice recognition unit 162, a display control unit 164, and an input processing unit 166 as functional elements or functional means. The output control unit 158 includes a stop instruction unit 170, a scene determination unit 172, a factor estimation unit 174, a condition determination unit 176, a notification unit 178, and a restart instruction unit 180, which are functional elements or functional means. The factor estimation unit 174 includes a load determination unit 186, a conversation determination unit 188, a sleep determination unit 190, and a content determination unit 192 as functional elements or functional units.
These functional elements of the processing device 150 are realized by the processing device 150 as a computer executing a program, for example. The computer program can be stored in advance in any computer-readable storage medium. Alternatively, all or a part of the functional elements included in the processing device 150 may be configured by hardware including one or more electronic circuit components.
The UI control section 156 controls the microphone 140, the speaker 142, the display device 144, and the touch panel 146 provided on the display screen of the display device 144 as the user interface.
The audio output unit 160 of the UI control unit 156 outputs audio information generated by the client apparatuses from the speaker 142 in accordance with instructions from the AV output apparatus 118 and the content providing apparatus 120 as the client apparatuses. The audio information may include audio information attached to music or moving images, in addition to audio generated by the client device.
The voice recognition unit 162 acquires the user's speech through the microphone 140 according to the conventional technique, performs voice recognition processing on the acquired speech, and outputs the result to the AV output device 118 and the content providing device 120. Alternatively, the voice recognition unit 162 may analyze the meaning of the voice recognition processing result and output the analysis result to the AV output device 118 and the content providing device 120, according to the conventional technique.
The display control unit 164 controls the display device 144 to output an image or video instructed by the AV output device 118 and the content providing device 120. Further, the input processing unit 166 acquires the input of the driver from the touch panel 146 according to the conventional technique, and outputs the processing result of the acquired input to the AV output device 118 and the content providing device 120.
The output control unit 158 controls the sound output from the speaker 142. The output control unit 158 stops the audio output from the speaker 142 in response to a stop instruction from the user. The output control unit 158 estimates a factor that the user has given a stop instruction, and determines the restart condition of the stopped audio output based on the factor. Then, the output control unit 158 restarts the audio output in accordance with the determined restart condition. In particular, when the audio output is resumed, the output control unit 158 notifies the user of the factor estimated as described above.
The stop instruction unit 170 of the output control unit 158 obtains, for example, a voice instruction of the user instructing to stop the voice output via the voice recognition unit 162. The voice indication may be, for example, "voice stop", etc. speech. The stop instruction unit 170 can acquire, for example, a voice recognition result of the voice instruction and volume information of the voice instruction from the UI control unit 156.
The scene determination unit 172 evaluates the driving scene of the vehicle 102 in cooperation with the driving scene evaluation device 110. The scene determination unit 172 determines the development of the driving scene, that is, the start and end of various driving scenes that change with time. Specifically, the scene determination unit 172 acquires the confidence level of each candidate scene calculated by the driving scene evaluation device 110 and the current driving scene at predetermined time intervals.
When the current driving scene acquired from the driving scene evaluation device 110 changes, the scene determination unit 172 determines that a new driving scene is started. When a new driving scene starts, the scene determination unit 172 calculates a confidence level (scene end confidence level) that the immediately preceding driving scene has been determined to have ended, based on the confidence level of the candidate scene corresponding to the immediately preceding driving scene. Here, as described above, the confidence of the candidate scene may be expressed as a numerical value in a range of, for example, 0 or more and 1 or less, which is larger as the degree of confidence is higher. Further, the scene end confidence may be calculated by, for example, subtracting the confidence of the candidate scene corresponding to the immediately preceding driving scene from 1.
When the stop instruction unit 170 receives a stop instruction of the voice output from the user, the factor estimation unit 174 estimates a factor for which the user has performed the stop instruction. Specifically, the factor estimation unit 174 determines whether the factor of the stop instruction is an increase in the driving load of the current driver driving the vehicle 102 by the load determination unit 186.
More specifically, load determination unit 186 acquires the current driving load of the current driver from driving load calculation device 116 at predetermined time intervals. Further, the load determination unit 186 determines whether or not the current driving load at the time of receiving the stop instruction is equal to or higher than a predetermined level. When the current driving load is equal to or higher than a predetermined level when the stop instruction is received, the load determination unit 186 determines that the factor that the user has performed the stop instruction is an increase in the driving load.
The factor estimation unit 174 determines whether or not the factor of the stop instruction is a conversation between the user and the fellow passenger of the vehicle 102 by the conversation determination unit 188. Here, the conversation between the user and the fellow passenger may include a conversation between the driver and the fellow passenger and a conversation between the fellow passengers.
Specifically, the conversation determination unit 188 detects whether or not a plurality of occupants including the driver are present based on the image of the in-vehicle camera 122 obtained via the camera control device 106. When a plurality of occupants are detected, the conversation determination unit 188 acquires speech sounds in the vehicle cabin from the microphone 140 via the UI control unit 156. Then, the conversation judging unit 188 analyzes the acquired speech sound, judges that a conversation is being performed between the occupants when the time of alternately speaking (replacing a speaker, or interactively speaking) between the occupants is equal to or longer than a predetermined time, and judges that the cause of the stop instruction is a conversation with the occupants.
When it is determined that a conversation is being performed between occupants and the driver is participating in the conversation, the conversation determination unit 188 may determine that the cause of the stop instruction is a conversation with the occupant. Whether or not the driver refers to the conversation can be determined by whether or not the voice of the driver is included in the conversation. Here, whether or not the voice of the driver is included in the conversation can be determined from, for example, a voice sample of the driver recorded in advance and stored in the user information management device 114.
The factor estimation unit 174 determines whether or not the factor of the stop instruction is the sleep of the passenger of the vehicle 102 by the sleep determination unit 190. Specifically, the sleep determination unit 190 detects the presence or absence of a passenger based on an image of the vehicle interior camera 122 obtained via the camera control device 106. When the passenger is detected, the sleep determination unit 190 acquires the speech sound in the vehicle cabin from the microphone 140 via the UI control unit 156. Then, when the volume of the acquired speech sound is equal to or less than a predetermined level, the sleep determination unit 190 determines that the cause of the stop instruction is the sleep of the fellow passenger.
The factor estimating unit 174 determines whether or not the factor of the stop instruction is the content of the information provided by the audio output to which the stop instruction is directed, by the content determining unit 192. Specifically, the content determination unit 192 acquires the current user's taste information from the user information management device 114, and calculates the degree of deviation between the type of information provided by the voice instruction and the taste type indicated by the acquired current user's taste information. Then, when the calculated degree of deviation is equal to or higher than a predetermined level, the content determination unit 192 determines that the factor of the stop instruction is the content of the information provided by the audio output.
The degree of deviation can be calculated according to the prior art using various methods. For example, the category and the preference category of information provided by the audio output can be plotted in a multidimensional space formed by a plurality of coordinate axes defined according to an arbitrary predetermined definition, and the distance between the categories in the multidimensional space can be calculated as the degree of deviation. The coordinate axes can be arbitrarily defined, and for example, axes in which "active" and "thinking" as languages representing the characteristics of the categories are indexed as antipodes, axes in which "field" and "indoor" are indexed as antipodes, and the like can be used.
Here, when receiving a stop instruction from the user, the factor estimation unit 174 gives priority to the determination as to whether or not the factor of the stop instruction is an increase in the driving load of the driver, over the determination as to other factors (for example, a conversation with the fellow passenger, the sleep of the fellow passenger, and the content of information). For example, the factor estimating unit 174 sequentially performs the determinations in the load determining unit 186, the session determining unit 188, the sleep determining unit 190, and the content determining unit 192, and estimates a factor related to the determination that an affirmative result is obtained first as a factor of the stop instruction.
Next, the condition determining unit 176 of the output control unit 158 determines the restart condition of the audio output stopped by the stop instruction, based on the factor of the stop instruction of the user estimated by the factor estimating unit 174. Specifically, for example, when the estimated factor is an increase in the driving load, the condition determining unit 176 determines the end of the driving scene that causes the increase in the driving load as the restart condition.
For example, when the factor estimated by the factor estimation unit 174 is a conversation with a fellow passenger, the condition determination unit 176 determines that the conversation is ended as a restart condition. Further, the condition determining unit 176 determines, for example, when the factor estimated by the factor estimating unit 174 is the sleep of the passenger, the decrease in the volume of the audio output as the restart condition.
Alternatively, for example, when the factor estimated by the factor estimating unit 174 is the content of the information, the condition determining unit 176 determines the change of the content of the information provided by the audio output as the restart condition. When the factor estimation unit 174 cannot specify a factor, that is, when the results of the determinations in the load determination unit 186, the session determination unit 188, the sleep determination unit 190, and the content determination unit 192 are all negative, the condition determination unit 176 determines that a predetermined time has elapsed since the stop instruction as the restart condition.
When the audio output stopped by the stop instruction from the user is resumed, the notification unit 178 of the output control unit 158 notifies the user of the estimated factor through the speaker 142, for example. The notification may include a reason for restarting the audio output according to the factor estimated by the factor estimating unit 174. Alternatively, the notification may include a restart condition of the audio output according to the estimated factor. Further, the notification may include an inquiry to the user as to whether or not the already stopped sound output can be restarted.
The notification unit 178 performs "can the tour information be restarted as if the conversation with the fellow passenger has ended when the factor estimated by the factor estimation unit 174 is a conversation with the fellow passenger? "and so on. In this case, "as if the conversation with the fellow passenger had ended" is a sentence indicating the reason for restarting the audio output corresponding to the factor estimated by the factor estimating unit 174, "can the sightseeing information just before be restarted? "is a query to the user as to whether or not the sound output stopped by the stop instruction of the user can be restarted. Further, the part of the "sightseeing information just before" becomes a reminder for the contents of the interrupted sound output. By including such a reminder in the notification, it is possible to facilitate the determination of the user regarding the question as to whether or not to restart the voice output when the user's thought is away from the content of the voice output, particularly when the interruption time of the voice output exceeds a predetermined time and is long, or when a conversation with the fellow passenger is performed during the interruption of the voice output.
Further, for example, when the factor estimated by the factor estimation unit 174 is the sleep of the fellow passenger, the notification unit 178 performs "can reduce the sound volume and resume the previous sightseeing information as if the fellow passenger had slept? "and so on. In this case, "the fellow passenger is asleep" is a term indicating a reason for restarting the audio output according to the factor estimated by the factor estimating unit 174. Further, "can volume be reduced and the previous sightseeing information restarted? "is a term indicating a restart condition of the audio output corresponding to the above estimated factor, and is an inquiry to the user as to whether or not the audio output stopped by the stop instruction from the user can be restarted.
Further, for example, when the factor estimated by the factor estimation unit 174 is the content of information, the notification unit 178 performs "do to change the topic? Information about what is you like basketball? "and so on. For easy understanding, the series of words included in the notification include a presentation of the reason for restarting the audio output and the restart condition corresponding to the factor estimated by the factor estimation unit 174, and an inquiry to the user as to whether or not the audio output can be restarted. In this case, "is topic to be changed? "the sentence part may be omitted. This is because, in "there is information about what basketball you like, how? The statement "implies that" the content of information "is estimated as a factor of the stop instruction.
The notification unit 178 acquires the current driver preference information from the user information management device 114, and suggests a change of the information content as the above-described restart condition. The notification unit 178 searches for contents stored in the storage device of the content providing device 120, for example, based on the acquired preference information, and extracts contents of a category having a deviation distance from any of the preference categories indicated by the preference information of a predetermined value or less. Then, reproduction of the extracted content can be presented as the above-described reproduction condition, and it is suggested to perform the reproduction.
When the factor estimated by the factor estimating unit 174 is "increase in driving load", the notifying unit 178 notifies that the driving scene causing the increase in driving load ends as the restart reason. For example, the notification section 178 performs "can the sightseeing information just before be restarted because the emergency vehicle has passed? "and so on. Here, "since the emergency vehicle has passed" is an expression of a driving scene causing an increase in driving load.
The notification unit 178 notifies the user that a predetermined notification sound is included when the factor estimated by the factor estimation unit 174 is "an increase in the driving load", the elapsed time from the instruction to stop the user to the end of the driving scene causing the increase in the driving load is equal to or less than a predetermined time, and the reliability of the determination of the end of the driving scene causing the increase in the driving load is equal to or greater than a predetermined value. In addition, when a notification including a predetermined notification sound is given, the notification may not include an inquiry as to whether or not the sound output can be restarted. That is, in this case, the sound output is automatically restarted following the notification sound.
Thus, when the user instructs to stop the audio output due to the temporary increase in the driving load, the user does not need to receive a request for restarting the audio output all at once, and can listen to the audio output again immediately after the driving scene causing the temporary increase in the driving load is finished.
Here, as described above, the condition "the reliability of the judgment of the end of the driving scene is equal to or higher than the predetermined value" is to avoid the situation where the sound output is automatically restarted when the driving scene is not actually ended, more reliably.
The "reliability of the determination of the end of the driving scene" corresponds to the scene end confidence calculated by the scene determination unit 172. The elapsed time from the stop instruction to the end of the driving scene can be the time measured by the notification unit 178.
For example, when the stop instruction unit 170 receives a stop instruction from the user, the notification unit 178 starts measuring the elapsed time, and when the factor estimated by the factor estimation unit 174 is "increase in driving load", acquires the scene end confidence calculated by the scene determination unit 172 later. Then, the notification unit 178 may set the elapsed time from the reception of the stop instruction to the reception of the scene end confidence as the elapsed time from the reception of the stop instruction to the end of the driving scene causing an increase in the driving load.
When the user returns an affirmative response to the notification including the inquiry about whether or not the audio output can be resumed by the notification unit 178, the resume instruction unit 180 of the output control unit 158 instructs the corresponding client apparatus, that is, the AV output apparatus 118 or the content providing apparatus 120, to resume the audio output in accordance with the notification.
Here, the "restart of the sound output in accordance with the notification" means, in addition to simply restarting the stopped sound output, a sound output of a reduced volume suggested in the notification or a sound output for information suggested in the notification when the estimated factor is "sleeping of the fellow passenger" or "content of information", respectively. When these factors are estimated, for example, when a restart instruction is given to the corresponding client apparatus, the restart instruction unit 180 adds instructions regarding the designation of the volume of the sound output to be restarted and the designation of the information to be provided. Note that the volume of the restarted audio output may be specified by the restart instruction unit 180 for the audio output unit of the UI control unit 156.
In the UI control device 100 having the above-described configuration, if an instruction to stop the audio output is received from the user while the audio content or the like is being output, the factor estimation unit 174 estimates the factor causing the user to perform the instruction to stop. Then, the condition determining unit 176 determines a restart condition for the stopped audio output based on the factor estimated by the factor estimating unit 174. Thus, the UI control device 100 can restart the audio output stopped by the user under appropriate conditions according to the cause of the stop.
Further, in the UI control device 100, when the voice output stopped by the stop instruction from the user is restarted, the user is notified of the estimated factor. The notification may include a reason for restarting the audio output and/or a restart condition corresponding to the estimated factor, and/or a query to the user as to whether the audio output can be restarted. Thus, the UI control device 100 can restart the audio output stopped by the user while ensuring the user's recognition.
Next, a control process of the audio output performed by the output control unit 158 of the UI control device 100 will be described. Fig. 2 is a flowchart showing the steps of the control process. This process starts when the power of the UI control device 100 is turned on, and ends when the power is turned off.
In parallel with this process, the UI control unit 156 of the UI control device 100 outputs audio and video from the speaker 142 and the display device 144 in response to an instruction from the AV output device 118 and/or the content providing device 120, which are client devices. In parallel with this processing, the UI control unit 156 acquires voice and input from the user via the microphone 140 and the touch panel, and transmits the voice and input to the corresponding client apparatus.
When the processing is started, the output control unit 158 starts evaluation of the driving scene by the scene determination unit 172 (S100). Next, the stop instruction unit 170 of the output control unit 158 determines whether or not there is an audio output from the speaker 142 (S102). For example, when the AV output device 118 and the content providing device 120, which are client devices, start an operation involving audio output to the user, the start of the audio output operation is notified to the UI control device 100, and the stop instruction unit 170 can determine whether or not audio output is present based on whether or not the notification is received.
When there is no audio output (no in S102), the stop instruction unit 170 returns to step S102 and repeats the processing. On the other hand, when there is a voice output (yes in S102), the stop instruction unit 170 determines whether or not a stop instruction of the voice output is given from the user (S104). The stop instruction unit 170 can determine whether or not a stop instruction is given, based on whether or not the stop instruction is received from the voice recognition unit 162 or the input processing unit 166 of the UI control unit 156 from the user as a voice instruction acquired by the microphone 140 or an input acquired via the touch panel 146.
When the stop instruction unit 170 does not issue a stop instruction (no in S104), it determines whether or not the audio output has ended (S106). For example, when the AV output device 118 and the content providing device 120, which are client devices, end an operation involving audio output to the user, the end of the audio output operation is notified to the UI control device 100, and the stop instruction unit 170 can determine whether or not the audio output has ended based on whether or not the notification is received.
When the audio output has ended (yes in S106), the stop instruction unit 170 returns the process to step S102. On the other hand, when the audio output is not completed (no in S106), the stop instruction unit 170 returns the process to step S104.
On the other hand, when the stop instruction is issued from the user in step S104 (yes in S104), the stop instruction unit 170 instructs the corresponding client apparatus to temporarily interrupt the current audio output operation (S108). Thus, the corresponding client apparatus interrupts the corresponding audio output operation and waits.
Next, the output control unit 158 of the UI control device 100 executes factor estimation processing for estimating the factor for which the user has performed the stop instruction by the factor estimation unit 174 (S110). Next, the output control unit 158 executes a condition determination process (S112) to determine a restart condition corresponding to the estimated factor for the interrupted audio output. Further, the output control unit 158 executes a notification process (S114) to notify the user of the estimated factor when the interrupted audio output is restarted. The procedure of the above-described factor estimation process, condition determination process, and notification process will be described later.
Next, the output control unit 158 instructs the corresponding client apparatus to restart or end the audio output in response to a response to the notification from the user or the like by the restart instruction unit 180 (S116), and returns to step S102 to repeat the processing.
Specifically, when the resume flag set in the notification process described later is 0, the resume instruction unit 180 instructs the corresponding client apparatus to end the audio output. On the other hand, when the restart flag is 1, the corresponding client apparatus is instructed to restart the audio output. At this time, when there is a restart condition set in the notification unit 178, the restart instruction unit 180 instructs the corresponding client apparatus of the restart condition.
Next, the procedure of the process in the above-described factor estimation process (S110) will be described. Fig. 3 is a flowchart showing the steps of the factor estimation process. When the process is started, the factor estimation unit 174 of the output control unit 158 determines whether or not the factor that the user has made the stop instruction is an increase in the driving load of the driver of the vehicle 102, by the load determination unit 186 (S200). When it is determined that the factor is an increase in the driving load (yes at S200), the load determination unit 186 sets the factor flag to 1(S202), and then ends the process.
Thus, the factor estimation unit 174 determines whether or not the factor giving the stop instruction has priority over the determination of the other factors is the increase in the driving load of the driver. After the end of the present process shown in fig. 3, the process of the output control unit 158 proceeds to the condition determination process of step S112 shown in fig. 2.
On the other hand, when determining that the factor of the user' S stop instruction is not an increase in the driving load (no in S200), the factor estimation unit 174 determines whether or not the factor is a conversation between the driver and the fellow passenger of the vehicle 102 by the conversation determination unit 188 (S204). When determining that the factor is a conversation with the fellow passenger (yes in S204), the conversation determination unit 188 sets the factor flag to 2(S206), and then ends the process.
On the other hand, when determining that the factor of the instruction to stop the user is not a conversation with the fellow passenger (no in S204), the factor estimation unit 174 determines whether or not the factor is a sleep of the fellow passenger of the vehicle 102 by the sleep determination unit 190 (S208). When it is determined that the factor is the sleep of the fellow passenger (yes in S208), the sleep determination unit 190 sets the factor flag to 3(S210), and then ends the process.
On the other hand, when it is determined that the cause of the stop instruction by the user is not the sleep of the passenger (no in S208), the cause estimation unit 174 determines whether or not the cause is the content of the information provided by the audio output by the content determination unit 192 (S212). When it is determined that the factor is the content of the information (yes in S212), the content determination unit 192 sets the factor flag to 4(S214), and then ends the process.
On the other hand, when determining that the factor is not the content of the information (no at S212), the factor estimating unit 174 sets the factor flag to 0(S216), and then ends the process.
Next, a procedure of the process in the condition determination process (S112) shown in fig. 2 will be described. Fig. 4 is a flowchart showing the steps of the condition decision processing. When the process is started, the condition determining unit 176 of the output control unit 158 determines whether or not the factor flag set in the factor estimation process (fig. 3) is set to 1 (S300). Then, when the factor flag is 1 (increase in driving load) (S300, yes), the condition determination unit 176 sets the end of the current driving scene causing the increase in driving load as a restart condition for audio output (S302), and then ends the present process. After the end of the present process shown in fig. 4, the process of the output control unit 158 proceeds to the notification process of step S114 shown in fig. 2.
On the other hand, if the factor flag is not 1 in step S300 (no in S300), the condition determining unit 176 determines whether or not the factor flag is set to 2 (S304). Then, when the factor flag is 2 (conversation with the fellow passenger) (yes in S304), the condition determination unit 176 sets the end of the conversation as a resume condition of the audio output (S306), and then ends the present process.
On the other hand, if the factor flag is not 2 in step S304 (no in S304), the condition determining unit 176 determines whether the factor flag is set to 3 (S308). Then, when the factor flag is 3 (the fellow passenger sleeps) (yes in S308), the condition determination unit 176 sets the decrease in the volume of the audio output as a restart condition of the audio output (S310), and then ends the present process.
On the other hand, if the factor flag is not 3 in step S308 (no in S308), the condition determining unit 176 determines whether or not the factor flag is set to 4 (S312). Then, when the factor flag is 4 (the content of the information) (S312, yes), the condition determination unit 176 sets the change of the content of the information provided by the audio output as a resume condition of the audio output (S314), and then ends the present process.
On the other hand, if the factor flag is not 4 in step S312 (no in S312), the condition determination unit 176 sets a predetermined time period after receiving the stop instruction as a restart condition of the audio output (S316), and then ends the present process.
Next, a procedure of the process in the notification process (S114) shown in fig. 2 will be described. Fig. 5 is a flowchart showing the steps of the notification process. When the process is started, the notification unit 178 of the output control unit 158 determines whether or not the factor flag set in the factor estimation process (fig. 3) is set to 1 (increase in driving load) (S400). When the factor flag is 1 (yes in S400), the condition determination unit 176 waits for the end of the current driving scenario causing the increase in the driving load in accordance with the restart condition determined by the condition determination unit 176 in the condition determination process by the notification unit 178 (S402). The scene determination unit 172 can determine whether or not the driving scene has ended by determining whether or not the current driving scene acquired from the driving scene evaluation device 110 has changed at predetermined time intervals.
Next, the notification unit 178 determines whether or not the elapsed time from the stop instruction to the end of the driving scene is equal to or less than a predetermined time (e.g., 5 seconds) (S404). When the elapsed time is equal to or less than the predetermined time (yes at S404), the notification unit 178 determines whether or not the scene end confidence of the driving scene determined to have ended at step S402 is equal to or more than a predetermined value (S406).
When the scene end confidence is equal to or higher than the predetermined value (yes at S406), the notification unit 178 outputs a notification sound as a notification (S408), sets the restart flag to 1(S410), and then ends the present process. After the end of the present process shown in fig. 5, the process of the output control unit 158 proceeds to step S116 shown in fig. 2.
On the other hand, when the elapsed time exceeds the predetermined time in step S404 (no in S404), or when the scene end confidence is smaller than the predetermined value (no in S406), the notification unit 178 outputs a notification including an expression indicating that the driving scene causing the increase in the driving load has ended as the reason for restarting the audio output and an inquiry sentence indicating whether or not the audio output can be restarted (S412).
Next, the notification unit 178 determines whether or not the answer to the inquiry as to whether or not the restart is possible is affirmative, that is, whether or not the restart of the audio output is permitted (S414). When the user answer is not affirmative (S414, no), that is, when the answer is negative, the notification unit 178 sets the resume flag to 0(S416), and the process ends. On the other hand, when the user answer is affirmative (yes at S414), the process proceeds to step S410.
On the other hand, if the factor flag is not 1 in S400 (no in S400), the notification unit 178 determines whether or not the factor flag is 2 (conversation with the fellow passenger) (S418). When the factor flag is 2 (yes in S418), the notification unit 178 waits for the end of the conversation with the fellow passenger according to the restart condition determined by the condition determination unit 176 in the condition determination process (S420). The notification unit 178 can determine that the conversation with the passenger has ended when the period during which the speech sound of the passenger is absent or the period during which the passenger does not take turns continues for a predetermined time or longer, for example, based on the sound in the vehicle 102 acquired from the microphone 140.
Next, the notification unit 178 outputs a notification including an inquiry statement indicating that the session has ended and indicating the reason for resuming the audio output and whether the audio output can be resumed (S422), and the process proceeds to step S414.
On the other hand, if the factor flag is not 2 in S418 (no in S418), the notification unit 178 determines whether or not the factor flag is 3 (the fellow passenger sleeps) (S424). When the factor flag is 3 (yes in S424), the notification unit 178 outputs a notification of an inquiry sentence including the restart condition (decrease in volume) determined by the condition determination unit 176 in the condition determination process and whether or not the audio output can be restarted (S426), and the process proceeds to step S414.
On the other hand, if the factor flag is not 3 in S424 (no in S424), the notification unit 178 determines whether or not the factor flag is 4 (content of information) (S428). When the factor flag is not 4 (no in S428), the notification unit 178 waits for a predetermined time to elapse since the reception of the user stop instruction according to the restart condition determined by the condition determination unit 176 in the condition determination process (S430). Next, the notification unit 178 outputs a notification including an inquiry statement indicating whether or not the sound output can be restarted (S432), and then the process proceeds to step S414.
On the other hand, when the factor flag is 4 in S428 (yes in S428), the notification unit 178 outputs a notification of an inquiry sentence including the restart condition (change of content) determined by the condition determination unit 176 in the condition determination process and whether or not the audio output can be restarted (S434), and the process proceeds to step S414.
The present invention is not limited to the configurations of the above-described embodiments and modifications, and can be implemented in various embodiments without departing from the scope of the invention.
For example, in the above-described embodiment, the UI control device 100 is shown as an example of the sound output control device, but the sound output control device of the present invention is not limited to the UI control device 100. The sound output control means may be implemented as any means that controls sound output. For example, the sound output control apparatus may be implemented as an apparatus in which the UI control section 156 is removed from the UI control apparatus 100. Such a sound output control apparatus can execute the control method shown in fig. 2 in cooperation with an apparatus in which the output control unit 158 is removed from the UI control apparatus 100.
Further, in the UI control device 100, as candidates for the factor that the user has made the stop instruction, an increase in the driving load, a conversation between the user and the fellow passenger, a sleep of the fellow passenger, and the content of the provided information are determined, but the candidates for the factor are not limited to these. For example, at least one of these factor candidates may be determined as a candidate for a factor. Further, other items may be determined as factor candidates.
For example, it is also possible to determine whether there is any event that can be a factor of a stop instruction of the audio output, such as a conversation with a person outside the vehicle to pass through a window, replacement of the driver, and temporary alighting of the driver, as a factor candidate. In the above-described example of the factor candidates, the termination of the conversation, the completion of the replacement, and the driver's re-riding the vehicle may be conditions for restarting the audio output according to each factor.
In the above-described embodiment, an increase in the driving load (expansion of the driving scene) is shown as an example of a case where the time from the instruction to stop the audio output until the factor of the instruction disappears (hereinafter, factor disappearing time) is short, and a notification sound is shown as a notification relating to the resumption of the audio output to the user when the factor disappearing time is short. However, the case where the factor disappearance time is short is not limited to the case where the driving load is increased. For example, even when the factor disappearance time is short, the notification sound can be used as the notification of the resumption of the voice output to the user at the time of the replacement of the driver or the temporary alighting of the driver.
In the above-described embodiment, the UI control device 100 as the audio output control device is an in-vehicle device, but the implementation of the audio output control device is not limited to the in-vehicle device. The audio output control device may be any device that controls audio output. Such a device may be, for example, a portable terminal such as a smartphone. In this case, a portion of the portable terminal that functions as the audio output control device may be implemented as a software function portion in the portable terminal. Such a part of the audio output control device has the same configuration as the output control unit 158 of the UI control device 100 shown in fig. 1, and can execute the same control method as that of fig. 2 to 5.
In this way, the software functional unit can stop the audio output generated by, for example, a functional unit that controls AV output, which is another software functional unit, in response to a stop instruction from the user, estimate the cause of the stop instruction, determine a restart condition corresponding to the estimated cause, and perform notification corresponding to the estimated cause. In this case, the output control unit as the software function unit of the mobile terminal may not include the parts corresponding to the scene determination unit 172 and the load determination unit 186 that perform the operation relating to the driving scene.
As described above, the UI control device 100 as the above-described audio output control device executes the control method shown in fig. 2 to 5 in order to control audio output. The control method includes the following steps (S108): in the course of the voice output to the user, the stop instruction section 170 stops the voice output in response to receiving a stop instruction of the voice output from the user. Further, the control method includes: a step (S110) in which the factor estimating unit 174 estimates a factor for which the user has performed the stop instruction in response to the reception of the stop instruction; and a step (S114) in which the notification unit 178 notifies the user of the estimated factor when the stopped sound output is restarted.
According to this configuration, it is possible to appropriately restart the audio output stopped by the user while ensuring the user's recognition.
The notification may include a reason for restarting the audio output according to the estimated factor (S412, S422). The notification may include a restart condition of the audio output according to the estimated factor (S426, S434). According to these configurations, since what factor is estimated as the factor of the stop instruction is explicitly or implicitly presented to the user, it is possible to appropriately restart the audio output while ensuring the user's recognition.
Further, the user may include a driver of the vehicle 102. In the step of notifying (S114), when the estimated factor is an increase in the driving load of the driver, the notification is performed including the end of the driving scene causing the increase in the driving load as a restart reason (S412). According to this configuration, the sensory expression "reduction in driving load" is not used as the restart reason included in the notification, but an expression relating to "end of driving scene" that is easily understood by the driver is used. Therefore, according to the above configuration, it is possible to restart the sound output appropriately while ensuring the recognition of the user as the driver.
Further, the notification includes an inquiry to the user as to whether or not the stopped sound output can be restarted (S412, S422, S426, S432, S434). According to this configuration, since permission of the user is requested when the audio output is restarted, the user's approval can be ensured more reliably.
In the step of notifying (S114), when the estimated factor is an increase in the driving load of the driver (yes in S400), the elapsed time from the stop instruction to the end of the driving scene causing the increase in the driving load is equal to or less than a predetermined time (yes in S404), and the reliability of the determination of the end of the driving scene causing the increase in the driving load (scene end confidence) is equal to or more than a predetermined value (yes in S406), a notification including a predetermined notification sound is given to the user (S408).
In general, the driving load of the vehicle may rapidly change according to the development of the driving scene. Therefore, in a vehicle, there may be a large number of scenes where it is only desirable to stop the sound output for a very short time. In this case, if the reason for restarting is sequentially notified, there is a possibility that it is rather inconvenient for the driver. According to the above configuration, rapid development of the driving scene can be quickly and reliably captured, and notification based on a notification sound that can be intuitively grasped is performed without notifying a restart reason, so that it is possible to appropriately restart sound output while ensuring user convenience.
The notification including the predetermined notification sound does not include an inquiry to the user as to whether or not the stopped sound output can be restarted (S408). According to this configuration, in the scene as described above in which it is desired to stop the audio output for a very short time, the user is not asked whether to restart the audio output, and therefore the audio output can be appropriately restarted without disturbing the driver who is dealing with the rapid progress of the driving scene.
Further, the UI control device 100 as the sound output control device controls sound output. The UI control device 100 includes: a stop instruction section 170 that, in the course of voice output to the user, stops the voice output in response to receiving a stop instruction of the voice output from the user by the stop instruction section 170; a factor estimating unit 174 that estimates a factor that the user has performed the stop instruction in response to the stop instruction being received; and a notification unit 178 for notifying the user of the estimated factor when the stopped sound output is restarted.
According to this configuration, it is possible to appropriately restart the audio output stopped by the user while ensuring the user's recognition.

Claims (8)

1. A method of controlling sound output, the method having the steps of:
a stop instruction section that stops the sound output in response to receiving a stop instruction of the sound output from a user, in the sound output to the user;
a factor estimating section that estimates a factor that the user has made the stop instruction in response to a case where the stop instruction is received; and
when the stopped sound output is restarted, a notification unit notifies the user of the factor estimated.
2. The control method of sound output according to claim 1,
the notification includes a reason for restarting the sound output corresponding to the estimated factor.
3. The control method of sound output according to claim 1 or 2,
the notification includes a restart condition of the sound output corresponding to the estimated factor.
4. The control method of sound output according to claim 2,
the user includes a driver of the vehicle,
in the notifying step, when the estimated factor is an increase in the driving load of the driver, the notification including an end of a driving scene causing the increase in the driving load as the restart reason is performed.
5. The control method of sound output according to any one of claims 1 to 4,
the notification includes a query to the user regarding whether the stopped sound output can be restarted.
6. The control method of sound output according to any one of claims 1 to 5,
the user includes a driver of the vehicle,
in the step of making the notification, the step of notifying,
when the estimated factor is an increase in the driving load of the driver, and the elapsed time from the stop instruction to the end of the driving scene causing the increase in the driving load is equal to or less than a predetermined time, and the reliability of the determination of the end of the driving scene causing the increase in the driving load is equal to or more than a predetermined value,
the notification including a predetermined notification sound is performed to the user.
7. The control method of sound output according to claim 6,
the notification including the prescribed notification sound does not include a query to the user as to whether or not resumption is possible.
8. An audio output control device for controlling audio output,
the sound output control device includes:
a stop instruction unit that stops the audio output in response to receiving a stop instruction of the audio output from a user, among the audio outputs to the user;
a factor estimating unit that estimates a factor that the user has performed the stop instruction in response to a case where the stop instruction is received; and
and a notification unit configured to notify the user of the estimated factor when the stopped sound output is restarted.
CN202110284661.1A 2020-03-26 2021-03-17 Method and device for controlling sound output Pending CN113516978A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-055563 2020-03-26
JP2020055563A JP7407046B2 (en) 2020-03-26 2020-03-26 Audio output control method and audio output control device

Publications (1)

Publication Number Publication Date
CN113516978A true CN113516978A (en) 2021-10-19

Family

ID=77917692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110284661.1A Pending CN113516978A (en) 2020-03-26 2021-03-17 Method and device for controlling sound output

Country Status (2)

Country Link
JP (1) JP7407046B2 (en)
CN (1) CN113516978A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293411A (en) * 1999-10-15 2001-05-02 精工爱普生株式会社 Data transmission control device and electron equipment
US20050267759A1 (en) * 2004-01-29 2005-12-01 Baerbel Jeschke Speech dialogue system for dialogue interruption and continuation control
JP2010035138A (en) * 2008-07-03 2010-02-12 Denso It Laboratory Inc Communication processing apparatus
CN102906732A (en) * 2010-05-20 2013-01-30 瑞萨电子株式会社 Data processor and electronic control unit
US20150006166A1 (en) * 2013-07-01 2015-01-01 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and vehicles that provide speech recognition system notifications
JP2016050964A (en) * 2014-08-28 2016-04-11 株式会社デンソー Reading control unit and telephone call control unit
CN106030701A (en) * 2014-02-22 2016-10-12 奥迪股份公司 Method for acquiring at least two pieces of information to be acquired, comprising information content to be linked, using a speech dialogue device, speech dialogue device, and motor vehicle
WO2019026360A1 (en) * 2017-07-31 2019-02-07 ソニー株式会社 Information processing device and information processing method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6896540B2 (en) 2017-07-18 2021-06-30 アルパイン株式会社 In-vehicle system
WO2019146309A1 (en) 2018-01-26 2019-08-01 ソニー株式会社 Information processing device, information processing method, and program

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293411A (en) * 1999-10-15 2001-05-02 精工爱普生株式会社 Data transmission control device and electron equipment
US20050267759A1 (en) * 2004-01-29 2005-12-01 Baerbel Jeschke Speech dialogue system for dialogue interruption and continuation control
JP2010035138A (en) * 2008-07-03 2010-02-12 Denso It Laboratory Inc Communication processing apparatus
CN102906732A (en) * 2010-05-20 2013-01-30 瑞萨电子株式会社 Data processor and electronic control unit
US20150006166A1 (en) * 2013-07-01 2015-01-01 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and vehicles that provide speech recognition system notifications
CN106030701A (en) * 2014-02-22 2016-10-12 奥迪股份公司 Method for acquiring at least two pieces of information to be acquired, comprising information content to be linked, using a speech dialogue device, speech dialogue device, and motor vehicle
JP2016050964A (en) * 2014-08-28 2016-04-11 株式会社デンソー Reading control unit and telephone call control unit
WO2019026360A1 (en) * 2017-07-31 2019-02-07 ソニー株式会社 Information processing device and information processing method

Also Published As

Publication number Publication date
JP7407046B2 (en) 2023-12-28
JP2021156993A (en) 2021-10-07

Similar Documents

Publication Publication Date Title
CN106803423B (en) Man-machine interaction voice control method and device based on user emotion state and vehicle
AU2020202415B2 (en) Modifying operations based on acoustic ambience classification
JP4534925B2 (en) Vehicle information providing device
US10929652B2 (en) Information providing device and information providing method
CN111402925B (en) Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium
US20130325478A1 (en) Dialogue apparatus, dialogue system, and dialogue control method
JP6713490B2 (en) Information providing apparatus and information providing method
US20190287520A1 (en) Dialog processing system, vehicle having the same, dialog processing method
JP2006092430A (en) Music reproduction apparatus
JP2003337039A (en) Interactive information providing apparatus, interactive information providing program and storage medium for storing the same
CN111681651B (en) Agent device, agent system, server device, method for controlling agent device, and storage medium
CN113450788A (en) Method and device for controlling sound output
US11282517B2 (en) In-vehicle device, non-transitory computer-readable medium storing program, and control method for the control of a dialogue system based on vehicle acceleration
JP7235554B2 (en) AGENT DEVICE, CONTROL METHOD OF AGENT DEVICE, AND PROGRAM
CN109785861B (en) Multimedia playing control method based on driving data, storage medium and terminal
CN113879235A (en) Method, system, equipment and storage medium for multi-screen control of automobile
CN113516978A (en) Method and device for controlling sound output
JP3505982B2 (en) Voice interaction device
JP2001014599A (en) Device and method for controlling vigilance and computer readable recording medium with vigilance management program stored therein
US11897481B2 (en) Driver assistance system and driver assistance method
JP7388962B2 (en) Standby time adjustment method and device
JP7039872B2 (en) Vehicle travel recording device and viewing device
CN111993997A (en) Pedestrian avoidance prompting method, device, equipment and storage medium based on voice
JP6555113B2 (en) Dialogue device
US20240025416A1 (en) In-vehicle soundscape and melody generation system and method using continuously interpreted spatial contextualized information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination