WO2023209888A1 - Information processing device - Google Patents

Information processing device Download PDF

Info

Publication number
WO2023209888A1
WO2023209888A1 PCT/JP2022/019144 JP2022019144W WO2023209888A1 WO 2023209888 A1 WO2023209888 A1 WO 2023209888A1 JP 2022019144 W JP2022019144 W JP 2022019144W WO 2023209888 A1 WO2023209888 A1 WO 2023209888A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
unit
guidance voice
route
information processing
Prior art date
Application number
PCT/JP2022/019144
Other languages
French (fr)
Japanese (ja)
Inventor
悟 滝澤
Original Assignee
パイオニア株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パイオニア株式会社 filed Critical パイオニア株式会社
Priority to PCT/JP2022/019144 priority Critical patent/WO2023209888A1/en
Publication of WO2023209888A1 publication Critical patent/WO2023209888A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output

Definitions

  • the present invention relates to an information processing device.
  • Car navigation systems that provide route guidance using audio information are known (for example, see Patent Documents 1 and 2).
  • Patent Document 1 discloses a route guidance device that includes a guidance output section that outputs a guidance voice and a direction input section that serves as an input means.
  • Patent Document 2 discloses a speech recognition device that includes a determination unit that determines a recognition vocabulary comprehension level that indicates the extent to which a user understands speech-recognizable vocabulary from the acquired user's utterance content.
  • JP2008-261641A Japanese Patent Application Publication No. 2012-27487
  • a route guidance device such as that described in Patent Document 1
  • the user inputs the direction into the direction input section and queries the system side. Check whether the recognized direction is correct.
  • a microphone collects speech indicating the contents of a user's utterance, a speech recognition unit performs speech recognition processing on the collected speech, and a count is performed during the speech recognition processing.
  • the recognition vocabulary understanding level is determined based on the number of timeouts and the like. Then, the guidance is changed according to the recognition vocabulary understanding level.
  • the system is able to determine whether or not the user was able to speak with the system when determining recognition vocabulary comprehension, it is not possible for the system to understand why the user was able to speak (or why the user was unable to speak). difficult to do.
  • the problem to be solved by the present invention is that when there is a matter that the user wants to confirm due to reasons such as the user not being able to understand the guidance voice or the fact that the guidance voice is lacking in some matters, the situation is An example of this is for the device or system to understand the cause of the occurrence of an item that the user wants to confirm regarding the guidance voice so that the device or system can use the information to improve the guidance voice.
  • the invention according to claim 1 includes: an audio output unit that outputs a guidance voice to be provided to a user; a detection unit that detects a reaction of the user to the guidance voice; an extraction unit that extracts the user's confirmation behavior regarding the guide voice from among the detected reactions; an estimation unit that estimates what the user wants to confirm based on the confirmation behavior;
  • the device is characterized by comprising a storage unit that stores information about the item.
  • the invention according to claim 10 is an information processing method executed by a computer, comprising: an audio output step of outputting a guidance voice to be provided to a user; a detection step of detecting a reaction of the user to the guidance voice; an extraction step of extracting the user's confirmation behavior regarding the guidance voice from among the reactions detected in the detection step; an estimation step of estimating what the user wants to confirm based on the confirmation behavior;
  • the method is characterized by comprising a storage step of storing information about the estimated item.
  • the invention according to claim 11 is characterized in that the information processing method according to claim 10 is executed by a computer as an information processing program.
  • the invention according to claim 11 is characterized in that the information processing program according to claim 10 is stored in a computer-readable storage medium.
  • FIG. 1 is a schematic configuration diagram of an information processing device according to an embodiment of the present invention.
  • 2 is a flowchart of the operation of the information processing apparatus shown in FIG. 1.
  • FIG. FIG. 2 is a schematic configuration diagram of an information processing device in Example 2.
  • FIG. 3 is a schematic configuration diagram of an information processing device in Example 3.
  • the audio output unit outputs guidance audio provided to the user.
  • the detection unit detects the user's reaction to the guidance voice.
  • the extraction unit extracts the user's confirmation behavior regarding the guidance voice from among the reactions detected by the detection unit.
  • the estimating unit estimates what the user wants to confirm based on the confirmation behavior.
  • the storage unit stores information about the items estimated by the estimation unit.
  • the extraction section can extract the user's confirmation behavior from among the user's reactions detected by the detection section, so that it is possible to detect that the user needs some kind of confirmation regarding the guidance voice.
  • the information processing device It is possible for the information processing device to understand that a situation has occurred in which the user wants to confirm something due to reasons such as the user not being able to understand the guidance voice or the information being lacking in the guidance voice.
  • the estimating section estimates the items that the user wants to confirm based on the confirmation behavior
  • the storage section can store information about the items estimated by the estimating section, so it can accumulate information about the estimated items. It can be utilized. Therefore, by analyzing the accumulated information, it is possible for the device or system side to understand the cause of the occurrence of the matters that the user wants to confirm regarding the guidance voice, so that it can be used to improve the guidance voice. .
  • the storage unit may be configured to classify and store information about the items estimated by the estimation unit according to predetermined conditions.
  • information about the matters estimated by the estimation unit can be stored, for example, as “the content of the guidance voice,” “the voice of the guidance voice itself,” “things related to place names,” and “things related to the speed of the guidance voice.”
  • ⁇ Things related to the pronunciation of the guidance voice,'' and so on, so for example, regarding the first guidance voice we can analyze by category what kind of content the user needed to confirm.
  • the information processing device may include a route search unit that searches for a route for a moving body
  • the guidance voice may be a voice that conveys to the user the route for the mobile body that has been searched by the route search unit.
  • the confirmation action does not have to be an action for the route search unit.
  • a user driving a mobile vehicle stops the vehicle on the roadside after listening to a guidance voice and searches for information using a device such as a smartphone, or when the driver of a mobile vehicle asks a fellow passenger to stop the vehicle and search for information using a device such as a smartphone.
  • a route search unit such as when you instruct someone to re-check the route using a mobile device, check their behavior. It can be extracted as an action. Therefore, it is possible to make it easier for the information processing device to understand that a situation in which the user wants to check the guidance voice has occurred, and to analyze the cause of the occurrence of the situation in which the user wants to check the guidance voice. , and can contribute to improving the guidance voice.
  • the information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user.
  • the output section, the detection section, the extraction section, the estimation section, and the storage section may be provided in a terminal device that moves together with the mobile object. By doing so, the information processing device can be configured with one terminal device without using external equipment such as a server device.
  • the information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user.
  • the output section, the extraction section, the estimation section, and the storage section may be provided in a first terminal device that moves together with the mobile object.
  • the detection unit may be provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device. By doing so, the load on each device can be reduced compared to an information processing device configured with one terminal device.
  • the information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user.
  • the output unit may be provided in a first terminal device that moves together with the mobile object.
  • the detection unit may be provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device.
  • the extraction unit, the estimation unit, and the storage unit may be provided in a server device that can communicate with the first terminal device and the second terminal device. By doing so, the load on each device can be reduced compared to an information processing device configured with one terminal device or an information processing device configured with a first terminal device and a second terminal device. can.
  • the second terminal device may include an information search unit independent from the route search unit.
  • the confirmation actions include the user using the information search unit to search for a route for a moving object, the user using the information search unit to search for the coordinates of a specific point, and the user searching for information. It may be possible to perform a keyword search for a specific word using the section. By doing this, if a problem occurs such as the user not being able to understand the guidance voice or there being something missing in the guidance voice, the user can resolve the problem independently from the route search section. By using the information retrieval unit, it is possible to extract the problem that was attempted to be solved as a confirmation action.
  • the confirmation action is the user making a sound in response to the guidance sound
  • the estimating unit may estimate the item the user wants to confirm based on the sound made by the user.
  • the information processing device can understand from the voice emitted by the user that the user needs to confirm the guidance voice, and based on the voice emitted by the user, the information processing device can determine whether the user wants to confirm the guidance voice or not. It is possible to estimate matters. For this reason, we analyzed data for each user regarding points that were difficult to understand in the guidance voice, such as ⁇ The guidance voice was too fast to understand,'' or ⁇ The guidance voice was poorly pronounced and I could't understand.'' It becomes possible to adjust the guidance voice so that it is optimal for each user.
  • the extraction unit can extract actions other than the user's response to the guidance voice as confirmation actions. For this reason, for example, a user might say things like, "Where is (place name)?", “I don't really understand (place name),” “I could't catch it because it was too fast,” or “I could't catch it because my pronunciation was bad.” The user is asked to respond, such as that the user voluntarily gave instructions to the passenger, such as "look up (place name) on your smartphone” or "check the route again on your smartphone.” It is possible to extract, as a confirmation action, the fact that the user spontaneously uttered a voice, rather than the fact that he or she spoke to the information processing device. Therefore, there is no need for the information processing device to specifically guide the user to speak while driving, and it is possible to prevent the user from diverting the user's attention while driving or causing the user to feel bothered by the guidance. can.
  • an information processing method is an information processing method executed by a computer.
  • the audio output step a guidance audio to be provided to the user is output.
  • the detection step the user's reaction to the guidance voice is detected.
  • the extraction step the user's confirmation behavior regarding the guidance voice is extracted from among the reactions detected in the detection step.
  • the estimation step the items that the user wants to confirm are estimated based on the confirmation behavior.
  • the storage step information about the items estimated in the estimation step is stored.
  • the items that the user wants to confirm are estimated based on the confirmation behavior, and in the storage step, information about the estimated items can be stored, so it is possible to accumulate and utilize the information about the estimated items. I can do it. Therefore, by analyzing the accumulated information, it is possible for the device or system side to understand the cause of the occurrence of the matters that the user wants to confirm regarding the guidance voice, so that it can be used to improve the guidance voice. .
  • the above-described information processing method may be executed by a computer as an information processing program.
  • a situation occurs in which the user wants to check the guidance voice
  • the situation can be grasped using the computer.
  • the above-described information processing program may be stored in a computer-readable storage medium.
  • the information processing program can be distributed as a standalone program in addition to being incorporated into a device, and version upgrades can be easily performed.
  • FIG. 1 is a schematic configuration diagram of an information processing apparatus 1 according to Example 1 of this embodiment.
  • the information processing device 1 includes, for example, a terminal device 10 (first terminal device) that moves together with a vehicle (mobile object).
  • the terminal device 10 is, for example, a navigation device installed in a vehicle, and includes a control section 11, an input section 12, an output section 13 (sound output section), and a storage section 14.
  • the control unit 11 is composed of a CPU equipped with a memory such as a RAM or ROM, and controls the entire terminal device 10.
  • the control unit 11 includes a user identification unit 110, a route search unit 111, a detection unit 112, an extraction unit 113, and an estimation unit 114. It performs route search processing to search for a route, audio output processing to output information about the searched route as guidance voice to the user, and so on.
  • control unit 11 monitors various devices installed in the vehicle and performs detection processing to detect reactions to the user's guidance voice.
  • the various devices monitored by the control unit 11 are, for example, devices such as a GPS receiver, various sensors such as an acceleration sensor and a gyro sensor, an in-vehicle camera, or a microphone.
  • control unit 11 performs an extraction process to extract the user's confirmation behavior related to the guidance voice from among the user's reactions detected by the detection process, and also performs an extraction process to extract confirmation behavior of the user regarding the guidance voice, and confirmation items that the user wants to confirm based on the extracted confirmation behavior. Perform estimation processing to estimate.
  • the control unit 11 also performs a classification process of classifying confirmation items according to predetermined conditions, a storage process of storing the classified confirmation items in the storage unit 14, and the like. Note that details of the user's reaction, user's confirmation behavior, and confirmation items will be explained later when the operation of the information processing device 1 is explained.
  • the input unit 12 is composed of a device such as a microphone, an input button, or a touch panel, for example.
  • the input unit 12 receives a user's instruction by the user's voice or by operating an input button or touch panel, and transmits a signal indicating the instruction to the control unit 11.
  • the output unit 13 is comprised of devices such as a speaker that outputs guidance audio and an amplifier that drives the speaker.
  • the output unit 13 outputs information about the vehicle route searched by the route search process as a guidance voice, and transmits it to the user.
  • the storage unit 14 is composed of, for example, a hard disk or a non-volatile memory, and stores programs and data (for example, map data necessary for route search processing, and voice output processing) for the control unit 11 to perform the above-mentioned control. (e.g., voice data), confirmation data 140 (described later) indicating confirmation behavior, estimated data 141 (described later) indicating confirmation items, etc. are stored.
  • programs and data for example, map data necessary for route search processing, and voice output processing
  • this information processing method can be executed as an information processing program on a computer equipped with a CPU or the like. Further, this information processing program may be stored in a computer-readable storage medium.
  • the user identification unit 110 performs user identification processing (step S100).
  • the user who uses the terminal device 10 is identified.
  • the user may be automatically identified by matching the in-vehicle image taken by the in-vehicle camera with images of the vehicle driver and passenger stored in the storage unit 14 in advance, or A question is output from the terminal device 10 to the driver and fellow passengers via the output unit 13, and the user is requested to input an answer to the question into the input unit 12, and based on the answer, the user may be specified.
  • a user terminal such as a smartphone owned by the user is registered in advance in the terminal device 10
  • the user terminal may be detected by the terminal device 10 by wireless means or the like or directly connected to the terminal device 10. The user may be identified based on what the user has done.
  • step S200 determines whether to direct route search processing.
  • the input unit 12 is first monitored, and it is determined whether an instruction for performing route search processing is input.
  • the instruction for performing the route search process is for the user to input the starting point and destination into the input section 12, and the instruction is to input the starting point and destination into the input section 12, and the instruction is ⁇ from ⁇ (starting point) to ⁇ XX (destination)''. This may be done by voice, such as "Check the route,” or by operating a touch panel or input button. If the result of the determination is YES, the process advances to the next step S300, and if the result is NO, the process waits until the instruction is input.
  • step S300 the route search unit 111 performs a route search process.
  • a route from the departure point input to the input unit 12 to the destination is searched, and guidance information regarding the route is created. Since the contents of the route search process are publicly known, a detailed explanation will be omitted.
  • a route to a destination is searched, and guidance information about the searched route is generated.
  • Step S400 is a step corresponding to the audio output process in the above embodiment.
  • the control section 11 performs voice output processing, and the output section 13 outputs a guidance voice.
  • the guidance voice is a voice provided to the user, and in this case indicates the guidance information generated in step S300 described above.
  • the vehicle route searched by the route search unit is transmitted to the user by this guidance voice.
  • the guidance voice may say, ⁇ Go along ⁇ road (name of the road), enter the ⁇ expressway (name of the expressway) from the XX interchange (name of the interchange), and get off at the ⁇ interchange (name of the interchange).
  • Routes are explained to the user using road names, such as ⁇ This is the route,'' or ⁇ This is the route that passes through ⁇ (place name) and goes to ⁇ (place name).'' It may also be something that explains the route.
  • the guidance voice uses the name of the facility, such as "Turn right at the intersection before ⁇ (name of facility), then turn left at the traffic light after ⁇ (name of facility)."
  • the route may be explained to the user by a combination of these, road names, place names, and facility names.
  • the guidance voice may include a request for a response from the user, such as "If you do not understand the guidance, please reply that you did not understand," but the guidance voice may also include a request for a response from the user, such as "If you do not understand the guidance, please reply that you did not understand.” From the viewpoint of preventing distractions and annoyance to the user, it is more preferable that the message does not require the user to respond.
  • step S400 ends and the process advances to step S500.
  • Step S500 is a detection step in the embodiment described above.
  • the detection unit 112 performs detection processing.
  • devices such as a GPS receiver, various sensors such as an acceleration sensor and a gyro sensor, an in-vehicle camera, or a microphone are monitored. Then, through these devices, sounds, user actions, behavior of the vehicle operated by the user, etc. that occur after the guidance voice is output are detected as the user's reaction.
  • the detected reaction is treated as reaction data indicating the reaction, and is temporarily stored in the storage unit 14 in a ring buffer format.
  • the detection unit 112 detects the user's reaction to the guidance voice, and the reaction is temporarily stored in the storage unit 14. Then, the process advances to step S600.
  • Step S600 is a step corresponding to the extraction step in the embodiment described above.
  • the extraction unit 113 performs extraction processing.
  • the storage unit 14 is monitored for a predetermined period of time.
  • the predetermined time may be, for example, about 1 to 3 minutes after the guidance voice is output. After the monitoring, it is determined whether or not data indicating the user's confirmation behavior regarding the guidance voice is found among the temporarily stored reaction data.
  • the extraction unit 113 extracts the user's confirmation behavior regarding the guidance voice from among the reactions detected by the detection unit 112. Then, if the confirmation data 140 can be extracted (in the case of YES), the process advances to step S700, and if the confirmation data 140 cannot be extracted (in the case of NO), the operation of the terminal device 10 is ended.
  • the confirmation action extracted as the confirmation data 140 refers to the action taken when a user who has heard a guidance voice encounters something that he or she wants to confirm about the guidance voice, and the user who has heard the guidance voice for some reason This refers to actions that indicate that the person was unable to understand the instructions, or that there was something missing in the guidance voice.
  • the confirmation behavior is, for example, when the user "stops the vehicle on the roadside and operates a device such as a smartphone,” and this confirmation behavior is included in the reaction data created by the detection unit 112.
  • the extraction unit 113 discovers and extracts data when there is data indicating that the user is operating a device such as a smartphone, and video data of a user operating a device such as a smartphone.
  • the confirmation behavior was when the user asked, ⁇ Where is ⁇ (place name)?'', ⁇ I don't really understand ⁇ (place name)'', ⁇ I could't catch it because it was too fast'', or ⁇ I could't catch it because my pronunciation was bad.'' ” or “give instructions to fellow passengers” such as “look up ⁇ (place name) on your smartphone” or “check the route again on your smartphone,” and this confirmation behavior is discovered and extracted by the extracting unit 113 when the reaction data created by the detecting unit 112 includes voice data indicating the above-mentioned soliloquy or instructions to a fellow passenger.
  • confirmation actions include actions that are not actions such as re-searching for a route using the route search unit 111, that is, actions that are not actions for the route search unit 111 that initially performed the route search. Therefore, such behavior is also extracted by the extraction unit 113.
  • the user's utterance of voice in response to the guidance voice is also extracted as a confirmation behavior, and this behavior includes ⁇ talking to oneself'' and ⁇ giving instructions to fellow passengers.'' , includes spontaneous actions other than responding to the guidance voice.
  • Step S700 is a step corresponding to the estimation process in the above-mentioned embodiment.
  • the estimation unit 114 performs estimation processing.
  • confirmation items that the user wants to confirm are estimated based on the confirmation data 140 (that is, the extracted confirmation behavior).
  • the confirmation data 140 indicates that the user has stopped the vehicle on the roadside and is operating a device such as a smartphone, it indicates that the user has stopped driving and is operating a device such as a smartphone.
  • the confirmation data 140 indicates that the user has stopped the vehicle on the roadside and is operating a device such as a smartphone.
  • the confirmation data 140 indicates that the user has stopped driving and is operating a device such as a smartphone.
  • the voice guidance There is a high possibility that they are doing some research on the voice guidance.
  • the fact that there is a high possibility that the person is doing research work means that there is a high possibility that the user can hear the guidance voice. Therefore, when such confirmation data is extracted, it is presumed that the confirmation item is "the content of the guidance voice.”
  • confirmation data 140 with the content "give instructions to fellow passengers” is extracted. Since he is giving instructions to his fellow passengers, he can hear the guidance voice, but he does not understand what the guidance voice is saying because he is having his fellow passenger check the route using another device called a smartphone. This is because it can be estimated that The same is true when the confirmation data 140 containing the content "talking to oneself" such as "Where is ⁇ (place name)?" or "I don't really understand ⁇ (place name)" is extracted.
  • step S700 confirmation items are estimated based on the user's confirmation behavior such as using another device after listening to the guidance voice or the confirmation behavior such as emitting a voice in response to the guidance voice.
  • step S700 whether the confirmation item is "the content of the guidance voice" or the "guidance voice itself", the confirmation item may be specified based on the voice uttered by the user. If so, such identification will be made. For example, if the content of the confirmation data 140 is ⁇ talking to oneself'' such as ⁇ Where is ⁇ (place name)?'' or ⁇ I don't really understand ⁇ (place name),'' the confirmation item is ⁇ Where is ⁇ (place name)?''' If the content of the confirmation data 140 is "talking to oneself” such as "I was too fast to hear", then the confirmation matter is identified as "related to the speed of the guidance voice". If the content of the confirmation data 140 is ⁇ talking to oneself'' such as ⁇ I could't understand your pronunciation because of poor pronunciation,'' then the confirmation item is specified as ⁇ related to the pronunciation of the guidance voice.'''
  • the estimation unit 114 creates estimated data 141 as information regarding the confirmation items.
  • This estimated data 141 indicates confirmation items. This ends step S700, and the process advances to step S800.
  • Step S800 is a storage step in the above embodiment.
  • the control unit 11 performs classification processing and storage processing.
  • the estimated data 141 is classified according to predetermined conditions.
  • the types of confirmation items mentioned above (“content of the guidance voice”, "the voice of the guidance voice itself”, “things related to place names”, “things related to the speed of the guidance voice”, “things related to the pronunciation of the guidance voice")
  • the estimated data 141 can be classified.
  • the estimated data 141 is stored as data that cannot be overwritten.
  • the estimated data 141 may be simply stored, but for example, data on the user specified by the user identification section 110, data on instructions inputted into the input section 12, and guidance voice outputted from the output section 13 can be used.
  • the confirmation data 140 that is the basis for creating the estimated data 141, the current position and current time data received by the GPS receiver, etc. may be added to the estimated data 141 and stored. By doing this, it is possible to analyze the reason why the guidance voice was not understood by the user from various viewpoints. Then, the operation of the terminal device 10 ends.
  • the extraction unit 113 can extract the user's confirmation behavior from among the user's reactions detected by the detection unit 112, so that it is possible to detect that the user needs some kind of confirmation regarding the guidance voice.
  • the information processing device 1 side can grasp that a situation has occurred in which the user has an item that the user would like to confirm due to reasons such as the user not being able to understand the guidance audio or the information being lacking in the guidance audio. I can do it.
  • the estimation unit 114 estimates confirmation items (items) that the user wants to confirm based on the confirmation behavior
  • the storage unit can store estimation data 141 (information about the confirmation items estimated by the estimation unit 114). , it is possible to accumulate and utilize confirmation items. Therefore, by analyzing the accumulated estimated data 141, it is possible to understand on the device side or the system side the cause of occurrence of the matters that the user wants to confirm regarding the guidance voice so that it can be used to improve the guidance voice. I can do it.
  • the estimated data 141 includes the above-mentioned user data, instruction data input to the input section 12, guidance voice data outputted by the output section 13, and the basis for creating the estimated data 141.
  • the confirmation data 140 and the data of the current location and current time received by the GPS receiver are provided, for example, it becomes possible to check the relationship between the guidance voice and the confirmation items for each user. It becomes possible to optimize the guidance voice according to the characteristics of each user, and to examine the expression of the guidance voice that many users find difficult to understand. Furthermore, since it is possible to check the relationship between the user's information, the current location, current time, and confirmation items, it is also possible to check points and routes whose guidance voice is difficult for many users to understand.
  • the information about the confirmation items estimated by the estimation unit 114 is added, for example, "content of the guidance voice", “the voice of the guidance voice itself”, “things related to place names”, “things related to the speed of the guidance voice”, “information about the speed of the guidance voice”, etc. For example, if you want to analyze by category what kind of content the user needed to confirm regarding the first guidance voice, etc. When utilizing the matters stored in the future storage unit 14, handling of the matters can be facilitated.
  • the user may contact the route search unit 111 regarding the problem. It is possible to extract as a confirmation behavior that an attempt was made to solve the problem using a method other than using the method. Specifically, when a user driving a vehicle stops the vehicle on the roadside after listening to the guidance voice and operates a device such as a smartphone, or when a vehicle driver uses a mobile terminal to ask a passenger.
  • a method other than using the route search unit 111 such as when the user instructs the user to re-examine the route, the action is extracted as a confirmation action. I can do it.
  • the information processing device 1 can be configured with one terminal device 10 without using external equipment such as a server device, the configuration of the information processing device 1 can be simplified.
  • the information processing device 1 can understand from the voice emitted by the user that the user needs to confirm the guidance voice, and can estimate the confirmation items based on the voice emitted by the user. . For this reason, we analyzed data for each user regarding points that were difficult to understand in the guidance voice, such as ⁇ The guidance voice was too fast to understand,'' or ⁇ The guidance voice was poorly pronounced and I could't understand.'' It becomes possible to adjust the guidance voice so that it is optimal for each user.
  • the extraction unit 113 can extract actions other than the user's response to the guidance voice as confirmation actions. For this reason, for example, when a user asks, ⁇ Where is ⁇ (place name)?'', ⁇ I don't really understand ⁇ (place name)'', ⁇ I could't catch it because it was too fast'', or ⁇ I could't catch it because the pronunciation was bad''.
  • Example 2 Next, the information processing device 1 according to Example 2 of this embodiment will be described with reference to FIG. 3. Note that the same parts as in the first embodiment described above are given the same reference numerals, and the explanation will be omitted or simplified.
  • the information processing device 1 includes a terminal device 10 (first terminal device) and a mobile terminal 20 (second terminal device).
  • the terminal device 10 is a device that moves with the vehicle as in the first embodiment, and is the same as the terminal device 10 in the first embodiment except that it does not have the detection section 112 and has the communication section 15. It has the following configuration.
  • the mobile terminal 20 is a device that moves with the vehicle (mobile object) and is provided to be able to communicate with the terminal device 10.
  • the mobile terminal 20 is, for example, a device such as a smartphone or a tablet, and includes a control section 21 and a communication section 22.
  • the control unit 21 is composed of a CPU equipped with a memory such as a RAM or ROM, and controls the entire mobile terminal 20.
  • the control unit 21 includes a detection unit 211 and an information search unit 212.
  • the detection unit 211 has a configuration corresponding to the detection unit 112 in the first embodiment.
  • the information search unit 212 is configured to perform an information search provided independently of the route search unit 111 of the terminal device 10, and performs a route search process (which performs a vehicle route search independently of the route search unit 111). route search), a point search process (point search) to check the coordinates of a specific point, and a keyword search process (keyword search) to search for a keyword for a specific word.
  • route search which performs a vehicle route search independently of the route search unit 111).
  • route search which performs a vehicle route search independently of the route search unit 111
  • point search point search
  • keyword search keyword search
  • the information search unit 212 may use a known search engine used on the World Wide Web, and the search results are output as images by an output unit (not shown) provided in the mobile terminal 20. Alternatively, it may be output as audio.
  • the communication unit 22 is provided to be able to communicate with the communication unit 15 of the terminal device 10, and the reaction data created by the detection unit 211 can be transmitted from the mobile terminal 20 to the terminal device 10.
  • step S100 in addition to user identification processing, cooperation processing of the mobile terminal 20 with respect to the terminal device 10 is performed.
  • authentication is performed to determine whether the mobile terminal 20 is a terminal that can cooperate with the terminal device 10, and if cooperation is recognized, the terminal device 10 and the mobile terminal 20 can communicate. This allows the terminal device 10 to grasp the user's reaction detected by the detection unit 211 of the mobile terminal 20 and the content of the information search performed by the information search unit 212.
  • the process advances to step S200.
  • step S500 detection processing is performed.
  • the information search section 212 of the mobile terminal 20 is monitored, and the operation of the information search section 212 that occurs after the guidance voice is output is detected as the user's reaction.
  • the confirmation action is that the user "performed a search action using the information search unit 212."
  • This confirmation behavior is performed by the extraction unit when the reaction data created by the detection unit 211 includes data indicating that the information search unit 212 has operated for route search, point search, keyword search, etc. 113 and extracted as confirmation data 140. That is, in addition to the confirmation behavior of the first embodiment, the confirmation behavior includes the user using the information search unit 212 to search for a route for a vehicle, and the user using the information search unit 212 to check the coordinates of a specific point. This includes a point search and a keyword search for a specific word by the user using the information search unit 212.
  • the user when a problem occurs such as the user not being able to understand the guidance voice or there being something missing in the guidance voice, the user can report the problem from the route search unit 111.
  • the independent information search unit 212 it is possible to extract what was attempted to be solved as a confirmation action. Therefore, for example, if a user who is driving a vehicle stops the vehicle on the roadside after listening to the guidance voice and searches for information using a device such as a smartphone, that action can be extracted as a confirmation action. I can do it.
  • the terminal device 10 and the mobile terminal 20 since it is possible to link the terminal device 10 and the mobile terminal 20, it is possible to refer to the search word (keyword) input to the information search unit 212, the output method of the search result (i.e. audio or image), etc.
  • the search word keyword
  • the output method of the search result i.e. audio or image
  • an action in which the user performs a route search using the information search unit 212 and displays the result on a display unit (not shown) can be regarded as a confirmation action and extracted as the confirmation data 140.
  • a confirmation action and extracted as the confirmation data 140 It can be inferred that the user was unable to visualize the route using the guidance voice and wanted to confirm the route using images.
  • an action in which the user performs a keyword search using the information search unit 212 can be regarded as a confirmation action and extracted as confirmation data 140.
  • the search characters of the keyword search may be the name of a facility, a place name, or a road. If the name is , it can be presumed that the user was unable to understand the facility name, place name, or road name in the guidance voice.
  • the keyword search characters are hiragana or katakana
  • the user does not know the name of the facility, place name, or road name.
  • the user searches for the same facility name, place name, or road name multiple times by keyword, the user may be able to think of multiple locations with the same sound for the facility name, place name, or road name that was guided. It can be assumed that
  • the information processing device 1 can be configured with the terminal device 10 and the mobile terminal 20, the load on each device can be reduced compared to the information processing device 1 configured with one terminal device 10. .
  • Example 3 Next, the information processing device 1 according to Example 3 of this embodiment will be described with reference to FIG. 4. Note that the same parts as in the first and second embodiments described above are denoted by the same reference numerals, and the description thereof will be omitted or simplified.
  • the information processing device 1 includes a terminal device 10 (first terminal device), a mobile terminal 20 (second terminal device), and a server device 30.
  • the terminal device 10 is a device that moves with the vehicle as in the first embodiment, and has the same configuration as the terminal device 10 in the second embodiment except that it does not include the extraction unit 113, the estimation unit 114, and the storage unit 14. It is equipped with The mobile terminal 20 has the same configuration as the mobile terminal 20 of the second embodiment.
  • the server device 30 is provided to be able to communicate with the terminal device 10 and the mobile terminal 20, and includes a control section 31, a storage section 32, and a communication section 33.
  • the control unit 31 includes an extraction unit 310 and an estimation unit 311.
  • the extraction unit 310, the estimation unit 311, and the storage unit 32 correspond to the extraction unit 113, the estimation unit 114, and the storage unit 14 in Examples 1 and 2, respectively, and have the same functions as these structures.
  • the communication unit 33 is provided to be able to communicate with the communication unit 15 and the communication unit 22, and is capable of transmitting and receiving the above-mentioned reaction data, confirmation data 140, and estimation data 141.
  • the information processing device 1 configured with one terminal device 10 or the information processing device 1 configured with the terminal device 10 (first terminal device) and the mobile terminal 20 (second terminal device) The load on each device can be reduced compared to the above.
  • the items that the user would like to confirm estimated in the estimation process may include "detailed information on the facility.”
  • information about the destination facility is made to be able to provide audio guidance.
  • actions such as "the user looked up detailed information about the facility using a smartphone or the like" are extracted as confirmation actions.
  • confirmation action it is estimated that the information that the user needed (or that was missing in the guidance voice) was "detailed information about the facility.”

Abstract

When there is a point that a user desires to confirm because the user could not understand a guidance voice or the guidance voice is insufficient to understand, a device or system finds out the situation and a cause of generation of the point that the user desires to confirm the guidance voice in order to improve the guidance voice. This information processing device outputs the guidance voice to be provided to the user from a voice output unit. In addition, a detection unit detects a response of the user for the guidance voice. In addition, an extraction unit extracts a confirmation action of the user pertaining to the guidance voice from the response detected by the detection unit. In addition, an estimation unit estimates, on the basis of the confirmation action, a point that the user desires to confirm. In addition, a storage unit stores information about the point estimated by the estimation unit.

Description

情報処理装置information processing equipment
 本発明は、情報処理装置に関する。 The present invention relates to an information processing device.
 音声情報により経路案内を行うカーナビゲーションシステムが知られている(例えば、特許文献1、2参照)。 Car navigation systems that provide route guidance using audio information are known (for example, see Patent Documents 1 and 2).
 特許文献1には、案内音声を出力する案内出力部と、入力手段としての方向入力部を備える経路案内装置が開示されている。特許文献2には、取得したユーザの発話内容からユーザが音声認識可能な語彙をどの程度理解しているかを示す認識語彙理解度を判定する判定部を備える音声認識装置が開示されている。 Patent Document 1 discloses a route guidance device that includes a guidance output section that outputs a guidance voice and a direction input section that serves as an input means. Patent Document 2 discloses a speech recognition device that includes a determination unit that determines a recognition vocabulary comprehension level that indicates the extent to which a user understands speech-recognizable vocabulary from the acquired user's utterance content.
特開2008-261641号公報JP2008-261641A 特開2012-27487号公報Japanese Patent Application Publication No. 2012-27487
 特許文献1に記載されているような経路案内装置では、案内音声を聞いても進行するべき方向が不確かだと判断したユーザは、方向入力部に方向を入力してシステム側に問い合わせを行い、認識している方向が正しいか否かを確認する。しかしながら、この場合、どのような理由でユーザが進行するべき方向を不確かだと判断したかを、システム側で把握することが難しく、ユーザに不確かだと判断された案内音声について改善を図ることが難しい。 In a route guidance device such as that described in Patent Document 1, if a user determines that the direction in which to proceed is uncertain even after listening to the guidance voice, the user inputs the direction into the direction input section and queries the system side. Check whether the recognized direction is correct. However, in this case, it is difficult for the system to understand why the user has determined that the direction in which he should proceed is uncertain, and it is difficult to improve the guidance voice that the user has determined to be uncertain about. difficult.
 特許文献2に記載されているような音声認識装置では、ユーザの発話内容を示す音声をマイクで集音し、集音した音声について音声認識部で音声認識処理を行い、音声認識処理中にカウントされるタイムアウト回数等に基づき、認識語彙理解度を判定する。そして、認識語彙理解度に応じて案内を変更する。しかしながら、システム側は、認識語彙理解度の判定に際してユーザがシステム側と発話できたか否かを把握することができても、なぜ発話ができたのか(なぜ、発話ができなかったのか)を把握することが難しい。 In a speech recognition device such as that described in Patent Document 2, a microphone collects speech indicating the contents of a user's utterance, a speech recognition unit performs speech recognition processing on the collected speech, and a count is performed during the speech recognition processing. The recognition vocabulary understanding level is determined based on the number of timeouts and the like. Then, the guidance is changed according to the recognition vocabulary understanding level. However, even if the system is able to determine whether or not the user was able to speak with the system when determining recognition vocabulary comprehension, it is not possible for the system to understand why the user was able to speak (or why the user was unable to speak). difficult to do.
 本発明が解決しようとする課題としては、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の理由によってユーザが確認したい事項がある場合に、その状況を装置側またはシステム側で把握するとともに、案内音声の向上に役立てることができるように、ユーザが案内音声について確認したい事項が発生した原因を装置側またはシステム側で把握することが一例として挙げられる。 The problem to be solved by the present invention is that when there is a matter that the user wants to confirm due to reasons such as the user not being able to understand the guidance voice or the fact that the guidance voice is lacking in some matters, the situation is An example of this is for the device or system to understand the cause of the occurrence of an item that the user wants to confirm regarding the guidance voice so that the device or system can use the information to improve the guidance voice.
 上記課題を解決するために、請求項1に記載の発明は、ユーザに提供する案内音声を出力する音声出力部と、前記ユーザの前記案内音声に対する反応を検出する検出部と、前記検出部で検出した前記反応のうち、前記案内音声に関する前記ユーザの確認行動を抽出する抽出部と、前記確認行動に基づいて前記ユーザが確認したい事項を推定する推定部と、前記推定部で推定された前記事項についての情報を記憶する記憶部と、を備えることを特徴としている。 In order to solve the above problem, the invention according to claim 1 includes: an audio output unit that outputs a guidance voice to be provided to a user; a detection unit that detects a reaction of the user to the guidance voice; an extraction unit that extracts the user's confirmation behavior regarding the guide voice from among the detected reactions; an estimation unit that estimates what the user wants to confirm based on the confirmation behavior; The device is characterized by comprising a storage unit that stores information about the item.
 請求項10に記載の発明は、コンピュータで実行される情報処理方法であって、ユーザに提供する案内音声を出力する音声出力工程と、前記ユーザの前記案内音声に対する反応を検出する検出工程と、前記検出工程で検出した前記反応のうち、前記案内音声に関する前記ユーザの確認行動を抽出する抽出工程と、前記確認行動に基づいて前記ユーザが確認したい事項を推定する推定工程と、前記推定工程で推定された前記事項についての情報を記憶する記憶工程と、を備えることを特徴とする。 The invention according to claim 10 is an information processing method executed by a computer, comprising: an audio output step of outputting a guidance voice to be provided to a user; a detection step of detecting a reaction of the user to the guidance voice; an extraction step of extracting the user's confirmation behavior regarding the guidance voice from among the reactions detected in the detection step; an estimation step of estimating what the user wants to confirm based on the confirmation behavior; The method is characterized by comprising a storage step of storing information about the estimated item.
 請求項11に記載の発明は、請求項10に記載の情報処理方法を、情報処理プログラムとしてコンピュータにより実行させることを特徴としている。 The invention according to claim 11 is characterized in that the information processing method according to claim 10 is executed by a computer as an information processing program.
 請求項11に記載の発明は、請求項10に記載の情報処理プログラムをコンピュータにより読み取り可能な記憶媒体へ格納したことを特徴としている。 The invention according to claim 11 is characterized in that the information processing program according to claim 10 is stored in a computer-readable storage medium.
本発明の一実施例にかかる情報処理装置の概略構成図である。1 is a schematic configuration diagram of an information processing device according to an embodiment of the present invention. 図1に示された情報処理装置の動作のフローチャートである。2 is a flowchart of the operation of the information processing apparatus shown in FIG. 1. FIG. 実施例2における情報処理装置の概略構成図である。FIG. 2 is a schematic configuration diagram of an information processing device in Example 2. FIG. 実施例3における情報処理装置の概略構成図である。FIG. 3 is a schematic configuration diagram of an information processing device in Example 3. FIG.
 以下、本発明の一実施形態に係る情報処理装置を説明する。本発明の情報処理装置は、音声出力部がユーザに提供する案内音声を出力する。そして、検出部が、ユーザの案内音声に対する反応を検出する。そして、抽出部が、検出部で検出した反応のうち、案内音声に関するユーザの確認行動を抽出する。そして、推定部が、確認行動に基づいてユーザが確認したい事項を推定する。そして、記憶部が、推定部で推定された事項についての情報を記憶する。このようにすることにより、検出部が検出したユーザの反応のうちユーザの確認行動を抽出部で抽出することができるので、ユーザが案内音声に対して何かしらの確認を必要としていること、すなわち、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の理由によってユーザが確認したい事項があるという状況が発生したことを情報処理装置側で把握することができる。また、推定部は、確認行動に基づいてユーザの確認したい事項を推定し、記憶部は、推定部で推定された事項についての情報を記憶できるので、その推定した事項についての情報を蓄積して活用することができる。このため、蓄積された情報を分析等することで、案内音声の向上に役立てることができるように、ユーザが案内音声について確認したい事項が発生した原因を装置側またはシステム側で把握することができる。 An information processing device according to an embodiment of the present invention will be described below. In the information processing device of the present invention, the audio output unit outputs guidance audio provided to the user. Then, the detection unit detects the user's reaction to the guidance voice. Then, the extraction unit extracts the user's confirmation behavior regarding the guidance voice from among the reactions detected by the detection unit. Then, the estimating unit estimates what the user wants to confirm based on the confirmation behavior. Then, the storage unit stores information about the items estimated by the estimation unit. By doing this, the extraction section can extract the user's confirmation behavior from among the user's reactions detected by the detection section, so that it is possible to detect that the user needs some kind of confirmation regarding the guidance voice. It is possible for the information processing device to understand that a situation has occurred in which the user wants to confirm something due to reasons such as the user not being able to understand the guidance voice or the information being lacking in the guidance voice. . In addition, the estimating section estimates the items that the user wants to confirm based on the confirmation behavior, and the storage section can store information about the items estimated by the estimating section, so it can accumulate information about the estimated items. It can be utilized. Therefore, by analyzing the accumulated information, it is possible for the device or system side to understand the cause of the occurrence of the matters that the user wants to confirm regarding the guidance voice, so that it can be used to improve the guidance voice. .
 また、記憶部は、推定部で推定された事項についての情報を所定条件ごとに分類して記憶するように構成されていてもよい。このようにすることにより、推定部で推定された事項についての情報を、例えば、「案内音声の内容」、「案内音声の音声自体」、「地名に関すること」、「案内音声のスピードに関すること」、「案内音声の発音に関すること」、というように、カテゴリーごとに分けることができるので、例えば、一の案内音声について、ユーザが確認を必要とした内容はどんな内容なのかをカテゴリーごとに分析する場合など、将来記憶部に蓄積された情報を活用する際に、当該事項の取扱いを容易にすることができる。 Furthermore, the storage unit may be configured to classify and store information about the items estimated by the estimation unit according to predetermined conditions. By doing this, information about the matters estimated by the estimation unit can be stored, for example, as "the content of the guidance voice," "the voice of the guidance voice itself," "things related to place names," and "things related to the speed of the guidance voice." , ``Things related to the pronunciation of the guidance voice,'' and so on, so for example, regarding the first guidance voice, we can analyze by category what kind of content the user needed to confirm. When utilizing the information stored in the storage unit in the future, such as in a case, the handling of the matter can be facilitated.
 また、情報処理装置は、移動体の経路を探索する経路探索部を備え、案内音声は、経路探索部によって探索された移動体の経路をユーザに伝達する音声であってもよい。そして、確認行動は、経路探索部に対する行動でなくてもよい。このようにすることにより、上述のような、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の問題が生じた場合であって、ユーザが当該問題について経路探索部を利用する以外の手段を用いて解決しようとしたことを、確認行動として抽出することができる。具体的には、移動体を運転中のユーザが、案内音声を聞いた後に車両を路肩に停車させ、スマートフォン等の機器を用いて情報検索をした場合や、移動体の運転者が同乗者に対して携帯端末を用いて経路を調べ直すように指示したような場合、など、経路探索部を利用することとは異なる手段を用いて上述の問題を解決しようとした場合に、その行動を確認行動として抽出することができる。このため、案内音声についてユーザが確認したい事項があるという状況が発生したことをより一層、情報処理装置側で把握し易くすることができ、ユーザが案内音声について確認したい事項が発生した原因の分析、及び案内音声の改善などに資することができる。 Furthermore, the information processing device may include a route search unit that searches for a route for a moving body, and the guidance voice may be a voice that conveys to the user the route for the mobile body that has been searched by the route search unit. Further, the confirmation action does not have to be an action for the route search unit. By doing this, if a problem occurs such as the user not being able to understand the guidance voice or there being something missing in the guidance voice, as described above, the user can resolve the problem. An attempt to solve the problem using a means other than using the route search unit can be extracted as a confirmation action. Specifically, when a user driving a mobile vehicle stops the vehicle on the roadside after listening to a guidance voice and searches for information using a device such as a smartphone, or when the driver of a mobile vehicle asks a fellow passenger to stop the vehicle and search for information using a device such as a smartphone. If you try to solve the above problem using a different method than using a route search unit, such as when you instruct someone to re-check the route using a mobile device, check their behavior. It can be extracted as an action. Therefore, it is possible to make it easier for the information processing device to understand that a situation in which the user wants to check the guidance voice has occurred, and to analyze the cause of the occurrence of the situation in which the user wants to check the guidance voice. , and can contribute to improving the guidance voice.
 また、情報処理装置は、移動体の経路を探索する経路探索部を備え、案内音声は、経路探索部によって探索された移動体の経路をユーザに伝達する音声であり、経路探索部と、音声出力部と、検出部と、抽出部と、推定部と、記憶部と、は、移動体とともに移動する端末装置に設けられていてもよい。このようにすることにより、サーバ装置などの外部機器を用いることなく、一の端末装置で情報処理装置を構成することができる。 The information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user. The output section, the detection section, the extraction section, the estimation section, and the storage section may be provided in a terminal device that moves together with the mobile object. By doing so, the information processing device can be configured with one terminal device without using external equipment such as a server device.
 また、情報処理装置は、移動体の経路を探索する経路探索部を備え、案内音声は、経路探索部によって探索された移動体の経路をユーザに伝達する音声であり、経路探索部と、音声出力部と、抽出部と、推定部と、記憶部とは、移動体とともに移動する第1端末装置に設けられていてよい。また、検出部は、移動体とともに移動し、かつ第1端末装置と通信可能な第2端末装置に設けられていてよい。このようにすることにより、一の端末装置で構成される情報処理装置と比較して各装置の負荷を低減することができる。 The information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user. The output section, the extraction section, the estimation section, and the storage section may be provided in a first terminal device that moves together with the mobile object. Further, the detection unit may be provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device. By doing so, the load on each device can be reduced compared to an information processing device configured with one terminal device.
 また、情報処理装置は、移動体の経路を探索する経路探索部を備え、案内音声は、経路探索部によって探索された移動体の経路をユーザに伝達する音声であり、経路探索部と、音声出力部とは、移動体とともに移動する第1端末装置に設けられていてよい。また、検出部は、移動体とともに移動し、かつ第1端末装置と通信可能な第2端末装置に設けられていてよい。また、抽出部と、推定部と、記憶部とは、第1端末装置及び第2端末装置と通信可能なサーバ装置に設けられていてよい。このようにすることにより、一の端末装置で構成される情報処理装置や、第1端末装置と第2端末装置とで構成される情報処理装置と比較して各装置の負荷を低減することができる。 The information processing device also includes a route search unit that searches for a route for the mobile object, and the guidance voice is a voice that conveys the route of the mobile body searched by the route search unit to the user. The output unit may be provided in a first terminal device that moves together with the mobile object. Further, the detection unit may be provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device. Further, the extraction unit, the estimation unit, and the storage unit may be provided in a server device that can communicate with the first terminal device and the second terminal device. By doing so, the load on each device can be reduced compared to an information processing device configured with one terminal device or an information processing device configured with a first terminal device and a second terminal device. can.
 また、第2端末装置は、経路探索部から独立した情報検索部を備えていてもよい。この場合、確認行動は、ユーザが情報検索部を用いて移動体の経路検索をすること、ユーザが情報検索部を用いて特定の地点の座標を調べる地点検索をすること、及びユーザが情報検索部を用いて特定のワードにつきキーワード検索をすること、のうち、いずれかであってもよい。このようにすることにより、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の問題が生じた場合であって、ユーザが当該問題を経路探索部から独立した情報検索部を用いることで解決しようとしたことを確認行動として抽出することができる。このため、例えば、ユーザが、最初に経路探索を行った端末装置以外の端末装置(例えば、スマートフォンなど)を用いた経路検索を行うことで当該問題を解決しようとした場合であっても、このような問題が起きていたことを情報処理装置側で見逃し難くなるとともに、記憶部に蓄積された情報を活用して、案内音声の内容の改善を図ることなどができるようになる。 Additionally, the second terminal device may include an information search unit independent from the route search unit. In this case, the confirmation actions include the user using the information search unit to search for a route for a moving object, the user using the information search unit to search for the coordinates of a specific point, and the user searching for information. It may be possible to perform a keyword search for a specific word using the section. By doing this, if a problem occurs such as the user not being able to understand the guidance voice or there being something missing in the guidance voice, the user can resolve the problem independently from the route search section. By using the information retrieval unit, it is possible to extract the problem that was attempted to be solved as a confirmation action. Therefore, for example, even if the user attempts to solve the problem by searching for a route using a terminal device (such as a smartphone) other than the terminal device that originally performed the route search, It becomes difficult for the information processing device to overlook the occurrence of such a problem, and it becomes possible to improve the content of the guidance voice by utilizing the information stored in the storage unit.
 また、確認行動は、ユーザが、案内音声に反応して音声を発することであり、推定部は、ユーザが発した音声に基づいて、ユーザが確認したい事項を推定してもよい。このようにすることにより、情報処理装置は、ユーザが発する音声によって、案内音声についてユーザの確認の必要が生じたことを把握することができ、ユーザが発した音声に基づいて、ユーザの確認したい事項を推定することができる。このため、例えば「案内音声が速すぎて聞き取れなかった」、「案内音声の発音が悪くて聞き取れなかった」、等、案内音声のわかりにくかった点についてユーザごとにデータを分析し、次回以降の案内音声を、各ユーザに最適な案内音声とするように調整することなどができるようになる。 Further, the confirmation action is the user making a sound in response to the guidance sound, and the estimating unit may estimate the item the user wants to confirm based on the sound made by the user. By doing this, the information processing device can understand from the voice emitted by the user that the user needs to confirm the guidance voice, and based on the voice emitted by the user, the information processing device can determine whether the user wants to confirm the guidance voice or not. It is possible to estimate matters. For this reason, we analyzed data for each user regarding points that were difficult to understand in the guidance voice, such as ``The guidance voice was too fast to understand,'' or ``The guidance voice was poorly pronounced and I couldn't understand.'' It becomes possible to adjust the guidance voice so that it is optimal for each user.
 また、案内音声は、ユーザに返答を要求するものでなくてもよい。このようにすることにより、抽出部は、ユーザが案内音声に対して返答すること以外の行動を確認行動として抽出することができる。このため、例えば、ユーザが「(地名)ってどこ?」、「(地名)ってよくわからない」、「速すぎて聞き取れなかった」、又は「発音が悪くて聞き取れなかった」といった独り言を発したことや、ユーザが同乗者に対して「(地名)についてスマートフォンで調べて」、「スマートフォンでもう一度経路を調べ直して」などといった指示を自発的に行ったこと、など、返答を求められて情報処理装置と発話したことではなく、自発的に音声を発したことを確認行動として抽出することができる。したがって、運転中のユーザに対して情報処理装置側からわざわざ発話を誘導する必要がなく、当該誘導によって運転中のユーザの注意を逸らすことや、ユーザに煩わしさを感じさせることを抑制することができる。 Additionally, the guidance voice does not need to request a response from the user. By doing so, the extraction unit can extract actions other than the user's response to the guidance voice as confirmation actions. For this reason, for example, a user might say things like, "Where is (place name)?", "I don't really understand (place name)," "I couldn't catch it because it was too fast," or "I couldn't catch it because my pronunciation was bad." The user is asked to respond, such as that the user voluntarily gave instructions to the passenger, such as "look up (place name) on your smartphone" or "check the route again on your smartphone." It is possible to extract, as a confirmation action, the fact that the user spontaneously uttered a voice, rather than the fact that he or she spoke to the information processing device. Therefore, there is no need for the information processing device to specifically guide the user to speak while driving, and it is possible to prevent the user from diverting the user's attention while driving or causing the user to feel bothered by the guidance. can.
 また、本発明の一実施形態に係る情報処理方法は、コンピュータで実行される情報処理方法である。情報処理方法は、音声出力工程で、ユーザに提供する案内音声を出力する。そして、検出工程で、ユーザの案内音声に対する反応を検出する。そして、抽出工程で、検出工程で検出した反応のうち、案内音声に関するユーザの確認行動を抽出する。そして、推定工程で、確認行動に基づいてユーザが確認したい事項を推定する。そして、記憶工程で、推定工程で推定された事項についての情報を記憶する。このようにすることにより、検出工程で検出したユーザの反応のうちユーザの確認行動を抽出工程で抽出できるので、確認行動が発生したことを把握することができる。すなわち、案内音声の内容にユーザの確認の必要が生じたことを把握することができる。また、推定工程では、確認行動に基づいてユーザの確認したい事項を推定し、記憶工程では、推定された事項についての情報を記憶できるので、その推定した事項についての情報を蓄積して活用することができる。このため、蓄積された情報を分析等することで、案内音声の向上に役立てることができるように、ユーザが案内音声について確認したい事項が発生した原因を装置側またはシステム側で把握することができる。 Furthermore, an information processing method according to an embodiment of the present invention is an information processing method executed by a computer. In the information processing method, in the audio output step, a guidance audio to be provided to the user is output. Then, in the detection step, the user's reaction to the guidance voice is detected. Then, in the extraction step, the user's confirmation behavior regarding the guidance voice is extracted from among the reactions detected in the detection step. Then, in the estimation step, the items that the user wants to confirm are estimated based on the confirmation behavior. Then, in the storage step, information about the items estimated in the estimation step is stored. By doing this, the user's confirmation behavior can be extracted in the extraction step from among the user's reactions detected in the detection step, so it is possible to know that the confirmation behavior has occurred. In other words, it is possible to understand that the content of the guidance voice requires the user's confirmation. In addition, in the estimation step, the items that the user wants to confirm are estimated based on the confirmation behavior, and in the storage step, information about the estimated items can be stored, so it is possible to accumulate and utilize the information about the estimated items. I can do it. Therefore, by analyzing the accumulated information, it is possible for the device or system side to understand the cause of the occurrence of the matters that the user wants to confirm regarding the guidance voice, so that it can be used to improve the guidance voice. .
 また、上述した情報処理方法を、情報処理プログラムとしてコンピュータにより実行させてもよい。このようにすることにより、案内音声についてユーザが確認したい事項があるという状況が発生した場合に、コンピュータを用いて、その状況を把握することができる。また、案内音声の向上に役立てることができるように、ユーザが案内音声について確認したい事項が発生した原因を把握することができる。 Furthermore, the above-described information processing method may be executed by a computer as an information processing program. By doing so, when a situation occurs in which the user wants to check the guidance voice, the situation can be grasped using the computer. In addition, it is possible to understand the cause of occurrence of an item that the user wants to confirm regarding the guidance voice so that it can be used to improve the guidance voice.
 また、上述した情報処理プログラムを、コンピュータにより読み取り可能な記憶媒体に格納してもよい。このようにすることにより、情報処理プログラムを機器に組み込む以外に単体でも流通させることができ、バージョンアップ等も容易に行える。 Furthermore, the above-described information processing program may be stored in a computer-readable storage medium. By doing so, the information processing program can be distributed as a standalone program in addition to being incorporated into a device, and version upgrades can be easily performed.
[実施例1]
 図1は、本実施形態の実施例1に係る情報処理装置1の概略構成図である。情報処理装置1は、例えば車両(移動体)とともに移動する端末装置10(第1端末装置)を備えている。端末装置10は、例えば、車両に設置されるナビゲーション装置であり、制御部11と、入力部12と、出力部13(音声出力部)と、記憶部14と、を備えている。
[Example 1]
FIG. 1 is a schematic configuration diagram of an information processing apparatus 1 according to Example 1 of this embodiment. The information processing device 1 includes, for example, a terminal device 10 (first terminal device) that moves together with a vehicle (mobile object). The terminal device 10 is, for example, a navigation device installed in a vehicle, and includes a control section 11, an input section 12, an output section 13 (sound output section), and a storage section 14.
 制御部11は、例えばRAMやROMなどのメモリを備えたCPUで構成され、端末装置10全体の制御を行っている。制御部11は、ユーザ特定部110と、経路探索部111と、検出部112と、抽出部113と、推定部114と、を備えており、端末装置10のユーザを特定する特定処理、車両の経路を探索する経路探索処理、探索された経路についての情報をユーザに提供する案内音声として出力する音声出力処理、などを行う。 The control unit 11 is composed of a CPU equipped with a memory such as a RAM or ROM, and controls the entire terminal device 10. The control unit 11 includes a user identification unit 110, a route search unit 111, a detection unit 112, an extraction unit 113, and an estimation unit 114. It performs route search processing to search for a route, audio output processing to output information about the searched route as guidance voice to the user, and so on.
 また、制御部11は、車両に搭載された各種装置を監視し、ユーザの案内音声に対する反応を検出する検出処理を行う。この際、制御部11が監視する各種装置とは、例えば、GPS受信機、加速度センサやジャイロセンサ等の各種センサ、車内撮影カメラ、又はマイクロフォンなどの装置である。 Additionally, the control unit 11 monitors various devices installed in the vehicle and performs detection processing to detect reactions to the user's guidance voice. At this time, the various devices monitored by the control unit 11 are, for example, devices such as a GPS receiver, various sensors such as an acceleration sensor and a gyro sensor, an in-vehicle camera, or a microphone.
 また、制御部11は、検出処理によって検出したユーザの反応のうち、案内音声に関するユーザの確認行動を抽出する抽出処理を行うとともに、抽出した確認行動に基づいてユーザが確認したい事項である確認事項を推定する推定処理を行う。また、制御部11は、確認事項を所定条件ごとに分類する分類処理、分類された確認事項を記憶部14に保存する保存処理、なども行う。なお、ユーザの反応、ユーザの確認行動、及び確認事項の詳細については後述の情報処理装置1の動作の説明の際に説明する。 Further, the control unit 11 performs an extraction process to extract the user's confirmation behavior related to the guidance voice from among the user's reactions detected by the detection process, and also performs an extraction process to extract confirmation behavior of the user regarding the guidance voice, and confirmation items that the user wants to confirm based on the extracted confirmation behavior. Perform estimation processing to estimate. The control unit 11 also performs a classification process of classifying confirmation items according to predetermined conditions, a storage process of storing the classified confirmation items in the storage unit 14, and the like. Note that details of the user's reaction, user's confirmation behavior, and confirmation items will be explained later when the operation of the information processing device 1 is explained.
 入力部12は、例えば、マイクロフォン、入力ボタン、又はタッチパネル等の装置で構成されている。入力部12は、ユーザの発声により、又はユーザの入力ボタンやタッチパネルへの操作により、ユーザの指示を受け付け、当該指示を示す信号を制御部11に送信する。 The input unit 12 is composed of a device such as a microphone, an input button, or a touch panel, for example. The input unit 12 receives a user's instruction by the user's voice or by operating an input button or touch panel, and transmits a signal indicating the instruction to the control unit 11.
 出力部13は、案内音声を出力するスピーカ及びスピーカを駆動するアンプ等の装置で構成されている。出力部13は、経路探索処理によって探索された車両の経路についての情報を案内音声として出力し、ユーザに伝達する。 The output unit 13 is comprised of devices such as a speaker that outputs guidance audio and an amplifier that drives the speaker. The output unit 13 outputs information about the vehicle route searched by the route search process as a guidance voice, and transmits it to the user.
 記憶部14は、例えば、ハードディスクや不揮発性メモリなどで構成されており、制御部11が上述の制御をするためのプログラムやデータ(例えば、経路探索処理に必要な地図データ、音声出力処理に必要な音声データなど)や、確認行動を示す後述の確認データ140、確認事項を示す後述の推定データ141、などが記憶されるようになっている。 The storage unit 14 is composed of, for example, a hard disk or a non-volatile memory, and stores programs and data (for example, map data necessary for route search processing, and voice output processing) for the control unit 11 to perform the above-mentioned control. (e.g., voice data), confirmation data 140 (described later) indicating confirmation behavior, estimated data 141 (described later) indicating confirmation items, etc. are stored.
 次に、上述した構成の情報処理装置1の動作(情報処理方法)の一例について、図2のフローチャートを参照して説明する。なお、この情報処理方法は、情報処理プログラムとして、CPU等を備えたコンピュータで実行することができる。また、この情報処理プログラムは、コンピュータ読み取り可能な記憶媒体に格納されてもよい。 Next, an example of the operation (information processing method) of the information processing device 1 configured as described above will be described with reference to the flowchart in FIG. 2. Note that this information processing method can be executed as an information processing program on a computer equipped with a CPU or the like. Further, this information processing program may be stored in a computer-readable storage medium.
 まず、端末装置10が起動されると、ユーザ特定部110によってユーザの特定処理が行われる(ステップS100)。このステップでは、端末装置10を使用するユーザが特定される。この際、車内撮影カメラが撮影した車内画像と、予め記憶部14に記憶させた車両の運転者や同乗者の画像と、をマッチングさせることでユーザを自動的に特定してもよいし、車両の運転者や同乗者に対して、端末装置10側から出力部13を介して質問を出力するとともに、質問に対する回答を入力部12に入力するようにユーザに要求し、その回答に基づいてユーザを特定してもよい。また、端末装置10にユーザの所持しているスマートフォン等のユーザ端末があらかじめ登録されている場合には、ユーザ端末が無線手段等により端末装置10に検出されたこと、又は端末装置10と直接接続されたこと、に基づいてユーザを特定してもよい。 First, when the terminal device 10 is started, the user identification unit 110 performs user identification processing (step S100). In this step, the user who uses the terminal device 10 is identified. At this time, the user may be automatically identified by matching the in-vehicle image taken by the in-vehicle camera with images of the vehicle driver and passenger stored in the storage unit 14 in advance, or A question is output from the terminal device 10 to the driver and fellow passengers via the output unit 13, and the user is requested to input an answer to the question into the input unit 12, and based on the answer, the user may be specified. In addition, if a user terminal such as a smartphone owned by the user is registered in advance in the terminal device 10, the user terminal may be detected by the terminal device 10 by wireless means or the like or directly connected to the terminal device 10. The user may be identified based on what the user has done.
 次に、制御部11によって経路探索処理の指示判定が行われる(ステップS200)。このステップでは、まず入力部12が監視され、経路探索処理を行うための指示が入力されるか否かが判定される。具体的に、経路探索処理を行うための指示とは、ユーザが、入力部12に出発地や目的地を入力することであり、「〇〇(出発地)から~××(目的地)までの経路を調べて」などの音声によってなされてもよいし、タッチパネルや入力ボタンの操作によってなされてもよい。そして、判定の結果がYESであれば、次のステップS300に進み、NOであれば、当該指示が入力されるまで待機する。 Next, the control unit 11 determines whether to direct route search processing (step S200). In this step, the input unit 12 is first monitored, and it is determined whether an instruction for performing route search processing is input. Specifically, the instruction for performing the route search process is for the user to input the starting point and destination into the input section 12, and the instruction is to input the starting point and destination into the input section 12, and the instruction is ``from 〇〇 (starting point) to ~XX (destination)''. This may be done by voice, such as "Check the route," or by operating a touch panel or input button. If the result of the determination is YES, the process advances to the next step S300, and if the result is NO, the process waits until the instruction is input.
 ステップS300では、経路探索部111によって、経路探索処理が行われる。このステップでは、入力部12に入力された出発地から目的地までの経路が探索され、その経路についての案内情報が作成される。経路探索処理の処理内容については公知のため、詳細な説明を省略するが、記憶部14に記憶された地図データやGPS受信機が受信した車両の現在位置及び現在時刻を用いて、出発地から目的地までの経路が探索され、探索された経路についての案内情報が生成されるようになっている。経路探索処理が終了するとステップS300が終了し、ステップS400に進む。 In step S300, the route search unit 111 performs a route search process. In this step, a route from the departure point input to the input unit 12 to the destination is searched, and guidance information regarding the route is created. Since the contents of the route search process are publicly known, a detailed explanation will be omitted. A route to a destination is searched, and guidance information about the searched route is generated. When the route search process ends, step S300 ends and the process advances to step S400.
 ステップS400は、上述の実施形態における音声出力工程に該当するステップである。このステップでは、制御部11によって音声出力処理が行われ、出力部13から案内音声が出力される。案内音声とは、ユーザに提供する音声であり、この場合、上述のステップS300で生成された案内情報を示すものである。この案内音声によって、経路探索部によって探索された車両の経路がユーザに伝達される。 Step S400 is a step corresponding to the audio output process in the above embodiment. In this step, the control section 11 performs voice output processing, and the output section 13 outputs a guidance voice. The guidance voice is a voice provided to the user, and in this case indicates the guidance information generated in step S300 described above. The vehicle route searched by the route search unit is transmitted to the user by this guidance voice.
 案内音声は、例えば、「〇〇道路(道路の名称)を進行し、××インター(インターチェンジの名称)から□□高速(高速道路の名称)に入り、△△インター(インターチェンジの名称)で降りるルートです」など、道路の名称等を用いてユーザに経路を説明するものや、「●●(地名)を通過して、◎◎(地名)に向かうルートです」など、地名を用いてユーザに経路を説明するものであってもよい。また、案内音声は、「◇◇(施設の名称)の手前の交差点を右折し、「◆◆(施設の名称)の先の信号を左折してください」のように、施設の名称を用いてユーザに経路を説明するものや、これら、道路の名称等、地名、及び施設の名称を組み合わせてユーザに経路を説明するものであってもよい。 For example, the guidance voice may say, ``Go along 〇〇 road (name of the road), enter the □□ expressway (name of the expressway) from the XX interchange (name of the interchange), and get off at the △△ interchange (name of the interchange). Routes are explained to the user using road names, such as ``This is the route,'' or ``This is the route that passes through ●● (place name) and goes to ◎◎ (place name).'' It may also be something that explains the route. In addition, the guidance voice uses the name of the facility, such as "Turn right at the intersection before ◇◇ (name of facility), then turn left at the traffic light after ◆◆ (name of facility)." The route may be explained to the user by a combination of these, road names, place names, and facility names.
 なお、案内音声は、「案内が分からなかった場合は、分からなかったと回答してください」のように、ユーザへの返答の要求を含むものであってもよいが、運転中のユーザの注意を逸らすことやユーザに煩わしさを与えることを抑制する観点から、ユーザに返答を要求するものではないことがより好適である。案内音声が出力されると、ステップS400が終了し、ステップS500に進む。 Note that the guidance voice may include a request for a response from the user, such as "If you do not understand the guidance, please reply that you did not understand," but the guidance voice may also include a request for a response from the user, such as "If you do not understand the guidance, please reply that you did not understand." From the viewpoint of preventing distractions and annoyance to the user, it is more preferable that the message does not require the user to respond. When the guidance voice is output, step S400 ends and the process advances to step S500.
 ステップS500は、上述の実施形態における検出工程である。このステップでは、検出部112によって検出処理が行われる。ここでは、まず、GPS受信機、加速度センサやジャイロセンサ等の各種センサ、車内撮影カメラ、又はマイクロフォンなどの装置が監視される。そして、これらの装置を介して、案内音声の出力後に発生する、音、ユーザの動作、又はユーザが操作する車両の挙動等がユーザの反応として検出される。そして検出された反応は当該反応を示す反応データとされ、リングバッファ形式で記憶部14に一時に保存される。このように、ステップS500では、検出部112において、ユーザの案内音声に対する反応が検出され、その反応が一時的に記憶部14に保存される。そして、ステップS600に進む。 Step S500 is a detection step in the embodiment described above. In this step, the detection unit 112 performs detection processing. Here, first, devices such as a GPS receiver, various sensors such as an acceleration sensor and a gyro sensor, an in-vehicle camera, or a microphone are monitored. Then, through these devices, sounds, user actions, behavior of the vehicle operated by the user, etc. that occur after the guidance voice is output are detected as the user's reaction. The detected reaction is treated as reaction data indicating the reaction, and is temporarily stored in the storage unit 14 in a ring buffer format. In this manner, in step S500, the detection unit 112 detects the user's reaction to the guidance voice, and the reaction is temporarily stored in the storage unit 14. Then, the process advances to step S600.
 ステップS600は、上述の実施形態における抽出工程に該当するステップである。このステップでは、抽出部113によって抽出処理が行われる。ここでは、まず、記憶部14が所定時間に亘って監視される。所定時間とは、例えば、案内音声が出力された後、1~3分間程度としてよい。そして、当該監視後に、一時保存された反応データのうち、案内音声に関するユーザの確認行動を示すデータが発見されるか否かが判定される。 Step S600 is a step corresponding to the extraction step in the embodiment described above. In this step, the extraction unit 113 performs extraction processing. Here, first, the storage unit 14 is monitored for a predetermined period of time. The predetermined time may be, for example, about 1 to 3 minutes after the guidance voice is output. After the monitoring, it is determined whether or not data indicating the user's confirmation behavior regarding the guidance voice is found among the temporarily stored reaction data.
 確認行動を示すデータが発見された場合には、そのデータは確認データ140として抽出される。このように、抽出部113によって、検出部112で検出した反応のうち、案内音声に関するユーザの確認行動が抽出される。そして、確認データ140が抽出できた場合(YESの場合)にはステップS700に進み、確認データ140が抽出できない場合(NOの場合)には、端末装置10の動作を終了する。 If data indicating confirmation behavior is found, the data is extracted as confirmation data 140. In this manner, the extraction unit 113 extracts the user's confirmation behavior regarding the guidance voice from among the reactions detected by the detection unit 112. Then, if the confirmation data 140 can be extracted (in the case of YES), the process advances to step S700, and if the confirmation data 140 cannot be extracted (in the case of NO), the operation of the terminal device 10 is ended.
 ここで、確認データ140として抽出される確認行動とは、案内音声を聞いたユーザが、その案内音声について何らかの確認したい事項が生じた際に取る行動のことをいい、ユーザが何らかの理由で案内音声を理解できなかったことや、案内音声に不足していた事項があったこと等を示す行動のことをいう。確認行動は、例えばユーザが「路肩に車両を停車させ、スマートフォンなどの機器を操作していること」であり、この確認行動は、検出部112で作成された反応データの中に、車両の停車を示すデータと、スマートフォンなどの機器を操作しているユーザの映像データがある場合などに抽出部113に発見され、抽出される。 Here, the confirmation action extracted as the confirmation data 140 refers to the action taken when a user who has heard a guidance voice encounters something that he or she wants to confirm about the guidance voice, and the user who has heard the guidance voice for some reason This refers to actions that indicate that the person was unable to understand the instructions, or that there was something missing in the guidance voice. The confirmation behavior is, for example, when the user "stops the vehicle on the roadside and operates a device such as a smartphone," and this confirmation behavior is included in the reaction data created by the detection unit 112. The extraction unit 113 discovers and extracts data when there is data indicating that the user is operating a device such as a smartphone, and video data of a user operating a device such as a smartphone.
 また、確認行動は、ユーザが「●●(地名)ってどこ?」、「●●(地名)ってよくわからない」、「速すぎて聞き取れなかった」、又は「発音が悪くて聞き取れなかった」などの「独り言を話すこと」や、「●●(地名)についてスマートフォンで調べて」、「スマートフォンでもう一度経路を調べ直して」など「同乗者に指示を行うこと」であり、この確認行動は、検出部112で作成された反応データの中に、上述の独り言や同乗者への指示を示す音声データがある場合などに抽出部113に発見され、抽出される。 In addition, the confirmation behavior was when the user asked, ``Where is ●● (place name)?'', ``I don't really understand ●● (place name)'', ``I couldn't catch it because it was too fast'', or ``I couldn't catch it because my pronunciation was bad.'' ” or “give instructions to fellow passengers” such as “look up ●● (place name) on your smartphone” or “check the route again on your smartphone,” and this confirmation behavior is discovered and extracted by the extracting unit 113 when the reaction data created by the detecting unit 112 includes voice data indicating the above-mentioned soliloquy or instructions to a fellow passenger.
 このように、確認行動には、経路探索部111を用いて経路の再検索をするといった行動ではない行動、すなわち、最初に経路探索を行った経路探索部111に対する行動ではない行動も含まれており、このような行動も抽出部113により抽出される。また、上述のようにユーザが、案内音声に反応して音声を発することも確認行動として抽出され、この行動の中には、「独り言を話すこと」や「同乗者に指示を行うこと」といった、案内音声に対して返答する以外の自発的な行動が含まれている。 In this way, confirmation actions include actions that are not actions such as re-searching for a route using the route search unit 111, that is, actions that are not actions for the route search unit 111 that initially performed the route search. Therefore, such behavior is also extracted by the extraction unit 113. In addition, as mentioned above, the user's utterance of voice in response to the guidance voice is also extracted as a confirmation behavior, and this behavior includes ``talking to oneself'' and ``giving instructions to fellow passengers.'' , includes spontaneous actions other than responding to the guidance voice.
 ステップS700は、上述の本実施例における推定工程に該当するステップである。このステップでは、推定部114によって推定処理が行われる。ここでは、確認データ140(すなわち、抽出した確認行動)に基づいて、ユーザが確認したい事項である確認事項が推定される。例えば、確認データ140の内容が、「路肩に車両を停車させ、スマートフォンなどの機器を操作していること」であった場合、ユーザは運転を中断してスマートフォンなどの機器を操作していることから、案内音声について何かしら調べる作業をしている可能性が高い。一方、調べる作業をしている可能性が高いということは、案内音声については聞き取れている可能性が高い。そこで、このような確認データが抽出された場合、確認事項は「案内音声の内容」であると推定される。 Step S700 is a step corresponding to the estimation process in the above-mentioned embodiment. In this step, the estimation unit 114 performs estimation processing. Here, confirmation items that the user wants to confirm are estimated based on the confirmation data 140 (that is, the extracted confirmation behavior). For example, if the confirmation data 140 indicates that the user has stopped the vehicle on the roadside and is operating a device such as a smartphone, it indicates that the user has stopped driving and is operating a device such as a smartphone. There is a high possibility that they are doing some research on the voice guidance. On the other hand, the fact that there is a high possibility that the person is doing research work means that there is a high possibility that the user can hear the guidance voice. Therefore, when such confirmation data is extracted, it is presumed that the confirmation item is "the content of the guidance voice."
 これは、「同乗者に指示を行うこと」という内容の確認データ140が抽出された場合も同様である。同乗者に指示を行っている以上、案内音声は聞き取れているものの、同乗者にスマートフォンという別の装置を用いて経路を調べ直させていることから、案内音声の示す内容については理解していないと推定できるからである。また、「●●(地名)ってどこ?」、「●●(地名)ってよくわからない」といった「独り言を話すこと」という内容の確認データ140が抽出された場合も同様である。 This is also the case when confirmation data 140 with the content "give instructions to fellow passengers" is extracted. Since he is giving instructions to his fellow passengers, he can hear the guidance voice, but he does not understand what the guidance voice is saying because he is having his fellow passenger check the route using another device called a smartphone. This is because it can be estimated that The same is true when the confirmation data 140 containing the content "talking to oneself" such as "Where is ●● (place name)?" or "I don't really understand ●● (place name)" is extracted.
 これらに対し、「速すぎて聞き取れなかった」、又は「発音が悪くて聞き取れなかった」などの「独り言を話すこと」という内容の確認データ140が抽出された場合は、ユーザは、案内音声の内容はもちろんのこと、音声自体が正確に聞き取れていない可能性が高い。したがって、このような内容の確認データ140が抽出された場合、確認事項は「案内音声の音声自体」と推定される。このように、ステップS700では、ユーザが、案内音声を聞いた後に他の機器を使用するといった確認行動や、案内音声に反応して音声を発するといった確認行動に基づいて、確認事項が推定されることがある。 On the other hand, if the confirmation data 140 with the content of "talking to oneself" such as "I couldn't hear it because it was too fast" or "I couldn't hear it because my pronunciation was bad" is extracted, the user can There is a high possibility that not only the content but also the audio itself cannot be heard accurately. Therefore, when confirmation data 140 with such content is extracted, the confirmation item is presumed to be "the voice of the guidance voice itself." In this way, in step S700, confirmation items are estimated based on the user's confirmation behavior such as using another device after listening to the guidance voice or the confirmation behavior such as emitting a voice in response to the guidance voice. Sometimes.
 また、ステップS700では、確認事項が「案内音声の内容」である場合と「案内音声の音声自体」の場合のいずれも、ユーザの発した音声に基づいて確認事項を特定できる場合があり、その場合は、当該特定が行われる。例えば、確認データ140の内容が「●●(地名)ってどこ?」、「●●(地名)ってよくわからない」といった「独り言を話すこと」という内容であれば、確認事項は、「地名に関すること」と特定され、確認データ140の内容が、「速すぎて聞き取れなかった」といった「独り言を話すこと」という内容であれば、確認事項は、「案内音声のスピードに関すること」と特定され、確認データ140の内容が、「発音が悪くて聞き取れなかった」といった「独り言を話すこと」という内容であれば、確認事項は、「案内音声の発音に関すること」と特定される。 In addition, in step S700, whether the confirmation item is "the content of the guidance voice" or the "guidance voice itself", the confirmation item may be specified based on the voice uttered by the user. If so, such identification will be made. For example, if the content of the confirmation data 140 is ``talking to oneself'' such as ``Where is ●● (place name)?'' or ``I don't really understand ●● (place name),'' the confirmation item is ``Where is ●● (place name)?'' If the content of the confirmation data 140 is "talking to oneself" such as "I was too fast to hear", then the confirmation matter is identified as "related to the speed of the guidance voice". If the content of the confirmation data 140 is ``talking to oneself'' such as ``I couldn't understand your pronunciation because of poor pronunciation,'' then the confirmation item is specified as ``related to the pronunciation of the guidance voice.''
 そして、これらの確認事項が推定又は特定されると、推定部114により確認事項についての情報として推定データ141が作成される。この推定データ141は、確認事項を示すものである。これによりステップS700が終了し、ステップS800に進む。 When these confirmation items are estimated or specified, the estimation unit 114 creates estimated data 141 as information regarding the confirmation items. This estimated data 141 indicates confirmation items. This ends step S700, and the process advances to step S800.
 ステップS800は、上述の実施形態における記憶工程である。このステップでは、制御部11により分類処理及び保存処理が行われる。ここでは、まず、推定データ141が、所定条件ごとに分類される。この場合、例えば、上述した確認事項の種類(「案内音声の内容」、「案内音声の音声自体」、「地名に関すること」、「案内音声のスピードに関すること」、「案内音声の発音に関すること」)ごとに仕分けるという条件を決め、当該種類ごとに分類コードを付与するなどして、推定データ141を分類することができる。 Step S800 is a storage step in the above embodiment. In this step, the control unit 11 performs classification processing and storage processing. Here, first, the estimated data 141 is classified according to predetermined conditions. In this case, for example, the types of confirmation items mentioned above ("content of the guidance voice", "the voice of the guidance voice itself", "things related to place names", "things related to the speed of the guidance voice", "things related to the pronunciation of the guidance voice") ), and by assigning a classification code to each type, the estimated data 141 can be classified.
 そして、推定データ141の分類が終了すると、分類された推定データ141を記憶部14に保存する保存処理が行われる。この保存処理では、上述の反応データの一時保存と異なり、上書き不能なデータとして推定データ141が保存される。この際、単に推定データ141を保存してもよいが、例えば、ユーザ特定部110で特定されたユーザのデータや、入力部12に入力された指示のデータ、出力部13により出力された案内音声のデータ、及び推定データ141を作成する基になった確認データ140、及びGPS受信機が受信した現在位置や現在時刻のデータ等を、推定データ141に付与して保存してもよい。このようにすることで、案内音声がユーザに理解されなかった原因等を、様々な観点から分析することができる。そして、端末装置10の動作が終了する。 Then, when the classification of the estimated data 141 is completed, a storage process is performed to save the classified estimated data 141 in the storage unit 14. In this storage process, unlike the above-described temporary storage of reaction data, the estimated data 141 is stored as data that cannot be overwritten. At this time, the estimated data 141 may be simply stored, but for example, data on the user specified by the user identification section 110, data on instructions inputted into the input section 12, and guidance voice outputted from the output section 13 can be used. , the confirmation data 140 that is the basis for creating the estimated data 141, the current position and current time data received by the GPS receiver, etc. may be added to the estimated data 141 and stored. By doing this, it is possible to analyze the reason why the guidance voice was not understood by the user from various viewpoints. Then, the operation of the terminal device 10 ends.
 実施例1によれば、検出部112が検出したユーザの反応のうちユーザの確認行動を抽出部113で抽出することができるので、ユーザが案内音声に対して何かしらの確認を必要としていること、すなわち、案内音声をユーザが理解できなかった、または案内音声に不足していた事項あった等の理由によってユーザが確認したい事項があるという状況が発生したことを情報処理装置1側で把握することができる。また、推定部114は、確認行動に基づいてユーザの確認したい確認事項(事項)を推定し、記憶部は、推定データ141(推定部114で推定された確認事項についての情報)を記憶できるので、確認事項を蓄積して活用することができる。このため、蓄積された推定データ141を分析等することで、案内音声の向上に役立てることができるように、ユーザが案内音声について確認したい事項が発生した原因を装置側またはシステム側で把握することができる。 According to the first embodiment, the extraction unit 113 can extract the user's confirmation behavior from among the user's reactions detected by the detection unit 112, so that it is possible to detect that the user needs some kind of confirmation regarding the guidance voice. In other words, the information processing device 1 side can grasp that a situation has occurred in which the user has an item that the user would like to confirm due to reasons such as the user not being able to understand the guidance audio or the information being lacking in the guidance audio. I can do it. In addition, the estimation unit 114 estimates confirmation items (items) that the user wants to confirm based on the confirmation behavior, and the storage unit can store estimation data 141 (information about the confirmation items estimated by the estimation unit 114). , it is possible to accumulate and utilize confirmation items. Therefore, by analyzing the accumulated estimated data 141, it is possible to understand on the device side or the system side the cause of occurrence of the matters that the user wants to confirm regarding the guidance voice so that it can be used to improve the guidance voice. I can do it.
 また、この場合、推定データ141に上述のユーザのデータや、入力部12に入力された指示のデータ、出力部13により出力された案内音声のデータ、及び推定データ141を作成する基になった確認データ140、及びGPS受信機が受信した現在位置や現在時刻のデータ等を付与していれば、例えば、ユーザごとに案内音声と、確認事項と、の関係を調べることが可能となるので、各ユーザの特性に合わせて案内音声を最適化することや、多くのユーザが分かり難いと感じる案内音声の表現を調べることなどが可能となる。また、ユーザの情報と、現在位置や現在時刻と、確認事項と、の関係を調べることが可能となるので、多くのユーザにとって案内音声が分かり難い地点や経路を調べることも可能となる。 In this case, the estimated data 141 includes the above-mentioned user data, instruction data input to the input section 12, guidance voice data outputted by the output section 13, and the basis for creating the estimated data 141. If the confirmation data 140 and the data of the current location and current time received by the GPS receiver are provided, for example, it becomes possible to check the relationship between the guidance voice and the confirmation items for each user. It becomes possible to optimize the guidance voice according to the characteristics of each user, and to examine the expression of the guidance voice that many users find difficult to understand. Furthermore, since it is possible to check the relationship between the user's information, the current location, current time, and confirmation items, it is also possible to check points and routes whose guidance voice is difficult for many users to understand.
 また、推定部114で推定された確認事項についての情報を、例えば、「案内音声の内容」、「案内音声の音声自体」、「地名に関すること」、「案内音声のスピードに関すること」、「案内音声の発音に関すること」、というように、カテゴリーごとに分けることができるので、例えば、一の案内音声について、ユーザが確認を必要とした内容はどんな内容なのかをカテゴリーごとに分析する場合など、将来記憶部14に記憶された事項を活用する際に、当該事項の取扱いを容易にすることができる。 In addition, the information about the confirmation items estimated by the estimation unit 114 is added, for example, "content of the guidance voice", "the voice of the guidance voice itself", "things related to place names", "things related to the speed of the guidance voice", "information about the speed of the guidance voice", etc. For example, if you want to analyze by category what kind of content the user needed to confirm regarding the first guidance voice, etc. When utilizing the matters stored in the future storage unit 14, handling of the matters can be facilitated.
 また、上述のような、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の問題が生じた場合であって、ユーザが当該問題について経路探索部111を利用する以外の手段を用いて解決しようとしたことを、確認行動として抽出することができる。具体的には、車両を運転中のユーザが、案内音声を聞いた後に車両を路肩に停車させ、スマートフォン等の機器を操作した場合や、車両の運転者が同乗者に対して携帯端末を用いて経路を調べ直すように指示したような場合、など、経路探索部111を利用することとは異なる手段を用いて上述の問題を解決しようとした場合に、その行動を確認行動として抽出することができる。このため、案内音声についてユーザが確認したい事項があるという状況が発生したことをより一層、情報処理装置1側で把握し易くすることができ、ユーザが案内音声について確認したい事項が発生した原因の分析、及び案内音声の改善などに資することができる。 In addition, if a problem occurs such as the user cannot understand the guidance voice or there is something missing in the guidance voice as described above, the user may contact the route search unit 111 regarding the problem. It is possible to extract as a confirmation behavior that an attempt was made to solve the problem using a method other than using the method. Specifically, when a user driving a vehicle stops the vehicle on the roadside after listening to the guidance voice and operates a device such as a smartphone, or when a vehicle driver uses a mobile terminal to ask a passenger. When an attempt is made to solve the above-mentioned problem using a means other than using the route search unit 111, such as when the user instructs the user to re-examine the route, the action is extracted as a confirmation action. I can do it. Therefore, it is possible to make it easier for the information processing device 1 to understand that a situation in which the user wants to check the guidance voice has occurred, and to identify the cause of the occurrence of the problem that the user wants to check regarding the guidance voice. This can contribute to analysis and improvement of guidance voices.
 また、サーバ装置などの外部機器を用いることなく、一の端末装置10で情報処理装置1を構成することができるので、情報処理装置1の構成をシンプルにすることができる。 Furthermore, since the information processing device 1 can be configured with one terminal device 10 without using external equipment such as a server device, the configuration of the information processing device 1 can be simplified.
 また、情報処理装置1は、ユーザが発する音声によって、案内音声についてユーザの確認の必要が生じたことを把握することができ、ユーザが発した音声に基づいて、確認事項を推定することができる。このため、例えば「案内音声が速すぎて聞き取れなかった」、「案内音声の発音が悪くて聞き取れなかった」、等、案内音声のわかりにくかった点についてユーザごとにデータを分析し、次回以降の案内音声を、各ユーザに最適な案内音声とするように調整することなどができるようになる。 Further, the information processing device 1 can understand from the voice emitted by the user that the user needs to confirm the guidance voice, and can estimate the confirmation items based on the voice emitted by the user. . For this reason, we analyzed data for each user regarding points that were difficult to understand in the guidance voice, such as ``The guidance voice was too fast to understand,'' or ``The guidance voice was poorly pronounced and I couldn't understand.'' It becomes possible to adjust the guidance voice so that it is optimal for each user.
 また、本実施例によれば、抽出部113は、ユーザが案内音声に対して返答すること以外の行動を確認行動として抽出することができる。このため、例えば、ユーザが「●●(地名)ってどこ?」、「●●(地名)ってよくわからない」、「速すぎて聞き取れなかった」、又は「発音が悪くて聞き取れなかった」といった独り言を発したことや、ユーザが同乗者に対して「●●(地名)についてスマートフォンで調べて」、「スマートフォンでもう一度経路を調べ直して」などといった指示を自発的に行ったこと、など、返答を求められて情報処理装置1と発話したことではなく、自発的に音声を発したことを確認行動として抽出することができる。したがって、運転中のユーザに対して情報処理装置1側からわざわざ発話を誘導する必要がなく、当該誘導によって運転中のユーザの注意を逸らすことや、ユーザに煩わしさを感じさせることを抑制することができる。 Furthermore, according to the present embodiment, the extraction unit 113 can extract actions other than the user's response to the guidance voice as confirmation actions. For this reason, for example, when a user asks, ``Where is ●● (place name)?'', ``I don't really understand ●● (place name)'', ``I couldn't catch it because it was too fast'', or ``I couldn't catch it because the pronunciation was bad''. Users may have uttered such words to themselves, or users may have spontaneously given instructions to fellow passengers such as, ``Look up ●● (place name) on your smartphone,'' or ``Check the route again on your smartphone.'' , it is possible to extract as a confirmation action that the user voluntarily uttered a voice, rather than speaking to the information processing device 1 in response to a request for a response. Therefore, there is no need for the information processing device 1 to specifically guide the user to speak while driving, and this guidance can prevent the user from diverting his or her attention while driving or causing the user to feel bothered. I can do it.
[実施例2]
 次に本実施形態の実施例2に係る情報処理装置1を、図3を参照して説明する。なお、上述した実施例1と同一部分には、同一符号を付して説明を省略又は簡略する。
[Example 2]
Next, the information processing device 1 according to Example 2 of this embodiment will be described with reference to FIG. 3. Note that the same parts as in the first embodiment described above are given the same reference numerals, and the explanation will be omitted or simplified.
 本実施例に係る情報処理装置1は、図3に示すように、端末装置10(第1端末装置)と、携帯端末20(第2端末装置)と、を備えている。 As shown in FIG. 3, the information processing device 1 according to this embodiment includes a terminal device 10 (first terminal device) and a mobile terminal 20 (second terminal device).
 端末装置10は、実施例1と同様に車両とともに移動する装置であり、検出部112を有していない点、通信部15を有している点以外は、実施例1の端末装置10と同様の構成を備えている。 The terminal device 10 is a device that moves with the vehicle as in the first embodiment, and is the same as the terminal device 10 in the first embodiment except that it does not have the detection section 112 and has the communication section 15. It has the following configuration.
 携帯端末20は、車両(移動体)とともに移動し、かつ端末装置10と通信可能に設けられた装置である。この携帯端末20は、例えば、スマートフォンやタブレット等の装置であり、制御部21と、通信部22と、を備えている。 The mobile terminal 20 is a device that moves with the vehicle (mobile object) and is provided to be able to communicate with the terminal device 10. The mobile terminal 20 is, for example, a device such as a smartphone or a tablet, and includes a control section 21 and a communication section 22.
 制御部21は、例えばRAMやROMなどのメモリを備えたCPUで構成され、携帯端末20全体の制御を行っている。制御部21は、検出部211と、情報検索部212と、を備えている。検出部211は、実施例1における検出部112に相当する構成である。 The control unit 21 is composed of a CPU equipped with a memory such as a RAM or ROM, and controls the entire mobile terminal 20. The control unit 21 includes a detection unit 211 and an information search unit 212. The detection unit 211 has a configuration corresponding to the detection unit 112 in the first embodiment.
 情報検索部212は、端末装置10の経路探索部111とは、独立して設けられた情報検索を行う構成であり、経路探索部111とは独立して車両の経路検索を行う経路検索処理(経路検索)や、特定の地点の座標を調べる地点検索処理(地点検索)、及び特定のワードにつきキーワード検索を行うキーワード検索処理(キーワード検索)を実行可能に設けられている。 The information search unit 212 is configured to perform an information search provided independently of the route search unit 111 of the terminal device 10, and performs a route search process (which performs a vehicle route search independently of the route search unit 111). route search), a point search process (point search) to check the coordinates of a specific point, and a keyword search process (keyword search) to search for a keyword for a specific word.
 なお、情報検索部212は、World Wide Web上で使用される公知の検索エンジンを利用するものでもあってもよく、検索の結果は、携帯端末20に設けられた不図示の出力部によって、画像又は音声として出力されてよい。 Note that the information search unit 212 may use a known search engine used on the World Wide Web, and the search results are output as images by an output unit (not shown) provided in the mobile terminal 20. Alternatively, it may be output as audio.
 通信部22は、端末装置10の通信部15と通信可能に設けられており、検出部211で作成された反応データが携帯端末20から端末装置10に送信可能となっている。 The communication unit 22 is provided to be able to communicate with the communication unit 15 of the terminal device 10, and the reaction data created by the detection unit 211 can be transmitted from the mobile terminal 20 to the terminal device 10.
 次に、この構成の情報処理装置1の動作(情報処理方法)を説明する。本実施例の情報処理装置1の動作は、処理が行われる装置が一部実施例1と異なるものの、動作の流れ(ステップ)は実施例1と同様であるため、図2を参照してその説明をする。 Next, the operation (information processing method) of the information processing device 1 with this configuration will be explained. Although the operation of the information processing device 1 of this embodiment is partially different from that of the first embodiment in the device that performs the processing, the flow (steps) of the operation is the same as that of the first embodiment, so please refer to FIG. Give an explanation.
 まず、ステップS100では、ユーザの特定処理に加えて、携帯端末20の端末装置10に対する連携処理が行われる。ここでは、携帯端末20が端末装置10に連携可能な端末か否かの認証が行われ、連携が認められる場合には、端末装置10と携帯端末20が通信可能となる。これにより、携帯端末20の検出部211で検出したユーザの反応や、情報検索部212で行われた情報検索の内容を端末装置10側で把握することができるようになる。次にステップS200に進む。 First, in step S100, in addition to user identification processing, cooperation processing of the mobile terminal 20 with respect to the terminal device 10 is performed. Here, authentication is performed to determine whether the mobile terminal 20 is a terminal that can cooperate with the terminal device 10, and if cooperation is recognized, the terminal device 10 and the mobile terminal 20 can communicate. This allows the terminal device 10 to grasp the user's reaction detected by the detection unit 211 of the mobile terminal 20 and the content of the information search performed by the information search unit 212. Next, the process advances to step S200.
 ステップS200からステップS400までは、実施例1と同様であるので、その説明を省略する。ステップS500では、検出処理が行われる。ここでは、実施例1の構成に加えて、携帯端末20の情報検索部212が監視され、案内音声の出力後に発生する、情報検索部212の動作がユーザの反応として検出される。 The steps from step S200 to step S400 are the same as in the first embodiment, so the description thereof will be omitted. In step S500, detection processing is performed. Here, in addition to the configuration of the first embodiment, the information search section 212 of the mobile terminal 20 is monitored, and the operation of the information search section 212 that occurs after the guidance voice is output is detected as the user's reaction.
 ステップS600では、実施例1の例に加えて、ユーザが「情報検索部212を用いて検索行動をしたこと」が確認行動とされる。この確認行動は、検出部211で作成された反応データの中に、経路検索や、地点検索、又はキーワード検索などのために情報検索部212が動作したことを示すデータがある場合などに抽出部113に発見され、確認データ140として抽出される。すなわち、確認行動には、実施例1の確認行動の他に、ユーザが情報検索部212を用いて車両の経路検索をすること、ユーザが情報検索部212を用いて特定の地点の座標を調べる地点検索をすること、及びユーザが情報検索部212を用いて特定のワードにつきキーワード検索をすること、が含まれることとなる。 In step S600, in addition to the example of the first embodiment, the confirmation action is that the user "performed a search action using the information search unit 212." This confirmation behavior is performed by the extraction unit when the reaction data created by the detection unit 211 includes data indicating that the information search unit 212 has operated for route search, point search, keyword search, etc. 113 and extracted as confirmation data 140. That is, in addition to the confirmation behavior of the first embodiment, the confirmation behavior includes the user using the information search unit 212 to search for a route for a vehicle, and the user using the information search unit 212 to check the coordinates of a specific point. This includes a point search and a keyword search for a specific word by the user using the information search unit 212.
 実施例2によれば、案内音声をユーザが理解できなかった、または案内音声に不足していた事項があった等の問題が生じた場合であって、ユーザが当該問題を経路探索部111から独立した情報検索部212を用いることで解決しようとしたことを確認行動として抽出することができる。このため、例えば、車両を運転中のユーザが、案内音声を聞いた後に車両を路肩に停車させ、スマートフォン等の機器を用いて情報検索をした場合にも、その行動を確認行動として抽出することができる。また、端末装置10と携帯端末20とを連携させることが可能であるので、情報検索部212に入力された検索ワード(キーワード)や検索結果の出力方法(すなわち、音声か画像か)などを参考に確認事項を推定することで、ユーザが確認したかった事項をより詳細に把握することができる。 According to the second embodiment, when a problem occurs such as the user not being able to understand the guidance voice or there being something missing in the guidance voice, the user can report the problem from the route search unit 111. By using the independent information search unit 212, it is possible to extract what was attempted to be solved as a confirmation action. Therefore, for example, if a user who is driving a vehicle stops the vehicle on the roadside after listening to the guidance voice and searches for information using a device such as a smartphone, that action can be extracted as a confirmation action. I can do it. In addition, since it is possible to link the terminal device 10 and the mobile terminal 20, it is possible to refer to the search word (keyword) input to the information search unit 212, the output method of the search result (i.e. audio or image), etc. By estimating the items to be confirmed, it is possible to understand in more detail the items the user wants to confirm.
 このため、例えば、ユーザが情報検索部212によって経路検索を行い、その結果を不図示の表示部に表示するというような行動を確認行動とし、確認データ140として抽出することもでき、この場合、ユーザは、案内音声では経路がイメージできず、画像で経路を確認したかったということが推定できる。また、ユーザが情報検索部212によってキーワード検索をするというような行動を確認行動とし、確認データ140として抽出することもでき、この場合、キーワード検索の検索文字が、施設の名称や地名、又は道路の名称であれば、ユーザは、案内音声のうち、施設の名称や地名、又は道路の名称を理解できなかったということが推定できる。 Therefore, for example, an action in which the user performs a route search using the information search unit 212 and displays the result on a display unit (not shown) can be regarded as a confirmation action and extracted as the confirmation data 140. In this case, It can be inferred that the user was unable to visualize the route using the guidance voice and wanted to confirm the route using images. Further, an action in which the user performs a keyword search using the information search unit 212 can be regarded as a confirmation action and extracted as confirmation data 140. In this case, the search characters of the keyword search may be the name of a facility, a place name, or a road. If the name is , it can be presumed that the user was unable to understand the facility name, place name, or road name in the guidance voice.
 また、この場合、特にキーワード検索の文字がひらがなや片仮名であれば、ユーザは施設の名称や地名、又は道路の名称を知らなかったということが推定できる。また、同じ施設の名称や地名、又は道路の名称を複数回にわたってキーワード検索している場合は、ユーザは、案内された施設の名称や地名、又は道路の名称について、同音の複数地点が思い当っていることが推定できる。 Additionally, in this case, especially if the keyword search characters are hiragana or katakana, it can be presumed that the user does not know the name of the facility, place name, or road name. In addition, if the user searches for the same facility name, place name, or road name multiple times by keyword, the user may be able to think of multiple locations with the same sound for the facility name, place name, or road name that was guided. It can be assumed that
 このように、ユーザが、スマートフォンなどの携帯端末20(最初に経路探索を行った端末装置10以外の端末装置)を用いた情報検索を行うことで当該問題を解決しようとした場合であっても、当該問題が起きていたことを情報処理装置1側で見逃し難くなるとともに、記憶部14に蓄積された推定データ141を活用して、案内音声の内容の改善を図ることなどができるようになる。また、情報処理装置1を、端末装置10と、携帯端末20とで構成できるので、一の端末装置10で構成される情報処理装置1と比較して、各装置の負荷を低減することができる。 In this way, even if the user attempts to solve the problem by searching for information using a mobile terminal 20 such as a smartphone (terminal device other than the terminal device 10 that initially performed the route search), , it becomes difficult for the information processing device 1 to overlook that the problem has occurred, and it becomes possible to improve the content of the guidance voice by utilizing the estimated data 141 accumulated in the storage unit 14. . Furthermore, since the information processing device 1 can be configured with the terminal device 10 and the mobile terminal 20, the load on each device can be reduced compared to the information processing device 1 configured with one terminal device 10. .
[実施例3]
 次に本実施形態の実施例3に係る情報処理装置1を、図4を参照して説明する。なお、上述した実施例1、2と同一部分には、同一符号を付して説明を省略又は簡略する。
[Example 3]
Next, the information processing device 1 according to Example 3 of this embodiment will be described with reference to FIG. 4. Note that the same parts as in the first and second embodiments described above are denoted by the same reference numerals, and the description thereof will be omitted or simplified.
 本実施例に係る情報処理装置1は、図4に示すように、端末装置10(第1端末装置)と、携帯端末20(第2端末装置)と、サーバ装置30と、を備えている。 As shown in FIG. 4, the information processing device 1 according to this embodiment includes a terminal device 10 (first terminal device), a mobile terminal 20 (second terminal device), and a server device 30.
 端末装置10は、実施例1と同様に車両とともに移動する装置であり、抽出部113、推定部114及び記憶部14を有していない点以外は、実施例2の端末装置10と同様の構成を備えている。携帯端末20は、実施例2の携帯端末20と同様の構成を備えている。 The terminal device 10 is a device that moves with the vehicle as in the first embodiment, and has the same configuration as the terminal device 10 in the second embodiment except that it does not include the extraction unit 113, the estimation unit 114, and the storage unit 14. It is equipped with The mobile terminal 20 has the same configuration as the mobile terminal 20 of the second embodiment.
 サーバ装置30は、端末装置10及び携帯端末20と通信可能に設けられており、制御部31と、記憶部32と、通信部33と、を備えている。制御部31は、抽出部310と、推定部311と、を備えている。抽出部310、推定部311、及び記憶部32は、実施例1、2における抽出部113、推定部114、及び記憶部14にそれぞれ対応し、これらの構成と同様の機能を備えている。通信部33は、通信部15及び通信部22と通信可能に設けられており、上述の反応データや、確認データ140、及び推定データ141を、送受信できるようになっている。 The server device 30 is provided to be able to communicate with the terminal device 10 and the mobile terminal 20, and includes a control section 31, a storage section 32, and a communication section 33. The control unit 31 includes an extraction unit 310 and an estimation unit 311. The extraction unit 310, the estimation unit 311, and the storage unit 32 correspond to the extraction unit 113, the estimation unit 114, and the storage unit 14 in Examples 1 and 2, respectively, and have the same functions as these structures. The communication unit 33 is provided to be able to communicate with the communication unit 15 and the communication unit 22, and is capable of transmitting and receiving the above-mentioned reaction data, confirmation data 140, and estimation data 141.
 このようにすることにより、一の端末装置10で構成される情報処理装置1や、端末装置10(第1端末装置)と携帯端末20(第2端末装置)とで構成される情報処理装置1と比較して各装置の負荷を低減することができる。 By doing so, the information processing device 1 configured with one terminal device 10 or the information processing device 1 configured with the terminal device 10 (first terminal device) and the mobile terminal 20 (second terminal device) The load on each device can be reduced compared to the above.
 なお、本発明は上記実施例に限定されるものではない。即ち、当業者は、従来公知の知見に従い、本発明の骨子を逸脱しない範囲で種々変形して実施することができる。かかる変形によってもなお本発明の情報処理装置を具備する限り、勿論、本発明の範疇に含まれるものである。 Note that the present invention is not limited to the above embodiments. That is, those skilled in the art can implement various modifications based on conventionally known knowledge without departing from the gist of the present invention. Of course, such modifications fall within the scope of the present invention as long as they still include the information processing apparatus of the present invention.
 例えば、推定工程で推定されるユーザが確認したい事項には、「施設の詳細情報」が含まれていてもよい。具体的には、まず音声出力工程で目的地とする施設の情報について音声により案内をできるようにする。そして、施設の情報について案内をする場合に、「施設の詳細情報をユーザがスマートフォン等で調べた」等の行動を確認行動として抽出する。そして、当該確認行動から、ユーザが必要としていた(または案内音声の中で抜けていた)情報は、「施設の詳細情報」であると推定する。このようにすることにより、ユーザが必要としていた(または音声案内の中で抜けていた)情報について、より多くの情報を装置側またはシステム側で把握することができる。 For example, the items that the user would like to confirm estimated in the estimation process may include "detailed information on the facility." Specifically, first, in the audio output step, information about the destination facility is made to be able to provide audio guidance. When providing information about a facility, actions such as "the user looked up detailed information about the facility using a smartphone or the like" are extracted as confirmation actions. Then, from the confirmation action, it is estimated that the information that the user needed (or that was missing in the guidance voice) was "detailed information about the facility." By doing this, it is possible for the device or system to grasp more information that the user needs (or that is missing in the voice guidance).
 1   情報処理装置
 13  出力部(音声出力部)
 14  記憶部
 112 検出部
 113 抽出部
 114 推定部
1 Information processing device 13 Output section (audio output section)
14 Storage unit 112 Detection unit 113 Extraction unit 114 Estimation unit

Claims (12)

  1.  ユーザに提供する案内音声を出力する音声出力部と、
     前記ユーザの前記案内音声に対する反応を検出する検出部と、
     前記検出部で検出した前記反応のうち、前記案内音声に関する前記ユーザの確認行動を抽出する抽出部と、
     前記確認行動に基づいて前記ユーザが確認したい事項を推定する推定部と、
     前記推定部で推定された前記事項についての情報を記憶する記憶部と、を備えることを特徴とする情報処理装置。
    an audio output unit that outputs guidance audio to be provided to the user;
    a detection unit that detects a reaction of the user to the guidance voice;
    an extraction unit that extracts confirmation behavior of the user regarding the guidance voice from among the reactions detected by the detection unit;
    an estimation unit that estimates items that the user wants to confirm based on the confirmation behavior;
    An information processing device comprising: a storage unit that stores information about the item estimated by the estimation unit.
  2.  前記記憶部は、前記推定部で推定された前記事項についての情報を所定条件ごとに分類して記憶することを特徴とする請求項1に記載の情報処理装置。 The information processing device according to claim 1, wherein the storage unit stores information about the items estimated by the estimation unit, classifying them according to predetermined conditions.
  3.  移動体の経路を探索する経路探索部を備え、
     前記案内音声は、前記経路探索部によって探索された前記移動体の経路を前記ユーザに伝達する音声であり、
     前記確認行動は、前記経路探索部に対する行動ではないことを特徴とする請求項1に記載の情報処理装置。
    Equipped with a route search unit that searches the route of a moving object,
    The guidance voice is a voice that conveys to the user the route of the mobile body searched by the route search unit,
    The information processing apparatus according to claim 1, wherein the confirmation action is not an action directed toward the route search unit.
  4.  移動体の経路を探索する経路探索部を備え、
     前記案内音声は、前記経路探索部によって探索された前記移動体の経路を前記ユーザに伝達する音声であり、
     前記経路探索部と、前記音声出力部と、前記検出部と、前記抽出部と、前記推定部と、前記記憶部と、は、前記移動体とともに移動する端末装置に設けられていることを特徴とする請求項1に記載の情報処理装置。
    Equipped with a route search unit that searches the route of a moving object,
    The guidance voice is a voice that conveys to the user the route of the mobile body searched by the route search unit,
    The route search unit, the audio output unit, the detection unit, the extraction unit, the estimation unit, and the storage unit are provided in a terminal device that moves together with the mobile object. The information processing device according to claim 1.
  5.  移動体の経路を探索する経路探索部を備え、
     前記案内音声は、前記経路探索部によって探索された前記移動体の経路を前記ユーザに伝達する音声であり、
     前記経路探索部と、前記音声出力部と、前記抽出部と、前記推定部と、前記記憶部とは、前記移動体とともに移動する第1端末装置に設けられ、
     前記検出部は、前記移動体とともに移動し、かつ前記第1端末装置と通信可能な第2端末装置に設けられていることを特徴とする請求項1に記載の情報処理装置。
    Equipped with a route search unit that searches the route of a moving object,
    The guidance voice is a voice that conveys to the user the route of the mobile body searched by the route search unit,
    The route search unit, the audio output unit, the extraction unit, the estimation unit, and the storage unit are provided in a first terminal device that moves together with the mobile object,
    The information processing device according to claim 1, wherein the detection unit is provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device.
  6.  移動体の経路を探索する経路探索部を備え、
     前記案内音声は、前記経路探索部によって探索された前記移動体の経路を前記ユーザに伝達する音声であり、
     前記経路探索部と、前記音声出力部と、は、前記移動体とともに移動する第1端末装置に設けられ、
     前記検出部は、前記移動体とともに移動し、かつ前記第1端末装置と通信可能な第2端末装置に設けられ、
     前記抽出部と、前記推定部と、前記記憶部とは、前記第1端末装置及び前記第2端末装置と通信可能なサーバ装置に設けられていることを特徴とする請求項1に記載の情報処理装置。
    Equipped with a route search unit that searches the route of a moving object,
    The guidance voice is a voice that conveys to the user the route of the mobile body searched by the route search unit,
    The route search unit and the audio output unit are provided in a first terminal device that moves together with the mobile object,
    The detection unit is provided in a second terminal device that moves together with the mobile object and can communicate with the first terminal device,
    The information according to claim 1, wherein the extraction unit, the estimation unit, and the storage unit are provided in a server device that can communicate with the first terminal device and the second terminal device. Processing equipment.
  7.  前記第2端末装置は、前記経路探索部から独立した情報検索部を備え、
     前記確認行動は、前記ユーザが前記情報検索部を用いて前記移動体の経路検索をすること、前記ユーザが前記情報検索部を用いて特定の地点の座標を調べる地点検索をすること、及び前記ユーザが前記情報検索部を用いて特定のワードにつきキーワード検索をすること、のうち、いずれかであることを特徴とする請求項5又は6に記載の情報処理装置。
    The second terminal device includes an information search unit independent from the route search unit,
    The confirmation action includes: the user using the information search unit to search for a route for the mobile object; the user using the information search unit to search for a point to find the coordinates of a specific point; 7. The information processing apparatus according to claim 5, wherein the user performs a keyword search for a specific word using the information search unit.
  8.  前記確認行動は、前記ユーザが、前記案内音声に反応して音声を発することであり、
     前記推定部は、前記ユーザが発した前記音声に基づいて、前記事項を推定することを特徴とする請求項1に記載の情報処理装置。
    The confirmation action is for the user to emit a sound in response to the guidance sound,
    The information processing apparatus according to claim 1, wherein the estimation unit estimates the item based on the voice uttered by the user.
  9.  前記案内音声は、前記ユーザに返答を要求するものではないことを特徴とする請求項1に記載の情報処理装置。 The information processing device according to claim 1, wherein the guidance voice does not request a response from the user.
  10.  コンピュータで実行される情報処理方法であって、
     ユーザに提供する案内音声を出力する音声出力工程と、
     前記ユーザの前記案内音声に対する反応を検出する検出工程と、
     前記検出工程で検出した前記反応のうち、前記案内音声に関する前記ユーザの確認行動を抽出する抽出工程と、
     前記確認行動に基づいて前記ユーザが確認したい事項を推定する推定工程と、
     前記推定工程で推定された前記事項についての情報を記憶する記憶工程と、を備えることを特徴とする情報処理方法。
    An information processing method executed by a computer, the method comprising:
    a voice output step of outputting a guidance voice to be provided to the user;
    a detection step of detecting a reaction of the user to the guidance voice;
    an extraction step of extracting confirmation behavior of the user regarding the guidance voice from among the reactions detected in the detection step;
    an estimation step of estimating what the user wants to confirm based on the confirmation behavior;
    An information processing method comprising: a storage step of storing information about the items estimated in the estimation step.
  11.  請求項10に記載の情報処理方法を、コンピュータにより実行させることを特徴とする情報処理プログラム。 An information processing program that causes a computer to execute the information processing method according to claim 10.
  12.  請求項11に記載の情報処理プログラムを格納したことを特徴とするコンピュータにより読み取り可能な記憶媒体。 A computer-readable storage medium storing the information processing program according to claim 11.
PCT/JP2022/019144 2022-04-27 2022-04-27 Information processing device WO2023209888A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/019144 WO2023209888A1 (en) 2022-04-27 2022-04-27 Information processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/019144 WO2023209888A1 (en) 2022-04-27 2022-04-27 Information processing device

Publications (1)

Publication Number Publication Date
WO2023209888A1 true WO2023209888A1 (en) 2023-11-02

Family

ID=88518361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/019144 WO2023209888A1 (en) 2022-04-27 2022-04-27 Information processing device

Country Status (1)

Country Link
WO (1) WO2023209888A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09329456A (en) * 1996-06-12 1997-12-22 Alpine Electron Inc Navigation system
JP2004053541A (en) * 2002-07-24 2004-02-19 Mazda Motor Corp Route guidance apparatus, route guidance method, and program for route guidance
JP2010038821A (en) * 2008-08-07 2010-02-18 Sharp Corp Information processor and information processing method
US20220005469A1 (en) * 2018-09-27 2022-01-06 Bayerische Motoren Werke Aktiengesellschaft Providing Interactive Feedback, on a Spoken Announcement, for Vehicle Occupants

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09329456A (en) * 1996-06-12 1997-12-22 Alpine Electron Inc Navigation system
JP2004053541A (en) * 2002-07-24 2004-02-19 Mazda Motor Corp Route guidance apparatus, route guidance method, and program for route guidance
JP2010038821A (en) * 2008-08-07 2010-02-18 Sharp Corp Information processor and information processing method
US20220005469A1 (en) * 2018-09-27 2022-01-06 Bayerische Motoren Werke Aktiengesellschaft Providing Interactive Feedback, on a Spoken Announcement, for Vehicle Occupants

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MORIMOTO YOSUKE *, HIROKI MORI: "An automated speech guidance that is aware of listener's response", MATERIALS FROM THE 87TH LANGUAGE/SPEECH UNDERSTANDING AND DIALOGUE PROCESSING STUDY GROUP (SIG-SLUD-B902), 20 November 2019 (2019-11-20), pages 99 - 100, XP093103173 *

Similar Documents

Publication Publication Date Title
US10269348B2 (en) Communication system and method between an on-vehicle voice recognition system and an off-vehicle voice recognition system
KR102414456B1 (en) Dialogue processing apparatus, vehicle having the same and accident information processing method
US8694323B2 (en) In-vehicle apparatus
JP5612926B2 (en) Traffic information processing apparatus, traffic information processing system, program, and traffic information processing method
JP2008058039A (en) On-vehicle device for collecting dissatisfaction information, information collection center, and system for collecting dissatisfaction information
US11511755B2 (en) Arousal support system and arousal support method
US10452351B2 (en) Information processing device and information processing method
WO2017015882A1 (en) Navigation device and navigation method
JP4910563B2 (en) Voice recognition device
WO2023209888A1 (en) Information processing device
JP3897946B2 (en) Emergency information transmission system
JP2005189667A (en) On-vehicle equipment, voice interaction document creation server, and navigation system using same
JP5160653B2 (en) Information providing apparatus, communication terminal, information providing system, information providing method, information output method, information providing program, information output program, and recording medium
JP2021072064A (en) Cognitive function diagnostic system
Sosunova et al. Ontology-based voice annotation of data streams in vehicles
KR20200095636A (en) Vehicle equipped with dialogue processing system and control method thereof
US11295742B2 (en) Voice output apparatus and voice output method
WO2022153823A1 (en) Guiding device
WO2023163196A1 (en) Content output device, content output method, program, and recording medium
EP4131212A1 (en) Information providing device, information providing method and information providing program
JP2022146261A (en) Guidance device
JP2007322309A (en) Vehicle-mounted information apparatus
JP2003209601A (en) Vehicular voice communication device
JP2020031388A (en) Information processing device and telephone conference system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22940166

Country of ref document: EP

Kind code of ref document: A1