CN110562260A - Dialogue system and dialogue processing method - Google Patents

Dialogue system and dialogue processing method Download PDF

Info

Publication number
CN110562260A
CN110562260A CN201811497854.XA CN201811497854A CN110562260A CN 110562260 A CN110562260 A CN 110562260A CN 201811497854 A CN201811497854 A CN 201811497854A CN 110562260 A CN110562260 A CN 110562260A
Authority
CN
China
Prior art keywords
vehicle
passenger
utterance
information
dialog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811497854.XA
Other languages
Chinese (zh)
Inventor
朴贞美
石东熙
申东洙
李廷馣
金佳熙
金宣我
卢熙真
金桂润
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Modern Auto Co Ltd
Hyundai Motor Co
Kia Corp
Original Assignee
Modern Auto Co Ltd
Kia Motors Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020180056497A external-priority patent/KR20190131741A/en
Priority claimed from KR1020180067127A external-priority patent/KR102562227B1/en
Priority claimed from KR1020180073824A external-priority patent/KR20200001188A/en
Priority claimed from KR1020180077027A external-priority patent/KR20200004054A/en
Application filed by Modern Auto Co Ltd, Kia Motors Corp filed Critical Modern Auto Co Ltd
Publication of CN110562260A publication Critical patent/CN110562260A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Physics (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Navigation (AREA)

Abstract

A dialogue system for a vehicle may include: an input processor configured to receive a conversation between occupants of a vehicle including a driver and at least one passenger, detect vehicle operation information, identify the at least one passenger based on the conversation between the occupants or the vehicle operation information, generate passenger number information estimating a change in the number of passengers in the vehicle when the vehicle reaches a stop point based on the conversation between the occupants, and acquire a pre-speech message according to the passenger number information; and a result processor configured to output the pre-utterance according to the pre-utterance message.

Description

Dialogue system and dialogue processing method
Technical Field
embodiments of the present disclosure relate generally to a dialog system and a processing method, and more particularly, to a dialog system configured to provide information or services required by a user by recognizing an intention of the user through a dialog with the user, and a dialog processing method.
Background
Many conventional audio-video navigation (AVN) devices for vehicles have relatively small screens and buttons similar to mobile devices. Small screen buttons may inconvenience a user when providing visual information or receiving user input.
In particular, while driving, there may be a serious risk of danger if the user removes his or her hands from the steering wheel or looks away from the road to operate the device or view the information being displayed.
therefore, when the dialogue system is implemented in a vehicle, the dialogue system should be able to recognize the user's intention by dialogue with the user and provide information or services required by the user in a more convenient and safer manner.
Disclosure of Invention
Accordingly, an aspect of the present disclosure provides a dialogue system, a vehicle having the dialogue system, and a dialogue processing method capable of providing a service conforming to a user's real intention or a service required by the user by accurately recognizing the user's intention based on various information such as a dialogue with the user, vehicle state information, and user information.
Another aspect of the present disclosure provides a dialogue system, a vehicle having the dialogue system, and a dialogue processing method capable of identifying passengers based on a dialogue between passengers in the vehicle while driving and managing a change in the number of passengers in the vehicle.
Additional aspects of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
According to an embodiment of the present disclosure, a dialogue system for a vehicle may include: an input processor configured to receive a conversation between occupants of a vehicle including a driver and at least one passenger, detect vehicle operation information, identify the at least one passenger based on the conversation between the occupants or the vehicle operation information, generate passenger number information estimating a change in the number of passengers in the vehicle when the vehicle reaches a stop point based on the conversation between the occupants, and acquire a pre-speech message according to the passenger number information; and a result processor configured to output the pre-utterance according to the pre-utterance message.
the pre-utterance message may indicate at least one of: the likelihood of each of the at least one passenger leaving the vehicle at the stop point, the likelihood of each of the at least one passenger again boarding the vehicle after leaving the vehicle at the stop point, and the likelihood of a potential passenger boarding the vehicle at the stop point.
The input processor may include a voice input processor configured to determine whether at least one passenger boards the vehicle based on one or more voice features of the voice of the at least one passenger included in the conversation between the occupants; and a context information processor configured to determine whether the at least one passenger gets on the vehicle based on the vehicle operation information.
The input processor may acquire a pre-utterance message corresponding to at least one passenger boarding the vehicle when it is determined that the at least one passenger is boarding the vehicle, receive an utterance of the at least one passenger related to the pre-utterance message, and identify the at least one passenger by applying a natural language understanding algorithm to the utterance of the at least one passenger, and the input processor may acquire a pre-utterance message corresponding to at least one passenger not boarding the vehicle, receive an utterance of a driver related to the pre-utterance message, and verify the presence of the at least one passenger by applying a natural language understanding algorithm to the utterance of the driver when it is determined that the at least one passenger is not boarding the vehicle.
The input processor may determine a likelihood of each of the at least one passenger exiting the vehicle at the stopping point and a likelihood of each of the at least one passenger re-boarding the vehicle after exiting the vehicle at the stopping point by applying a natural language understanding algorithm to the dialogue between the passengers, and generate the passenger quantity information based on the determined likelihood of the at least one passenger exiting the vehicle at the stopping point and the likelihood of the at least one passenger re-boarding the vehicle after exiting the vehicle at the stopping point.
the input processor may receive a call session in the vehicle, determine a likelihood of the potential passenger boarding the vehicle at the stopping point by applying a natural language understanding algorithm to the received call session, and generate the passenger quantity information based on the likelihood of the potential passenger boarding the vehicle at the stopping point.
after the vehicle leaves the stop point, the input processor may determine a change in the number of passengers in the vehicle based on the dialogue between the passengers and the vehicle operation information, compare an estimated change in the number of passengers in the vehicle based on the passenger number information with the determined change in the number of passengers, and acquire the pre-speech message based on the comparison result.
After the vehicle leaves the dwell point, the input processor may acquire the pre-utterance message to determine a change in the number of passengers in the vehicle, receive an utterance of at least one passenger related to the pre-utterance message, and determine the change in the number of passengers in the vehicle by applying a natural language understanding algorithm to the utterance of the at least one passenger.
The dialogue system may further include a storage device configured to store the travel-related information of the vehicle and the passenger information of each of the at least one passenger when the vehicle stops traveling.
The passenger information may include at least one of passenger identification information, voice feature information, seat position information, boarding vehicle time information, boarding vehicle position information, departure time information, or information relating to a location of a departing vehicle.
The input processor may receive a dialog between occupants and vehicle operation information, determine whether at least one passenger gets on the vehicle based on the dialog between occupants and the vehicle operation information, determine whether a feature of each of the at least one passenger corresponds to the passenger information, and acquire the pre-utterance message by verifying whether a first passenger having a feature corresponding to the passenger information participates in a previous travel.
the input processor may receive an utterance of at least one passenger related to the pre-utterance message, verify whether the first passenger is engaged in a previous trip by applying a natural language understanding algorithm to the utterance of the passenger, and generate passenger quantity information based on a dialogue between passengers and passenger information when the first passenger is engaged in the previous trip.
Further, according to an embodiment of the present disclosure, a dialogue processing method for a vehicle may include: receiving a dialogue between occupants of a vehicle including a driver and at least one passenger; detecting vehicle operation information; identifying at least one passenger based on the dialogue or vehicle operation information between the passengers; generating passenger number information that estimates a change in the number of passengers in the vehicle when the vehicle reaches the stop point based on the dialogue between the passengers; acquiring a pre-speaking message according to the passenger number information; and outputting a pre-utterance according to the pre-utterance message.
The pre-utterance message may indicate at least one of: the likelihood of each of the at least one passenger leaving the vehicle at the stop point, the likelihood of each of the at least one passenger again boarding the vehicle after leaving the vehicle at the stop point, and the likelihood of a potential passenger boarding the vehicle at the stop point.
The dialog processing method may further include: determining whether at least one passenger boards the vehicle based on one or more voice features of the voice of the at least one passenger included in the conversation between the occupants; and determining whether the at least one passenger boards the vehicle based on the vehicle operation information.
The dialogue processing method may further include: when it is determined that at least one passenger is boarding the vehicle, acquiring a pre-spoken message corresponding to the at least one passenger boarding the vehicle; receiving an utterance of at least one passenger related to the pre-utterance message; and identifying the at least one passenger by applying a natural language understanding algorithm to the utterance of the at least one passenger; and when it is determined that the at least one passenger is not boarding the vehicle, acquiring a pre-spoken message corresponding to the at least one passenger not boarding the vehicle; receiving an utterance of a driver related to the pre-utterance message; and verifying the presence of the at least one passenger by applying a natural language understanding algorithm to the utterance of the driver.
The dialogue processing method may further include: determining a likelihood of each of the at least one passenger exiting the vehicle at the stop point and a likelihood of each of the at least one passenger boarding the vehicle again after exiting the vehicle at the stop point by applying a natural language understanding algorithm to the dialogue between the passengers; and generating passenger quantity information based on the determined likelihood of the at least one passenger leaving the vehicle at the stopping point and the likelihood of the at least one passenger getting on the vehicle again after leaving the vehicle at the stopping point.
The dialogue processing method may further include: receiving a call session in a vehicle; determining a likelihood of a potential passenger boarding the vehicle at the stopping point by applying a natural language understanding algorithm to the received call session; and generating passenger quantity information based on the likelihood of the potential passenger boarding the vehicle at the stopping point.
The dialogue processing method may further include: determining a change in the number of passengers in the vehicle based on the dialogue between the passengers and the vehicle operation information after the vehicle leaves the stop point; comparing the estimated change in the number of passengers in the vehicle based on the passenger number information with the determined change in the number of passengers; and obtains a pre-utterance message based on the comparison result.
The dialogue processing method may further include: after the vehicle leaves the stop point, obtaining a pre-verbal message to determine a change in a number of passengers in the vehicle; receiving an utterance of at least one passenger related to the pre-utterance message; and determining a change in the number of passengers in the vehicle by applying a natural language understanding algorithm to the utterance of the at least one passenger.
The dialogue processing method may further include: the travel-related information of the vehicle and the passenger information of each of the at least one passenger are stored when the vehicle stops traveling.
The driving-related information may include at least one of a start point, a stop point, and a destination of the driving.
The passenger information may include at least one of passenger identification information, voice feature information, seat position information, boarding vehicle time information, boarding vehicle position information, departure time information, or information relating to a location of a departing vehicle.
The dialogue processing method may further include: receiving dialogue between occupants and vehicle operation information; determining whether at least one passenger gets on the vehicle based on the dialogue between the passengers and the vehicle operation information; determining whether a characteristic of each of the at least one passenger corresponds to passenger information; and acquires the pre-utterance message by verifying whether the first passenger having the feature corresponding to the passenger information participates in the previous travel.
The dialogue processing method may further include: receiving an utterance of at least one passenger related to the pre-utterance message; verifying whether the first passenger was engaged in a previous trip by applying a natural language understanding algorithm to the passenger's utterance; and generating passenger number information based on the dialogue between the passengers and the passenger information when the first passenger participates in the previous travel.
Drawings
These and/or other aspects of the present disclosure will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Fig. 1 is a control block diagram illustrating a dialog system according to an embodiment of the present disclosure;
Fig. 2A is a view showing the interior of the vehicle;
Fig. 2B is a view showing the interior of the vehicle when viewed from another angle different from fig. 2A;
Fig. 3 to 6 are views showing examples of a dialog generated between the dialog system and the driver;
Fig. 7 and 8 are views showing a dialogue system configured to estimate a variation in the number of passengers and output a pre-utterance;
FIGS. 9 and 10 are control block diagrams schematically illustrating the connections between the dialog system and the vehicle components;
FIGS. 11 and 12 are control block diagrams schematically illustrating the connections between components of the dialog system and vehicle components;
Fig. 13 is a control block diagram showing a vehicle-only manner in which the dialogue system is provided in the vehicle;
Fig. 14 and 15 are control block diagrams showing a gateway manner of a vehicle in which a dialogue system is provided in a remote server and the vehicle is used as a gateway for connecting a user to the dialogue system;
Fig. 16 is a control block diagram showing a case where the vehicle is able to perform input processing and output processing in the vehicle gateway mode;
FIG. 17 is a control block diagram showing a hybrid approach in which both the remote dialogue system server and the vehicle execute dialogue processing;
FIGS. 18 and 19 are control block diagrams illustrating a mobile gateway approach for a mobile device connected to a vehicle to connect a user to a remote dialog system server;
Fig. 20 is a control block diagram showing a mobile individual manner of setting a dialogue system in a mobile device;
Fig. 21, 22A, and 22B are control block diagrams showing in detail the configuration of an input processor in the configuration of the dialog system;
Fig. 23A, 23B, and 23C are views showing examples of information stored in the contextual understanding table;
FIG. 24 is a control block diagram showing a dialog system suitable for use in a situation where the dialog system first outputs an utterance before receiving user input;
Fig. 25A, 25B, 25C, and 25D are views showing examples of information stored in the pre-utterance condition table;
Fig. 26 is a control block diagram illustrating in detail the configuration of the dialog manager;
Fig. 27 is a view showing an example of information stored in the relational action DB;
Fig. 28 is a view showing an example of information stored in the action execution condition DB;
Fig. 29 is a view showing an example of information stored in the action parameter DB;
Fig. 30 is a table showing an example of information stored in the blur resolution information DB;
Fig. 31A and 31B are tables showing various examples of performing vehicle control as a result of the blur solver resolving an ambiguity by referring to the blur resolution information DB and extracting an action;
FIG. 32 is a control block diagram showing in detail the configuration of a result processor;
Fig. 33 to 45 are views showing a specific example in which the dialogue system processes an input, manages a dialogue, and outputs a result when a user inputs an utterance related to route guidance;
Fig. 46 is a flowchart illustrating a method of processing user input in a dialog processing method according to an embodiment;
FIG. 47 is a flowchart illustrating a method of managing dialogs using the output of an input processor in a dialog processing method according to an embodiment;
Fig. 48 is a flowchart illustrating a result processing method for generating a response corresponding to a result of dialog management in the dialog processing method according to the embodiment;
Fig. 49 to 51 are flowcharts showing a case where a pre-utterance is output before an utterance is input by a dialogue system in a dialogue processing method according to an embodiment;
Fig. 52 is a flowchart illustrating processing of a repetitive task when the dialog system outputs a pre-utterance before a user inputs an utterance in the dialog processing method according to the embodiment;
Fig. 53 is a flowchart illustrating a method of determining that a passenger gets on a vehicle and outputting a pre-utterance in a dialogue processing method according to an embodiment;
Fig. 54 is a flowchart illustrating a method of estimating a variation in the number of passengers and outputting a pre-utterance in a dialogue processing method according to an embodiment; and
Fig. 55 is a flowchart illustrating a method of determining that a passenger participating in a previous trip boards a vehicle and outputting a pre-utterance in a dialogue processing method according to an embodiment.
It should be understood that the foregoing drawings are not necessarily to scale, presenting a somewhat simplified representation of various preferred features illustrative of the basic principles of the disclosure. The specific design features of the present disclosure, including, for example, specific dimensions, orientations, locations, and shapes, will be determined in part by the particular intended application and use environment.
Description of the reference numerals
100: dialogue system
110: input processor
120: dialogue processor
130: result processor
200: vehicle with a steering wheel
210: voice input device
220: information input device other than speech
230: dialogue output device
280: communication device
Detailed Description
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. Those skilled in the art will recognize that the described embodiments can be modified in various different ways, without departing from the spirit or scope of the present disclosure. In the following description, like reference numerals refer to like elements throughout the specification.
Well-known functions or constructions are not described in detail since they would obscure one or more exemplary embodiments in unnecessary detail. Terms such as "unit," "module," "member," and "block" may be implemented as hardware or software. According to embodiments, a plurality of "units", "modules", "members" and "blocks" may be implemented as a single component or a single "unit", "module", "member" and "block" may include a plurality of components.
It will be understood that when an element is referred to as being "connected" to another element, it can be directly or indirectly connected to the other element, wherein indirect connection includes "connection via a wireless communication network".
Furthermore, when a component "comprises" or "comprising" an element, the component may also include, but not exclude, other elements unless there is a particular description to the contrary.
As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The reference numerals are used for convenience of description, but are not intended to illustrate the order of each step. Unless the context clearly dictates otherwise, each step may be performed in a different order than that shown.
It should be understood that the term "automobile" or "vehicle" or other similar terms as used herein include motor vehicles in general, such as passenger vehicles including Sport Utility Vehicles (SUVs), buses, trucks, various commercial vehicles, watercraft including various watercraft, aircraft, and the like, and hybrid vehicles, electric vehicles, plug-in hybrid electric vehicles, hydrogen powered vehicles, and other alternative fuel vehicles (e.g., fuel from resources other than petroleum). As referred to herein, a hybrid vehicle is a vehicle having two or more power sources, such as a gasoline powered vehicle and an electric vehicle.
Additionally, it should be understood that one or more of the following methods or aspects thereof may be performed by at least one controller. The term "controller" may refer to a hardware device that includes a memory and a processor. The memory is configured to store program instructions and the processor is specifically programmed to execute the program instructions to perform one or more processes described further below. As described herein, a control unit may control the operation of units, modules, components, devices, and the like. Further, it should be understood that the following methods may be performed by an apparatus that includes a controller in conjunction with one or more other components, as will be understood by those of ordinary skill in the art.
Further, the controller of the present disclosure may be embodied as a non-transitory computer readable medium containing executable program instructions executed by a processor. Examples of computer readable media include, but are not limited to, ROM, RAM, Compact Disc (CD) -ROM, magnetic tape, floppy disk, flash drive, smart card, and optical data storage. The computer readable recording medium CAN also be distributed throughout a computer network so that the program instructions are stored and executed in a distributed fashion, such as through a telematics server or a Controller Area Network (CAN).
Reference will now be made in detail to embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings.
according to an embodiment, the dialog system may be configured to recognize the user's intention by using the user's voice and another input other than the voice, and to provide a service appropriate or required for the user's intention. A dialog system may perform a dialog with a user by outputting a system utterance, which is a means configured to provide a service or clearly recognize a user's intention.
As an example, the dialogue system recognizes dialogue contents between occupants in a vehicle by using voice recognition, recognizes passengers based on the dialogue contents between the occupants in the vehicle, and determines a possibility that each passenger leaves the vehicle and a possibility that each passenger gets on the vehicle again after leaving. The dialogue system can dialogue with the passengers by outputting a system utterance as a means for notifying the driver of the possibility that each passenger leaves the vehicle and the possibility that each passenger gets on the vehicle again after leaving the vehicle.
At this time, the dialogue system may converse with the user by outputting a system utterance in response to a request of the user, or may converse with the user by outputting a pre-utterance without a user request.
The pre-utterance described below denotes an utterance that is output without a user request, and the utterance may include an utterance that is output when an immediate response is required when the vehicle reaches a stop point or a destination. In addition, the pre-utterance may include an utterance that is output when information transmission is required although not requested by the user by acquiring and analyzing user information.
The pre-utterance may include an utterance that is output when information transmission is required by receiving various information from external devices such as a vehicle, a user terminal, and an external server.
Meanwhile, the pre-utterance is not limited to a system utterance that is output although not requested by the user. The pre-utterance may include a case where an utterance needs to be output for a certain period of time or when a certain condition occurs because a request of a user is not immediate. For example, when the output of an utterance is postponed because the user is talking, although the user requests the utterance to be output after a period of time or although the requested time is upcoming, the dialog system selects an appropriate time and outputs the utterance at the selected time. Hereinafter, for convenience of description, if it is not necessary to distinguish between a pre-utterance or a system utterance output in response to a user request, they are collectively referred to as an utterance.
The user or occupant in the vehicle described below includes all objects boarding the vehicle. For example, the user or the passenger (a person in a vehicle) in the vehicle includes not only the driver but also the passenger (i.e., a fellow passenger), and specifically, the user or the passenger in the vehicle is a generic term of the passenger sitting in the driver seat, the passenger seat, and the rear seat. According to an embodiment, the service provided to the user may include all types of operations according to the user's needs or the user's intention, wherein all types of operations may include providing information, controlling a vehicle, performing an audio/video/navigation function, and providing content from an external server.
According to one embodiment, a dialogue system provides dialogue processing techniques that are specific to a vehicle environment in order to accurately recognize a user's intent in a particular environment, i.e., a vehicle.
The gateway for connecting the dialog system with the user may be a vehicle or a mobile device connected to a vehicle. As described below, the dialogue system may be provided in a remote server in or outside the vehicle to transmit or receive data through communication with the vehicle or a mobile device connected to the vehicle.
Some components of the dialog system may be located in the vehicle and some components may be located in a remote server. Thus, the vehicle and the remote server may perform a part of the operation of the dialogue system.
Fig. 1 is a control block diagram illustrating a dialog system according to an embodiment of the present disclosure.
Referring to fig. 1, a dialog system 100 may include: an input processor 110 that processes a user input including a user's voice and an input other than the user's voice or an input including information related to the vehicle or information related to the user; a dialogue manager 120 recognizing the user's intention and vehicle state using the processing result of the input processor 110 and determining an action corresponding to the user's intention or vehicle state; a result processor 130 providing a specific service or outputting a system utterance for continuing a conversation according to an output result of the conversation manager 120; and a storage device 140 that stores various information for operations described later.
The input processor 110 may receive two inputs, such as user speech and input other than speech. The input other than voice may include input other than voice of the user input by recognizing a gesture of the user or an operation of an input device, vehicle state information indicating a state of the vehicle, driving environment information related to driving information of the vehicle, and user information indicating a state of the user. In addition, in addition to the above information, any information about the user and the vehicle may be input to the input processor 110 as long as the information is used to recognize the user's intention or to provide a service to the user or the vehicle. The users may include drivers and passengers.
The input processor 110 converts the user's speech into a text-type utterance by recognizing the user's speech, and recognizes the user's intention by applying a natural language understanding algorithm to the user's utterance.
The input processor 110 may determine whether the passenger gets on the vehicle by recognizing a voice of the passenger as an input and an input other than the voice of the user through an information input device other than the voice. The dialog system 100 can execute a pre-utterance to request identification information of passengers to identify each passenger determined to board the vehicle. The dialog system 100 may identify each passenger by receiving the utterance of each passenger. The identification information of the passenger may indicate that each passenger determined to board the vehicle is discriminated based on the identification information.
The input processor 110 collects information related to a vehicle state or related to a driving environment of the vehicle other than the user's voice, and then uses the collected information to understand the context.
The input processor 110 transmits the user's intention obtained through the natural language understanding technology and the information related to the context to the dialog manager 120.
The dialog manager 120 determines an action corresponding to the user's intention or the current context based on the information related to the user's intention and context transmitted from the input processor 110 and manages parameters required to perform the corresponding action.
According to an embodiment, the action may represent various actions for providing a specific service, and the kind of the action may be predetermined. Providing the service may correspond to performing the action, as desired.
For example, actions such as route guidance, vehicle state check, and gas station recommendation may be predefined in the domain/action inference rule DB 141 (refer to fig. 22A), and an action corresponding to an utterance of the user, that is, an action intended by the user, may be extracted according to the stored inference rule. Actions related to events occurring in the vehicle may be predefined and then stored in the relation action DB 146b (refer to fig. 24).
the kind of the action is not limited. If the dialog system 100 allows an action to be performed via the vehicle 200 or the mobile device 400, and the action is predefined and its inference rules or relationships with other actions/events are stored, the action may become the action mentioned above.
The dialog manager 120 sends information about the determined action to the results processor 130.
The results processor 130 generates and outputs dialog responses and commands necessary to perform the sent actions. The dialog response may be output in text, image or audio type. When the command is output, services such as vehicle control and external content provision corresponding to the output command may be executed.
For example, the results processor 130 may generate and output dialog responses and commands for identifying passengers. As an example, the result processor 130 may output an utterance requesting identification information of a passenger to identify the passenger determined to board the vehicle through the input processor 110.
In addition, the result processor 130 may generate and output dialog responses and commands for estimating changes in the number of passengers. As an example, the results processor 130 may output a pre-utterance having content related to a likelihood of each passenger leaving the vehicle at a stopping point and a likelihood of each passenger boarding the vehicle again after leaving the vehicle.
It should be understood that the input processor 110 and the result processor 130 may be implemented as separate processors or, alternatively, as a single processor. It should also be understood that the input processor 110 and the result processor 130 may be implemented as one or more hardware processors, one or more software modules, or a combination thereof. It should also be understood that the operation of the input processor 110 and the results processor 130 may be controlled by the vehicle controller 240.
The storage device 140 stores various information for conversation processing and service provision. For example, the storage 140 may pre-store information related to domains, actions, voice behaviors, and entity names for natural language understanding, and a context understanding table for understanding a context from input information. In addition, the storage device 140 may prestore data detected by sensors provided in the vehicle, information related to the user, and information required for action.
For example, when the traveling of the vehicle is terminated, the storage device 140 may store traveling related information about the traveling of the vehicle and passenger information about passengers boarding the vehicle while traveling. Specifically, the storage device 140 may store travel-related information regarding travel of the vehicle, such as a departure point, a stop point, and a destination of travel, and passenger information regarding passengers, such as personal identification information, voice feature information, seat position information, boarding vehicle time information, leaving time information, boarding vehicle position information, and information related to a position of a departing vehicle. A description of the information stored in the storage device 140 will be described later.
As described above, the dialogue system 100 provides dialogue processing techniques specific to a vehicle environment. All or some of the components of the dialog system 100 may be contained in a vehicle. The dialog system 100 may be located in a remote server and the vehicle may act as a gateway between the dialog system 100 and the user. In either case, the dialog system 100 may be connected to the user via the vehicle or a mobile device connected to the vehicle.
Fig. 2A is a view showing the vehicle interior, and fig. 2B is a view showing the vehicle interior when viewed from another angle different from fig. 2A.
Referring to fig. 2A, a display 231 configured to display a screen required for vehicle control including an audio function, a video function, a navigation function, and a call function, and an input button 221 configured to receive a control command of a user may be provided in a center fascia (center fascia)203 corresponding to a center portion of an instrument panel 201 inside a vehicle 200.
For the convenience of the user's operation, input buttons may be provided in the steering wheel 207, and a jog shuttle (jog shuttle)225 serving as an input button may be provided in the center console area 202 provided between the driver seat 254a and the passenger seat 254 b.
Meanwhile, the seat 254 is not limited to the driver seat 254a and the passenger seat 254 b. Referring to fig. 2B, the vehicle 1 may be provided with rear seats 254c and 254d as needed.
At this time, when the passenger gets on the passenger seat 254b and the rear seats 254c and 254d, and when a dialogue is performed between the passengers in the vehicle, the dialogue system may determine whether the passenger gets on the vehicle through voice recognition, and determine the seat position of the passenger in the vehicle by estimating the position where the voice is generated. In addition, the dialogue system may determine whether the passenger is located in the corresponding seat through various sensors provided at the side of the passenger seat 254b and the rear seats 254c and 254 d.
According to one embodiment, the dialog system may output a pre-utterance to request identification information of passengers to identify each passenger determined to board the vehicle, and identify each passenger by receiving an utterance of each passenger. As described above, the identification of the passenger may represent distinguishing the passenger based on the identification information of the passenger determined to board the vehicle.
According to one embodiment, the dialog system may store passenger information including the seat position of each passenger in the vehicle. A detailed description thereof will be described later.
The module including the display 231, the input buttons 221, and the processor controlling various functions may correspond to an Audio Video Navigation (AVN) terminal or a head unit (head unit).
The display 231 may be implemented by any of various display devices, for example, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED), a Plasma Display Panel (PDP), an Organic Light Emitting Diode (OLED), and a Cathode Ray Tube (CRT).
As shown in fig. 2, the input button 221 may be provided in a hard key type on an area adjacent to the display 231. Alternatively, when the display 231 is implemented by a touch screen, the display 231 may perform the function of the input button 221.
The vehicle 200 may receive the user control command as voice via the voice input device 210. The voice input device 210 may include a microphone configured to receive sound and then convert the sound into an electrical signal.
As shown in fig. 2, the voice input device 210 may be mounted to the roof panel 205 for efficient voice input, but embodiments of the vehicle 200 are not limited thereto. Thus, the voice input device 210 may be mounted to the dashboard 201 or the steering wheel 207. In addition, the voice input device 210 may be provided in the passenger seat 254b and the rear seats 254c and 254d, respectively, for inputting the voice of the passengers seated in the passenger seat 254b and the rear seats 254c and 254 d. In addition, the voice input device 210 may be installed at any position as long as the position is suitable for receiving the voice of the user.
Inside the vehicle 200, a speaker 232 may be provided, which is configured to have a conversation with the user or is configured to output sounds required to provide a service desired by the user. For example, the speaker 232 may be disposed inside the driver seat door 253a and the passenger seat door 253 b.
The speaker 232 may output voice for navigation route guidance, sound or voice contained in audio and video content, voice for providing information or services desired by the user, and system utterances generated as a response to the utterances of the user.
According to an embodiment, the dialogue system 100 provides a service suitable for a user's life style by using a dialogue processing technology suitable for a vehicle environment, and the dialogue system 100 may implement a new service using technologies such as connected car (connected car), internet of things (IoT), and Artificial Intelligence (AI).
When a dialogue processing technique suitable for a vehicle environment, such as the dialogue system 100 according to the embodiment, is applied, a key context (context) can be easily recognized and responded to during a driver directly drives a vehicle. The service may be provided by applying a weight to parameters affecting driving, such as gasoline shortage and drowsy driving, or information, such as travel time and destination information, which is required for the service, may be easily obtained based on conditions under which the vehicle moves to the destination in most cases.
In addition, an intelligent service configured to provide a function can be easily implemented by recognizing the intention of the driver. This is because real-time information and actions are prioritized in the case of direct driver driving. For example, when a driver searches for a gas station while driving, it can be interpreted as the driver's intention to go to the gas station. However, when the driver searches for a gas station in a place other than the vehicle, it can be interpreted as another intention such as a search location information query, a phone number query, and a price query, in addition to the intention that the driver is going to go to the gas station.
Further, although the vehicle has a limited space, various situations may occur therein. For example, a driver may utilize the dialog system 100 in various situations, such as driving a vehicle with an unfamiliar interface, such as a rental car, using a designated drive service, vehicle management situations, such as car washes, situations where a baby is on the car, and situations where a particular destination is visited.
In addition, various service and dialogue situations may occur in each of the stages constituting the vehicle running and the preceding and following stages of the running (for example, a vehicle inspection stage, a start preparation stage, a running stage, and a parking stage). Specifically, the driver can use the dialogue system 100 in various situations, for example, a situation in which the driver does not know how to deal with a problem, a situation in which the vehicle is associated with various external devices, a situation in which driving habits are checked, such as gasoline mileage, and a situation in which a safety support function is used, such as smart cruise control, a situation in which a navigation operation is performed, a situation in which driving is drowsy, a situation in which driving is performed along the same route every day, and a situation in which whether the place can be parked or not are checked.
fig. 3 to 6 are views showing examples of a dialog generated between the dialog system and the driver.
Referring to fig. 3, although the driver does not input a word for inquiring about the current remaining gasoline amount or for requesting a gas station guide, the dialogue system 100 may recognize the current remaining gasoline by itself and in case that the recognized remaining gasoline is less than a predetermined value, the dialogue system 100 may first output a word providing information related to the current remaining gasoline (S1: 43km may be driven with the remaining gasoline).
In response to the utterance, the driver may input an utterance requesting to receive route guidance to a nearby gas station (U1: let I know the nearby gas station), and the dialogue system 100 may output an utterance providing information about the gas station closest to the current location (S2: there are A-Oil Seong-rim, B-Oil Jang-dae and C-Oil Pacific gas stations at the nearby gas station).
The driver may additionally input a spoken utterance asking for the price of gasoline (U2: where cheapest?), and the dialog system 100 may output a spoken utterance providing information related to the price of the fuel type (B oil Jang-dae station, where the price of gasoline is lowest, 1294 Han-grams per liter, and A oil Seong-rim station, where the price of diesel is lowest, 985 Han-grams per liter).
the driver may input an utterance requesting guidance to the B oil Jang-dae fuel station (U3), and the dialogue system 100 may output an utterance indicating that guidance to the fuel station selected by the driver is started (S4: start of a route to the B oil Jang-dae fuel station).
That is, the dialogue system 100 may determine that the currently required service is the gasoline guide service based on the state information of the vehicle received via the input processor 110, and output the pre-utterance to provide the demand service. In addition, the driver can be guided to a nearby gas station that sells the current vehicle's fuel type at the lowest price through a dialogue with the dialogue system 100. According to one embodiment, assume that a "pre-utterance" represents an utterance that is first output from the dialog system 100 before the user speaks.
Meanwhile, when a gas station is selected in the example shown in fig. 3, the dialogue system 100 may omit some problems and directly provide information, and thus steps and time of dialogue may be reduced.
For example, the dialogue system 100 may recognize in advance that the fuel type of the current vehicle is gasoline, and the criterion for the driver to select a gas station is price. The information on the fuel type of the vehicle may be acquired from the vehicle, and the criterion for the driver to select the gas station may be previously input from the driver or acquired by learning a driver conversation history or a gas station selection history. This information may be pre-stored in the storage device 140.
In this case, as shown in fig. 4, the dialogue system 100 may actively output a word providing information on fuel price, particularly, a gasoline price as a fuel type of the current vehicle (S2+ S3 — S3'), without inputting a word (U2) that the driver requests information on fuel price, i.e., omitting U2.
The driver may omit the utterance (U2) for requesting information about the fuel price, and may form a response of the dialogue system 100 such that the utterance (S2) guiding the near gas station and the utterance (S3) guiding the fuel price are integrated as a single response to reduce the steps and time of the dialogue.
In addition, the dialogue system 100 can recognize itself that the driver's intention is to search for a gas station based on the fact that the driver inquires about the amount of gasoline currently remaining.
In this case, as shown in fig. 5, although the driver does not input a word asking for a nearby gas station (U1), i.e., omits U1, the dialogue system 100 may actively output a word providing information about fuel price (S2+ S3 ═ S3 ").
In a state where the closest gas station from the current location and the gas station providing the lowest fuel price are the same gas station, the utterance (S3 ") providing the information related to the fuel price may include a question for asking whether to guide to the corresponding gas station. Thus, the user can request route guidance to a corresponding gas station by simply inputting an utterance (U3': YES) agreeing to the question of the dialogue system 100 without inputting a specific utterance for requesting guidance to a certain gas station.
As described above, the dialog system 100 can recognize the real intention of the user by considering the contents that the user does not utter and actively provide information corresponding to the intention based on the information obtained in advance. Accordingly, a dialog step and time for providing a service desired by a user can be reduced.
Referring to fig. 6, when it is determined that a passenger boards the vehicle based on the dialogue between occupants in the vehicle and the vehicle operation information, the dialogue system 100 may first output a utterance asking the passenger for identification information (S11: who you are? telling me your name).
In response to the utterance, the passenger may input an utterance containing identification information of the passenger (U11: I am OO), and the dialog system 100 may identify the passenger by the passenger's utterance.
That is, the dialogue system 100 may determine whether a passenger gets on the vehicle based on the dialogue between passengers and the vehicle operation information received through the input processor 110, and output a pre-utterance to recognize the identity of the passenger. In addition, the passenger may provide identification information to the dialog system 100 through a dialog with the dialog system 100.
Fig. 7 and 8 are views showing a dialogue system configured to estimate a variation in the number of passengers and output a pre-utterance.
Referring to fig. 7, after identifying the identity of the passengers, the dialogue system 100 in the vehicle 200 may estimate a change in the number of passengers based on the dialogue between the passengers in the vehicle and output a pre-utterance indicating a result of estimating the change in the number of passengers.
Specifically, by applying the natural language understanding algorithm to the dialogue between the occupants in the vehicle, the dialogue system 100 in the vehicle 200 can acquire the possibility that each passenger leaves the vehicle at the stop point 700 and gets on the vehicle again after leaving the vehicle. Further, by applying a natural language understanding algorithm to the call session in the vehicle, the dialog system 100 in the vehicle 200 may obtain the possibility of a passenger (i.e., a potential passenger) intending to board the vehicle at the stop point 700. A detailed description thereof will be described.
Before reaching the stopping point 700, the dialogue system 100 in the vehicle 200 may output a pre-utterance related to the result of estimating the change in the number of passengers, such as "a will leave at the stopping point," based on the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving. "B will board the vehicle again after leaving the vehicle at the stop point". "C does not exit at the stop point". And "D will board the vehicle at the stopping point".
Referring to fig. 8, after leaving the stop point 700, the dialogue system 100 in the vehicle 200 may output a pre-utterance related to a result of comparison between a result of estimation of a change in the number of passengers estimated before reaching the stop point 700 and a result of the change in the number of passengers after leaving from the stop point 700.
Specifically, after leaving the stop point 700, the dialogue system 100 in the vehicle 200 may compare the estimation result of the change in the number of passengers acquired before reaching the stop point 700 and the change result of the number of passengers after leaving from the stop point 700.
The comparison between the result of the estimation of the change in the number of passengers acquired before reaching the stop point 700 and the result of the change in the number of passengers acquired after departing from the stop point 700 may be performed by determining whether the passengers board the vehicle after departing from the stop point 700 based on the dialogue between the passengers in the vehicle after departing from the stop point 700 and the vehicle operation information.
In addition, after leaving the stop point 700, the dialogue system 100 in the vehicle 200 may output a pre-utterance verifying whether the estimation result of the change in the number of passengers is correct, so as to verify the result of the change in the number of passengers. The dialogue system 100 in the vehicle 200 can compare the estimation result of the change in the number of passengers acquired before reaching the stop point 700 and the result of the change in the number of passengers after leaving from the stop point by using the response utterance of the occupant regarding the pre-utterance verifying whether the estimation result of the change in the number of passengers is correct.
After leaving the stop point 700, the dialogue system 100 in the vehicle 200 may output a predetermined utterance about the comparison result "the current passenger number is different from the estimation result of the change in the passenger number" based on the comparison between the estimation result of the change in the passenger number acquired before reaching the stop point 700 and the result of the change in the passenger number after leaving from the stop point 700. Fig. 9 and 10 are control block diagrams schematically showing the connections between the dialogue system and the vehicle components.
Referring to fig. 9, a user's voice input to the dialog system 100 may be input via a voice input device 210 provided in the vehicle 200. As shown in fig. 2, the voice input device 210 may include a microphone disposed inside the vehicle 200.
The input other than voice among the user input may be input through the information input device 220 other than voice. The information input device 220 other than voice may include input buttons 221 and 223 for receiving commands through user's operations and a rotary shuttle 225.
The information input device 220 other than voice may include a camera that photographs the user. Through the image imaged by the camera, a gesture, an expression, or a line-of-sight direction of the user serving as a means of command input can be recognized. Alternatively, the state of the user (drowsy state, etc.) may be recognized by an image imaged by a camera.
In addition, the information input device 220 other than voice may include window adjustment buttons, seat adjustment buttons, and air conditioning adjustment buttons provided at the side of the passenger seat 254b and the rear seats 254c and 254d to determine the passenger boarding of the vehicle and the seat position of the passenger.
Information related to the vehicle may be input into the dialog system 100 via the vehicle controller 240. The vehicle-related information may include vehicle state information or surrounding environment information acquired by various sensors provided in the vehicle 200, and information originally stored in the vehicle 200, such as a fuel type of the vehicle.
The dialog system 100 may recognize the user's intention and context using the user's voice input via the voice input device 210, the input other than the user's voice via the information input device 220 other than the voice, and various inputs via the vehicle controller 240. The dialog system 100 outputs a response to perform an action corresponding to the user's intent.
The dialog output device 230 is a device configured to provide output to a speaker in a visual, auditory, or tactile manner. The dialogue output device 230 may include a display 231 and a speaker 232 provided in the vehicle 200. The display 231 and the speaker 232 may output a response to an utterance of the user, a question about the user, or information requested by the user in a visual or audible manner. In addition, vibration may be output by mounting a vibrator in the steering wheel 207.
Further, according to the response output from the dialogue system 100, the vehicle controller 240 may control the vehicle 200 to perform an action corresponding to the user's intention or the current situation.
Meanwhile, the vehicle 200 may collect information acquired from the external content server 300 or an external device, for example, driving environment information and user information such as traffic conditions, weather, temperature, passenger information, and driver personal information, via the communication device 280, in addition to information acquired by a sensor provided in the vehicle 200, and then the vehicle 200 may transmit the information to the dialogue system 100.
As shown in fig. 10, information acquired by sensors provided in the vehicle 200, for example, a remaining fuel amount, a rainfall speed, surrounding obstacle information, a speed, an engine temperature, a tire pressure, a current position, may be input to the dialogue system 100 via the internal signal controller 241.
In addition, information acquired by sensors provided in the vehicle, such as window adjustment button operation information, seat adjustment button operation information, and air conditioning adjustment button operation information, may be input to the dialogue system 100 through the interior signal controller 241.
Driving environment information acquired from the outside via vehicle-to-anything (V2X) communication may be input to the dialogue system 100 via the external signal controller 242. V2X may represent vehicles exchanging and sharing various useful information, such as traffic conditions, by communicating with the road infrastructure and other vehicles while driving.
The V2X communications may include vehicle-to-infrastructure (V2I) communications, vehicle-to-vehicle (V2V) communications, and vehicle-to-mobile (V2N) communications. Therefore, by using the V2X communication, information such as traffic information on the front or the approach of another vehicle or the risk of collision with another vehicle can be transmitted and received through communication performed directly between vehicles or communication with infrastructure installed on a road, so that the driver can be notified of the information.
Accordingly, the driving environment information input to the dialogue system 100 via the external signal controller 242 may include traffic information on the front, proximity information of neighboring vehicles, a collision warning with another vehicle, a real-time traffic condition, an accident condition, and a traffic flow control state.
Although not shown in the drawings, a signal obtained via V2X may also be input to the vehicle 200 via the communication device 280.
The vehicle controller 240 may include: a memory in which a program for performing the above-described operation and an operation described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
In addition, the internal signal controller 241 and the external signal controller 242 may be implemented by the same processor and memory, or by separate processors and memories.
Fig. 11 and 12 are control block diagrams schematically showing the connections between the dialogue system and the vehicle components.
Referring to fig. 11, a user's voice transmitted from the voice input device 210 may be input to the voice input processor 111 provided in the input processor 110, and an input other than the user's voice transmitted from the information input device 220 other than the voice may be input to the contextual information processor 112 provided in the input processor 110.
In addition, information input via the internal signal controller 241 or the external signal controller 242 is input to the context information processor 112 provided in the input processor 110.
The context information input to the context information processor 112 may include vehicle state information, driving environment information, and user information input from the information input device 220 and the vehicle controller 240, in addition to voice. The context information processor 112 can identify a context based on the input context information. The dialog system 100 can accurately recognize the user's intention or efficiently find a service required by the user by recognizing the context.
For example, the voice input processor 111 may determine that the passenger boards the vehicle based on a dialogue between occupants in the vehicle recognized through the voice input device 210. In addition, the voice input processor 111 may determine the possibility that each passenger leaves the vehicle at the stopping point and the possibility that each passenger boards the vehicle again after leaving at the stopping point based on the dialogue between the occupants in the vehicle recognized through the voice input device 210. In addition, the voice input processor 111 may estimate a passenger intending to board the vehicle based on the call session in the vehicle recognized through the voice input device 210. The contextual information processor 112 may recognize the operation of the information input device 220 other than voice and determine whether the passenger gets on the vehicle based on the recognition result. The response output from the results processor 130 may be input to the dialogue output device 230 or the vehicle controller 240 to allow the vehicle 200 to provide the service desired by the user. In addition, a response may be sent to the external content server 300 to request a desired service.
The vehicle state information, the driving environment information, and the user information transmitted from the vehicle controller 240 may be stored in the storage device 140.
Referring to fig. 12, the storage 140 may include a long term memory 143 and a short term memory 144. The data stored in the storage 140 may be classified into short-term memory and long-term memory according to the importance and persistence of the data and the intention of a designer.
The short-term memory 144 may store previously executed dialogs. The previous dialog may be a dialog performed within a reference time from the current time. Alternatively, the dialog may be continuously stored until the capacity of the content of the utterance between the user and the dialog system 100 reaches a reference value.
For example, when it is a meal time, the vehicle 200 may output an utterance via the speaker 232 asking whether to lead to a restaurant. Whether or not it is a meal time may be identified based on whether or not the current time is within a predetermined meal time range. When the user says the content of "let i know restaurants near the south of the river" or the content of "let i know restaurants" and when the current position of the vehicle 200 is near the south of the river station, the dialogue system 100 may search for restaurants near the south of the river station through the external content server 300 and then provide the user with information about the searched restaurants near the south of the river station. As an example of providing information, the dialog system 100 may display a list of restaurants on the display 231 and, when the user says "first", may store dialog content in the short-term memory 144 that is relevant to the request for a restaurant and the selection of a restaurant.
Alternatively, not only the entire conversation content but also specific information contained in the conversation content may be stored. For example, a first restaurant of the restaurant list may be stored in the short term memory 144 or the long term memory 143 as the restaurant selected by the user.
When the user asks the dialog system 100 about "how weather?" after a dialog with a restaurant near the south of the river station, the dialog system 100 may assume the user's location of interest as the south of the river station from the dialog stored in the short-term memory 144 and then output a response "the south of the river station is raining".
Next, when the user says "recommended restaurant menu", the dialogue system 100 may assume that "the restaurant" represents a restaurant near the south of the river station from the dialogue stored in the short-term memory, and acquire information related to the recommended menu of the corresponding restaurant through a service provided from the external content server 300. Thus, the dialog system 100 may output a response "noodle is the best menu in the restaurant".
Long-term storage 143 may store data according to the persistence of its presence. For example, the long term memory 143 can determine that persistence of data such as the phone numbers of relatives and friends and location of interest (POI) information, e.g., family or company, and user preferences for certain parameters can be guaranteed and then store the data in the long term memory. Conversely, when it is determined that the persistence of the data is not guaranteed, the data may be stored in the short-term memory 144.
for example, when the traveling of the vehicle is terminated, the long-term memory 143 may store traveling-related information regarding the traveling of the vehicle and passenger information that passengers board the vehicle while traveling. Specifically, the long-term memory 143 may store travel-related information regarding travel of the vehicle, such as a departure point, a stop point, and a destination of travel, and passenger information regarding passengers, such as personal identification information, voice feature information, seat position information, boarding vehicle time information, leaving time information, boarding vehicle position information, and information related to a position of leaving the vehicle. For example, the user's current location may be temporary data and thus stored in short-term memory 144, and the user's preferences for restaurants may be persistent data that is available later and thus stored in long-term memory 143.
When the user says "there is no restaurant?" here, the dialog system 100 may identify the user's current location from the short-term memory 144 and indicate the user's favorite chinese restaurants from the long-term memory 143.
In addition, the dialog system 100 can actively provide services and information to the user using data stored in the long term 143 and short term 144 memories.
For example, information related to the user's home may be stored in long-term memory 143. The dialogue system 100 may acquire information about the user's house from the external content server 300 and then provide information indicating "water is expected to be cut off on this friday due to the cleanliness of the apartment".
In addition, when the same passenger as the passenger in the previous trip is seated in the next trip, the dialogue system 100 may estimate the possibility of each passenger leaving the vehicle and the possibility of getting on the vehicle again after leaving based on the stored passenger information and the dialogue between the passengers in the vehicle.
Information relating to the vehicle battery status may be stored in the short term memory 144. The dialogue system 100 may analyze the vehicle battery status stored in the short-term memory 144 and then provide an indication that the battery is in a bad condition. Before winter, the information of the' is repaired.
fig. 13 is a control block diagram showing a vehicle-only manner in which the dialogue system is provided in the vehicle.
As shown in fig. 13, the dialogue system 100 having the input processor 110, the dialogue manager 120, the result processor 130, and the storage device 140 may be included in a vehicle 200 according to a vehicle-individual manner.
When the dialogue system 100 is incorporated in the vehicle 200, the vehicle 200 can handle dialogue with the user by itself and provide a service required by the user. However, information required for the session processing and the service provision may be obtained from the external content server 300.
Vehicle state information or running environment information detected by the vehicle detector 260, for example, a remaining fuel amount, a rainfall speed, surrounding obstacle information, a speed, an engine temperature, a tire pressure, a current location, may be input to the dialogue system 100 via the vehicle controller 240.
The vehicle controller 240 may control the air conditioner 251, the window 252, the door 253, the seat 254, or the AVN 255 provided in the vehicle 200 according to the response output from the dialogue system 100.
For example, when the dialogue system 100 determines that the user's intention or the service required by the user is to lower the temperature in the vehicle 200 and then generates and outputs a corresponding command, the vehicle controller 240 may lower the temperature in the vehicle 200 by controlling the air conditioner 251.
For another example, when the dialogue system 100 determines that the user's intention or the service required by the user is to raise the driver's seat window 252a and generate and output a corresponding command, the vehicle controller 240 may raise the driver's seat window 252a by controlling the window 252.
For another example, when the dialogue system 100 determines that the user's intention or a service required by the user is to guide a route to a certain destination and generates and outputs a corresponding command, the vehicle controller 240 may perform route guidance by controlling the AVN 255. The communication device 280 may acquire map data and POI information from the external content server 300 and then use the information for service provision, as necessary.
Fig. 14 and 15 are control block diagrams showing a gateway manner of a vehicle in which a dialogue system is provided in a remote server and the vehicle is used as a gateway for connecting a user to the dialogue system.
As shown in fig. 14, according to the vehicle gateway system, the remote dialogue system server 1 may be provided outside the vehicle 200, and a dialogue system client 270 connected with the remote dialogue system server 1 via a communication device 280 may be provided in the vehicle 200. The communication device 280 serves as a gateway for connecting the vehicle 200 and the remote dialogue system server 1.
the dialog system client 270 may serve as an interface to connect to input/output devices and perform collection as well as sending and receiving data.
When the voice input device 210 and the information input device 220 other than voice provided in the vehicle 200 receive the input of the user and transmit the user input to the dialogue system client 270, the dialogue system client 270 may transmit the input data to the remote dialogue system server 1 via the communication device 280.
The vehicle controller 240 may also transmit data detected by the vehicle detector 260 to the dialogue system client 270, and the dialogue system client 270 may transmit data detected by the vehicle detector 260 to the remote dialogue system server 1 via the communication device 280.
since the remote dialogue system server 1 is provided with the above-described dialogue system 100, input data processing, dialogue processing based on the result of the input data processing, and result processing based on the result of the dialogue processing can be executed by the remote dialogue system server 1.
In addition, the remote dialogue system server 1 may acquire information or contents necessary for input data processing, dialogue management, or result processing from the external content server 300.
The vehicle 200 may acquire information or contents of the service required by the user from the external content server 300 according to the response transmitted from the remote dialogue system server 1.
Referring to fig. 15, the communication device 280 may include at least one communication module configured to communicate with an external device. For example, the communication device 280 may include at least one of a short-range communication module 281, a wired communication module 282, and a wireless communication module 283.
The short range communication module 281 may include various short range communication modules configured to transmit and receive signals using a short range wireless communication module, such as a bluetooth module, an infrared communication module, a Radio Frequency Identification (RFID) communication module, a Wireless Local Area Network (WLAN) communication module, an NFC communication module, and a ZigBee communication module.
The wired communication module 282 may include various wired communication modules, for example, a Local Area Network (LAN) module, a Wide Area Network (WAN) module, or a Value Added Network (VAN) module, and various wired communication modules, for example, a Universal Serial Bus (USB), a High Definition Multimedia Interface (HDMI), a Digital Video Interface (DVI), a recommendation standard 232(RS-232), a power line communication, or a Plain Old Telephone Service (POTS).
The wireless communication module 283 may include a wireless communication module supporting various wireless communication methods, for example, a Wifi module, a wireless broadband module, a global system for mobile communication (GSM), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Universal Mobile Telecommunications System (UMTS), Time Division Multiple Access (TDMA), Long Term Evolution (LTE), 4G, and 5G.
Additionally, the communication device 280 may also include an internal communication module (not shown) for communication between electronic devices in the vehicle 200. The communication protocol of the vehicle 200 may use a Controller Area Network (CAN), a Local Interconnect Network (LIN), FlexRay, and ethernet.
the dialog system 100 may transmit and receive data to and from the external content server 300 or the remote dialog system service 1 via the wireless communication module 283. The dialog system 100 may perform V2X communication using the wireless communication module 283. In addition, the dialogue system 100 can transmit and receive data to and from a mobile device connected to the vehicle 200 by using the short-range communication module 281 or the wired communication module 282.
Fig. 16 is a control block diagram showing a case where the vehicle can execute the input processing and the output processing in the vehicle gateway system.
As described above, the dialogue system client 270 of the vehicle 200 may collect and transmit and receive only data, but when the input processor 271, the result processor 273, and the storage device 274 are included in the dialogue system client 270 as shown in fig. 16, the dialogue system client 270 may also process data input from the user or the vehicle or perform processing related to determining service provision required by the user. That is, the operations of the input processor 110 and the result processor 130 may be performed not only by the remote dialogue system server 1 but also by the vehicle 200.
In this case, the dialog system client 270 may perform all or some of the operations of the input processor 110. The dialog system client 270 may perform all or some of the operations of the results processor 130.
The task sharing between the remote dialog system server 1 and the dialog system client 270 may be determined in consideration of the capacity of data to be processed and the data processing speed.
Fig. 17 is a control block diagram showing a hybrid manner in which both the remote dialogue system server and the vehicle execute dialogue processing.
As shown in fig. 17, according to the hybrid manner, since the input processor 110, the dialogue manager 120, the result processor 130, and the storage device 140 are provided in the remote dialogue system server 1, the remote dialogue system server 1 can perform the dialogue process, and since the terminal dialogue system 290 provided with the input processor 291, the dialogue manager 292, the result processor 293, and the storage device 294 is provided in the vehicle 200, the vehicle 200 can also perform the dialogue process.
however, there may be a difference in capacity or performance between the processor and the memory provided in the vehicle 200 and the processor or the memory provided in the remote dialogue system server 1. Therefore, when the terminal dialog system 290 can output a result by processing all input data and managing a dialog, the terminal dialog system 290 can perform the entire process. Otherwise, processing may be requested from the remote dialogue system server 1.
Before performing the dialogue process, the terminal dialogue system 290 may determine whether the dialogue process can be performed based on the data type, and the terminal dialogue system 290 may directly perform the process or request the process from the remote dialogue system server 1 based on the result of the determination.
When an event of a process that the terminal dialog system 290 cannot perform occurs during the terminal dialog system 290 performs a dialog process, the terminal dialog system 290 may request the remote dialog system server 1 for the process while transmitting the result of its own process to the remote dialog system server 1.
For example, the remote dialog system server 1 may perform dialog processing when high-performance computing power or long-term data processing is required, and the terminal dialog system 290 may perform dialog processing when real-time processing is required. For example, when a situation occurs that requires immediate processing and thus processing of data before synchronization, it may be set that the terminal dialog system 290 processes data first.
In addition, when there is an unregistered speaker in the vehicle and thus user confirmation is required, the remote dialogue system server 1 may process the dialogue. That is, the remote dialogue system server 1 may process a dialogue when a new passenger needs to be identified because the passenger is in the vehicle.
Further, when the terminal dialogue system 290 cannot complete the dialogue processing by itself in a state in which it cannot connect to the remote dialogue system server 1 via the communication device 280, it is possible to notify the user via the dialogue output device 230 that the dialogue processing cannot be executed.
the data stored in the terminal dialog system 290 and the data stored in the remote dialog system server 1 may be determined according to the data type or data capacity. For example, in the case of data having a risk of invading privacy due to personal identification, the data may be stored in the storage 294 of the terminal dialog system 290. In addition, a large amount of data may be stored in the storage device 140 of the remote dialog system server 1, and a small amount of data may be stored in the storage device 294 of the terminal dialog system 290. Alternatively, a small amount of data may be stored in the storage device 140 of the remote dialog system server 1 and the storage device 294 of the terminal dialog system 290.
Fig. 18 and 19 are control block diagrams illustrating a mobile gateway manner in which a mobile device connected to a vehicle connects a user to a remote dialogue system server.
As shown in fig. 18, according to the mobile gateway system, the mobile device 400 may receive vehicle state information, driving environment information, and the like from the vehicle 200 and transmit user input and the vehicle state information to the remote dialogue system server 1. That is, the mobile device 400 may act as a gateway connecting the user to the remote dialogue system server 1 or connecting the vehicle 200 to the remote dialogue system server 1.
The mobile device 400 may represent an electronic device that is portable and capable of transmitting and receiving data to and from an external server and a vehicle by communicating with the external server and the vehicle, wherein the mobile device 400 may include a smart phone, a smart watch, smart glasses, a PDA, and a tablet computer.
The mobile device 400 may include a voice input device 410 receiving a user's voice, an input device 420 other than the voice input device 410 receiving an input other than the user's voice, an output device 430 outputting a response in a visual, auditory or tactile manner, a communication device 480 transmitting and receiving data to and from the remote dialogue system server 1 and the vehicle 200 through communication, and a dialogue system client 470 collecting input data from the vehicle 200 and the user and transmitting the data to the remote dialogue system server 1 via the communication device 480.
The voice input device 410 may include a microphone that receives sound, converts the sound into an electrical signal, and outputs the electrical signal.
The input device 420 other than the voice input device 410 may include an input button, a touch screen, or a camera provided in the mobile device 400.
The output device 430 may include a display, a speaker, or a vibrator provided in the mobile device 400.
The voice input device 410 provided in the mobile device 400, the input device 420 and the output device 430 other than the voice input device 410 may serve as an input and output interface for a user. In addition, the voice input device 210, the information input device 220 other than voice, and the dialogue output device 230 provided in the vehicle 200 may be used as input and output interfaces for the user.
When the vehicle 200 transmits data and user input detected by the vehicle detector 260 to the mobile device 400, the dialogue system client 470 of the mobile device 400 may transmit the data and user input to the remote dialogue system server 1.
The dialogue system client 470 may transmit a response or a command transmitted from the remote dialogue system server 1 to the vehicle 200. When the dialogue system client 470 uses the dialogue output device 230 provided in the vehicle 200 as an input and output interface for the user, an utterance of the dialogue system 100 or a response to the utterance of the user can be output via the dialogue output device 230. When the conversation system client 470 uses the output device 430 provided in the mobile device 400, an utterance of the conversation system 100 or a response to the utterance of the user can be output via the output device 430.
A command for vehicle control may be transmitted to the vehicle 200, and the vehicle controller 240 may perform control corresponding to the transmitted command, thereby providing a service required by the user.
the dialog system client 470 may collect input data and transmit the input data to the remote dialog system server 1. The dialog system client 470 may also perform all or some of the functions of the input processor 110 and the result processor 130 of the dialog system 100.
Referring to fig. 19, the communication device 480 of the mobile device 400 may include at least one communication module configured to communicate with an external device. For example, the communication device 480 may include at least one of a short-range communication module 481, a wired communication module 482, and a wireless communication module 483.
The short-range communication module 481 may include various short-range communication modules (e.g., a bluetooth module, an infrared communication module, a Radio Frequency Identification (RFID) communication module, a Wireless Local Area Network (WLAN) communication module, an NFC communication module, and a ZigBee communication module) configured to transmit and receive signals using the short-range wireless communication module.
The wired communication module 482 may include various wired communication modules, for example, a Local Area Network (LAN) module, a Wide Area Network (WAN) module, or a Value Added Network (VAN) module, and various cable communication modules, for example, a Universal Serial Bus (USB), a High Definition Multimedia Interface (HDMI), a Digital Video Interface (DVI), a recommendation standard 232(RS-232), a power line communication, or a Plain Old Telephone Service (POTS).
The wireless communication module 483 may include a wireless communication module supporting various wireless communication methods, for example, a Wifi module, a wireless broadband module, a global system for mobile communication (GSM), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Universal Mobile Telecommunications System (UMTS), Time Division Multiple Access (TDMA), Long Term Evolution (LTE), 4G, and 5G.
For example, the mobile device 400 may be connected to the vehicle 200 via the short range communication module 481 or the wired communication module 482, and the mobile device 400 may be connected to the remote conversation system server 1 or the external content server 300 via the wireless communication module 483.
Fig. 20 is a control block diagram showing a mobile individual manner of setting a dialogue system in a mobile device.
As shown in fig. 20, the dialogue system 100 may be provided in the mobile device 400 according to a mobile individual manner.
Therefore, without connecting to the remote dialogue system server 1 for dialogue processing, the mobile device 400 can process a dialogue with a user by itself and provide a service required by the user. However, the mobile device 400 may acquire a part of information for conversation processing and service provision from the external content server 300.
The components forming the dialog system 100 may be physically separated from each other, or some components may be omitted, according to any of the above-described approaches. For example, even when the dialogue system 100 is provided in the remote dialogue system server 1, some components forming the dialogue system 100 may be provided in a separate server or vehicle. The operator or administrator of the individual server may be the same or different from the operator or administrator of the remote dialog system server 1. For example, a speech recognizer or a natural language understanding part described later may be provided in a separate server, and the dialogue system 100 may receive a result of the speech recognition or a result of the natural language understanding about the utterance of the user from the separate server. Alternatively, the storage 140 may be provided in a separate server.
A description of the detailed configuration and detailed operation of each component of the dialog system 100 will be described in detail. According to an embodiment described later, it is assumed that the dialogue system 100 is provided in the vehicle 200 for convenience of explanation. Specific components of the dialog system 100 described later may be classified according to their operations, and there may be no limitation as to whether the components are implemented by the same processor and memory and the physical locations of the processor and memory.
Fig. 21, 22A, and 22B are control block diagrams showing in detail the configuration of an input processor in the configuration of the dialog system.
referring to fig. 21, the input processor 110 may include a voice input processor 111 processing a voice input and a context information processor 112 processing context information.
The user's voice input through the voice input device 210 may be transmitted to the voice input processor 111, and input other than the user's voice input through the information input device 220 other than the voice may be transmitted to the contextual information processor 112.
The vehicle controller 240 may transmit the vehicle state information, the driving environment information, and the user information to the context information processor 112. The driving environment information and the user information may be provided from the external content server 300 or the mobile device 400 connected to the vehicle 200.
Inputs other than speech may be included in the context information. That is, the context information may include vehicle state information, driving environment information, user information, passenger boarding vehicle information, stop point arrival information, and stop point departure information. The passenger boarding vehicle information may correspond to contextual information indicating whether the passenger boards the vehicle, the stop point arrival information may correspond to contextual information indicating whether the vehicle arrives at the stop point, and the stop point departure information may correspond to contextual information indicating whether the vehicle departs from the stop point after arriving at the stop point.
The dialogue system 100 may acquire passenger boarding information based on information about passengers acquired by recognizing a dialogue between passengers in the vehicle through the voice input device 210 and information of passengers acquired through the information input device 220 other than voice.
In addition, the dialogue system 100 may determine whether the vehicle reaches or departs from the stop point based on the vehicle state information such as the current position and the vehicle speed detected by the vehicle detector 260, and the dialogue system 100 may acquire the stop point arrival information and the stop point departure information.
The vehicle state information may include information indicating a vehicle state and acquired by sensors provided in the vehicle 200, and information related to the vehicle and stored in the vehicle, such as a fuel type of the vehicle.
The running environment information may be information acquired by a sensor provided in the vehicle 200. The running environment information may include image information acquired by a front camera, a rear camera, or a stereo camera, obstacle information acquired by a sensor (e.g., radar, lidar, ultrasonic sensor), and information on rainfall and rain speed information acquired by a rain sensor.
The running environment information may also include traffic state information, traffic light information, and adjacent vehicle approach or adjacent vehicle collision risk information acquired via V2X.
The user information may include information related to a user state measured by a camera or a biometric reader provided in the vehicle, information related to the user directly input by the user using an input device provided in the vehicle, information related to the user stored in the external content server 300, and information stored in the mobile device 400 connected to the vehicle.
The voice input processor 111 may include: a speech recognizer 111a that outputs an utterance of a text type by recognizing an inputted speech of the user; a natural language understanding section 111b that recognizes an intention of the user contained in the utterance by applying a natural language understanding technique to the utterance of the user; and a dialog input manager 111c which transmits a result of understanding the natural language and context information to the dialog manager 120.
The speech recognizer 111a may include a speech recognition engine, and the speech recognition engine may recognize a speech spoken by a user and generate a recognition result by applying a speech recognition algorithm to the input speech.
Since the input speech is converted into a more useful form for speech recognition, the speech recognizer 111a can detect an actual speech portion included in the speech by detecting a start point and an end point from the speech signal. This is called End Point Detection (EPD).
The speech recognizer 111a may acquire a feature vector of the input speech from the detected portion by applying a feature vector extraction technique, such as Cepstrum (Cepstrum), Linear Prediction Coefficient (LPC), mel-frequency cepstral coefficient (MFCC), or filter bank energy (filter bank energy).
the speech recognizer 111a may obtain a recognition result by comparing the extracted feature vectors with a trained reference pattern (pattern). At this time, the speech recognizer 111a may use an acoustic model that models and compares signal characteristics of speech, and a language model that models a language sequential relationship of words or syllables corresponding to a recognition vocabulary. To this end, the storage 140 may store an acoustic model and a language model DB.
The acoustic model may be classified into a direct comparison method of setting a recognition object as a feature vector model and comparing the feature vector model with a feature vector of a speech signal, and a statistical method of statistically processing the feature vector of the recognition object.
The direct comparison method is to set units such as words or phonemes, which are recognition objects, to the feature vector model and compare the received speech with the feature vector model to determine the similarity therebetween. A representative example of the direct comparison method is vector quantization. The vector quantization is to map feature vectors of a received speech signal to a codebook as a reference model to encode the mapped result into representative values and to compare the representative values with each other.
The statistical model method is to configure units of an identification object as state sequences and use the relationship between the state sequences. Each state sequence may be configured with multiple nodes. Methods using the relationship between state sequences may be classified into Dynamic Time Warping (DTW), Hidden Markov Models (HMM), and methods using neural networks.
DTW is a method of compensating for a difference in time axis by comparing with a reference model in consideration of the dynamic characteristics of speech whose signal length changes with time even if the same person utters the same utterance. The HMM is a recognition method that assumes speech as a markov process having a state transition probability and an observation probability of a node (output symbol) in each state, then estimates the state transition probability and the observation probability of the node based on learning data and calculates a probability of generating the received speech from the estimated model.
Meanwhile, a language model modeling the language sequential relationship of words, syllables, and the like can reduce acoustic blur and recognition errors by applying the sequential relationship between units constituting a language to units acquired through speech recognition. The language models may include statistical language models and Finite State Automata (FSA) based models. Statistical language models use chain probabilities of words, such as Unigram, Bigram, and Trigram.
The speech recognizer 111a may perform speech recognition using any of the methods described above. For example, the speech recognizer 111a may use an acoustic model to which HMM is applied, or an N-best search method in which an acoustic model is combined with a speech model. The N-best search method may improve recognition performance by selecting N or fewer recognition result candidates using an acoustic model and a language model, and then re-evaluating the ranking of the recognition result candidates.
The speech recognizer 111a may calculate a confidence value to ensure reliability of the recognition result. The confidence value may be a criterion indicating how reliable the speech recognition result is. For example, a confidence value may be defined for a phoneme or word as a recognition result as a relative value of the probability of emitting the corresponding phoneme or word from a different phoneme or word. Thus, the confidence value may be expressed as a value between 0 and 1 or between 1 and 100.
When the confidence value is greater than the predetermined threshold, the speech recognizer 111a may output a recognition result to allow an operation corresponding to the recognition result to be performed. When the confidence value is equal to or less than the threshold, the speech recognizer 111a may reject the recognition result.
In addition, the voice recognizer 111a can distinguish voice input of the passenger through the voice input device 210. Specifically, the voice recognizer 111a may distinguish the voice of each passenger by comparing the non-verbal voice feature and the verbal feature of the voice of the passenger input through the voice input device 210. The non-verbal speech characteristics may include the height, intensity, breathing, and speed of the passenger's speech. Speech features may include dialects, slang, and accents of the passenger's voice.
In addition, the voice recognizer 111a may determine whether a new passenger gets on the vehicle by distinguishing the passenger voice input through the voice input device 210.
An utterance in a text form as a recognition result of the speech recognizer 111a may be input to the natural language understanding section 111 b.
The natural language understanding section 111b can recognize the intention of the user utterance included in the utterance language by applying a natural language understanding technique. Accordingly, the user can input a control command through a natural dialog, and the dialog system 100 can also cause the input of the control command and provide a service desired by the user via the dialog.
The natural language understanding section 111b may perform morpheme analysis on the utterance in the text form. A morpheme is the smallest unit of meaning and represents the smallest semantic element that cannot be subdivided. Thus, morpheme analysis is the first step in natural language understanding and converts an input string to a morpheme string.
The natural language understanding section 111b may acquire a domain from the utterance based on the morpheme analysis result. The fields may be used to recognize a subject of a user utterance language, and fields indicating various subjects (e.g., route guidance, weather search, traffic search, schedule management, fuel management and air conditioning control, boarding of passengers in a vehicle, and change in the number of passengers) may be stored as a database.
The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section 111b may acquire important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.
The natural language understanding section 111b can analyze the speech behavior contained in the utterance. The voice behavior analysis may be configured to recognize the intent of the user utterance, e.g., whether the user asks a question, whether the user makes a request, whether the user responds, or whether the user simply expresses an emotion.
The natural language understanding section 111b extracts an action corresponding to the intention of the utterance of the user. The natural language understanding section 111b may recognize the intention of an utterance of the user based on information such as a domain, an entity name, and a voice behavior, and extract an action corresponding to the utterance. The action may be defined by an object and an operator (operator).
The natural language understanding section 111b may acquire parameters related to the execution of the action. The parameter related to the action execution may be a valid parameter directly required for the action execution or an invalid parameter for extracting a valid parameter.
For example, when the utterance of the user is "let us go to seoul station", the natural language understanding section 111b may acquire "navigation" as a domain corresponding to the utterance, and "route guidance" as an action, in which the voice behavior corresponds to "request".
The entity name "seoul station" may correspond to [ parameter: destination ], but may require a specific exit number for the station or GPS information to actually guide the route via the navigation system. In this case, the [ parameter: destination: seoul station ] may be a candidate parameter for searching for "seoul station" actually desired by the user among a plurality of seoul station POIs.
The natural language understanding part 111b may acquire a tool configured to express a relationship between words or sentences, for example, a parse tree.
Morpheme analysis results, domain information, action information, voice behavior information, extracted parameter information, entity name information, and parse trees, which are the processing results of the natural language understanding section 111b, may be transmitted to the dialog input manager 111 c.
In addition, the morpheme analysis result, the domain information, the action information, the voice behavior information, the extracted parameter information, the entity name information, and the parse tree, which are the processing results of the natural language understanding section 111b, may be transmitted to the dialog input manager 111c through the passenger determiner 111 d.
the passenger determiner 111d estimates a variation in the number of passengers in the vehicle based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving based on the dialogue between the occupants in the vehicle, and the passenger determiner 111d may estimate the number of potential passengers based on the call session in the vehicle.
The passenger determiner 111d may generate the passenger number information based on the estimation result of the variation of the passenger number.
The passenger number information may include a likelihood that each passenger leaves the vehicle at the stopping point, a likelihood that each passenger boards the vehicle again after leaving at the stopping point, and a likelihood that a potential passenger boards the vehicle at the stopping point.
The context information processor 112 may include: a contextual information collector 112a collecting information from the information input device 220 and the vehicle controller 240 except for voice; a context information collection manager 112b that manages collection of context information; and a context understanding section 112c for understanding a context based on the result of the natural language understanding and the collected context information.
The input processor 110 may include: a memory in which a program for performing the above-described operation and an operation described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
The speech input processor 111 and the context information processor 112 included in the input processor 110 may be implemented by the same processor and memory or separate processors and memories.
Hereinafter, a method in which components of the input processor 110 process input data using information stored in the storage device 140 will be described in detail with reference to fig. 22A and 22B.
Referring to fig. 22A, the natural language understanding part 111b may perform domain extraction, entity recognition, voice behavior analysis, and action extraction using the domain/action inference rule DB 141.
In the domain/action inference rule DB 141, a domain extraction rule, a voice behavior analysis rule, an entity name conversion rule, an action extraction rule may be stored.
Other information such as user input other than voice, vehicle state information, driving environment information, and user information may be input to the context information collector 112a and then stored in the context information DB142, the long term memory 143, or the short term memory 144.
For example, the raw data detected by the vehicle detector 260 may be classified by sensor type and sensor value and then stored in the context information DB 142.
In the short term memory 144 and the long term memory 143, data meaningful to the user may be stored, which may include the current user state, the user's preferences and orientation, or data used to determine the user's preferences and orientation.
As described above, information that ensures persistence and is thus available for a long period of time may be stored in long-term storage 143, and may include information about the user's phone book, schedule, preferences, academic calendar, personality, work, and family members. In addition, the long-term memory 143 may store travel-related information related to travel and passenger information of passengers boarding the vehicle while traveling.
Information that is not guaranteed to be persistent or uncertain and therefore available in the short term may be stored in the short term memory 144 and may include current and previous locations, today's schedule, previous conversation content, conversation participants, environment, domain, and driver status. The data may be repeatedly stored in at least two storage devices among the context information DB142, the short term memory 144, and the long term memory 143 according to the data type.
In addition, among the information stored in the short-term memory 144, data determined to ensure durability may be transmitted to the long-term memory 143.
The information to be stored in the long-term memory 143 can be acquired using the information stored in the short-term memory 144 or the context information DB 142. For example, the user's preference may be acquired by analyzing destination information or conversation contents stored for a specific duration, and the acquired user's preference may be stored in the long-term memory 143.
The obtaining of the information to be stored in the long-term memory 143 by using the information stored in the short-term memory 144 or the context information DB 142 may be performed in the dialogue system 100 or in another external system.
In the former case, it may be executed in the memory manager 135 of the result processor 130. In this case, data used in acquiring meaningful information or persistent information (e.g., user's preference or orientation) among data stored in the short-term memory 144 or the context information DB142 may be stored in the long-term memory 143 in a log file type.
The memory manager 135 may retrieve persistent data by analyzing data stored for more than a particular duration and re-enter the data into the long term memory 143. In long-term storage 143, the locations where persistent data is stored may be different from the locations where data stored in a log file type is stored.
The memory manager 135 may determine persistent data among the data stored in the short-term memory 144 and move and store the determined data into the long-term memory 143.
As shown in fig. 22B, when information to be stored in the long-term memory 143 is obtained in an additional external system using information stored in the short-term memory 144 or the context information DB142, a data management system 800 provided with a communicator 810, a storage 820, and a controller 830 may be used.
The communicator 810 can receive data stored in the context information DB142 or the short term memory 144. All of the stored data may be sent to the communicator 810 or data used in obtaining meaningful or persistent information (e.g., user's preferences or orientations) may be selected and then sent. The received data may be stored in storage 820.
The controller 830 may acquire the persistent data by analyzing the stored data and then transmit the acquired data to the dialog system 100 via the communicator 810. The transmitted data may be stored in long-term memory 143 of dialog system 100.
In addition, the dialog input manager 111c can acquire context information related to the execution of the action by transmitting the output result of the natural language understanding section 111b to the context understanding section 112 c.
The context understanding part 112c can determine which context information is related to the action execution corresponding to the intention of the utterance of the user by referring to the context information stored by the action in the context understanding table 145.
fig. 23A and 23B are views showing examples of information stored in the contextual understanding table.
Referring to the example of fig. 23A, contextual information and types of contextual information related to the execution of an action may be stored in the contextual understanding table 145 according to each action.
For example, when the action is route guidance, the current location may be required as the context information, and the type of the context information may be GPS information. When the action is vehicle state checking, a travel distance may be required as the context information, and the type of the context information may be an integer. When the action is a gas station recommendation, a remaining fuel amount and a depletion Distance (DTE) may be required as the context information, and the type of the context information may be an integer.
when context information related to the execution of an action corresponding to the intention of the user utterance is pre-stored in the context information DB142, the long-term memory 143, or the short-term memory 144, the context understanding part 112c may acquire the corresponding information from the context information DB142, the long-term memory 143, or the short-term memory 144 and transmit the corresponding information to the dialog input manager 111 c.
when the context information related to the execution of the action corresponding to the intention of the user utterance is not stored in the context information DB 142, the long-term memory 143, or the short-term memory 144, the context understanding part 112c may request information required by the context information collection manager 112 b. The context information collection manager 112b may allow the context information collector 112a to collect the required information.
The contextual information collector 112a may collect data periodically or only when a particular event occurs. In addition, the context information collector 112a may collect data periodically and then additionally collect data when a specific event occurs. Further, the contextual information collector 112a may collect data when receiving a data collection request from the contextual information collection manager 112 b.
The context information collector 112a may collect the required information and then store the information in the context information DB142 or the short term memory 144. The contextual information collector 112a may send a confirmation signal to the contextual information collection manager 112 b.
The context information collection manager 112b can send a confirmation signal to the context understanding part 112c, and the context understanding part 112c can retrieve the required information from the long term memory 143 or the short term memory 144 and then send the required information to the dialog input manager 111 c.
Specifically, when the action corresponding to the intention of the utterance of the user is route guidance, the context understanding section 112c may search the context understanding table 145 and recognize the context information related to the route guidance as the current location.
When the current position is pre-stored in the short-term memory 144, the context understanding part 112c can acquire the current position and transmit the current position to the dialog input manager 111 c.
When the current location is not stored in the short-term memory 144, the context understanding portion 112c can request the current location from the context information collection manager 112b, and the context information collection manager 112b can allow the context information collector 112a to obtain the current location from the vehicle controller 240.
The contextual information collector 112a may obtain the current location and then store the current location in the short-term memory 144. The contextual information collector 112a may send a confirmation signal to the contextual information collection manager 112 b. The context information collection manager 112b can send a confirmation signal to the context understanding part 112c, and the context understanding part 112c can retrieve the current location information from the short-term memory 144 and then send the information to the dialog input manager 111 c.
the dialog input manager 111c may transmit the output of the natural language understanding part 111b and the output of the context understanding part 112c to the dialog manager 120, and the dialog input manager 111c may try to prevent repeated input from being input to the dialog manager 120. At this time, the output of the natural language understanding section 111b and the output of the context understanding section 112c may be combined into one output and then transmitted to the dialog manager 120 or may be transmitted to the dialog manager 120 independently of each other.
When the context information collection manager 112b determines that a certain event occurs because the data collected by the context information collector 112a satisfies a predetermined condition, the context information collection manager 112b can send an action trigger signal to the context understanding part 112 c. The context understanding part 112c can search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c can again transmit the context information request signal to the context information collection manager 112 b.
for example, when the contextual information collection manager 112b determines that the passenger gets on the vehicle because the vehicle operation information of the information input device 220 other than voice collected by the contextual information collector 112a satisfies a predetermined condition, the contextual information collection manager 112b may transmit an action trigger signal to the contextual understanding section 112 c. The context understanding part 112c can search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c can again transmit the context information request signal to the context information collection manager 112 b. As shown in fig. 23B, context information and types of context information related to the event may be stored in the contextual understanding table 145 according to each event.
For example, when the generated event is an engine temperature warning, an integer form of engine temperature may be stored as context information associated with the event. When the generated event is a driver drowsy driving detection, an integer form of the driver drowsy driving state may be stored as the context information related to the event. When the generated event is insufficient tire pressure, the tire pressure in integer form may be stored as context information associated with the event. When the generated event is a fuel warning, a depletion Distance (DTE) in the form of an integer may be stored as context information associated with the event. When the generated event is a sensor error, a textual sensor name may be stored as contextual information associated with the event.
In addition, as shown in fig. 23C, context information and types of context information related to the event may be stored in the contextual understanding table 145 according to each event. When the generated event is a window adjustment button manipulation, the window adjustment information in the form of text may be stored as the context information. When the generated event is a seat adjustment button manipulation, seat adjustment information in the form of text may be stored as context information. When the generated event is an air conditioning button manipulation, the air conditioning adjustment information in the form of text may be stored as the context information. In addition, an event related to the passenger boarding the vehicle may occur, and at this time, the passenger boarding vehicle information in the form of text may be stored as the context information.
The context information collection manager 112b can collect the required context information via the context information collector 112a and send a confirmation signal to the context understanding part 112 c. The context understanding part 112c can acquire required context information from the context information DB 142, the long term memory 143, or the short term memory 144 and then transmit the context information together with the action information to the dialog input manager 111 c.
The dialog input manager 111c can input the output of the context understanding portion 112c to the dialog manager 120.
Hereinafter, a case where the dialogue system 100 outputs a pre-utterance by itself before inputting an utterance of a user will be described.
Fig. 24 is a control block diagram showing a dialogue system suitable for a case where the dialogue system first outputs an utterance before receiving a user input, and fig. 25A, 25B, and 25C, and fig. 25D are views showing examples of information stored in a pre-utterance condition table.
Referring to fig. 24, the input processor 110 of the dialog system 100 may further include: a pre-utterance determiner 151 that determines whether it is a pre-utterance context; and a repetitive task processor 152. The storage 140 may further include a pre-utterance condition table 145a and a task processing DB145b that store pre-utterance conditions.
The data stored in the context information DB 142, the long term memory 143, and the short term memory 144 may be transmitted to the pre-utterance determiner 151. The pre-utterance determiner 151 may analyze the transmitted data and determine whether the transmitted data satisfies the pre-utterance condition stored in the pre-utterance condition table 145 a.
In addition, the voice input processor 111 and the context information processor 112 of the input processor 110 may generate context information indicating whether the passenger gets on the vehicle, and transmit the generated context information to the pre-utterance determiner 151.
The passenger determiner 111d of the input processor 110 may generate passenger number information based on the estimation result of the variation of the passenger number and transmit the generated passenger number information to the pre-utterance determiner 151.
In addition, when it is determined that the vehicle has left the stop point, the passenger determiner 111d may compare the estimation result of the change in the number of passengers acquired before reaching the stop point with the result of the change in the number of passengers after leaving the stop point based on the passenger number information, and transmit the comparison result to the pre-utterance determiner 151.
The pre-utterance determiner 151 may analyze the passenger number information and the comparison result, and determine whether the transmitted data satisfies the pre-utterance condition stored in the pre-utterance condition table 145 a. Referring to the example of fig. 25A, in the pre-utterance condition table 145A, a pre-utterance condition related to context information for each context information and a pre-utterance message output when the corresponding pre-utterance condition is satisfied may be stored.
When the context information transmitted from the context information DB142 satisfies the pre-utterance condition, the pre-utterance determiner 151 may determine as the pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 can transmit the pre-utterance trigger signal to the context understanding section 112c together with the pre-utterance message corresponding to the corresponding pre-utterance context. Further, the pre-utterance determiner 151 may transmit information related to a corresponding pre-utterance context. The information related to the corresponding pre-utterance context may include a pre-utterance condition corresponding to the corresponding pre-utterance context or an action corresponding to the pre-utterance context, which is described later.
For example, the pre-speech condition may be satisfied when the context information relates to tire air pressure and the tire air pressure is equal to or less than a predetermined reference value. When the pre-speech condition of the tire air pressure is satisfied, the pre-speech determiner 151 may determine that the pre-speech context is caused by insufficient tire air pressure, and generate a pre-speech trigger signal.
The pre-utterance determiner 151 may transmit the pre-utterance trigger signal to the context understanding section 112c together with the pre-utterance message. For example, in a pre-spoken context caused by insufficient tire air pressure, a pre-spoken message indicating that tire air pressure is low (such as "tire pressure is too low") may be sent to the context understanding portion 112 c.
In addition, the pre-speech condition may be satisfied when the context information relates to an engine temperature and the engine temperature is equal to or higher than a predetermined reference value. When the pre-speech condition of the engine temperature is satisfied, the pre-speech determiner 151 may determine that the pre-speech context is caused by an abnormality of the engine temperature, and generate a pre-speech trigger signal.
the pre-utterance determiner 151 may transmit the pre-utterance trigger signal to the context understanding section 112c together with the pre-utterance message. For example, in a pre-speech context caused by an abnormality in engine temperature, a pre-speech message indicating that the engine is overheated (such as "engine temperature too high") may be sent to the context understanding portion 112 c.
In addition, the pre-speech condition may be satisfied when the context information relates to the remaining amount of gasoline and the remaining amount of gasoline is equal to or less than a predetermined reference value. When the user sets a destination using a navigation service of the vehicle, the predetermined reference value may be set based on a distance from the current location to the destination. When the destination is not set, a default value may be used as a reference value. For example, a value smaller than the reference value for indicating the fuel deficiency warning lamp may be set as the reference value of the pre-speech condition relating to the deficiency in the remaining amount of gasoline. When the pre-speech condition for the remaining gasoline amount is satisfied, the pre-speech determiner 151 may determine that the pre-speech context is caused by a shortage of the remaining gasoline amount, and generate a pre-speech trigger signal.
The pre-utterance determiner 151 may transmit the pre-utterance trigger signal to the context understanding section 112c together with the pre-utterance message. For example, in the pre-speech context caused by a shortage of the remaining amount of gasoline, a pre-speech message indicating that the remaining amount of gasoline is insufficient (such as "the remaining amount of gasoline is insufficient to reach the destination") may be sent to the context understanding portion 112 c.
However, the pre-speech condition and the pre-speech message shown in fig. 25A are only examples that can be applied to the dialogue system 100. In the above-described example, the case where the pre-utterance message corresponding to the pre-utterance context is the content notifying the current situation has been described. However, the dialog system 100 may also first suggest specific functions or services needed to perform the pre-utterance context.
Referring to fig. 25B, when the pre-spoken context is caused by insufficient tire air pressure or an abnormal engine temperature, a pre-spoken message corresponding thereto may be stored that is an active recommendation to a repair shop reservation service, such as "do you want to reserve a repair shop?".
In addition, when the pre-utterance context is caused by insufficient remaining amount of gasoline, a pre-utterance message corresponding thereto can be stored as content that actively suggests a gas station guidance service, such as "do you want to guide to a gas station?".
In addition, the pre-speech condition may be satisfied when the pre-speech context is caused by an interior temperature of the vehicle and when the interior temperature of the vehicle is outside a predetermined reference range. When the pre-speech condition of the vehicle interior temperature is satisfied, the context understanding section 112c may determine that the pre-speech context is caused by an abnormality of the vehicle interior temperature, and generate the pre-speech trigger signal.
In the case where the pre-utterance context is caused by an abnormality in the vehicle interior temperature, a pre-utterance message corresponding thereto may be stored that is an active suggestion for an interior temperature control function, such as "do you want to operate the air conditioner?".
When the pre-utterance condition of the microphone input is satisfied, the context understanding section 112c may determine that the pre-utterance condition is a pre-utterance context for changing emotion and generate a pre-utterance trigger signal.
In addition, the pre-speech condition may be satisfied when the context information relates to opening and closing of a window and whether it rains, and when the window is open and raining. When the window is open and it is raining, the context understanding portion 112c may determine that the pre-utterance context is caused by the window being open and generate a pre-utterance trigger signal.
In the case where the pre-speech context is caused by a window being opened, a pre-speech message corresponding thereto may be stored that is an active recommendation for a window closing function, such as "do you want to close window?".
In the above-described example of fig. 25A and 25B, the case where the pre-utterance message corresponding to the pre-utterance context is pre-stored in the pre-utterance condition table 145A has been described. However, examples of the dialog system 100 are not limited thereto, and thus, the action corresponding to the pre-utterance context may be pre-stored.
As described above, when the utterance of the user is input, the natural language understanding part 111b may acquire an action corresponding to the utterance of the user with reference to the domain/action inference rule DB 141. As shown in fig. 25C, when the dialog system 100 outputs the pre-utterance, an action corresponding to the pre-utterance context may be pre-stored in each pre-utterance context.
For example, when the pre-speech context is caused by an abnormality in tire air pressure and engine temperature, "repair shop guide" may be stored as a corresponding action, and when the pre-speech context is caused by a shortage of the remaining amount of gasoline, "gas station guide" may be stored as a corresponding action.
In addition, when the pre-speech context is caused by an abnormality of the vehicle interior temperature, "air-conditioning operation" may be stored as the corresponding action, and when the pre-speech context is used to change the emotion, "multimedia play" may be stored as the corresponding action. When the pre-speech context is caused by the opening of a window, "opening and closing of the window" may be stored as the corresponding action.
As described above, when the action corresponding to the pre-utterance context is pre-stored, the pre-utterance trigger signal may be transmitted to the context understanding part 112c together with the action corresponding to the pre-utterance context, and the dialog input manager 111c may input the pre-utterance trigger signal to the dialog manager 120 together with the action corresponding to the pre-utterance context. In this case, the same operation as in the case of inputting the user utterance may be performed in the dialog manager 120.
For another example, in the pre-utterance condition table 145a, the pre-utterance context may be stored in such a manner that the pre-utterance context matches a virtual user utterance corresponding to each pre-utterance context, or the pre-utterance determiner 151 may generate a virtual user utterance corresponding to the pre-utterance context. The pre-utterance determiner 151 may transmit the user utterance stored in the pre-utterance condition table 145a or generated by the pre-utterance determiner 151 to the natural language understanding section 111b in a text type. For example, when the pre-spoken context is caused by an abnormality in tire air pressure, a virtual user utterance, such as "check tire pressure" or "guide to a repair shop" may be stored or generated. In addition, when the pre-utterance context is caused by an abnormality of the vehicle interior temperature, a virtual user utterance such as "turn on the air conditioner" may be stored or generated.
in addition, the dialogue system client 470 of the mobile device 400 may perform some operations of the pre-utterance determiner 151 according to a mobile gateway manner in which the mobile device 400 acts as a gateway between the vehicle and the dialogue system 100. In this case, the dialog system client 470 can generate a virtual user utterance corresponding to the pre-utterance context and send the virtual user utterance to the natural language understanding section 111 b.
The natural language understanding part 111b may acquire a domain and an action corresponding to the transmitted virtual user utterance and transmit the domain and the action to the dialog input manager 111 c. The action extracted by the natural language understanding section 111b may become an action corresponding to the pre-utterance context. The process performed after the action corresponding to the pre-utterance context is transmitted to the dialog manager 120 may be performed in the same manner as in the case where the user utters first.
The above-described contextual information, pre-utterance conditions, pre-utterance messages, and actions are merely examples of embodiments that apply to the dialog system 100, but embodiments of the dialog system 100 are not limited in this regard. In addition, various contextual information, pre-utterance conditions, pre-utterance messages, and actions may be stored.
When the pre-utterance determiner 151 transmits the information related to the pre-utterance trigger signal and the pre-utterance context to the context understanding section 112c, the context understanding section 112c may transmit the information related to the pre-utterance context to the repetitive task processor 152.
The repetitive task processor 152 may determine (based on information stored in the task processing DB145 b) whether a task that is contextually related to the currently occurring pre-utterance has been processed, or whether the task is a repetitive task.
In the task processing DB145b, information about tasks that have been processed or are currently processed may be stored. For example, a dialogue history (including dialogue content and each dialogue time), a vehicle state within the dialogue time, and whether a task is completed or not may be stored. In addition, route guidance task procedures and processing results, such as using a navigation function, may be stored regardless of the dialog.
Specifically, when the pre-utterance context is caused by a shortage of the remaining amount of gasoline, the repetitive task processor 152 may determine whether the gas station guidance task is currently processed based on the information stored in the task processing DB 145 b. When a dialogue of a gas station guidance is currently performed or a gas station guidance action is currently performed, the repeat task processor 152 may determine a task related to the current pre-utterance context as a repeat task and terminate the pre-utterance context.
Further, when an utterance for gas station guidance is previously output, and when there is a history of a conversation in which the user rejects gas station guidance, the repetitive task processor 152 may determine a task related to the current pre-utterance context as a repetitive task, and terminate the pre-utterance context.
Further, when the gas station guidance task using the navigation function is currently processed, the repeat task processor 152 may also determine a task related to the current pre-utterance context as a repeat task and terminate the pre-utterance context regardless of a conversation history of the gas station guidance. The repetitive task processor 152 can recognize that a gas station guidance task using a navigation function is currently being processed based on information stored in the task processing DB145 b.
Further, when the reference period of time has not elapsed since the execution of the dialogue related to the guidance of the remaining amount of gasoline, it can be assumed that the user drives himself to the gasoline station even if the gasoline station guidance has not been executed currently. Accordingly, the repeat task processor 152 can determine the task related to the current pre-utterance context as a repeat task and terminate the pre-utterance context.
Further, in a state where the pre-utterance context is used to indicate a schedule based on information stored in the long-term memory 143 (such as a user's birthday or a family member birthday), when there is a conversation history in which the same schedule is previously guided and a reference period of time has not elapsed since a corresponding conversation was performed, the repetitive task processor 152 may determine a task related to the current pre-utterance context as a repetitive task and terminate the pre-utterance context.
That is, the repetitive task processor 152 may determine whether a pre-utterance and a user intention regarding a pre-utterance context have been previously output based on a dialog history stored in the task processing DB 145 b. The repetitive task processor 152 may determine whether it is a repetitive task based on the stored dialog time, the user's intention, the vehicle state, or the completion of the task.
In the repetitive task processor 152, a policy configured to determine whether to be a repetitive task, i.e., whether to terminate the pre-utterance context, based on information stored in the task processing DB145b may be stored. The repeat task processor 152 may determine whether a task related to the current pre-utterance context is a repeat task according to a stored policy, and when it is determined to be a repeat task, the repeat task processor 152 may terminate the pre-utterance context.
In the above example, the case where the dialogue system 100 includes the pre-utterance determiner 151, the repetitive task processor 152, the pre-utterance condition table 145a, and the task processing DB145b has been described.
However, the example of the dialog system 100 is not limited thereto, and thus the components shown in fig. 22A and 22B may also perform the operations of the above-described components.
For example, the context understanding section 112c may perform an operation corresponding to the pre-utterance determiner 151 that determines whether a pre-utterance condition is satisfied, and an operation corresponding to the repetitive task processor 152 that processes the repetitive task.
The information stored in the pre-utterance condition table 145a may be stored in the context understanding table 145, and the information stored in the task processing DB145b may be stored in a dialog and action state DB147, which will be described later.
Referring to the example of fig. 25D, in the pre-utterance condition table 145a, a pre-utterance condition related to context information for each piece of context information and a pre-utterance message output when the corresponding pre-utterance condition is satisfied may be stored.
That is, the dialogue system 100 can acquire the pre-utterance message stored in the pre-utterance condition table based on the context information and the pre-utterance condition.
The pre-speech condition may be satisfied when it is determined that the passenger boards the vehicle in a state in which the context information is related to whether the passenger boards the vehicle. When the pre-speech condition is satisfied because it is determined that the passenger boards the vehicle, the pre-speech determiner 151 may determine a pre-speech context caused by the passenger boarding the vehicle, and may generate a pre-speech trigger signal.
The pre-utterance determiner 151 may send a pre-utterance trigger to the dialog input manager 111c together with the stored pre-utterance message, for example, in the case of a pre-utterance context related to the passenger boarding the vehicle, a pre-utterance message requesting the passenger's identification information, such as "who you? tell me your name" may be sent to the dialog input manager 111 c.
In addition, for the context information related to whether the passenger gets on the vehicle, the pre-speech condition may be satisfied when it is determined that the passenger does not get on the vehicle. When the pre-speech condition is satisfied because it is determined that the passenger is not boarding the vehicle, the pre-speech determiner 151 may determine a pre-speech context caused by the passenger not boarding the vehicle, and may generate a pre-speech trigger signal.
The pre-utterance determiner 151 may transmit a pre-utterance trigger signal to the dialog input manager 111c along with the stored pre-utterance message, for example, in a pre-utterance context related to a passenger not boarding a vehicle, a pre-utterance message for verifying whether a passenger is present (such as "whether another passenger is boarding the vehicle?") may be transmitted to the dialog input manager 111 c.
In addition, when the dialogue system 100 estimates the possibility that the potential passenger gets on the vehicle in a state where the context information is related to whether the potential passenger will get on the vehicle, the pre-utterance condition may be satisfied. When the pre-utterance condition is satisfied due to the estimation of the possibility that the potential passenger boards the vehicle, the pre-utterance determiner 151 may determine a pre-utterance context caused by the potential passenger boarding the vehicle, and may generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may send a pre-utterance trigger signal to the dialog input manager 111c along with the stored pre-utterance message, for example, in the case of a pre-utterance where a potential passenger will board a vehicle, a pre-utterance message verifying whether the passenger is present (such as "who boards the vehicle en route? tells me his/her name") may be sent to the dialog input manager 111 c.
In addition, when a change in the number of passengers is estimated in a state where the context information is correlated with before reaching the stop point, the pre-speech condition may be satisfied. When the pre-utterance condition is satisfied due to the estimated variation in the number of passengers, the pre-utterance determiner 151 may determine that the pre-utterance context is due before reaching the stop point, and may generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may transmit a pre-utterance trigger signal to the dialog input manager 111c together with the stored pre-utterance message. For example, in the context of a pre-utterance related to before reaching the stop point, pre-utterances related to the estimation result of the change in the number of passengers, such as "a will leave at the stop point", "B will leave the vehicle at the stop point and then board the vehicle again", "C will not leave at the stop point", and "D will board the vehicle at the stop point" may be transmitted to the dialogue input manager 111C.
In addition, when a change in the number of passengers is estimated in a state where the context information is correlated after leaving the stop point, the pre-speech condition may be satisfied. When the pre-utterance condition is satisfied due to the estimated variation in the number of passengers, the pre-utterance determiner 151 may determine that the pre-utterance context is due to the pre-utterance after leaving the stop point, and may generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may transmit a pre-utterance trigger signal to the dialog input manager 111C together with the stored pre-utterance message, for example, in the context of pre-utterance after departure from a stop point, a pre-utterance verifying whether the estimation result of the change in the number of passengers is correct, such as "a departed?", "B whether it gets on the vehicle? again", "C whether it is still?", and "D gets on the vehicle?", may be transmitted to the dialog input manager 111C to verify the result of the change in the number of passengers after departure from the stop point.
In addition, when the estimation result of the change in the number of passengers is compared with the result of the change in the number of passengers after leaving from the stop point in a state where the context information is related to the state after leaving from the stop point, the pre-speech condition may be satisfied. When the pre-utterance condition is satisfied as a result of comparing the estimation result of the variation in the number of passengers with the result of the variation in the number of passengers after departing from the stop point, the pre-utterance determiner 151 may determine that the pre-utterance context is due to the passenger after departing from the stop point, and may generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may transmit a pre-utterance trigger signal to the dialog input manager 111c together with the stored pre-utterance message. For example, in the context of a pre-utterance related after departure from a stop point, a pre-utterance indicating a change in the number of passengers and a result of a change in the number of passengers after departure from the stop point may be transmitted to the dialogue input manager 111c, such as "the number of present passengers is different from the result of an estimation of a change in the number of passengers" or a pre-utterance indicating a change in the number of passengers is the same as the result of a change in the number of passengers after departure from a stop point, such as "the number of present passengers is the same as the result of an estimation of a change in the number of passengers" may be transmitted to the dialogue input manager 111 c.
In addition, when it is determined that the characteristics of the passenger boarding the vehicle are the same as the stored passenger information in a state in which the context information is related to whether the passenger boards the vehicle, the pre-speech condition may be satisfied. When the pre-utterance condition is satisfied due to the estimation of the variation in the number of passengers, the pre-utterance determiner 151 may determine a pre-utterance context due to the passengers boarding the vehicle, and may generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may transmit a pre-utterance trigger signal to the dialog input manager 111c together with the stored pre-utterance message. For example, in a pre-utterance context caused by a passenger boarding a vehicle, a pre-utterance message (such as "you are 00") for verifying whether the current passenger is in a previous trip may be transmitted to the dialogue input manager 111 c.
As described above, the dialogue system 100 can determine whether the passenger gets on the vehicle and output the pre-speech recognizing the passenger by using the pre-speech condition table 145 a.
Specifically, the dialogue system 100 may determine that the passenger boards the vehicle based on at least one of dialogue between occupants in the vehicle and vehicle operation information. For example, the dialogue system 100 may determine that a passenger boards the vehicle based on a dialogue between occupants in the vehicle input through the voice input processor 111. The occupants in the vehicle may include a driver and at least one passenger, and the vehicle operation information may include operation information of the information input device 220 other than voice.
The determination by the dialogue system 100 that the passenger gets on the vehicle may be performed within a certain period of time from when the vehicle 200 starts running or within a certain period of time from when the vehicle 200 stops running.
The voice input processor 111 may distinguish the voice of each passenger based on a dialogue between occupants in the vehicle inputted through the voice input device 210 provided in the vehicle 200 and the voice input device 410 provided in the mobile device 400. Specifically, the voice input processor 111 may distinguish the voice of each passenger by comparing the voice of each passenger based on the non-verbal voice feature and the verbal feature of the voice of the passenger input through the voice input device 210. The non-verbal speech characteristics may include the height, intensity, breathing, and speed of the passenger's speech. Verbal speech features may include dialects, slang, and accents of the passenger's speech.
Accordingly, the voice input processor 111 may determine whether each passenger gets on the vehicle by distinguishing the voice of each passenger input through the voice input device 210 and the voice input device 410 provided in the mobile device 400 based on one or more voice features.
The voice features may include at least one of non-verbal features and verbal features.
The dialogue between occupants in the vehicle, which is input through the voice input processor 111 to determine whether a passenger gets on the vehicle, may represent not an utterance for sending an intention to the vehicle 200, but a dialogue between occupants in the vehicle including the driver.
The contextual information processor 112 of the dialog system 100 can determine that the passenger is boarding the vehicle based on the vehicle operation information. That is, the dialogue system 100 may determine the boarding of the passenger based on the vehicle operation information in order to determine whether there is a passenger who fails to be determined to board the vehicle through the voice input processor 111 because of not participating in the dialogue.
Specifically, the contextual information processor 112 may detect operation of the information input device 220 by the passenger other than speech. The information input device 220 other than voice may include a window adjustment button, a seat adjustment button, and an air conditioning adjustment button provided at the side of the passenger seat 254b and the rear seats 254c and 254d, respectively, to determine the passenger boarding vehicle and the seat position of the passenger. When the operation of the information input device 220 other than voice is detected, the contextual information processor 112 may acquire the vehicle operation information based on the operation of the information input device 220 other than voice.
The vehicle operation information may include at least one of window adjustment button operation information, seat adjustment button operation information, or air conditioning adjustment button operation information related to the passenger seat 254b and the rear seats 254c and 254 d.
The contextual information processor 112 may determine that the passenger is boarding the vehicle based on the vehicle operation information.
That is, the input processor 110 may collect passenger boarding vehicle information indicating that a passenger boards the context of the vehicle or that the passenger does not board the context of the vehicle through at least one of the voice input processor 111 and the context information processor 112.
When the dialogue system 100 determines that the passenger boards the vehicle for a period of time from when the vehicle 200 starts running or for a period of time from when the vehicle 200 stops running, the dialogue system 100 may output a pre-utterance for requesting the identification information. Specifically, when it is determined that the passenger boards the vehicle, the dialogue system 100 may output a pre-utterance for requesting recognition information.
For example, when it is determined that a passenger is boarding the vehicle, the dialog system 100 may output a pre-utterance requesting the passenger's identification information, such as "who you are? telling me your name".
The pre-utterance determiner 151 of the input processor 110 may determine whether to output a pre-utterance based on a pre-utterance condition related to determining whether the passenger gets on the vehicle based on context information related to whether the passenger gets on the vehicle. In addition, when the context information corresponding to whether the passenger gets on the vehicle satisfies a pre-utterance condition corresponding to the determination that the passenger gets on the vehicle, the pre-utterance determiner 151 may determine as the pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may obtain a pre-utterance message corresponding to a pre-utterance context associated with a passenger boarding the vehicle, such as "who you are? telling me your name". when the pre-utterance determiner 151 sends a pre-utterance trigger and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may send the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 can identify the passenger by receiving the passenger's utterance. Specifically, the dialog system 100 can identify the passenger by receiving an utterance of the passenger with respect to the pre-utterance message.
For example, the passenger may speak "i am 00" in response to a pre-verbal message requesting the passenger's identifying information. That is, the passenger may speak a message including his/her name in response to the pre-spoken message.
When the words of the passenger are input, the voice input processor 111 recognizes the input words of the passenger. The words of the passenger may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding process may include performing morpheme analysis on an utterance in text form and recognizing a name based on a result of the morpheme analysis.
In addition, the natural language understanding section 111b may increase the recognition rate of names using the driver's phone book stored in the long-term memory 143. Specifically, the natural language understanding section 111b may increase the recognition rate of the name by comparing the name contained in the passenger utterance with the name contained in the phonebook.
The passenger determiner 111d may verify the name of the passenger based on the output of the natural language understanding section 111b to identify the identity of the passenger.
Thus, based on the occupant's utterance, the dialog system 100 can identify the identity of the occupant who uttered the message.
In addition, the dialog system 100 may determine the passenger's seat position in the vehicle by estimating the location of the utterance based on the direction and magnitude of the occupant's utterance. The dialogue system 100 may determine the seat position of the passenger in the vehicle based on the vehicle operation information related to the window adjustment buttons, the seat adjustment buttons, and the air conditioning adjustment buttons provided on the passenger seat 254b and the rear seats 254c and 254 d. Thus, the dialog system 100 may generate seat position information for each passenger by mapping the passenger with the seat position in the vehicle.
When the seat position of the passenger is changed while driving, the dialogue system 100 may estimate a change in the seat position of the passenger based on the utterance of the passenger and the vehicle operation information, and determine the changed seat position of the passenger. In this case, the dialog system 100 may generate seat position information by applying the changed passenger seat position.
Passenger information about the identified passenger may be stored in the storage device 140 in real time, wherein the passenger information may include personal identification information, one or more voice characteristics of the passenger's voice, and seat position information.
When it is determined that the passenger is not boarding the vehicle, the dialog system 100 may output a pre-utterance that verifies whether the passenger is present. Specifically, when the dialogue system 100 determines that the passenger does not board the vehicle for a period of time from when the vehicle 200 starts running or for a period of time from when the vehicle 200 stops running, the dialogue system 100 may output a pre-utterance that verifies whether the passenger is present.
For example, when the dialogue system 100 determines that the passenger does not board the vehicle for a period of time from when the vehicle 200 starts running or for a period of time from when the vehicle 200 stops running, the dialogue system 100 may output a pre-utterance to verify whether the passenger is present, "whether there are other passengers boarding the vehicle?".
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to whether the passenger is not boarding the vehicle, based on context information related to whether the passenger is boarding the vehicle. In addition, when the context information related to whether the passenger gets on the vehicle satisfies the pre-utterance condition related to whether the passenger does not get on the vehicle, the pre-utterance determiner 151 may determine as the pre-utterance context and generate the pre-utterance trigger signal.
the pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a pre-utterance context of a vehicle that the passenger does not board, such as "if there are any other passengers boarding the vehicle?". when the pre-utterance determiner 151 transmits a pre-utterance trigger signal and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 may verify that a passenger is present by receiving an utterance of a driver. Specifically, the dialog system 100 can verify whether a passenger is present by receiving an utterance of the driver regarding the pre-utterance message.
For example, the driver may speak "none" or "present" in response to a pre-verbal message verifying whether the passenger is present. That is, the driver may speak a message including a response indicating the presence of the passenger in response to the pre-verbal message.
When the speech of the driver is input, the voice input processor 111 recognizes the input speech of the driver. The driver's speech may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section 111b can acquire important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.
Specifically, the natural language understanding process may include performing morpheme analysis on an utterance in text form, and recognizing an entity name based on a result of the morpheme analysis.
The output of the natural language understanding section 111b as a result of the natural language understanding may include the entity name and the result of the morpheme analysis corresponding to the utterance of the passenger.
The passenger determiner 111d can recognize the presence of the passenger based on the output of the natural language understanding section 111 b.
Thus, the voice input processor 111 may verify the presence of the passenger based on the driver's utterance.
The dialogue system 100 may estimate and guide the possibility of each passenger leaving the vehicle at the stop point and the possibility of each passenger getting on the vehicle again after leaving at the stop point by identifying the passengers.
In addition, the dialogue system 100 may perform recognition of an occupant based on a dialogue between occupants in the vehicle and vehicle operation information for a period of time from when the vehicle 200 starts traveling or for a period of time from when the vehicle 200 stops traveling. Thus, the dialog system 100 can continuously identify new passengers.
In addition, when the dialogue system 100 erroneously recognizes a passenger who has checked in the vehicle as a new passenger and outputs a pre-utterance, the passenger can inform him/her of an existing passenger by speaking a message including his/her name.
The dialogue system 100 can estimate the variation in the number of passengers by using the pre-utterance condition table 145a and output a pre-utterance regarding the estimation result of the variation in the number of passengers.
Specifically, as described above, the dialog system 100 may identify a passenger. The dialogue system 100 may determine that a passenger boards the vehicle based on at least one of dialogue between occupants in the vehicle and vehicle operation information, and identify the passenger by using the pre-speech.
the dialogue system 100 may generate passenger quantity information based on dialogue between occupants in the vehicle. Specifically, the dialogue system 100 may determine the possibility of each passenger leaving the vehicle at a specific stop point and the possibility of each passenger getting on the vehicle again after leaving at the specific stop point by continuously receiving dialogs between occupants in the vehicle.
For example, the voice input processor 111 of the dialogue system 100 may continuously receive dialogs between occupants in the vehicle through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section 111b may acquire important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.
Specifically, the natural language understanding process may include performing morpheme analysis on an utterance in text form, and recognizing an entity name based on a result of the morpheme analysis.
The output of the natural language understanding section 111b as a result of the natural language understanding may include the entity name and the result of the morpheme analysis corresponding to the utterance of the passenger.
The passenger determiner 111d may estimate the variation in the number of passengers based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the variation in the number of passengers at a specific stop point by analyzing the words of the passengers.
For example, when a certain passenger issues a message indicating that he/she will leave at a specific stop point, such as "i will leave at seoul stop", the speech recognizer 111a may output the utterance of the certain passenger in text form, and the natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output the result of natural language understanding.
Specifically, the natural language understanding section 111b may output morpheme analysis results such as "i", "soon", a "seoul station" corresponding to the name of the entity, and "leave".
The passenger determiner 111d may estimate that a specific passenger will exit at a specific stopping point based on the entity name and morpheme analysis result of the natural language understanding section 111 b.
Specifically, when a certain passenger issues a message indicating that he/she will leave at a specific stop point, such as "i will leave at seoul stop soon", the passenger determiner 111d may estimate that the certain passenger will leave at seoul stop in the near future. That is, the passenger determiner 111d may estimate the possibility of a specific passenger to leave at a specific stopping point and the time point thereof.
Further, when a certain passenger speaks a message indicating that he/she is going to leave at a specific stop point and then board the vehicle again, such as "i are going to leave at seoul stop and board the vehicle again", the natural language understanding section 111b may output morpheme analysis results such as "i", "leave", a "seoul stop", "again", and "board the vehicle" corresponding to the entity name.
The passenger determiner 111d may estimate that a specific passenger will leave and board the vehicle again at a specific stopping point based on the entity name and morpheme analysis result of the natural language understanding section 111 b.
Specifically, when a certain passenger issues a message indicating that he/she will leave and board the vehicle again at a specific stop point, such as "i will leave and board the vehicle again at the seoul stop", the passenger determiner 111d may estimate that the certain passenger will leave and board the vehicle again at the seoul stop. That is, the passenger determiner 111d may estimate the possibility that a specific passenger gets on the vehicle again after leaving.
Additionally, the dialog system 100 may estimate the number of potential passengers by determining the likelihood of the called party boarding the vehicle by receiving a call session in the vehicle.
Specifically, when a passenger in the vehicle speaks a message indicating that a particular potential passenger will board the vehicle at a particular stop, such as "we will meet soon at seoul stop," the natural language understanding portion 111b may output morpheme analysis results, such as "soon," the "seoul stop" corresponding to the name of the entity, and "meet.
The passenger determiner 111d may estimate that the potential passenger will board the vehicle at a specific stopping point based on the entity name and the morpheme analysis result of the natural language understanding section 111 b.
In particular, when an occupant in the vehicle speaks a message indicating that a particular potential passenger will board the vehicle at a particular stop, such as "we will meet soon at seoul stop," the passenger determiner 111d can estimate that the particular potential passenger will board the vehicle at seoul stop. That is, the passenger determiner 111d may estimate the likelihood of a potential passenger boarding the vehicle.
When estimating the likelihood of the potential passenger boarding the vehicle, the dialog system 100 may output a pre-utterance for verifying the likelihood of the potential passenger boarding the vehicle.
For example, when estimating the likelihood of a potential passenger boarding the vehicle, the dialog system 100 may output a pre-utterance for verifying the likelihood of a potential passenger boarding the vehicle, such as "who boards the vehicle midway? telling me his/her name".
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to an estimation of a likelihood of boarding a vehicle based on context information related to whether a potential passenger will board the vehicle. In addition, when the context information related to whether the potential passenger will board the vehicle satisfies the pre-utterance condition related to the estimation of the likelihood that the potential passenger boards the vehicle, the pre-utterance determiner 151 may determine as the pre-utterance context and generate the pre-utterance trigger signal.
The pre-utterance determiner 151 may obtain a pre-utterance message corresponding to a pre-utterance context in which a potential passenger will board the vehicle, e.g., "who boards the vehicle? midway telling me his/her name.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 can verify the likelihood of a potential passenger boarding the vehicle by receiving occupant utterances in the vehicle. In particular, the dialog system 100 can verify whether a potential passenger is present by receiving an utterance of an occupant regarding the pre-utterance message.
the passenger determiner 111d may estimate the variation in the number of passengers in the vehicle based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the number of potential passengers based on the call session, and the passenger determiner 111d may also estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving based on the dialogue between the occupants in the vehicle.
The passenger determiner 111d may generate the passenger number information based on the estimation result of the variation of the passenger number.
That is, the passenger determiner 111d may generate the passenger number information based on the possibility that each passenger leaves the vehicle at the stopping point, the possibility that each passenger gets on the vehicle again after leaving at the stopping point, and the possibility that a potential passenger gets on the vehicle at the stopping point.
Before reaching the stop point, the dialogue system 100 may output a preliminary utterance related to the estimation result of the change in the number of passengers based on the passenger number information.
for example, when it is determined that the vehicle 200 is before reaching the stopping point, the dialogue system 100 may output a pre-utterance related to the estimation result of the change in the number of passengers, such as "a will leave at the stopping point", "B will leave the vehicle at the stopping point and then board the vehicle again", "C will not leave at the stopping point", and "D will board the vehicle at the stopping point".
That is, before reaching the stopping point, the dialogue system 100 may output the pre-utterance contained in the passenger number information that relates to the possibility that each passenger leaves the vehicle at the stopping point, the possibility that each passenger gets on the vehicle again at the stopping point, and the possibility that each passenger gets on the vehicle at the stopping point.
However, the dialogue system 100 may output the pre-utterance relating to the estimation result of the change in the number of passengers based on the passenger number information not only before reaching the stop point but also after reaching the stop point.
Further, the content related to the possibility of each passenger leaving the vehicle at the stop point, the possibility of each passenger getting on the vehicle again at the stop point, and the possibility of getting on the vehicle at the stop point may include a message about the number of passengers leaving the vehicle, such as "next goodbye", and a message about passengers getting on the vehicle again after leaving the vehicle, such as "travel pleasure and safely coming back".
The dialogue system 100 may determine whether the vehicle is just before or just after the stop point is reached based on vehicle state information such as the vehicle position and the vehicle speed detected by the vehicle detector 260.
Specifically, when the gear is placed in the P range, the dialogue system 100 may determine that the vehicle 200 reaches the stop point, and when the speed is equal to or less than 10kph, the dialogue system 100 may determine that the vehicle 200 is just before reaching the stop point.
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to an estimation of a change in the number of passengers based on context information related to before reaching the stop point. The pre-utterance determiner 151 may determine that a pre-utterance condition related to an estimation of a variation in the number of passengers is satisfied based on the passenger number information transmitted from the passenger determiner 111 d. In addition, when the context information related to before reaching the stop point satisfies the pre-utterance condition related to the estimation of the variation of the number of passengers, the pre-utterance determiner 151 may determine as the pre-utterance context and generate the pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a pre-utterance context in which the number of passengers is estimated to vary, such as "a will leave at a stop point", "B will leave the vehicle at the stop point and then board the vehicle again", "C will not leave at the stop point", and "D will board the vehicle at the stop point", based on the passenger number information. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialogue system 100 may compare the estimation result of the change in the number of passengers with the result of the change in the number of passengers after leaving from the stop point based on the passenger number information.
The dialogue system 100 may determine whether the vehicle has left the parking spot based on vehicle state information such as the vehicle position and the vehicle speed detected by the vehicle detector 260.
Specifically, the dialogue system 100 may determine that the vehicle leaves the parking spot based on facts such as the parking brake being released, the ignition on, or the brake pedal on.
The dialogue system 100 may detect the passenger through the voice input processor 111 and the contextual information processor 112 and recognize the passenger through the passenger determiner 111D to determine the result of the change in the number of passengers after departing from the stop point.
Therefore, when it is determined that the vehicle 200 departs from the stop point, the passenger determiner 111d of the dialogue system 100 may compare the estimation result of the change in the number of passengers based on the passenger number information with the result of the change in the number of passengers after departing from the stop point.
In addition, the dialogue system 100 may output a pre-utterance to compare the result of the estimation of the change in the number of passengers with the result of the change in the number of passengers after leaving from the stop point.
Specifically, the dialogue system 100 may output a pre-utterance that determines whether the passenger who is going to leave at the stop point leaves at the stop point, such as "A leaves?," and a pre-utterance that determines whether the passenger who is getting on the vehicle again after determining to leave the vehicle at the stop point gets on the vehicle again at the stop point, such as "B gets on the vehicle again?.
In addition, the dialog system 100 may output a pre-utterance to determine whether a passenger determined not to leave at the stop point does not leave at the stop point, such as "C still at?," and a pre-utterance to determine whether a potential passenger to board the vehicle at the stop point boards the vehicle at the stop point, such as "D did board the vehicle?.
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to whether to estimate a variation in the number of passengers based on context information after leaving from the parking point. The pre-utterance determiner 151 may determine that a pre-utterance condition related to an estimation of a variation in the number of passengers is satisfied based on the passenger number information transmitted from the passenger determiner 111 d. In addition, when context information related after leaving the stop point satisfies a pre-utterance condition related to an estimation of a variation of the number of passengers, the pre-utterance determiner 151 may determine as a pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a variation in the number of passengers. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating that it is a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 may verify the result of the change in the number of passengers by receiving occupant utterances in the vehicle. Specifically, the passenger determiner 111d of the dialogue system 100 may verify the result of the variation in the number of passengers by receiving the utterance of the occupant corresponding to the pre-utterance message.
For example, the occupant may speak a pre-speech message such as "a leaves?" in response to the query determining whether the passenger to leave at the stop point leaves at the stop point, while saying "that he/she leaves" or "not, he/she gets on the vehicle".
When the occupant's speech is input in the vehicle, the voice input processor 111 recognizes the input occupant's speech in the vehicle. The words of the occupant may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
Specifically, the natural language understanding process may include performing morpheme analysis on the utterance in text form, and recognizing a result of the variation in the number of passengers based on a result of the morpheme analysis.
Therefore, the passenger determiner 111d of the dialogue system 100 can verify the result of the change in the number of passengers based on the dialogue between the passengers in the vehicle.
The dialogue system 100 may output a pre-utterance related to a result of comparison between a result of estimation of a change in the number of passengers and a result of the change in the number of passengers after leaving from the stop point.
For example, the dialogue system 100 may output a pre-speech message indicating that the result of the estimation of the change in the number of passengers is different from the result of the change in the number of passengers after leaving from the stop point, such as "the result of the estimation of the change in the number of passengers is different from the current passenger", and a pre-speech message indicating that the result of the estimation of the change in the number of passengers is the same as the result of the change in the number of passengers after leaving the stop point, such as "the result of the estimation of the change in the number of passengers is the same as the current passenger".
Specifically, the pre-utterance determiner 151 may determine whether to output the pre-utterance according to a pre-utterance condition related to whether to compare the estimation result of the variation in the number of passengers with the result of the variation in the number of passengers after leaving the stop point, based on the context information related to after leaving the stop point. The pre-utterance determiner 151 may determine that the pre-utterance condition is satisfied based on a result of comparison between an estimation result of the change in the number of passengers transmitted from the passenger determiner 111d and a result of the change in the number of passengers after leaving from the stop point. In addition, when the context information after the departure from the stop point satisfies a pre-utterance condition related to a comparison between an estimation result of a change in the number of passengers and a result of a change in the number of passengers after the departure from the stop point, the pre-utterance determiner 151 may determine as the pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message indicating a result of the comparison based on a result of the comparison between the result of the estimation of the variation in the number of passengers and the result of the variation in the number of passengers after leaving the stop point. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
Therefore, the driver can verify whether each passenger leaves or gets on the vehicle based on the result of comparison between the result of estimation of the change in the number of passengers and the result of the change in the number of passengers after leaving from the stop point, and concentrate on vehicle management, such as driving and parking, without paying attention to whether the passenger leaves or gets on the vehicle.
In addition, it is possible to prevent a situation in which the passenger stays at the stop point because he/she cannot board the vehicle again or a situation in which the passenger fails to leave when the vehicle reaches the stop point.
When the traveling of the vehicle is terminated, the dialogue system 100 may store traveling-related information and passenger information about each passenger.
For example, when the travel of the vehicle is terminated, the storage device 140 of the dialogue system 100 may store information related to the travel of the vehicle and passenger information of each passenger boarding the vehicle while traveling.
Specifically, the storage device 140 of the dialogue system 100 may store travel-related information regarding travel of the vehicle, such as a departure point, a stop point, and a destination of travel, and passenger information regarding passengers, such as personal identification information, voice feature information, seat position information, boarding vehicle time information, leaving time information, boarding vehicle position information, and information related to a position of leaving the vehicle.
That is, the storage device 140 of the dialogue system 100 may store travel-related information related to travel of the vehicle, such as a departure point, a stop point, and a destination of the travel, by collecting GPS values from the vehicle controller 240.
In addition, the storage device 140 of the dialogue system 100 may collect passenger identification information, voice feature information, seat position information, and passenger number information and store passenger information about passengers, such as passenger identification information, voice feature information, seat position information, boarding vehicle time information, leaving time information, boarding vehicle position information, and information related to a position of a departing vehicle.
Further, the dialogue system 100 may determine whether the passenger boards the vehicle and output a pre-speech of whether the current passenger is the same passenger who was traveling previously by using the pre-speech condition table 145 a.
Specifically, the dialogue system 100 may determine that a passenger boards the vehicle based on dialogue between occupants in the vehicle and vehicle operation information. Specifically, the voice input processor 111 of the input processor 110 may determine that the passenger boards the vehicle by receiving a dialogue between passengers in the vehicle, and acquire features such as a voice feature of each passenger, a seat position, a boarding time, and a boarding vehicle position.
The dialog system 100 may determine whether the passenger characteristics are the same as the stored passenger characteristics. Specifically, the voice input processor 111 of the dialogue system 100 may compare the characteristics of the passenger acquired from the storage device 140 with the passenger characteristics (such as the voice characteristics, the seat position, the boarding time, and the boarding vehicle position) of the passenger determined to board the vehicle.
For example, the voice input processor 111 may compare the voice characteristics of the passenger, the seat position, the boarding time, and the boarding vehicle position, which are contained in the passenger information, with the characteristics of the passenger who is determined to board the vehicle.
When at least two of the voice characteristics of the passenger, the seat position, the boarding time, and the boarding position included in the passenger information are the same as the detected passenger characteristics, the voice input processor 111 may determine that the characteristics of the passenger determined to board the vehicle are the same as the passenger information.
When comparing the voice feature, the seat position, the time of boarding the vehicle, and the position of boarding the vehicle of the passenger included in the passenger information with the detected feature of the passenger, the voice input processor 111 may determine similar voice features, the seat position, the time of boarding the vehicle, and the position of boarding the vehicle within a certain range to be the same.
When it is determined that the characteristics of the passenger determined to board the vehicle are the same as the stored passenger information, the dialogue system 100 may output a pre-utterance verifying whether the passenger participates in the previous driving.
For example, when the characteristics of the passenger determined to board the vehicle are the same as the stored passenger information, the dialogue system 100 may output a pre-utterance, such as "you are 00?," that verifies whether the passenger is the same as the passenger in the previous trip.
The pre-speech determiner 151 may determine whether to output a pre-speech according to a pre-speech condition that is identically related to the stored passenger information and the characteristics of the passenger determined to board the vehicle, based on the context information related to whether the passenger boards the vehicle. In addition, when the context information related to whether the passenger gets on the vehicle satisfies a pre-speech condition related to the same as the stored passenger information and the characteristics of the passenger determined as the passenger getting on the vehicle, the pre-speech determiner 151 may determine as the pre-speech context and generate the pre-speech trigger signal.
the pre-utterance determiner 151 may acquire a pre-utterance message, such as "you are 00?", corresponding to a pre-utterance context in which the passenger gets on the vehicle when the pre-utterance determiner 151 transmits a pre-utterance trigger signal and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 may verify whether the passenger was engaged in a previous trip by receiving the words of the passenger in the vehicle. Specifically, the dialogue system 100 can verify whether the passenger is engaged in a previous trip by receiving the passenger utterance corresponding to the pre-utterance message.
For example, the passenger may say "yes" or "no" in response to a pre-verbal message asking the passenger whether or not to participate in a previous trip. That is, the passenger may speak a message including a response indicating whether the passenger participated in the previous trip in response to the pre-speech message asking whether the passenger participated in the previous trip.
When the words of the passenger are input, the voice input processor 111 recognizes the input words of the passenger in the vehicle. The words of the passenger may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
Specifically, the natural language understanding process may include performing morpheme analysis on the utterance in the text form, and recognizing whether the passenger is involved in the previous travel based on the result of the morpheme analysis.
thus, the passenger determiner 111d of the dialogue system 100 can verify whether the passenger is involved in the previous travel based on the utterance of the occupant in the vehicle.
When it is determined that the passenger is involved in the previous trip, the dialogue system 100 may generate the passenger number information based on the dialogue between the passengers in the vehicle and the stored passenger information. That is, the dialogue system 100 may additionally consider stored passenger information when generating passenger quantity information based on dialogue between passengers in a vehicle.
For example, the passenger determiner 111d of the dialogue system 100 may estimate the change in the number of passengers at the stop point based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving based on the dialogue between the occupants in the vehicle, and the passenger determiner 111d may estimate the number of potential passengers getting on the vehicle based on the call session in the vehicle.
When the change in the number of passengers is estimated based on the dialogue between the occupants in the vehicle, the passenger determiner 111d may increase the accuracy of the estimation result of the change in the number of passengers by using the departure time information and the information related to the position of departure from the vehicle among the stored passenger information.
Specifically, when it is estimated that the passenger will exit at a specific stopping point based on the dialogue between the passengers in the vehicle, the passenger determiner 111d may verify the departure time and the location of the departing vehicle in the previous travel by using the departure time information and the information on the location of the departing vehicle among the stored passenger information.
The passenger determiner 111d may determine whether the estimated specific stopping point at which the passenger departs is the same as the departure position in the previous travel based on the dialogue between the occupants in the vehicle.
When the specific stopping point at which the passenger leaves estimated based on the dialogue between the occupants in the vehicle is the same as the leaving position in the previous travel, the passenger determiner 111d may generate the passenger information by estimating the change in the number of passengers estimated based on the dialogue between the occupants in the vehicle.
When a specific stopping point at which the passenger leaves, estimated based on the dialogue between the passengers in the vehicle, is different from the leaving position in the previous travel, the passenger determiner 111d may verify whether the leaving position is the specific stopping point position by speaking a pre-utterance for the passenger, and generate passenger information by using the utterance of the passenger.
Fig. 26 is a control block diagram showing the configuration of the dialog manager in detail, fig. 27 is a view showing an example of information stored in the relational action DB, fig. 28 is a view showing an example of information stored in the action execution condition DB, and fig. 29 is a view showing an example of information stored in the action parameter DB.
Referring to fig. 26, the dialog manager 120 may include: a dialog flow manager 121 that requests generation, deletion, and update of a dialog or action; a dialog action manager 122 generating, deleting, and updating a dialog or action according to a request of the dialog flow manager 121; an ambiguity resolver 123 clarifying the user's intention by resolving ambiguities of context and dialogs; a parameter manager 124 that manages parameters required for action execution; an action priority determiner 125 that determines whether an action is executable with respect to a plurality of candidate actions; and an external information manager 126 managing an external content list and related information, and managing parameter information of the external content query.
Dialog manager 120 may include: a memory in which a program for performing the above-described operation and an operation described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when a plurality of memories and processors are provided, they may be integrated on one chip or physically separated.
Each of the components contained in the dialog manager 120 may be implemented by the same processor or by separate processors.
In addition, the dialog manager 120 and the input processor 110 may be implemented by the same processor or separate processors.
when a user utterance is input or when a user utterance context-matched with the pre-utterance is transmitted to the natural language understanding section 111b, the dialog input manager 111c may transmit a result of natural language understanding (output of the natural language understanding section) and context information (output of the context understanding section) to the dialog flow manager 121. In addition, the dialog input manager 111c can send a pre-utterance trigger signal when a pre-utterance context occurs.
The output of the natural language understanding section 111b may include information related to the content of the utterance of the user, such as a morpheme analysis result, and information such as a domain and an action. The output of the context understanding portion 112c can include the events determined by the context information collection manager 112b as well as the context information.
The dialog flow manager 121 can search whether a dialog task or an action task corresponding to an input by the dialog input manager 111c exists in the dialog and action state DB 147.
The dialog and action state DB147 may be a storage space for managing dialog states and action states, and thus the dialog and action state DB147 may store dialogs and actions currently in progress, and dialog states and action states related to preliminary actions to be processed. For example, the dialog and action state DB147 may store states related to completed dialogs and actions, stopped dialogs and actions, ongoing dialogs and actions, and pending dialogs and actions.
the dialog and action state DB147 can store the last output state regarding whether to switch and nest (nesting), switch action index, action change time, and screen/voice/command.
For example, in the case of extracting a domain and an action corresponding to a user utterance, when there is a dialog and an action corresponding to the corresponding domain and action in the recently stored dialog, the dialog and action state DB147 may determine that it is a dialog task or an action task corresponding to an input from the dialog input manager 111 c.
When the domain and the action corresponding to the user utterance are not extracted, the dialogue and action state DB147 may request the dialogue action manager 122 to generate a random task or refer to a recently stored task.
When there is no dialog task or action task corresponding to the input of the input processor 110 in the dialog and action state DB147, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task.
Further, when the pre-utterance trigger signal is transmitted from the input processor 110, although there is a dialog task or an action task currently performed, the dialog task or the action task may be temporarily stopped, and a dialog task or an action task corresponding to a pre-utterance context may be first generated. In addition, the priority may be selected according to established rules.
When the pre-utterance trigger signal and the action corresponding to the pre-utterance trigger signal are input from the dialog input manager 111c, the dialog flow manager 121 can request the dialog action manager 122 to generate a new dialog task or action task in the same manner as in the case of obtaining an action from a user utterance.
Further, when a pre-utterance trigger and a pre-utterance message corresponding to the pre-utterance trigger are input from the dialog input manager 111c, the dialog flow manager 121 may request the dialog action manager 122 to generate a new dialog task or action task for outputting the input pre-utterance message.
When the dialog flow manager 121 manages the dialog flow, the dialog flow manager 121 may refer to the dialog policy DB 148. The dialog policy DB 148 may store policies for continuing a dialog, wherein the policies may represent policies for selecting, starting, suggesting, stopping, and terminating a dialog.
In addition, the dialog policy DB 148 may store policies regarding the time point and method at which the system outputs a response. The conversation policy DB 148 may store a policy for generating a response by linking a plurality of services and a policy for deleting a previous action and replacing the action with another action.
For example, two policies may be allowed, where the two policies may include a policy that generates responses to two actions at once, e.g., "whether it is necessary to perform a B action after performing a action?", and a policy that generates a separate response to one action after generating a response to the other action, e.g., "perform a action" → "do you want to perform B action?".
The dialog and action state DB147 may store policies for determining priorities among candidate actions. The priority determination policy will be described later.
The dialog and action manager 122 may designate a memory space to the dialog and action state DB147 and generate a dialog task and an action task corresponding to an output of the input processor 110.
The dialog action manager 122 may generate a random dialog state when domains and actions cannot be extracted from the user's utterance. In this case, as described later, the ambiguity resolver 123 may recognize the user's intention based on the content of the utterance of the user, environmental conditions, vehicle states, and user information, and determine an action suitable for the user's intention.
When there is a dialog task or an action task corresponding to the output of the input processor 110 in the dialog and action state DB147, the dialog flow manager 121 may request the dialog action manager 122 to refer to the corresponding dialog task or action task.
The action priority determiner 125 may search the relational action DB 146b to search an action list related to actions or events included in the output of the input processor 110, and then the action priority determiner 125 may acquire a candidate action. As shown in fig. 27, the relationship action DB 146b may indicate actions related to each other, relationships between the actions, actions related to events, and relationships between the events. For example, route guidance, vehicle status checks, and gas station recommendations may be classified as relational actions, and the relationships therein may correspond to associations.
Therefore, when performing route guidance, the vehicle status check and the gas station recommendation can be performed together. In this case, "to be executed together" may include a case where the vehicle state check and the gas station recommendation are executed before or after the route guidance and a case where the vehicle state check and the gas station recommendation are executed during the route guidance (e.g., added as a stop point).
The warning light output event may be stored as an event action related to the repair shop guidance action, and the relationship between them may correspond to the association.
When a warning lamp output event occurs, a repair shop guide action may be performed according to the type of warning lamp or whether repair is required.
When the input processor 110 transmits an action corresponding to the utterance of the user together with the event determined by the context information collection manager 112b, the action related to the action corresponding to the utterance of the user and the action related to the event may become candidate actions.
The extracted candidate action list may be sent to the dialog action manager 122, and the dialog action manager 122 may update the action state of the dialog and action state DB147 by adding the candidate action list.
The action priority determiner 125 may search the action execution condition DB 146c for a condition for executing each candidate action.
As shown in fig. 28, the action execution condition DB 146c may store, according to each action, conditions required to execute the action, and parameters for determining whether the respective conditions are satisfied.
For example, the execution condition for the vehicle state check may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may correspond to the destination distance. The gas station recommended condition may be a case where the destination distance is greater than a depletion Distance (DTE), wherein the parameter for determining the condition may correspond to the destination distance and the depletion Distance (DTE).
The action priority determiner 125 may send the execution conditions of the candidate actions to the dialog action manager 122, and the dialog action manager 122 may add the execution conditions and update the action state of the dialog and action state DB147 per each candidate action.
The action priority determiner 125 may search for parameters (hereinafter, referred to as condition determination parameters) required to determine action execution conditions from the context information DB142, the long term memory 143, the short term memory 144, or the dialog and action state DB147 and determine whether a candidate action can be executed using the searched parameters.
When the parameters for determining the action execution condition are not stored in the context information DB 142, the long term memory 143, the short term memory 144, or the dialog and action state DB147, the action priority determiner 125 may acquire the required parameters from the external content server 300 via the external information manager 126.
The action priority determiner 125 may use parameters for determining action execution conditions to determine whether a candidate action may be executed. In addition, the action priority determiner 125 may determine the priority of the candidate action based on whether the candidate action may be executed and the priority determination rule stored in the dialog policy DB 148.
The score for each candidate action may be calculated based on the current situation. Candidate actions with higher computed scores may be given higher priority. For example, an action corresponding to a user utterance, a security score, a convenience score, a processing time point (whether immediate processing is required), a user preference (an acceptance degree of the user when suggesting a service or a preference predetermined by the user), an administrator score, a score related to a vehicle state, and an action success rate (conversation success rate) may be used as parameters for calculating the score, as shown in equation 1 below. w1, w2, w3, w4, w5, w6, w7, w8, and w9 represent weight values of each parameter.
[ equation 1]
Priority score w1 user utterance action + w2 security score + w3 convenience score + w4 processing time + w5 processing time point + w6 user preference + w7 administrator score + w8 score related to vehicle status + w9 action success rate the probability of action execution (1: possible, not yet known, 0: impossible) action complete status (complete: 1, not completed: 0).
As described above, the action prioritizer 125 may provide the user with the most needed services by searching for actions directly associated to the user's utterance and context information and action lists related thereto, and by determining priorities therebetween.
The action priority determiner 125 may transmit the likelihood and priority of candidate action execution to the dialog action manager 122, and the dialog action manager 122 may update the action state of the dialog and action state DB147 by adding the transmitted information.
The parameter manager 124 may search the action parameter DB 146a for a parameter (hereinafter, referred to as an action parameter) for performing each candidate action.
As shown in fig. 29, the action parameter DB 146a may store necessary parameters, selective parameters, initial values of the parameters, and reference positions for acquiring the parameters, per action. In a state where initial values of parameters are stored, when parameter values corresponding to the corresponding parameters do not exist in the utterance and context information of the user output from the input processor 110 and parameter values do not exist in the context information DB142, an action may be performed according to the stored initial values, or whether to perform the action may be confirmed to the user according to the stored initial values.
For example, the necessary parameters for route guidance may include the current location and destination, and the selective parameters may include the type of route. The initial value of the selectivity parameter may be stored as a fast route. The current location and destination can be acquired by searching the dialog and action state DB147, the context information DB 142, the short-term memory 144, or the long-term memory 143 in order.
the necessary parameters for the vehicle state inspection may include vehicle state information, and the selective parameters may include a portion to be inspected (hereinafter, referred to as an "inspection portion"). The whole part (the whole) may be stored as an initial value of the selectivity parameter. The vehicle state information can be acquired from the context information DB 142.
The gas station recommended selective parameter may include a favorite gas station, and "a gas station" may be stored as an initial value of the selective parameter. The favorite gas station can be retrieved from the long term storage 143. The selectivity parameters may also include the fuel type and fuel price of the vehicle.
As described above, the parameter manager 124 may acquire the parameter values of the parameters searched in the action parameter DB 146a from the corresponding reference positions. The reference location for acquiring the parameter values may be at least one of the context information DB142, the short term memory 144 or the long term memory 143, the dialog and action state DB147, and the external content server 300.
The parameter manager 124 may obtain the parameter values from 300 via the external information manager 126. The external information manager 126 can determine where to acquire information by referring to the external service set DB 146 d.
The external service set DB 146d may store information on an external content server connected to the dialog system 100. For example, the external service set DB 146d may store an external service name, a description about the external service, a type of information provided from the external service, an external service using method, and a subject providing the external service.
The parameter values obtained by the parameter manager 124 may be sent to the dialogue action manager 122, and the dialogue action manager 122 may update the dialogue and action state DB147 by adding initial values to action states according to candidate actions.
The parameter manager 124 may obtain parameter values for all candidate actions, or the parameter manager 124 may obtain only parameter values for candidate actions determined to be executable by the action priority determiner 125.
the parameter manager 124 may selectively use parameter values among different types of parameter values indicating the same information. For example, by using a destination search service of a navigation system, "seoul stop" indicating a destination and in a text form may be converted into "seoul stop" in the form of POI.
When there is no ambiguity in the dialog and context, it is possible to acquire required information and manage the dialog and action according to the above-described operations of the action priority determiner 125, the parameter manager 124, and the external information manager 126. When there is ambiguity in the dialog and context, it may be difficult to provide a service desired by the user using only the operations of the action prioritizer 125, the parameter manager 124, and the external information manager 126.
In this case, the ambiguity resolver 123 can handle ambiguities in the dialog or in the context. For example, when a back-reference is included in the dialog (e.g., that person, where yesterday, father, mother, grandmother, and daughter, there may be ambiguity as it is unclear to back which or which of the tables are referring. In this case, the ambiguity resolver 123 may resolve an ambiguity by referring to the context information DB142, the long term memory 143, or the short term memory 144, or provide guidance for resolving an ambiguity.
For example, ambiguous words contained in "yesterday's place", "market near home", and "my yesterday's first stop" may correspond to parameter values of action parameters or parameter values of conditional determination parameters. However, in this case, due to the ambiguity of the word, it is impossible to perform an actual action or determine an action execution condition by using the corresponding word.
The ambiguity resolver 123 can resolve ambiguity of the parameter values by referring to information stored in the context information DB142, the long term memory 143, or the short term memory 144. The blur solver 123 may acquire required information from the external content server 300 by using the external information manager 126, as needed.
For example, the ambiguity resolver 123 can search for where the user went yesterday by referring to the short-term memory 144 to convert "yesterday's place" into information that can be used as a destination of the route guidance action. The ambiguity resolver 123 can search the user's home address by referring to the long-term memory 143 and acquire location information related to the a-market near the user's home address from the external content server 300. Thus, the ambiguity resolver 123 can convert "the market near the home" into information that can be used as a destination for the route guidance action.
When the input processor 110 does not clearly extract the action (object and operator) or when the user's intention is unclear, the ambiguity resolver 123 can recognize the user's intention by referring to the ambiguity resolution information DB 146e and determine an action corresponding to the recognized intention.
Fig. 30 is a table showing an example of information stored in the blur resolution information DB.
Based on the vehicle state information and the surrounding environment information, the blur analysis information DB 146e may match the utterance with an action corresponding to the utterance and then store the utterance and the action. The utterance stored in the fuzzy parsing information DB 146e may be an utterance which cannot extract an action by natural language understanding. Fig. 30 shows a case where the content of the utterance is a hand freeze or a hand cold according to the morpheme analysis result.
The surrounding environment information may include an outdoor temperature of the vehicle and whether it is raining, and the vehicle state information may include on/off of an air conditioner and a heater, and an air volume and a wind direction of the air conditioner and on/off of a steering wheel heater.
Specifically, in a state where the outdoor temperature exceeds 20 degrees during rain, when the air conditioner is turned ON (ON), it can be recognized that the hand is being frozen because the air conditioner temperature is set low, and thus "increase the air conditioner temperature by 3 degrees" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees during rain, when the air conditioner is turned OFF (OFF), it can be recognized that the user feels cold due to rain, and thus "heater on" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees when it is not raining, when the air conditioner is turned ON (ON) and the wind direction of the air conditioner is the upper side, it can be recognized that the hand is felt to be frozen because the wind of the air conditioner directly affects the hand, and thus "change the wind direction of the air conditioner to the lower side" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees when it is not raining, when the air conditioner is turned ON (ON), the wind direction of the air conditioner is the lower side, and the air volume is set to exceed the intermediate level, it can be recognized that the user feels cold due to the excessive air volume of the air conditioner, and thus "decrease the air volume of the air conditioner" can be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature exceeds 20 degrees when it is not raining, "increase the air conditioning temperature by 3 degrees" may be stored as a vehicle control action corresponding thereto when the air conditioner is turned ON (ON), the wind direction of the air conditioner is the lower side, and the air volume is set to be weak.
In a state where the outdoor temperature is lower than 20 degrees, when the heater is turned OFF (OFF), it may be recognized that the hand is frozen due to cold weather, and thus the "on heater" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, when the heater is turned ON (ON) and the steering wheel heater is turned off, it may be recognized that the hands are frozen because hot air is not transferred to the hands, and thus, "steering wheel heater ON" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, when the heater and the steering wheel heater are turned ON (ON) and the wind direction of the heater is the lower side, it may be recognized that the hand is frozen because the wind of the heater is not transferred to the hand, and thus "change the wind direction of the heater to the bidirectional" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heater are turned ON (ON), the wind direction of the heater is the upper side, and when the heater temperature is set to be lower than the maximum temperature, "increase the temperature of the heater" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heater are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature is set to be the highest, and when the capacity of the heater is not set to be the highest, "increase the air volume of the heater" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heating wire are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature and the air volume of the heater are set to be the highest, and when the seat heating wire is turned off, the "turned-ON seat heating wire" may be stored as a vehicle control action corresponding thereto.
In a state where the outdoor temperature is lower than 20 degrees, the heater and the steering wheel heater are turned ON (ON), the wind direction of the heater is the upper side, and the heater temperature and the air volume of the heater are set to be the highest, and when the seat heater is turned ON, "notification is waited for a certain period of time because the heater is now in a fully-operated (fully-loaded) state" can be stored as a vehicle control action corresponding thereto.
Fig. 31A and 31B are tables showing various examples of performing vehicle control as a result of the ambiguity resolver resolving and extracting an action by referring to the ambiguity resolution information DB.
For example, as shown in fig. 31A and 31B, in a state where the content of speech according to the morpheme analysis result is that the hand is frozen or the hand is cold, when the surrounding environment is summer, the vehicle state is that the wind direction of the air conditioner is on the upper side (upper side) of the passenger's head, the air conditioner set temperature is 19 degrees, and the air volume of the air conditioner is high, it can be recognized that the hand is frozen because the wind of the air conditioner is directed to the hand. An air conditioning control action for reducing the intensity of the air volume while changing the wind direction to the foot side (lower side) may be extracted as an action corresponding to the speech, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is winter, the vehicle state is that the wind direction of the air conditioner is the feet of the passengers, the air conditioner set temperature is 25 degrees, and the air volume of the air conditioner is high level, it can be recognized that the hands are frozen because the hot air is not transferred to the hands. The "turn on the steering wheel heater" action may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the content of speech according to the morpheme analysis result is "dysphoria", when the vehicle speed is 30km or less and the front-rear gap is less than 30cm, it can be recognized that the dysphoria is caused by heavy traffic. Accordingly, "change a route option in a route guidance action (quick route guidance)," "play multimedia content such as music," or "open a chat function" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the content of the speech according to the morpheme analysis result is "drowsiness", when the vehicle state is the interior air mode, it can be recognized that the drowsiness is caused by lack of air circulation. Therefore, the "change to the outside air mode" can be extracted as the action corresponding to the utterance, and the vehicle can be controlled according to the extracted action.
In words having the same contents, when the vehicle state is the outside air mode and the heater is turned ON (ON), it can be recognized that drowsiness is caused by hot air emitted from the heater. The "opening of the window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the content of speech according to the morpheme analysis result is "sweating" or "hot", when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that heat is caused by hot air emitted from the heater. Thus, "lowering the heater temperature" or "reducing the air volume" may be stored as an action corresponding to the speech.
In the words having the same contents, when the surrounding environment is winter and when the heater is turned OFF (OFF), it can be recognized that the heat is caused by the body heat of the user. Thus, "open window" or "suggest to open window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that the heat is caused by the increase in the internal temperature of the vehicle. Therefore, "turning on the air conditioner" may be extracted as an action corresponding to the speech, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the ambient environment is summer and when the air conditioner is turned ON (ON), it can be recognized that the heat is caused by the air conditioner temperature being set high. Therefore, "lowering the air-conditioning temperature" or "increasing the air volume of the air conditioner" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.
In a state where the content of the words according to the morpheme analysis result is "cold", when the surrounding environment is summer and when the air conditioner is turned ON (ON), it can be recognized that cold is caused by the air conditioner temperature being set too low or being subjected to too strong wind force of the air conditioner. Therefore, "increasing the air-conditioning temperature" or "decreasing the air volume" can be extracted as the action corresponding to the speech, and the vehicle can be controlled according to the extracted action.
In the words having the same content, when the surrounding environment is summer and when the air conditioner is OFF (OFF), it can be recognized that cold is caused by the physical condition of the user. The "operating the heater" or "checking the biorhythm of the user" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that cold is caused by the heater temperature being set to a low or weak amount of wind. Therefore, "increasing the heater temperature" or "increasing the air volume" can be extracted as an action corresponding to the speech, and the vehicle can be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is winter and the heater is turned off (off), it can be recognized that the cold is caused by the non-operation of the heater. The "operating the heater" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a state where the content of speech according to the morpheme analysis result is "head pain", when the surrounding environment is winter and the heater is turned ON (ON), it can be recognized that the headache is caused by lack of air circulation. Accordingly, "change to the outside air mode" or "open the window" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is winter and the heater is turned OFF (OFF), it can be recognized that the headache is caused by cold. The "operating the heater" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is summer and the air conditioner is turned OFF (OFF), it can be recognized that the headache is thermally induced. The "operating the air conditioner" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In the words having the same contents, when the surrounding environment is summer and the air conditioner (ON) is turned ON, it can be recognized that the headache is caused by the air conditioning. The "change of the wind direction or the air volume of the air conditioner" may be extracted as an action corresponding to the speech, and the vehicle may be controlled according to the extracted action.
In a state where the content of the speech according to the morpheme analysis result is "uncomfortable", when the surrounding environment is winter and it is raining, it can be recognized that the discomfort is caused by high humidity. Therefore, the "operation defogging function" or the "operation dehumidification function" may be extracted as the action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
In a speech having the same content, when the surrounding environment is summer and it is not rainy, it can be recognized that discomfort is caused by seasonal characteristics and heat. Therefore, "operating the air conditioner at the lowest temperature" may be extracted as an action corresponding to the speech, and the vehicle may be controlled according to the extracted action.
In a speech having the same content, when the surrounding environment is summer and is raining, it can be recognized that discomfort is caused by heat and high humidity. Accordingly, the "operating the air conditioner in the dehumidification mode" may be extracted as an action corresponding to the utterance, and the vehicle may be controlled according to the extracted action.
According to the operation of the above-described blur solver 123, although there is ambiguity in the utterance or situation of the user, the blur solver 123 can accurately recognize an action actually desired by the user or an action actually required by the user by considering the surrounding environment information and the vehicle state information of the utterance of the user as a whole, and provide the desired action and the required action.
information about the action determined by the ambiguity resolver 123 may be transmitted to the dialogue action manager 122, and the dialogue action manager 122 may update the dialogue and action state DB147 based on the transmitted information.
As described above, the action priority determiner 125 and the parameter manager 124 may determine action execution conditions with respect to the action determined by the ambiguity resolver 123, determine priorities thereof, and acquire parameter values.
When all of the parameter values for performing each action that can be obtained by the current context and dialog are obtained, the dialog action manager 122 may send a signal to the dialog flow manager 121.
When necessary parameter values for action execution and condition determination do not exist in the dialogue and action state DB147, the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, and can only be acquired by the user, the result processor 130 may generate a dialogue response for inquiring the user about the parameter values.
the dialog flow manager 121 may send information about the action corresponding to the first priority action and the dialog state to the result handler 130. In addition, the dialog flow manager 121 may send information regarding a plurality of candidate actions according to the dialog policy.
When the dialog system 100 outputs a pre-utterance, i.e., a pre-utterance trigger signal is generated by the input processor 110, the dialog state sent from the result processor 130 may include the pre-utterance trigger signal. However, the pre-utterance trigger signal is not required to be included in the dialog state, but any type of information may be included in the dialog state as long as the pre-utterance context is indicated. When information indicating the pre-utterance context is included in the dialog state, the results processor 130 may output the dialog response first, or output the dialog response and other types of responses, in preference to other types of responses.
In a state where the dialog system 100 outputs a pre-utterance, when a pre-utterance message corresponding to a pre-utterance context is input from the dialog input manager 111c, the pre-utterance message can be transmitted to the result processor 130 without the above-described procedures of ambiguity resolution, parameter management, and action priority determination.
In a state where the dialog system 100 outputs a pre-utterance, when an action corresponding to a pre-utterance context is input from the dialog input manager 111c, a pre-utterance message may be transmitted to the result processor 130 with or without going through the above-described process of ambiguity resolution, parameter management, and action priority determination.
Fig. 32 is a control block diagram showing the configuration of the result processor in detail.
referring to fig. 32, the result processor 130 may include: a response generation manager 131 that manages generation of a response necessary to perform an action input from the dialog manager 120; a dialog response generator 132 generating a response of a text, image or audio type according to the request of the response generation manager 131; a command generator 136 generating a command for vehicle control or use of an external content providing service according to a request of the response generation manager 131; a service editor 134 that sequentially or occasionally executes a plurality of services and collects the results thereof to provide a service desired by a user; an output manager 133 that outputs the generated text type response, image type response, or audio type response, outputs the command generated by the command generator 136, or determines the order of output when the output is plural; and a memory manager 135 which manages the long-term memory 143 and the short-term memory 144 based on outputs of the response generation manager 131 and the output manager 133.
The result processor 130 may include: a memory in which a program for performing the above-described operation and an operation described later is stored; and a processor for executing the stored program. At least one memory and one processor may be provided, and when multiple memories and processors are provided, they may be integrated on a single chip or physically separated.
Each of the components included in the result processor 130 may be implemented by the same processor or separate processors.
In addition, the result processor 130, the dialog manager 120, and the input processor 110 may be implemented by the same processor or separate processors.
Responses output by the utterance or context corresponding to the user may include dialogue responses, vehicle controls, and external content provision. The dialog response may include an initial dialog, a query, and an answer that includes information. The dialog response may be stored as a database in the response template 149.
The response generation manager 131 may request the dialog response generator 132 and the command generator 136 to generate the responses required to perform the actions determined by the dialog manager 120. To this end, the response generation manager 131 may transmit information about the action to be performed, which may include an action name and a parameter value, to the dialog response generator 132 and the command generator 136. When generating a response, dialog response generator 132 and command generator 136 may reference the current dialog state and action state.
The dialog response generator 132 may retrieve the dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be transmitted to the response generation manager 131. When the parameter values required to generate the dialog response are not transmitted from the dialog manager 120 or when an introduction using the external content is transmitted, the dialog response generator 132 may receive the parameter values from the external content server 300 or search the long-term memory 143, the short-term memory 144, or the context information DB 142.
for example, when the action determined by the dialog manager 120 corresponds to route guidance, the dialog response generator 132 may search the response template 149 and then extract that the dialog response template "takes [ duration: - ] from [ current location: - ] to [ destination: - ]. start guidance?".
The [ current position ] and [ destination ] of the parameters that need to be populated in the dialog response template may be sent from the dialog manager 120, and the parameter value of [ duration ] may not be sent. In this case, the dialog response generator 132 may request the duration spent from [ current location ] to [ destination ] to the external content server 300.
When the response to the user utterance or context includes vehicle control or external content provision, the command generator 136 may generate a command for performing the vehicle control or external content provision. For example, when the action determined by the dialog manager 120 is the control of the air conditioner, the window, the seat, and the AVN, the command generator 136 may generate a command to perform the control and then transmit the command to the response generation manager 131.
When the action determined by the dialog manager 120 requires external content provision, the command generator 136 may generate a command for receiving corresponding content from the external content server 300 and then transmit the command to the response generation manager 131.
When a plurality of commands are provided by the command generator 136, the service editor 134 may determine a method and an order of executing the plurality of commands and transmit the method and the order to the response generation manager 131.
The response generation manager 131 may transmit the response transmitted from the dialog response generator 132, the command generator 136, or the service editor 134 to the output manager 133.
The output manager 133 can determine the output timing, output order, and output position of the dialog response generated by the dialog response generator 132 and the command generated by the command generator 136.
The output manager 133 can output the responses by transmitting the dialog responses generated by the dialog response generator 132 and the commands generated by the command generator 136 to appropriate output positions in an appropriate order at appropriate timing. The output manager 133 may output a text-to-speech (TTS) response via the speaker 232 and a text response via the display 231. When outputting a dialog response of TTS type, the output manager 133 may use a TTS module provided in the vehicle 200, or alternatively, the output manager 133 may include a TTS module.
According to the control target, a command may be transmitted to the vehicle controller 240 or the communication device 280 for communicating with the external content server 300.
The response generation manager 131 may also transmit a response transmitted from the dialog response generator 132, the command generator 136, or the service editor 134 to the memory manager 135.
The output manager 133 may transmit a response output by itself to the memory manager 135.
The memory manager 135 may manage the long term memory 143 or the short term memory 144 based on the contents transmitted from the response generation manager 131 and the output manager 133. For example, the memory manager 135 may update the short-term memory 144 by storing dialog content between the user and the system based on the generated and output dialog responses. The memory manager 135 may update the long term memory 143 by storing information related to the user acquired through a dialog with the user.
Among the information stored in the short-term memory 144, persistent information (e.g., a user's preference or orientation) or information for obtaining persistent information may be stored in the long-term memory 143.
Based on the vehicle control and external content requests corresponding to the generated and output commands, the user preferences or vehicle control history stored in long-term memory 143 may be updated.
Meanwhile, in a state in which the dialog system 100 outputs a pre-utterance before the user inputs an utterance, when an action corresponding to a pre-utterance context is input from the dialog input manager 111c, the dialog response generator 132, which receives information related to the action, may acquire a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be transmitted to the response generation manager 131. The dialog response may become a pre-utterance of the dialog system 100.
The response generation manager 131 may transmit the dialog response transmitted from the dialog response generator 132 to the output manager 133.
The output manager 133 may output the dialog response generated by the dialog response generator 132 via the speaker 232.
when the result processor 130 receives the pre-utterance message itself corresponding to the pre-utterance context from the dialog flow manager 121, the input pre-utterance message may become a dialog response, and the input pre-utterance message may be transmitted to the output manager 133.
The output manager 133 can output the transmitted pre-speech message via the speaker 232.
When the user utterance is input after the dialog system 100 outputs the pre-utterance, the same operation as the above-described operation for processing the user utterance may be performed.
According to the above-described embodiment, the dialogue system 100 can provide a service that is most suitable for a user by considering various situations occurring inside the vehicle. The dialogue system 100 can also determine the services required by the user from the beginning based on the context information or the driver information collected by itself and provide the services actively, without inputting the utterance of the user.
For example, the evaluation criterion of the vehicle state may be changed according to the situation when the vehicle is started, and thus the feedback may be actively provided. The travel start time may be defined as a vehicle start time, a time point (EPB) at which an electronic parking brake is released, or a time point at which a navigation destination is set. The vehicle condition evaluation system that calculates the running availability score may give a weight to each device and change a variable weight applied to each device according to a situation factor. When it is determined that there is a problem with the vehicle state, a solution regarding each apparatus, such as a repair shop guide, may be provided.
By considering the destination when the vehicle is started, it can be determined whether the vehicle is out of fuel. When fuel is out, as feedback of the fuel out, adding a user's favorite gas station to an automatic stop point in a route to a destination and notifying the user of a change in the stop point may be performed. In addition, the gas station added as an automatic stop point can be changed according to the user's response.
Although the current vehicle state does not indicate a fuel deficiency, a gas station or refueling time can be actively provided by integrating the user's next schedule, primary movement history, and remaining fuel quantity.
By acquiring information related to the physical condition and sleep record of the driver, it is possible to conditionally allow the vehicle to start based on the acquired information. For example, when the risk of drowsy driving is recognized by recognizing the physical condition and the sleep record from outside the vehicle, the user may be recommended not to drive the vehicle. Alternatively, information related to the recommended driving time may be provided according to the physical condition or the sleep record.
When a trigger indicating a risk of drowsy driving repeatedly occurs, the risk of drowsy driving may be detected and a warning may be output or feedback such as automatically changing the route, i.e., changing the route to a rest area, may be provided according to the degree of risk. The trigger indicating the drowsy driving risk may be obtained by manually measuring the state of the driver and the vehicle state (for example, a case where the heart rate is lowered, a case where the back-and-forth gap is a reference distance or more, and a case where the vehicle speed is a reference speed or less) or by active measurement via a dialogue (for example, a case where a question is spoken to the driver and the speed of response of the driver to the question is measured).
When the user inputs an utterance indicating an emotion, the dialog system 100 may not extract a certain domain or action from the utterance of the user. However, the dialogue system 100 may recognize the user's intention by using the surrounding environment information, the vehicle state information, and the user state information, and then continue the dialogue. As described above, this embodiment may be performed by the ambiguity resolver 123 resolving the ambiguity of the user utterance.
Specifically, when a passenger boards the vehicle, the dialog system 100 can determine that the passenger boards the vehicle through voice input or input other than voice, and raise a question (e.g., who you? tell me your name) for identifying the passenger.
In addition, when estimating the variation in the number of passengers in the vehicle, the dialogue system 100 may output a pre-utterance relating to the estimation result of the variation in the number of passengers. Specifically, the dialogue system 100 may receive dialogue between occupants in the vehicle through voice input, and estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving, so as to output a pre-utterance related to the estimation result of the change in the number of passengers.
hereinafter, an example of a dialogue process using the dialogue system 100 will be described in detail.
Fig. 33 to 45 are views showing a specific example in which the dialogue system 100 processes an input, manages a dialogue, and outputs a result when a user inputs an utterance related to route guidance.
As shown in fig. 33, when the user inputs the utterance "let us go to the initial station yesterday", the speech recognizer 111a can output the speech of the user in the form of a text utterance (let us go to the initial station yesterday).
the natural language understanding section 111b can perform morpheme analysis by referring to the domain/action inference rule DB 141 and output [ domain: navigation ], [ action: path guidance ], [ voice behavior; request ] and [ parameter: NLU: destination: seoul station ], then input them to the dialog input manager 111 c.
Referring to fig. 34, when additional information exists in the context understanding section 112c when the natural language understanding result of the natural language understanding section 111b is transmitted to the context understanding section 112c, the dialog input manager 111c may request the context understanding section 112c to transmit the additional information.
The context understanding part 112c can search the context understanding table 145 and extract the context information associated with [ domain: navigation ] and [ action: route guidance ] the fact that the relevant context information is the current location and the type of the context information is a GPS value.
The context understanding part 112c can acquire the GPS value of the current location by searching the context information DB 142. When the GPS value of the current location is not stored in the context information DB142, the context understanding part 112c may request the GPS value of the current location from the context information collection manager 112 b.
The context information collection manager 112b can send a signal to the context information collector 112a to cause the context information collector 112a to collect the GPS value of the current location. The context information collector 112a may collect the GPS value of the current location from the vehicle controller 240 and then store the GPS value of the current location in the context information DB142 while transmitting a GPS value collection confirmation signal to the context information collection manager 112 b. When the context information collection manager 112b transmits a GPS value collection confirmation signal to the context understanding part 112c, the context understanding part 112c may acquire the GPS value of the current location from the context information DB142 and then transmit the GPS value of the current location to the dialog input manager 111 c.
The dialog input manager 111c may store [ domain: navigation ], [ action: route guidance ], [ voice behavior; request ], [ parameters: NLU: destination: seoul station ] and [ context information: current position: seoul station (GPS value) ] is combined into a natural language understanding result, and then the combined information is transmitted to the dialog manager 120.
Referring to fig. 35, the dialog flow manager 121 may search the dialog and action state DB147 and determine whether there is a dialog task or an action task currently in progress. At this time, the dialog flow manager 121 may refer to the dialog policy DB 148. According to this embodiment, it is assumed that there is no currently ongoing conversation task or action task.
The dialog flow manager 121 may request the dialog action manager 122 to generate an action task and a dialog task corresponding to an output of the input processor 110. The generation of the action task and the conversation task may represent specifying a storage space for storing and managing information related to the action state and the conversation state.
Accordingly, the dialog action manager 122 may specify a storage space in the dialog and action state DB147 to store information about the action state and the dialog state.
the dialog action manager 122 may send the action state and the dialog state to the action prioritizer 125.
The action priority determiner 125 may search the relational action DB 146b for a vehicle state check and a gas station recommendation related to route guidance. The route guidance action and the relationship action may become candidate actions.
The action priority determiner 125 may determine the priority of the candidate action according to pre-stored rules. The priority may be determined before determining the execution condition of the candidate action, or alternatively, only the priority may be determined with respect to the candidate action satisfying the execution condition after determining the execution condition of the candidate action.
The candidate action list may be sent to the dialog action manager 122 again, and the dialog action manager 122 may update the action state by adding the searched relationship action.
referring to fig. 36, the action priority determiner 125 may search the action execution condition DB 146c for an execution condition with respect to each candidate action and parameters required to determine the execution condition. The action prioritizer 125 may also determine a priority between candidate actions.
For example, the condition for vehicle state checking may be a case where the destination distance is equal to or greater than 100km, wherein the parameter for determining the condition may correspond to the destination distance.
The gas station recommended condition may be a case where the destination distance is greater than a depletion Distance (DTE), wherein the parameter for determining the condition may correspond to the destination distance and the depletion Distance (DTE).
the dialog action manager 122 may update the action state by adding a condition for executing each candidate action and parameters required to determine the condition to the dialog and action state DB 147.
The action priority determiner 125 may search the dialog and action state DB147, the context information DB142, the long term memory 143, or the short term memory 144 for parameter values required to determine whether the candidate action satisfies the execution condition, and acquire the parameter values from the dialog and action state DB147, the context information DB142, the long term memory 143, or the short term memory 144.
The action priority determiner 125 may obtain the parameter values from the dialog and action state DB147 when the parameter values are included in the previous dialog contents, in the context information related to the dialog contents, or in the context information related to the generated event.
When the action priority determiner 125 fails to retrieve parameter values from the dialog and action state DB147, the context information DB142, the long term memory 143, or the short term memory 144, the action priority determiner 125 may request the parameter values from the external information manager 126.
For example, the destination distance may be acquired from the external content server 300 providing the navigation service via the external information manager 126, and the DTE may be acquired from the context information DB 142. Meanwhile, in order to search for a destination distance, correct destination information for a navigation service may be required. In this embodiment, the destination entered from the user's utterance may correspond to "seoul station," where "seoul station" may include various places having names beginning with "seoul station" and "seoul station" having a particular meaning. Therefore, it may be difficult to search for the correct destination distance using only "seoul station".
The parameter values may be obtained from a mobile device 400 connected to the vehicle 200 as needed. For example, when user information (e.g., contacts and schedules) that is not stored in the long-term memory 143 is required as parameter values, the external information manager 126 may request the mobile device 400 for required information and then acquire the required parameter values.
When it may not be possible to acquire the parameter values via the storage device 140, the external content server 300, and the mobile device 400, the user may be asked to acquire the required parameter values.
The action priority determiner 125 may determine the execution condition of the candidate action by using the parameter value. Since the destination distance is not searched, the determination of the execution condition related to the vehicle state checking action and the gas station recommendation can be postponed.
As shown in fig. 37, the dialogue-action manager 122 may update the action state by adding the acquired parameter value to the dialogue-and-action state DB147 with whether the action execution condition is satisfied as determined by using the corresponding parameter value.
The dialog action manager 122 may request a parameter list from the parameter manager 124 for performing the candidate action.
The parameter manager 124 may acquire the current location and the destination from the action parameter DB 146a as necessary parameters for performing the route guidance action, and extract the route type (initial value: express route) as the selection parameter.
The parameter manager 124 may acquire a check part (initial value: entire part) for performing the vehicle state check action as the selective parameter, and extract a favorite gas station (initial value: a-gas station) as the selective parameter for performing the gas station recommendation action.
The extracted parameter list may be sent to the dialog action manager 122 and used to update the action state.
the parameter manager 124 may search the reference location of each parameter in the dialog and action state DB147, the context information DB142, the long term memory 143, and the short term memory 144 for the corresponding parameter value to obtain the parameter value corresponding to the necessary parameter and the selective parameter of the candidate action. When the parameter value needs to be provided via the external service, the parameter manager 124 may request the required parameter value to the external content server 300 via the external information manager 126.
The parameters for determining the execution condition of the candidate action and the parameters for executing the candidate action may overlap. Among the parameter values acquired by the action priority determiner 125 and then stored in the dialog and action state DB147, when there are parameters corresponding to parameters (necessary parameters and selective parameters) for performing a candidate action, the corresponding parameters may be used.
Referring to fig. 38, the dialog action manager 122 may update the action state by adding parameter values acquired by the parameter manager 124.
As described above, when the destination (seoul station) extracted from the utterance of the user is used as a parameter of the route guidance action, there may be ambiguity. Therefore, the parameter of the route guidance action (destination), the parameter of the vehicle state checking action (destination distance), and the parameter of the gas station recommendation (destination distance) may not have been acquired yet.
When [ parameter: NLU: destination: seoul stop ] is converted into a destination parameter suitable for a route guidance action, the ambiguity resolver 123 may check whether an ambiguity exists. As mentioned above, "seoul station" may include different kinds of places having names beginning with "seoul station", as well as "seoul station" having a user-specific meaning.
The fuzzy solver 123 can confirm that the modifier of "seoul station" is present in the user utterance by referring to the morpheme analysis result. The ambiguity resolver 123 can search the long term memory 143 or the short term memory 144 for the schedule, the mobile location and the contacts to identify the location of "our yesterday's home".
For example, the ambiguity resolver 123 can confirm that "the initial stop we went yesterday" is "initial stop exit 4" from the mobile position that the user performed yesterday. After confirming the presence of a POI (e.g., "seoul station exit 4"), the ambiguity resolver 123 may obtain a corresponding value.
The destination information obtained by the ambiguity resolver 123 may be sent to the dialogue action manager 122, and the dialogue action manager 122 may update the action state by adding "seoul station exit 4" to the destination parameter of the candidate action.
The parameter manager 124 may acquire destination information (seoul station exit 4) from the dialog and action state DB147 and request a destination distance value to the external content server 300 providing the navigation service via the external information manager 126.
Referring to fig. 39, when the external information manager 126 acquires a destination distance value (80km) from the external content server 300 and then transmits the destination distance value to the parameter manager 124, the parameter manager 124 may transmit the destination distance value to the dialogue action manager 122 to allow the action state to be updated.
The action priority determiner 125 may determine whether the candidate action is executable by referring to the action state and adjust the priority of the candidate action. Since the parameter values of the current position and the destination as the necessary parameters are acquired, it can be determined that the route guidance action is executable. Since the destination distance (70km) is less than 100km, it can be determined that the vehicle state checking motion is not executable. Since the destination distance (80km) is greater than the DTE, it can be determined that the gas station recommended action is performable.
Since the vehicle state checking action is not executable, the vehicle state checking action may be excluded from the priority determination. Thus, the route guidance action may be ranked first and the gas station recommendation action may be ranked second.
The dialog action manager 122 may update the action state based on whether the candidate action is executable and the priority of the modification.
The dialog flow manager 121 can check the dialog state and the action state stored in the dialog and action state DB147 and can develop a dialog policy by referring to the dialog policy DB 148 to continue the dialog. For example, dialog flow manager 121 may select the highest priority action among the executable actions, and dialog flow manager 121 may request response generation manager 131 to generate a response for conducting a dialog according to dialog policy DB 148.
The dialog state and action state stored in the dialog and action state DB147 may be updated to [ state: confirm the start of route guidance ].
referring to fig. 40, the response generation manager 131 may request to generate a response of the dialog response generator 132 in response to a request of the dialog flow manager 121.
For example, the dialog response generator 132 may generate a dialog response configured to output "expect 30 minutes to be needed from Uiwang station to Seoul station Exit 4 in TTS and text form, Dow want to start guiding?".
as another example, the dialog flow manager 121 may send the pre-utterance event to the dialog response generator 132. thus, the dialog response generator 132 may generate a dialog configured to output "who you are?, tell me your name" in TTS and a text form. alternatively, the dialog response generator 132 may generate a dialog configured to output "a will leave at a stop point" in TTS and a text form. in other words, the dialog response generator 132 may generate a question for identifying the passenger when the passenger boards the vehicle, and the dialog response generator 132 may determine the likelihood of each passenger leaving the vehicle and the likelihood of each passenger boarding the vehicle again after leaving and generate a dialog related thereto.
Response generation manager 131 may transmit the TTS response and the text response generated by dialog response generator 132 to output manager 133 and memory manager 135, and output manager 133 may transmit the TTS response to speaker 232 and the text response to display 231. At this point, output manager 133 may send the TTS response to speaker 232 after passing through a TTS module configured to combine text with speech.
The memory manager 135 may store the user-requested route guidance in the short-term memory 144 or the long-term memory 143.
As shown in fig. 41, when the user says "yes", the utterance of the user may be input to the speech recognizer 111a and then output as [ text: yes ], and the natural language understanding section 111b may output [ field: ] -, [ action: ], [ voice action: ], ] and [ morpheme analysis result: yes/IC ].
The natural language understanding result may be transmitted to the dialog input manager 111c, and the dialog input manager 111c may transmit the natural language understanding result to the dialog manager 120.
referring to fig. 42, the dialog flow manager 121 may search the dialog and action state DB147 and analyze a previous dialog state. The dialog flow manager 121 may request the dialog action manager 122 to update the dialog/action related to the currently executed route guidance.
The dialog action manager 122 may update the dialog state and the action state to [ state: route guidance starts ].
Dialog flow manager 121 may request that result processor 130 generate a response for initiating route guidance.
Referring to fig. 43, the dialog action manager 122 may update the dialog state to [ state: proceed next dialog ] and update the action state to [ state: and execute ].
The dialog flow manager 121 may request the response generation manager 131 to generate a response to the route guidance.
The dialog response generator 132 may generate a dialog response configured to output "start route guidance" as TTS and a text form, and then transmit the dialog response to the response generation manager 131.
Command generator 136 may generate a command to perform route guidance [ target: navigation and command: route guidance, destination: seoul station exit 4, start: uiwang station ], and then sends the command to the response generation manager 131.
The response generation manager 131 may transmit the generated dialog response and command to the output manager 133. The output manager 133 may output the dialog response via the display 231 and the speaker 232. The output manager 133 may transmit a route guidance command to the AVN 230 of the vehicle 200 or to the external content server 300 providing the navigation service via the vehicle controller 240.
Referring to fig. 44, the dialog flow manager 121 may select a gas station recommendation as the next executable action and request the response generation manager 131 to generate a response configured to ask the user whether to recommend a gas station.
The dialog state and the action state may be updated to [ state: check for relevant service recommendations ].
The response generation manager 131 may request that the dialog response generator 132 generate a TTS response and a text response, and the dialog response generator 132 may generate a dialog response configured to output "not enough fuel to reach the destination in a TTS and text form, do you want to add an a-gas station to the stop?.
Response generation manager 131 may transmit the TTS response and the text response generated by dialog response generator 132 to output manager 133 and memory manager 135, and output manager 133 may transmit the TTS response to speaker 232 and the text response to display 231.
A dialogue response configured to ask "there is not enough fuel to reach the destination, you want to add a fuel station to the stop point?" may be output via the display 231 and the speaker 232, as shown in fig. 45, when the user says "no", the utterance of the user may be input to the speech recognizer 111a and then output as [ text: no ], and the natural language understanding section 111b may output [ field: ], [ action: ], [ voice action: ], [ morpheme: no/IC ].
The dialog flow manager 121 may request the dialog action manager 122 to update the dialog state and the action state.
The dialog action manager 122 may update the dialog state to [ state: proceed next dialog ] and update the action state to [ state: cancel ].
The dialog flow manager 121 can request the response generation manager 131 to generate a response indicating that the gas station recommendation service is cancelled, and the dialog flow manager 121 can check whether there is a dialog to be continued. When a dialog to be continued does not exist, the dialog flow manager 121 may update the dialog state to [ state: idle ] and wait for user input.
The flow of data processing described above is merely an example applied to the dialogue system 100. Accordingly, the order in which each component of the dialog system 100 processes data is not limited to the above-described example, and thus a plurality of components may process data at the same time, or a plurality of components may process data in an order different from the above-described example.
Hereinafter, according to an embodiment, a dialogue processing method will be described. According to the embodiment, the dialogue processing method may be applied to the dialogue system 100 described above or the vehicle 200 provided with the dialogue system 100. Therefore, the description of fig. 1 to 45 will be applied to the dialogue processing method in the same manner.
Fig. 46 is a flowchart illustrating a method of processing a user input in a dialog processing method according to an embodiment. The method of processing user input may be performed in the input processor 110 of the dialog system 100.
Referring to fig. 46, when an utterance of a user is input (yes in 500), the speech recognizer 111a may recognize the input utterance of the user (510). The user's utterance may be input to the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form.
The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form (520), and output a result of the natural language understanding.
Specifically, the natural language understanding process (520) may include performing morpheme analysis (521) on an utterance in text form, extracting a domain from the utterance based on morpheme analysis results (522), recognizing an entity name (523), analyzing speech behavior (524), and extracting an action (525).
The extraction of the domain, the identification of the entity name, and the extraction of the action may be performed by referring to the domain/action inference rule DB 141.
The output of the natural language understanding section 111b, i.e., the result of natural language understanding, may include the result of domain, action, voice behavior, and morpheme analysis corresponding to the utterance of the user.
contextual information related to the extracted action may be searched (530). Contextual information related to the extracted action may be stored in the contextual understanding table 145. The context understanding part 112c may search the context understanding table 145 for context information related to the extracted action, and the context information processor 112 may acquire an information value of the searched context information from the context information DB142, the long term memory 143, or the short term memory 144.
When additional context information is needed (yes in 540), i.e. in case that the context information cannot be acquired from the context information DB142, the long term memory 143 or the short term memory 144, the context understanding part 112c may request to collect the corresponding context information (550). Inputs other than voice, such as vehicle state information, surrounding environment information, and driver information, may be input via the context information collector 112a separately from the input of the user's utterance.
The information may be entered periodically or only upon the occurrence of a particular event. In addition, information may be input periodically and then additionally input when a specific event occurs. In any case, when information collection is requested, the corresponding information may be actively collected.
Accordingly, when the context information related to the action has been collected, the corresponding information may be acquired from the context information DB142, the long term memory 143, or the short term memory 144, and otherwise, the corresponding information may be collected via the context information collector 112 a.
When the context information collector 112a, which has received the request for collecting the context information, collects the corresponding context information and stores the information in the context information DB142, the context understanding part 112c can acquire the corresponding context information from the context information DB 142.
When the context information collection manager 112b determines that a certain event occurs because the data collected by the context information collector 112a satisfies a predetermined condition, the context information collection manager 112b can send an action trigger signal to the context understanding part 112 c.
The context understanding part 112c can search the context understanding table 145 for the context information related to the corresponding event, and when the searched context information is not stored in the context understanding table 145, the context understanding part 112c can again transmit the context information request signal to the context information collection manager 112 b.
When the collection of the required context information is completed, the result of the natural language understanding and the context information may be transmitted to the dialog manager 120 (560). When an event occurs, information about the event (which event occurred) and context information about the event occurred may also be transmitted.
Fig. 47 is a flowchart illustrating a method of managing a dialog using an output of an input processor in a dialog processing method according to an embodiment. The dialog processing method may be performed by the dialog manager 120 of the dialog system 100.
referring to fig. 47, the dialog flow manager 121 may search the dialog and action state DB147 for a relevant dialog history (600).
in this embodiment, a case of extracting a domain and an action from an utterance of a user is described as an example, but there may be a case where: it is not possible to extract domains and actions from the user's utterance because of ambiguity in the content or context of the utterance. In this case, the dialogue action manager 122 may generate a random dialogue state, and the ambiguity resolver 123 may recognize the user's intention based on the content of the utterance of the user, environmental conditions, vehicle state, and user information, and determine an action suitable for the user's intention.
When there is a related dialog history (yes in 600), the related dialog history may be referred to (690). When there is no relevant conversation history (NO in 600), new conversation tasks and action tasks may be generated (610).
A related action list related to an action (hereinafter, referred to as an input action) extracted from the utterance of the user may be searched in the relation action DB 146b, and a candidate action list may be generated (620). The input action and the action associated with the input action may correspond to a list of candidate actions.
The action execution condition DB 146c may be searched for an execution condition according to each candidate action (620). The execution condition may represent a necessary condition for executing the action. Thus, when the respective condition is satisfied, the action may be determined to be executable, but when the respective condition is not satisfied, the action may be determined not to be executable. In the action execution condition DB 146c, information on the type of parameter for determining the action execution condition may also be stored.
Parameter values for determining action execution conditions may be obtained (640). The parameter for determining the action execution condition may be referred to as a condition determination parameter. Parameter values of the condition determining parameters can be acquired by searching the context information DB142, the long term memory 143, the short term memory 144, or the dialogue and action state DB 147. When a parameter value of a parameter needs to be determined via an external service providing condition, the required parameter value may be provided from the external content server 300 via the external information manager 126.
When it is impossible to acquire the required parameter values due to ambiguities in the context and utterances, the required parameter values can be acquired by solving the ambiguities using the ambiguity resolver 123.
Although the acquired parameters are invalid parameters having difficulty in determination of the action execution condition, the fuzzy solver 123 may acquire valid parameters from the invalid parameters.
Based on the obtained condition determination parameters, it may be determined whether each candidate action is executable (650), and a priority of the candidate action may be determined (660). Rules for determining the priority of candidate actions may be pre-stored. The action priority determiner 125 may determine the priority of the candidate actions by considering only executable candidate actions after determining whether each candidate action is executable. Alternatively, after determining the priority of the candidate action regardless of whether each candidate action is executable, the priority of the candidate action may be modified based on whether each candidate action is executable.
The action parameter DB 146a may be searched for a parameter list for performing a candidate action (670). The parameters for performing the candidate action may correspond to action parameters. The action parameters may include a necessary parameter and a selective parameter.
Parameter values for performing the candidate action may be obtained (680). Parameter values of the action parameters can be acquired by searching the context information DB142, the long term memory 143, the short term memory 144, or the dialogue and action state DB 147. When the parameter value of the action parameter needs to be provided via the external service, the required parameter value may be provided from the external content server 300 via the external information manager 126.
When the required parameter values cannot be acquired due to ambiguities in context and utterances, the required parameter values can be acquired by resolving the ambiguities using the ambiguity resolver 123.
Although the acquired parameters are invalid parameters having difficulty in determination of the action execution condition, the fuzzy solver 123 may acquire valid parameters from the invalid parameters.
The dialog state and the action state managed by the dialog action manager 122 may be performed through the above-described steps, and the dialog state and the action state may be updated whenever the state is changed.
When all available parameter values are obtained, the dialog flow manager 121 may send information about the candidate actions and dialog states to the results processor 130. According to the dialog policy, the dialog flow manager 121 may transmit information on an action corresponding to the first priority or information on a plurality of candidate actions.
When the required parameter values can be acquired only by the user because the required parameter values do not exist in the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, a dialog response for inquiring about the parameter values may be output to the user.
Fig. 48 is a flowchart illustrating a result processing method for generating a response corresponding to a result of dialog management in the dialog processing method according to the embodiment. The result processing method may be performed by the result processor 130 of the dialog system 100.
Referring to fig. 48, when a dialog response needs to be generated (yes in 700), the dialog response generator 132 may search the response template 149 (710). The dialog response generator 132 may retrieve a dialog response template corresponding to the current dialog state and action state and populate the response template with the required parameter values to generate a dialog response (720).
When the parameter values required to generate a dialog response are not transmitted from the dialog manager 120 or when an introduction using external content is transmitted, the required parameter values may be provided from the external content server 300 or searched in the long-term memory 143, the short-term memory 144, or the context information DB 142. When only the required parameter values can be acquired by the user because the required parameter values do not exist in the external content server 300, the long term memory 143, the short term memory 144, and the context information DB 142, a dialogue response for inquiring the parameter values may be generated to the user.
When a command needs to be generated (760), the command generator 136 may generate a command for vehicle control or external content (770).
The generated dialog responses or commands may be input to the output manager 133, and the output manager 133 may determine an output order between the dialog responses and the commands or among the commands (730).
The memory may be updated based on the generated dialog response or command (740). The memory manager 135 may update the short term memory 144 by storing contents of a dialog between a user and a system based on the generated dialog response or command, and update the long term memory 143 by storing information about the user acquired through the dialog with the user. The memory manager 135 may update the user's preferences and vehicle control history stored in the long-term memory 143 based on the generated and outputted vehicle control and external content requests.
The output manager 133 may output the response by sending the dialog response and the command to the appropriate output location (750). TTS responses may be output via speaker 232 and text responses may be output on display 231. The command may be transmitted to the vehicle controller 240 according to the control object or transmitted to the external content server 300. In addition, the command may be transmitted to a communication device 280 configured to communicate with the external content server 300.
Fig. 49 to 51 are flowcharts showing a case where the dialogue system according to the embodiment outputs a pre-utterance before an utterance is input in the dialogue processing method.
Referring to fig. 49, the context information collector 112a and the context information collection manager 112b collect context information (810). Specifically, the vehicle controller 240 may input vehicle state information and driving environment information, such as a remaining fuel amount, a rainfall speed, surrounding obstacle information, a speed, an engine temperature, a tire pressure, a current position, which are acquired by sensors provided in the vehicle, to the context information processor 112. User information input via the information input device 220 other than voice and information acquired from the external content server 300 or an external device may be input to the contextual information processor 112. The collected context information may be stored in the context information DB142, the long term memory 143, or the short term memory 144.
The pre-utterance determiner 151 determines a pre-utterance condition based on the context information (811). The pre-speech condition may be stored in the pre-speech condition table 145 a. As shown in fig. 25A to 25D, a pre-utterance condition related to context information may be stored in the pre-utterance condition table 145A for each context information.
When the context information transmitted from the context information DB142, the long-term memory 143, or the short-term memory 144 satisfies the pre-utterance condition (yes in 812), the pre-utterance determiner 151 determines as the pre-utterance context and generates a pre-utterance trigger signal (813).
The pre-utterance determiner 151 extracts an action corresponding to the pre-utterance context (814). As shown in fig. 25C, the action corresponding to the pre-utterance context may be pre-stored in the pre-utterance condition table 145 a. The pre-utterance determiner 151 can acquire an action corresponding to the pre-utterance context from the pre-utterance condition table 145 a. In addition, the pre-utterance determiner 151 can generate an action corresponding to the pre-utterance context according to the established rules.
When the pre-utterance determiner 151 transmits a pre-utterance trigger signal having an action corresponding to the pre-utterance context to the dialog input manager 111c, the dialog input manager 111c transmits an action corresponding to the pre-utterance context to the dialog manager 120 (815). In this case, the pre-utterance trigger signal may be transmitted using a signal indicating a pre-utterance context.
As shown in fig. 47, after an action corresponding to the pre-utterance context is transmitted to the dialog manager 120, a series of processes such as generation of a dialog task and an action task and acquisition of an action parameter may be performed. When other conversation tasks or action tasks are being performed, the conversation stream manager 121 may first generate and process tasks related to the pre-utterance context, or may select a priority according to established rules.
When the dialog manager 120 sends information about the first performed action to the results processor 130, the dialog response generator 132 may retrieve a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be sent to the output manager 133 via the response generation manager 131. The output manager 133 may output the generated dialogue response via a speaker provided in the vehicle 200 or the mobile device 400.
further, the pre-utterance message itself can be obtained or generated that corresponds to the pre-utterance context. Referring to fig. 50, the context information collector 112a and the context information collection manager 112b collect context information (820), and the pre-utterance determiner 151 determines a pre-utterance condition based on the context information (821).
When the context information transmitted from the context information DB142, the long-term memory 143, or the short-term memory 144 satisfies the pre-utterance condition (yes in 822), the pre-utterance determiner 151 determines as the pre-utterance context and generates a pre-utterance trigger signal (823).
The pre-utterance determiner 151 extracts a pre-utterance message corresponding to the pre-utterance context (824). As shown in fig. 25A, 25B, 25C, and 25D, the pre-utterance message corresponding to the pre-utterance context may be pre-stored in the pre-utterance condition table 145A. The pre-utterance message stored in advance may be content indicating a current context or content that first suggests a specific function or service required to perform the pre-utterance context. In addition, the pre-utterance determiner 151 may generate a pre-utterance message according to the established rule.
When the pre-utterance determiner 151 transmits the pre-utterance trigger signal to the dialog input manager 111c together with the pre-utterance message, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120 (825). In this case, the pre-utterance trigger signal may be transmitted together with a signal indicating a pre-utterance context.
the dialog manager 120 may generate a dialog task for outputting the transmitted pre-utterance message and transmit the dialog task to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
Further, a virtual user utterance corresponding to the pre-utterance context may be extracted. Referring to fig. 51, the context information collector 112a and the context information collection manager 112b collect context information (830), and the pre-utterance determiner 151 determines a pre-utterance condition based on the context information (831).
When the context information transmitted from the context information DB142, the long-term memory 143, or the short-term memory 144 satisfies the pre-utterance condition (yes in 832), the pre-utterance determiner 151 determines as the pre-utterance context and generates a pre-utterance trigger signal (833).
The pre-utterance determiner 151 extracts a virtual user utterance corresponding to the pre-utterance context (834). Although not shown in the drawings, a virtual user utterance corresponding to the pre-utterance context may be pre-stored in the pre-utterance condition table 145 a. The pre-utterance determiner 151 can acquire a virtual user utterance corresponding to a pre-utterance context from the pre-utterance condition table 145 a. In addition, the pre-utterance determiner 151 may generate a virtual user utterance corresponding to the pre-utterance context according to the established rules.
When the pre-utterance determiner 151 transmits the virtual user utterance in text form to the natural language understanding section 111b (835), the natural language understanding section 111b can acquire a domain and an action from the virtual user utterance in the same manner as the case where the user actually uttered.
The dialog input manager 111c sends the pre-utterance trigger signal to the dialog manager 120(836) together with the natural language understanding result. The result of the natural language understanding may include a domain and an action extracted from the virtual user utterance, and the extracted domain and action may become a domain and action corresponding to the pre-utterance context.
For example, the dialog system client 470 of the mobile device 400 may perform some of the operations of the pre-utterance determiner 151, depending on the manner in which the mobile device 400 acts as a mobile gateway between the vehicle and the dialog system 100. In this case, the dialog system client 470 can generate a virtual user utterance corresponding to the pre-utterance context and send the virtual user utterance to the natural language understanding section 111 b.
As shown in fig. 47, after the pre-utterance trigger signal is transmitted to the dialog manager 120 together with the natural language understanding result, a series of processes such as generation of a dialog task and an action task and acquisition of an action parameter may be performed. When other conversation tasks or action tasks are being performed, the conversation stream manager 121 may first generate and process tasks related to the pre-utterance context, or may select a priority according to established rules.
When the dialog manager 120 sends information about the first performed action to the results processor 130, the dialog response generator 132 may retrieve a dialog response template by searching the response template 149 and generate a dialog response by populating the extracted dialog response template with parameter values. The generated dialog response may be sent to the output manager 133 via the response generation manager 131. The output manager 133 may output the generated dialogue response via a speaker provided in the vehicle 200 or the mobile device 400.
Fig. 52 is a flowchart illustrating a process of processing a repetitive task when the dialog system outputs a pre-utterance before a user inputs an utterance in the dialog processing method according to the embodiment.
Referring to fig. 52, the context information collector 112a and the context information collection manager 112b collect context information (840), and the pre-utterance determiner 151 determines a pre-utterance condition based on the context information (841).
The pre-utterance determiner 151 determines whether the context information transmitted from the context information DB142, the long-term memory 143, or the short-term memory satisfies a pre-utterance condition, and when the context information satisfies the pre-utterance condition (yes in 842), the repeated task processor 152 determines whether a task that is context-related to the currently occurring pre-utterance is repeated (843).
Specifically, the repetitive task processor 152 may determine whether a task, such as a dialog and an action related to a currently occurring pre-utterance context, has been performed or is currently performed based on information related to a task previously or currently performed in the dialog system 100 stored in the task processing DB145 b.
For example, when a dialog related to the currently occurring pre-utterance context has been performed, and when a reference time period has not elapsed from the dialog time point, the repetitive task processor 152 may determine that a task related to the current pre-utterance context is a repetitive task. In addition, the repetitive task processor 152 can determine the task related to the current pre-utterance context to be a repetitive task when a dialog and an action related to the current pre-utterance context are currently being performed.
That is, the repetitive task processor 152 may determine whether the pre-utterance and the user's intention with respect to the pre-utterance context have been output based on the dialog history stored in the task processing DB145b and whether the task is performed. The repetitive task processor 152 may determine whether it is a repetitive task based on the stored dialog time, the user's intention, or whether the task is processed.
When the task related to the current pre-utterance context is recognized as the repetitive task (yes in 843), the repetitive task processor 152 terminates the pre-utterance context.
When it is determined that the task contextually related to the current pre-utterance is not the repetitive task (no in 843), the pre-utterance operation (844) as shown in the above-described embodiment may be performed. For example, a pre-utterance trigger and an action or pre-utterance message corresponding to a pre-utterance context may be sent to the dialog manager 120. In addition, a virtual user utterance corresponding to the pre-utterance context may be transmitted to the natural language understanding part 111b, and a result of the natural language understanding and the pre-utterance trigger signal may be transmitted to the dialog manager 120.
According to the above-described embodiment, it is assumed that separate components such as the pre-utterance determiner 151 and the repetitive task processor 152 and separate storage devices such as the pre-utterance condition table 145a and the task processing DB 145b are used to perform a dialogue processing method of a pre-utterance. However, the embodiment of the dialogue processing method is not limited thereto, and the context understanding part 112c may perform the operations of the pre-utterance determiner 151 and the repetitive task processor 152, and the information stored in the pre-utterance condition table 145a and the task processing DB 145b may be stored in the context understanding table 145.
The dialogue processing method according to the embodiment is not limited to the order in the above-described flowcharts. The flow according to the flowcharts of fig. 44 to 52 may be only an example applied to the dialogue processing method. Therefore, a plurality of steps can be performed simultaneously, and the order of each step can also be changed.
Fig. 53 is a flowchart illustrating a method of determining that a passenger gets on a vehicle and outputting a pre-utterance in a dialogue processing method according to an embodiment.
Referring to fig. 53, the dialogue system 100 may determine boarding of the passenger on the vehicle based on at least one of dialogue between occupants in the vehicle and vehicle operation information (5300). For example, the dialogue system 100 may determine that a passenger boards the vehicle based on a dialogue between occupants in the vehicle input through the voice input processor 111. The occupants in the vehicle may include a driver and at least one passenger, and the vehicle operation information may include operation information of the information input device 220 other than voice.
The determination by the dialogue system 100 that the passenger gets on the vehicle may be performed within a certain period of time from when the vehicle 200 starts running or within a certain period of time from when the vehicle 200 stops running.
The voice input processor 111 may distinguish the voice of each passenger based on a dialogue between occupants in the vehicle inputted through the voice input device 210 provided in the vehicle 200 and the voice input device 410 provided in the mobile device 400.
The voice input processor 111 may detect each passenger by distinguishing the voice of each passenger input through the voice input device 210 and the voice input device 410 provided in the mobile device 400 based on the voice feature information.
The voice features may include at least one of verbal features and non-verbal features.
The dialogue between occupants in the vehicle, which is input through the voice input processor 111 to determine the boarding of the passengers, may represent not an utterance for sending an intention to the vehicle 200, but a dialogue between occupants in the vehicle including the driver.
In addition, the contextual information processor 112 of the dialog system 100 can determine that the passenger is boarding the vehicle based on the vehicle operation information. That is, the dialogue system 100 may determine that the passenger boards the vehicle based on the vehicle operation information in order to determine whether there is a passenger who fails to board the vehicle determined through the voice input processor 111 because the passenger is not involved in the dialogue.
the vehicle operation information may include at least one of window adjustment button operation information, seat adjustment button operation information, or air conditioning adjustment button operation information related to the passenger seat 254b and the rear seats 254c and 254 d.
The contextual information processor 112 may determine that the passenger is boarding the vehicle based on the vehicle operation information.
that is, the input processor 110 may collect passenger boarding vehicle information indicating that a passenger boards the context of the vehicle or that the passenger does not board the context of the vehicle through at least one of the voice input processor 111 and the context information processor 112.
When the dialogue system 100 determines that the passenger boards the vehicle within a period of time from when the vehicle 200 starts traveling or within a period of time from when the vehicle 200 stops traveling (yes in 5300), the dialogue system 100 may output a pre-utterance for requesting identification information (5310). Specifically, when it is determined that the passenger boards the vehicle, the dialogue system 100 may output a pre-utterance for requesting recognition information.
For example, when it is determined that a passenger is boarding the vehicle, the dialog system 100 may output a pre-utterance requesting the passenger's identification information, such as "who you are? telling me your name".
The pre-utterance determiner 151 of the input processor 110 may determine whether to output a pre-utterance based on a pre-utterance condition related to determining whether the passenger gets on the vehicle based on context information related to whether the passenger gets on the vehicle. In addition, when the context information corresponding to whether the passenger gets on the vehicle satisfies a pre-utterance condition corresponding to the determination that the passenger gets on the vehicle, the pre-utterance determiner 151 may determine as the pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may obtain a pre-utterance message corresponding to a pre-utterance context associated with a passenger boarding the vehicle, such as "who you are? telling me your name". when the pre-utterance determiner 151 sends a pre-utterance trigger and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may send the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 can identify the passenger by receiving the passenger's utterance (5320). Specifically, the dialog system 100 can identify the passenger by receiving the passenger's utterance regarding the pre-utterance message.
For example, the passenger may speak "i am 00" in response to a pre-verbal message requesting the passenger's identifying information. That is, the passenger may speak a message including his/her name in response to the pre-spoken message.
When the words of the passenger are input, the voice input processor 111 recognizes the input words of the passenger. The words of the passenger may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding process may include performing morpheme analysis on an utterance in text form and recognizing a name based on a result of the morpheme analysis.
In addition, the natural language understanding section 111b may increase the recognition rate of names using the driver's phone book stored in the long-term memory 143. Specifically, the natural language understanding section 111b may increase the recognition rate of the name by comparing the name contained in the passenger utterance with the name contained in the phonebook.
The passenger determiner 111d may verify the name of the passenger based on the output of the natural language understanding section 111b to identify the identity of the passenger.
thus, based on the occupant's utterance, the dialog system 100 can identify the identity of the occupant who uttered the message.
Passenger information about the identified passenger may be stored in the storage device 140 in real time, wherein the passenger information may include personal identification information, one or more voice characteristics of the passenger's voice, and seat position information.
when it is determined that the passenger does not board the vehicle (no in 5300), the dialogue system 100 may output a pre-utterance to verify whether the passenger is present (5330). Specifically, when the dialogue system 100 determines that the passenger does not board the vehicle for a period of time from when the vehicle 200 starts running or for a period of time from when the vehicle 200 stops running, the dialogue system 100 may output a pre-utterance that verifies whether the passenger is present.
For example, when the dialogue system 100 determines that a passenger does not board the vehicle for a period of time since the vehicle 200 starts running or for a period of time since the vehicle 200 stops running, the dialogue system 100 may output a pre-utterance to verify whether the passenger is present, "whether there are other passengers boarding the vehicle?".
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to context information about whether the passenger gets on the vehicle based on a pre-utterance condition about whether the passenger does not get on the vehicle. In addition, when the context information regarding whether the passenger gets on the vehicle satisfies a pre-speech condition regarding whether the passenger does not get on the vehicle, the pre-speech determiner 151 may determine as the pre-speech context and generate a pre-speech trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a pre-utterance context of a vehicle that the passenger does not board, such as "if there are any other passengers boarding the vehicle?". when the pre-utterance determiner 151 transmits a pre-utterance trigger signal and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 can verify whether the passenger is present by receiving the driver's utterance (5340). Specifically, the dialog system 100 can verify whether a passenger is present by receiving an utterance of the driver regarding the pre-utterance message.
For example, in response to the pre-verbal message, the driver may speak "none" or "present" to verify whether the passenger is present. That is, the driver may speak a message including a response indicating the presence of the passenger in response to the pre-verbal message.
When the speech of the driver is input, the voice input processor 111 recognizes the input speech of the driver. The driver's speech may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section 111b may acquire important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.
Specifically, the natural language understanding process may include performing morpheme analysis on an utterance in text form, and recognizing an entity name based on a result of the morpheme analysis.
The output of the natural language understanding section 111b as a result of the natural language understanding may include the entity name and the result of the morpheme analysis corresponding to the utterance of the passenger.
The passenger determiner 111d can recognize the presence of the passenger based on the output of the natural language understanding section 111 b.
Thus, the voice input processor 111 may verify the presence of the passenger based on the driver's utterance.
Fig. 54 is a flowchart illustrating a method of estimating a variation in the number of passengers and outputting a pre-utterance in a dialogue processing method according to an embodiment.
Referring to fig. 54, the dialog system 100 can identify the passenger as described in fig. 53 (5400). Specifically, the dialogue system 100 may determine that the passenger boards the vehicle based on at least one of dialogue between occupants in the vehicle and vehicle operation information, and identify the passenger by using the pre-speech.
The dialog system 100 may generate passenger quantity information based on the dialog between occupants in the vehicle (5410). Specifically, the dialogue system 100 may determine the possibility of each passenger leaving the vehicle at a specific stop point and the possibility of each passenger getting on the vehicle again after leaving at the specific stop point by continuously receiving dialogs between occupants in the vehicle.
For example, the voice input processor 111 of the dialogue system 100 may continuously receive dialogs between occupants in the vehicle through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
The natural language understanding section 111b can recognize the entity name from the utterance. The entity name may be a proper noun, such as a person name, a place name, an organization name, a time, a date, and a currency, and the entity name recognition may be configured to recognize the entity name in the sentence and determine a type of the recognized entity name. The natural language understanding section 111b may acquire important keywords from the sentence using the entity name recognition and recognize the meaning of the sentence.
Specifically, the natural language understanding process may include performing morpheme analysis on an utterance in text form, and recognizing an entity name based on a result of the morpheme analysis.
The output of the natural language understanding section 111b as a result of the natural language understanding may include the entity name and the result of the morpheme analysis corresponding to the utterance of the passenger.
The passenger determiner 111d may estimate the variation in the number of passengers based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the variation in the number of passengers at a specific stop point by analyzing the words of the passengers.
The passenger determiner 111d may estimate that a specific passenger will exit at a specific stopping point based on the entity name and morpheme analysis result of the natural language understanding section 111 b.
The passenger determiner 111d may estimate that a specific passenger will leave and board the vehicle again at a specific stopping point based on the entity name and morpheme analysis result of the natural language understanding section 111 b.
Additionally, the dialog system 100 may estimate the number of potential passengers by determining the likelihood of the called party boarding the vehicle by receiving a call session in the vehicle.
When estimating the likelihood of the potential passenger boarding the vehicle, the dialog system 100 may output a pre-utterance for verifying the likelihood of the potential passenger boarding the vehicle.
For example, when estimating the likelihood of a potential passenger boarding the vehicle, the dialog system 100 may output a pre-utterance for verifying the likelihood of a potential passenger boarding the vehicle, such as "who boards the vehicle midway? telling me his/her name".
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to an estimation of a likelihood of getting on the vehicle based on context information related to whether the potential passenger will get on the vehicle. In addition, when the context information related to whether the potential passenger will board the vehicle satisfies the pre-speech condition related to the estimation of the likelihood that the potential passenger boards the vehicle, the pre-speech determiner 151 may determine as the pre-speech context and generate the pre-speech trigger signal.
the pre-utterance determiner 151 may obtain a pre-utterance message corresponding to a pre-utterance context in which a potential passenger will board the vehicle, e.g., "who boards the vehicle? midway telling me his/her name.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 can verify the likelihood of a potential passenger boarding the vehicle by receiving an utterance of an occupant in the vehicle. In particular, the dialog system 100 can verify whether a potential passenger is present by receiving an utterance of an occupant regarding the pre-utterance message.
The passenger determiner 111d may estimate the variation in the number of passengers in the vehicle based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the number of potential passengers based on the call session, and the passenger determiner 111d may also estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving based on the dialogue between the occupants in the vehicle.
The passenger determiner 111d may generate the passenger number information based on the estimation result of the variation of the passenger number.
That is, the passenger determiner 111d may generate the passenger number information based on the possibility that each passenger leaves the vehicle at the stopping point, the possibility that each passenger gets on the vehicle again after leaving at the stopping point, and the possibility that a potential passenger gets on the vehicle at the stopping point.
Before reaching the stop point, the dialogue system 100 may output a predictive utterance related to the estimation result of the change in the number of passengers based on the passenger number information (5420).
For example, when it is determined that the vehicle 200 is before reaching the stopping point, the dialogue system 100 may output a pre-utterance related to the estimation result of the change in the number of passengers, such as "a will leave at the stopping point", "B will leave the vehicle at the stopping point and then board the vehicle again", "C will not leave at the stopping point", and "D will board the vehicle at the stopping point".
That is, before reaching the stopping point, the dialogue system 100 may output the pre-utterance relating to the possibility that each passenger leaves the vehicle at the stopping point, the possibility that each passenger gets on the vehicle again at the stopping point, and the possibility of getting on the vehicle at the stopping point included in the passenger number information.
However, the dialogue system 100 may output the pre-utterance relating to the estimation result of the change in the number of passengers based on the passenger number information not only before reaching the stop point but also after reaching the stop point.
Further, the contents related to the possibility of each passenger leaving the vehicle at the stop point, the possibility of each passenger getting on the vehicle again at the stop point, and the possibility of getting on the vehicle at the stop point may include a message about the number of passengers leaving the vehicle, such as "next time again", and a message about the passenger getting on the vehicle again after leaving the vehicle, such as "travel pleasure and safe back".
the dialogue system 100 may determine whether the vehicle is just before or just after the stop point is reached based on vehicle state information such as the vehicle position and the vehicle speed detected by the vehicle detector 260.
Specifically, when the gear is placed in the P range, the dialogue system 100 may determine that the vehicle 200 reaches the stop point, and when the speed is equal to or less than 10kph, the dialogue system 100 may determine that the vehicle 200 is just before reaching the stop point.
the pre-utterance determiner 151 may determine whether to output a pre-utterance based on a pre-utterance condition related to an estimation of a change in the number of passengers based on context information related to before reaching the stop point. The pre-utterance determiner 151 may determine that a pre-utterance condition related to an estimation of a variation in the number of passengers is satisfied based on the passenger number information transmitted from the passenger determiner 111 d. In addition, when the context information related to before reaching the stop point satisfies the pre-utterance condition related to the estimation of the variation of the number of passengers, the pre-utterance determiner 151 may determine as the pre-utterance context and generate the pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a pre-utterance context in which the number of passengers is estimated to vary, such as "a will leave at a stop point", "B will leave the vehicle at the stop point and then board the vehicle again", "C will not leave at the stop point", and "D will board the vehicle at the stop point", based on the passenger number information. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialogue system 100 may compare the estimation result of the change in the number of passengers with the result of the change in the number of passengers after departing from the stop point based on the passenger number information (5430).
The dialogue system 100 may determine whether the vehicle has left the parking spot based on vehicle state information such as the vehicle position and the vehicle speed detected by the vehicle detector 260.
Specifically, the dialogue system 100 may determine that the vehicle leaves the parking spot based on facts such as the parking brake being released, the ignition on, or the brake pedal on.
The dialogue system 100 may detect that the passenger boards the vehicle through the voice input processor 111 and the contextual information processor 112 to determine the result of the change in the number of passengers after leaving from the stop point, and identify the passenger through the passenger determiner 111 d.
Therefore, when it is determined that the vehicle 200 departs from the stop point, the passenger determiner 111d of the dialogue system 100 may compare the estimation result of the change in the number of passengers based on the passenger number information with the result of the change in the number of passengers after departing from the stop point.
In addition, the dialogue system 100 may output a pre-utterance to compare the result of the estimation of the change in the number of passengers with the result of the change in the number of passengers after leaving from the stop point.
Specifically, the dialogue system 100 may output a pre-utterance that determines whether a passenger determined to be away at the stopping point leaves at the stopping point, such as "A leave?", and a pre-utterance that determines whether a passenger determined to be on board the vehicle again after leaving the vehicle at the stopping point boards the vehicle again at the stopping point, such as "B board the vehicle again?".
In addition, the dialog system 100 may output a pre-utterance to determine whether a passenger determined not to exit at the stop point does not exit at the stop point, such as "C still at?," and a pre-utterance to determine whether a potential passenger determined to board the vehicle at the stop point boards the vehicle at the stop point, such as "D did board the vehicle?.
The pre-utterance determiner 151 may determine whether to output a pre-utterance according to a pre-utterance condition related to whether to estimate a change in the number of passengers based on context information after leaving from the parking point. The pre-utterance determiner 151 may determine that a pre-utterance condition related to an estimation of a variation in the number of passengers is satisfied based on the passenger number information transmitted from the passenger determiner 111 d. In addition, when context information related after leaving the stop point satisfies a pre-utterance condition related to an estimation of a variation of the number of passengers, the pre-utterance determiner 151 may determine as a pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message corresponding to a variation in the number of passengers. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 may verify the result of the change in the number of passengers by receiving an utterance of a passenger in the vehicle. Specifically, the passenger determiner 111d of the dialogue system 100 may verify the result of the variation in the number of passengers by receiving the utterance of the passenger corresponding to the pre-utterance message.
For example, the passenger may speak a pre-speech message such as "a leaves?" asking whether the passenger determined to leave at the stop point left at the stop point, but saying "he/she left" or "not, he/she gets on the vehicle".
When the passenger utterance in the vehicle is input, the voice input processor 111 recognizes the input passenger utterance in the vehicle. The words of the passenger may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
the speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
Specifically, the natural language understanding process may include performing morpheme analysis on the utterance in text form, and recognizing a result of the variation in the number of passengers based on a result of the morpheme analysis.
Therefore, the passenger determiner 111d of the dialogue system 100 can verify the result of the change in the number of passengers based on the dialogue between the passengers in the vehicle.
The dialogue system 100 may output a pre-utterance related to a result of comparison between the result of estimation of the change in the number of passengers and the result of the change in the number of passengers after leaving from the stop point (5440).
For example, the dialogue system 100 may output a pre-speech message indicating that the result of the estimation of the change in the number of passengers is different from the result of the change in the number of passengers after leaving from the stop point, such as "the result of the estimation of the change in the number of passengers is different from the current passenger", and a pre-speech message indicating that the result of the estimation of the change in the number of passengers is the same as the result of the change in the number of passengers after leaving the stop point, such as "the result of the estimation of the change in the number of passengers is the same as the current passenger".
Specifically, the pre-utterance determiner 151 may determine whether to output the pre-utterance according to a pre-utterance condition whether to compare the estimation result of the variation in the number of passengers with the result of the variation in the number of passengers after leaving the stop point, based on the context information related to after leaving the stop point. The pre-utterance determiner 151 may determine that the pre-utterance condition is satisfied based on a result of comparison between a result of the change in the number of passengers after departing from the stop point and an estimation result of the change in the number of passengers transmitted from the passenger determiner 111 d. In addition, when the context information after the departure from the stop point satisfies a pre-utterance condition related to a comparison between an estimation result of a change in the number of passengers and a result of a change in the number of passengers after the departure from the stop point, the pre-utterance determiner 151 may determine as the pre-utterance context and generate a pre-utterance trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message indicating a result of the comparison based on a result of the comparison between the result of the estimation of the variation in the number of passengers and the result of the variation in the number of passengers after leaving the stop point. When the pre-utterance determiner 151 transmits the pre-utterance trigger signal and the pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120. At this time, a pre-utterance trigger signal or a signal indicating a pre-utterance context may be transmitted together with the pre-utterance message.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
Therefore, the driver can verify whether each passenger leaves or boards the vehicle based on the result of comparison between the result of estimation of the change in the number of passengers and the result of change in the number of passengers after leaving from the stop point, and concentrate on vehicle management, such as driving and parking, without paying attention to whether the passenger leaves or boards the vehicle.
In addition, it is possible to prevent a situation in which the passenger stays at the stop point because he/she cannot board the vehicle again or a situation in which the passenger fails to leave when the vehicle reaches the stop point.
When the traveling of the vehicle is terminated, the dialogue system 100 may store traveling-related information and passenger information about each passenger (5450).
For example, when the travel of the vehicle is terminated, the storage device 140 of the dialogue system 100 may store information related to the travel of the vehicle and passenger information of each passenger boarding the vehicle while traveling.
Specifically, the storage device 140 of the dialogue system 100 may store travel-related information regarding travel of the vehicle, such as a departure point, a stop point, and a destination of travel, and passenger information regarding passengers, such as personal identification information, voice feature information, seat position information, boarding vehicle time information, leaving time information, boarding vehicle position information, and information related to a position of leaving the vehicle.
That is, the storage device 140 of the dialogue system 100 may store travel-related information related to travel of the vehicle, such as a departure point, a stop point, and a destination of the travel, by collecting GPS values from the vehicle controller 240.
In addition, the storage device 140 of the dialogue system 100 may collect passenger identification information, voice feature information, seat position information, and passenger number information and store passenger information about passengers, such as passenger identification information, voice feature information, seat position information, boarding vehicle time information, departure information, boarding vehicle position information, and information related to the location of the departing vehicle.
Fig. 55 is a flowchart illustrating a method of determining that a passenger participating in a previous trip boards a vehicle and outputting a pre-utterance in a dialogue processing method according to an embodiment.
The dialogue system 100 may determine whether a passenger gets on the vehicle based on dialogue between occupants in the vehicle and vehicle operation information (5500). Specifically, the voice input processor 111 of the input processor 110 may determine whether the passenger gets on the vehicle by receiving a dialogue between passengers in the vehicle, and acquire features such as a voice feature of each passenger, a seat position, a time to get on the vehicle, and a position to get on the vehicle.
The dialog system 100 may determine whether the occupant characteristics are the same as the stored occupant characteristics (5510). Specifically, the voice input processor 111 of the dialogue system 100 may compare the characteristics of the passenger acquired from the storage device 140 with the passenger characteristics (such as the voice characteristics, the seat position, the boarding time, and the boarding vehicle position) of the passenger determined to board the vehicle.
For example, the voice input processor 111 may compare the voice characteristics of the passenger, the seat position, the boarding time, and the boarding vehicle position included in the passenger information with the characteristics of the passenger determined to board the vehicle in step 5510.
when at least two of the voice characteristics of the passenger, the seat position, the boarding time, and the boarding position of the passenger included in the passenger information are the same as the characteristics of the passenger who determined to board the vehicle at step 5510, the voice input processor 111 may determine that the characteristics of the passenger determined to board the vehicle are the same as the passenger information.
When comparing the voice feature, the seat position, the time of boarding the vehicle, and the position of boarding the vehicle of the passenger included in the passenger information with the feature of the passenger who determined to board the vehicle in step 5510, the voice input processor 111 may determine similar voice features, seat positions, the time of boarding the vehicle, and the position of boarding the vehicle within a certain range to be the same.
When it is determined that the characteristics of the passenger determined to board the vehicle are the same as the stored passenger information (yes in 5510), the dialogue system 100 may output a pre-utterance verifying whether the passenger participates in the previous travel (5520).
For example, when the characteristics of the passenger determined to board the vehicle are the same as the stored passenger information, the dialogue system 100 may output a pre-utterance, such as "you are 00?," that verifies whether the passenger is the same as the passenger previously driving.
Based on the context information related to whether the passenger gets on the vehicle, the pre-speech determiner 151 may determine whether to output the pre-speech according to a pre-speech condition that is identically related to the stored passenger information and the characteristics of the passenger determined to get on the vehicle. In addition, when the context information related to whether the passenger gets on the vehicle satisfies a pre-speech condition related to the same as the stored passenger information and the characteristics of the passenger determined as the passenger getting on the vehicle, the pre-speech determiner 151 may determine as the pre-speech context and generate the pre-speech trigger signal.
The pre-utterance determiner 151 may acquire a pre-utterance message, such as "you are 00?", corresponding to a pre-utterance context in which the passenger gets on the vehicle when the pre-utterance determiner 151 transmits a pre-utterance trigger signal and a pre-utterance message to the dialog input manager 111c, the dialog input manager 111c may transmit the pre-utterance message to the dialog manager 120.
The dialog manager 120 can generate a dialog task for outputting the transmitted pre-utterance message and transmit the pre-utterance message to the result processor 130. The results processor 130 can output the input pre-speech message via speaker 232.
The dialog system 100 may verify whether the passenger is engaged in a previous trip by receiving the words of the passenger in the vehicle (5530). Specifically, the dialogue system 100 can verify whether the passenger is engaged in a previous trip by receiving the passenger utterance corresponding to the pre-utterance message.
For example, the passenger may say "yes" or "no" in response to a pre-verbal message asking the passenger whether or not to participate in a previous trip. That is, the passenger may speak a message including a response indicating whether the passenger is participating in the previous trip in response to the pre-verbal message asking whether the passenger is participating in the previous trip.
When the words of the passenger are input, the voice input processor 111 recognizes the input words of the passenger in the vehicle. The words of the passenger may be input through the voice input device 210 provided in the vehicle 200 or the voice input device 410 provided in the mobile device 400.
The speech recognizer 111a may recognize an input utterance of the user and output the utterance in a text form. The natural language understanding section 111b may apply a natural language understanding technique to the utterance in text form and output a result of the natural language understanding.
Specifically, the natural language understanding process may include performing morpheme analysis on the utterance in the text form, and recognizing whether the passenger is in a previous travel based on a result of the morpheme analysis.
Accordingly, the passenger determiner 111d of the dialogue system 100 may verify whether the passenger participates in the previous travel based on the dialogue between the passengers in the vehicle.
When it is determined that the passenger participates in the previous trip, the dialogue system 100 may generate passenger number information based on a dialogue between passengers in the vehicle and the stored passenger information (5540). That is, as described in step 5540, the dialogue system 100 may additionally consider stored occupant information when generating occupant count information based on dialogue between occupants in the vehicle.
For example, the passenger determiner 111d of the dialogue system 100 may estimate the change in the number of passengers at the stop point based on the output of the natural language understanding section 111 b. Specifically, the passenger determiner 111d may estimate the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving based on the dialogue between the occupants in the vehicle, and the passenger determiner 111d may estimate the number of potential passengers getting on the vehicle based on the call session in the vehicle.
When the change in the number of passengers is estimated based on the dialogue between the occupants in the vehicle, the passenger determiner 111d may increase the accuracy of the estimation result of the change in the number of passengers by using the departure time information and the information related to the position of departure from the vehicle among the stored passenger information.
Specifically, when it is estimated that the passenger will exit at a specific stopping point based on the dialogue between the passengers in the vehicle, the passenger determiner 111d may verify the departure time and the location of the departing vehicle in the previous travel by using the departure time information and the information on the location of the departing vehicle among the stored passenger information.
The passenger determiner 111d may determine whether the estimated specific stopping point at which the passenger departs is the same as the departure position in the previous travel based on the dialogue between the occupants in the vehicle.
When the specific stopping point at which the passenger leaves estimated based on the dialogue between the occupants in the vehicle is the same as the leaving position in the previous travel, the passenger determiner 111d may generate the passenger information by using the estimate of the change in the number of passengers estimated based on the dialogue between the occupants in the vehicle.
When a specific stopping point at which the passenger leaves, estimated based on a dialogue between passengers in the vehicle, is different from a leaving position in a previous travel, the passenger determiner 111d may verify whether the leaving position is the specific stopping point position by speaking a pre-utterance for the passenger, and generate passenger information by using the utterance of the passenger.
As is apparent from the above description, according to the proposed dialogue system, a vehicle and a method for controlling a vehicle may provide a service suitable for a user's intention or a user's desire by accurately recognizing the user's intention based on various information, such as dialogue with the user and vehicle state information, driving environment information, and user information during driving of the vehicle.
In addition, the possibility of each passenger leaving the vehicle and the possibility of each passenger getting on the vehicle again after leaving can be estimated based on the dialogue between the passengers in the vehicle during traveling, so as to guide the passengers to leave the passengers at the positions desired by the passengers and also to prevent the driver from being distracted while driving.
Although a few embodiments of the present disclosure have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the disclosure, the scope of which is defined in the claims and their equivalents.
According to the proposed dialogue processing apparatus, the vehicle having the dialogue processing apparatus, and the dialogue processing method, it is possible to provide services suitable for the user's intention or the user's needs by using the dialogue processing method specified for the vehicle.
In addition, by considering various contexts occurring in the vehicle, a service desired by the user can be provided. Specifically, regardless of the utterance of the user, the service required by the user may be determined and actively provided based on the context information or the driver information collected by the dialogue system 100.

Claims (24)

1. A dialogue system for a vehicle, comprising:
An input processor configured to receive a dialogue between occupants of the vehicle including a driver and at least one passenger, detect vehicle operation information, identify at least one passenger based on the dialogue between the occupants or the vehicle operation information, generate passenger number information estimating a change in the number of passengers in the vehicle when the vehicle reaches a stop point based on the dialogue between the occupants, and acquire a pre-utterance message according to the passenger number information; and
A result processor configured to output a pre-utterance from the pre-utterance message.
2. The dialog system of claim 1, wherein:
The pre-speech message indicates at least one of: a likelihood of each of the at least one passenger exiting the vehicle at the stopping point, a likelihood of each of the at least one passenger again boarding the vehicle after exiting the vehicle at the stopping point, and a likelihood of a potential passenger boarding the vehicle at the stopping point.
3. The dialog system of claim 1, wherein the input processor comprises:
A voice input processor configured to determine whether at least one passenger boards the vehicle based on one or more voice features of the voice of the at least one passenger included in the conversation between the occupants; and
A contextual information processor configured to determine whether the at least one passenger is boarding the vehicle based on the vehicle operation information.
4. The dialog system of claim 3 wherein:
When it is determined that the at least one passenger is boarding the vehicle, the input processor is configured to obtain a pre-spoken message corresponding to the at least one passenger boarding the vehicle, receive an utterance of the at least one passenger related to the pre-spoken message, and identify the at least one passenger by applying a natural language understanding algorithm to the utterance of the at least one passenger, and
When it is determined that the at least one passenger is not boarding the vehicle, the input processor is configured to obtain a pre-utterance message corresponding to the at least one passenger not boarding the vehicle, receive an utterance by a driver related to the pre-utterance message, and verify a presence of the at least one passenger by applying a natural language understanding algorithm to the utterance by the driver.
5. The dialog system of claim 2 wherein:
An input processor is configured to determine a likelihood of each of the at least one passenger exiting the vehicle at the stopping point and a likelihood of each of the at least one passenger re-boarding the vehicle after exiting the vehicle at the stopping point by applying a natural language understanding algorithm to a dialogue between the occupants, and to generate the passenger quantity information based on the determined likelihood of the at least one passenger exiting the vehicle at the stopping point and the likelihood of the at least one passenger re-boarding the vehicle after exiting the vehicle at the stopping point.
6. The dialog system of claim 2 wherein:
The input processor is configured to receive a call session in the vehicle, determine a likelihood of a potential passenger boarding the vehicle at the stopping point by applying a natural language understanding algorithm to the received call session, and generate the passenger quantity information based on the likelihood of the potential passenger boarding the vehicle at the stopping point.
7. The dialog system of claim 5 wherein:
After the vehicle leaves the stop point, the input processor is configured to determine a change in the number of passengers in the vehicle based on the dialogue between the occupants and the vehicle operation information, compare an estimated change in the number of passengers in the vehicle based on the passenger number information with the determined change in the number of passengers, and acquire the pre-speech message based on a result of the comparison.
8. The dialog system of claim 7 wherein:
After the vehicle leaves the dwell point, the input processor is configured to acquire the pre-verbal message to determine a change in the number of passengers in the vehicle, receive an utterance of the at least one passenger related to the pre-verbal message, and determine the change in the number of passengers in the vehicle by applying a natural language understanding algorithm to the utterance of the at least one passenger.
9. The dialog system of claim 1, further comprising:
A storage device configured to store travel-related information of the vehicle and passenger information of each of the at least one passenger when the vehicle stops traveling.
10. The dialog system of claim 9 wherein:
The passenger information includes at least one of passenger identification information, voice feature information, seat position information, boarding vehicle time information, boarding vehicle position information, departure time information, or information relating to a location of departure from the vehicle.
11. the dialog system of claim 9 wherein:
The input processor is configured to receive a dialogue between the occupants and the vehicle operation information, determine whether the at least one passenger boards the vehicle based on the dialogue between the occupants and the vehicle operation information, determine whether a feature of each of the at least one passenger corresponds to the passenger information, and acquire the pre-utterance message by verifying whether a first passenger having a feature corresponding to the passenger information participates in a previous travel.
12. The dialog system of claim 11 wherein:
The input processor is configured to receive an utterance of at least one passenger related to the pre-utterance message, verify whether the first passenger is engaged in a previous trip by applying a natural language understanding algorithm to the utterance of the passenger, and generate the passenger number information based on a dialogue between the passengers and the passenger information when the first passenger is engaged in the previous trip.
13. a dialogue processing method for a vehicle, comprising:
receiving a conversation between vehicle occupants including a driver and at least one passenger;
Detecting vehicle operation information;
Identifying the at least one passenger based on the dialogue between the occupants or the vehicle operation information;
Generating passenger number information that estimates a change in the number of passengers in the vehicle when the vehicle reaches a stop point based on a dialogue between the passengers;
Acquiring a pre-speaking message according to the passenger number information; and is
and outputting the pre-speaking according to the pre-speaking message.
14. the dialogue processing method according to claim 13, wherein:
The pre-speech message indicates at least one of: a likelihood of each of the at least one passenger exiting the vehicle at the stopping point, a likelihood of each of the at least one passenger again boarding the vehicle after exiting the vehicle at the stopping point, and a likelihood of a potential passenger boarding the vehicle at the stopping point.
15. The conversation processing method of claim 13 further comprising:
Determining whether the at least one passenger boards the vehicle based on one or more voice features of the voice of the at least one passenger included in the conversation between the occupants; and is
Determining whether the at least one passenger boards the vehicle based on the vehicle operation information.
16. The conversation processing method of claim 15 further comprising:
Obtaining a pre-verbal message corresponding to the at least one passenger's boarding vehicle when it is determined that the at least one passenger is boarding the vehicle;
Receiving an utterance of the at least one passenger related to the pre-utterance message; and is
Identifying at least one passenger by applying a natural language understanding algorithm to an utterance of the at least one passenger; and
when it is determined that the at least one passenger is not boarding the vehicle, acquiring a pre-verbal message corresponding to the at least one passenger not boarding the vehicle;
Receiving an utterance of the driver related to the pre-utterance message; and is
Verifying the presence of the at least one passenger by applying a natural language understanding algorithm to the driver's utterance.
17. The conversation processing method of claim 15 further comprising:
Determining a likelihood of each of the at least one passenger exiting the vehicle at the stopping point and a likelihood of each of the at least one passenger re-boarding the vehicle after exiting the vehicle at the stopping point by applying a natural language understanding algorithm to a dialogue between the occupants; and is
Generating the passenger quantity information based on the determined likelihood of the at least one passenger leaving the vehicle at the stopping point and the likelihood of the at least one passenger getting on the vehicle again after leaving the vehicle at the stopping point.
18. The conversation processing method of claim 15 further comprising:
Receiving a call session in the vehicle;
Determining a likelihood of the potential passenger boarding the vehicle at the stopping point by applying a natural language understanding algorithm to the received call session; and is
Generating the passenger quantity information based on a likelihood that the potential passenger boards the vehicle at the stopping point.
19. The conversation processing method of claim 17 further comprising:
Determining a change in the number of passengers in the vehicle based on the dialogue between the occupants and the vehicle operation information after the vehicle leaves the parking spot;
Comparing the estimated change in the number of passengers in the vehicle based on the passenger number information with the determined change in the number of passengers; and is
Obtaining the pre-utterance message based on the comparison result.
20. The conversation processing method of claim 19 further comprising:
After the vehicle leaves the dwell point, obtaining the pre-verbal message to determine a change in the number of passengers in the vehicle;
Receiving an utterance of the at least one passenger related to the pre-utterance message; and is
Determining a change in the number of passengers in the vehicle by applying a natural language understanding algorithm to the utterance of the at least one passenger.
21. The conversation processing method of claim 13 further comprising:
when the vehicle stops traveling, storing traveling-related information of the vehicle and passenger information of each of the at least one passenger.
22. The conversation processing method of claim 21 wherein:
The passenger information includes at least one of passenger identification information, voice feature information, seat position information, boarding vehicle time information, boarding vehicle position information, departure time information, or information relating to a location of departure from the vehicle.
23. The conversation processing method of claim 21 further comprising:
Receiving a dialogue between the occupants and the vehicle operation information;
Determining whether the at least one passenger gets on the vehicle based on the dialogue between the occupants and the vehicle operation information;
Determining whether a characteristic of each of the at least one passenger corresponds to the passenger information;
The pre-speech message is acquired by verifying whether a first passenger having a characteristic corresponding to the passenger information participates in a previous travel.
24. The conversation processing method of claim 23 further comprising:
Receiving an utterance of the at least one passenger related to the pre-utterance message;
Verifying whether the first passenger was engaged in a previous trip by applying a natural language understanding algorithm to the occupant's utterance; and is
Generating the passenger number information based on a dialogue between the passengers and the passenger information when the first passenger participates in the previous travel.
CN201811497854.XA 2018-05-17 2018-12-07 Dialogue system and dialogue processing method Pending CN110562260A (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
KR1020180056497A KR20190131741A (en) 2018-05-17 2018-05-17 Dialogue system, and dialogue processing method
KR10-2018-0056497 2018-05-17
KR1020180067127A KR102562227B1 (en) 2018-06-12 2018-06-12 Dialogue system, Vehicle and method for controlling the vehicle
KR10-2018-0067127 2018-06-12
KR10-2018-0073824 2018-06-27
KR1020180073824A KR20200001188A (en) 2018-06-27 2018-06-27 Dialogue system, Vehicle and method for controlling the vehicle
KR10-2018-0077027 2018-07-03
KR1020180077027A KR20200004054A (en) 2018-07-03 2018-07-03 Dialogue system, and dialogue processing method

Publications (1)

Publication Number Publication Date
CN110562260A true CN110562260A (en) 2019-12-13

Family

ID=68772430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811497854.XA Pending CN110562260A (en) 2018-05-17 2018-12-07 Dialogue system and dialogue processing method

Country Status (1)

Country Link
CN (1) CN110562260A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870448A (en) * 2021-09-28 2021-12-31 深圳市卡联科技股份有限公司 Safe and quick response communication method and system of intelligent vehicle-mounted terminal
WO2024078460A1 (en) * 2022-10-13 2024-04-18 广州小鹏汽车科技有限公司 Speech processing method, speech interaction method, server, and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000118260A (en) * 1998-10-09 2000-04-25 Honda Motor Co Ltd Vehicular occupant dialoging device
JP2003308079A (en) * 2002-04-15 2003-10-31 Nissan Motor Co Ltd Voice input device
US20050240324A1 (en) * 2004-04-26 2005-10-27 Robert Boman Enhanced automotive monitoring system using sound
US20080269958A1 (en) * 2007-04-26 2008-10-30 Ford Global Technologies, Llc Emotive advisory system and method
US20080319602A1 (en) * 2007-06-25 2008-12-25 Mcclellan Scott System and Method for Monitoring and Improving Driver Behavior
CN101711206A (en) * 2007-06-27 2010-05-19 福特汽车公司 The emergency information method and system
CN103678456A (en) * 2012-09-11 2014-03-26 通用汽车环球科技运作有限责任公司 Voice stamp-driven in-vehicle functions
CN104442622A (en) * 2013-09-25 2015-03-25 现代自动车株式会社 Sound control system and method for vehicle
US20160221583A1 (en) * 2015-01-29 2016-08-04 GM Global Technology Operations LLC Method and apparatus for monitoring a rear passenger seating area of a vehicle
JP2016206469A (en) * 2015-04-24 2016-12-08 マツダ株式会社 Voice interaction system for vehicle
JP6145210B1 (en) * 2016-12-26 2017-06-07 株式会社スバルカーベル Passenger management device and passenger management method
CN107424613A (en) * 2017-05-16 2017-12-01 鄂尔多斯市普渡科技有限公司 The Phonetically door-opening Verification System and its method of a kind of unmanned taxi
CN107818788A (en) * 2016-09-14 2018-03-20 通用汽车环球科技运作有限责任公司 Remote speech identification on vehicle
CN107817714A (en) * 2016-09-13 2018-03-20 福特全球技术公司 Passenger's monitoring system and method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000118260A (en) * 1998-10-09 2000-04-25 Honda Motor Co Ltd Vehicular occupant dialoging device
JP2003308079A (en) * 2002-04-15 2003-10-31 Nissan Motor Co Ltd Voice input device
US20050240324A1 (en) * 2004-04-26 2005-10-27 Robert Boman Enhanced automotive monitoring system using sound
US20080269958A1 (en) * 2007-04-26 2008-10-30 Ford Global Technologies, Llc Emotive advisory system and method
US20080319602A1 (en) * 2007-06-25 2008-12-25 Mcclellan Scott System and Method for Monitoring and Improving Driver Behavior
CN101711206A (en) * 2007-06-27 2010-05-19 福特汽车公司 The emergency information method and system
CN103678456A (en) * 2012-09-11 2014-03-26 通用汽车环球科技运作有限责任公司 Voice stamp-driven in-vehicle functions
CN104442622A (en) * 2013-09-25 2015-03-25 现代自动车株式会社 Sound control system and method for vehicle
US20160221583A1 (en) * 2015-01-29 2016-08-04 GM Global Technology Operations LLC Method and apparatus for monitoring a rear passenger seating area of a vehicle
JP2016206469A (en) * 2015-04-24 2016-12-08 マツダ株式会社 Voice interaction system for vehicle
CN107817714A (en) * 2016-09-13 2018-03-20 福特全球技术公司 Passenger's monitoring system and method
CN107818788A (en) * 2016-09-14 2018-03-20 通用汽车环球科技运作有限责任公司 Remote speech identification on vehicle
JP6145210B1 (en) * 2016-12-26 2017-06-07 株式会社スバルカーベル Passenger management device and passenger management method
CN107424613A (en) * 2017-05-16 2017-12-01 鄂尔多斯市普渡科技有限公司 The Phonetically door-opening Verification System and its method of a kind of unmanned taxi

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
牛宝;曹纪伟;肖枭;: "疲劳预警的长途客车监控系统的设计", 信息通信, no. 06, 15 June 2016 (2016-06-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870448A (en) * 2021-09-28 2021-12-31 深圳市卡联科技股份有限公司 Safe and quick response communication method and system of intelligent vehicle-mounted terminal
WO2024078460A1 (en) * 2022-10-13 2024-04-18 广州小鹏汽车科技有限公司 Speech processing method, speech interaction method, server, and storage medium

Similar Documents

Publication Publication Date Title
KR102562227B1 (en) Dialogue system, Vehicle and method for controlling the vehicle
US10839797B2 (en) Dialogue system, vehicle having the same and dialogue processing method
US10733994B2 (en) Dialogue system, vehicle and method for controlling the vehicle
KR102426171B1 (en) Dialogue processing apparatus, vehicle having the same and dialogue service processing method
US10991368B2 (en) Dialogue system and dialogue processing method
US10997974B2 (en) Dialogue system, and dialogue processing method
US10937424B2 (en) Dialogue system and vehicle using the same
US10861460B2 (en) Dialogue system, vehicle having the same and dialogue processing method
US10950233B2 (en) Dialogue system, vehicle having the same and dialogue processing method
US11004450B2 (en) Dialogue system and dialogue processing method
CN110503947B (en) Dialogue system, vehicle including the same, and dialogue processing method
KR102403355B1 (en) Vehicle, mobile for communicate with the vehicle and method for controlling the vehicle
KR102487669B1 (en) Dialogue processing apparatus, vehicle having the same and dialogue processing method
KR20200006738A (en) Dialogue system, and dialogue processing method
CN110562260A (en) Dialogue system and dialogue processing method
KR102448719B1 (en) Dialogue processing apparatus, vehicle and mobile device having the same, and dialogue processing method
KR20200000621A (en) Dialogue processing apparatus, vehicle having the same and dialogue processing method
KR20190036018A (en) Dialogue processing apparatus, vehicle having the same and dialogue processing method
KR20190135676A (en) Dialogue system, vehicle having the same and dialogue processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination