US11055495B2 - Utterance sentence generation system and utterance sentence generation program - Google Patents
Utterance sentence generation system and utterance sentence generation program Download PDFInfo
- Publication number
- US11055495B2 US11055495B2 US16/640,104 US201816640104A US11055495B2 US 11055495 B2 US11055495 B2 US 11055495B2 US 201816640104 A US201816640104 A US 201816640104A US 11055495 B2 US11055495 B2 US 11055495B2
- Authority
- US
- United States
- Prior art keywords
- utterance
- utterance sentence
- sentences
- sentence
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Definitions
- the present invention relates to an utterance sentence generation system and an utterance sentence generation program.
- Patent Literature 1 Japanese Unexamined Patent Publication No. 2014-219872
- the invention is contrived in view of such situations, and an object thereof is to provide an utterance sentence generation device and an utterance sentence generation program which are capable of outputting an utterance sentence having an appropriate length when an utterance sentence for responding to a user is generated by connecting a plurality of sentences.
- an utterance sentence generation device which is a system that outputs an utterance sentence for responding to an utterance which is input by a user, includes a focus extraction unit that extracts focus information representing at least a portion of subject matter in the user's utterance which is input by the user, on the basis of the user's utterance, an interest state estimation unit that estimates an interest state indicating a degree of the user's interest in the subject matter represented by the focus information, a number of connected sentences determination unit that determines the number of utterance sentences to be connected, on the basis of the interest state, a connected utterance sentence generation unit that generates a connected utterance sentence by connecting utterance sentences corresponding to the number determined by the number of connected sentences determination unit, and an utterance sentence output unit that outputs the connected utterance sentence.
- an utterance sentence generation program which is an utterance sentence generation program for causing a computer to function as an utterance sentence generation system that outputs an utterance sentence for responding to an utterance which is input by a user, causes the computer to realize a focus extraction function of extracting focus information representing at least a portion of subject matter in the user's utterance which is input by the user, on the basis of the user's utterance, an interest state estimation function of estimating an interest state indicating a degree of the user's interest in the subject matter represented by the focus information, a number of connected sentences determination function of determining the number of utterance sentences to be connected, on the basis of the interest state, a connected utterance sentence generation function of generating a connected utterance sentence by connecting utterance sentences corresponding to the number determined by the number of connected sentences determination function, and an utterance sentence output function of outputting the connected utterance sentence.
- focus information representing subject matter of the user's utterance is extracted, and the number of sentences to be connected is determined in accordance with the degree of the user's interest in the focus information.
- a connected utterance sentence with an appropriate length in which the degree of the user's interest is reflected is output.
- an utterance sentence generation device and an utterance sentence generation program which are capable of outputting an utterance sentence having an appropriate length when an utterance sentence for responding to a user is generated by connecting a plurality of sentences.
- FIG. 1 is a block diagram illustrating a functional configuration of an utterance sentence generation system including an utterance sentence generation device of the present embodiment.
- FIG. 2 is a hardware block diagram of the utterance sentence generation device.
- FIG. 3 is a diagram illustrating an example of a configuration of a number of connected sentences table and data stored therein.
- FIG. 4( a ) is a diagram illustrating an example of data stored in an utterance sentence DB
- FIG. 4( b ) is a diagram illustrating an example of data stored in the utterance sentence DB.
- FIG. 5 is a diagram illustrating an example of data stored in the utterance sentence DB.
- FIG. 6 is a diagram illustrating an example of correction of a connected utterance sentence.
- FIG. 7 is a diagram illustrating an example of correction of a connected utterance sentence.
- FIG. 8 is a flowchart illustrating processing contents of an utterance sentence generation method of the present embodiment.
- FIG. 9 is a diagram illustrating a configuration of an utterance sentence generation program.
- FIG. 1 is a diagram illustrating a functional configuration of an utterance sentence generation system 1 including an utterance sentence generation device 10 according to the present embodiment.
- the utterance sentence generation device 10 is a device that outputs an utterance sentence for responding to an utterance which is input by a user.
- the utterance sentence generation system 1 of the present embodiment outputs an utterance sentence using sound, text, and the like in accordance with a user's utterance for which no particular assumptions are made regarding the contents thereof, such as for example a chat, instead of being used for a specific purpose such as presentation of a route to a destination.
- a device constituting the utterance sentence generation system 1 or the utterance sentence generation device 10 is not limited, the system or the device may be constituted by a device such as a portable terminal and a personal computer or may be constituted by a robot in which a computer is embedded.
- the utterance sentence generation system 1 includes an utterance sentence generation device 10 , a user state acquisition unit 30 , a number of connected sentences table 40 and an utterance sentence DB 50 .
- the utterance sentence generation system 1 may be configured as one device, or one or two or more of the utterance sentence generation device 10 , the user state acquisition unit 30 , the number of connected sentences table 40 and utterance sentence DB 50 may constitute one device.
- the user state acquisition unit 30 may be configured as one terminal, and the utterance sentence generation device 10 , the number of connected sentences table 40 and the utterance sentence DB 50 may be constituted by a server.
- the utterance sentence generation device 10 and the user state acquisition unit 30 may be configured as one terminal.
- Storage means of each of the number of connected sentences table 40 and the utterance sentence DB 50 may be configured as a device having any configuration as long as the storage means is configured to be able to be accessed by the utterance sentence generation device 10 .
- a terminal constituting the user state acquisition unit 30 or a terminal constituting the utterance sentence generation device 10 and the user state acquisition unit 30 is configured as a portable terminal such as a high performance mobile phone (smartphone) or a mobile phone.
- the utterance sentence generation device 10 functionally includes a user utterance acquisition unit 11 , a focus extraction unit 12 , an interest state estimation unit 13 , a number of connected sentences determination unit 14 , a connected utterance sentence generation unit 15 , an ungrammatical sentence determination unit 16 , a sentence establishment determination unit 17 , an output information control unit 18 , a connected sentence correction unit 19 and an utterance sentence output unit 20 .
- the user state acquisition unit 30 includes a sound acquisition unit 31 and an image acquisition unit 32 . These functional units will be described later.
- each functional block represents blocks in units of functions. These functional blocks (constituent elements) are realized by any combination of hardware and/or software.
- means for realizing each functional block is not particularly limited. That is, each functional block may be realized by one device which is physically and/or logically coupled, or may be realized by two or more devices which are physically and/or logically separated from each other by accessing the plurality of devices directly and/or indirectly (for example, in a wired manner and/or wirelessly).
- the utterance sentence generation device 10 in the embodiment of the invention may function as a computer.
- FIG. 2 is a diagram illustrating an example of a hardware configuration of the utterance sentence generation device 10 according to the present embodiment.
- the utterance sentence generation device 10 may be physically configured as a computer device including a processor 1001 , a memory 1002 , a storage 1003 , a communication device 1004 , an input device 1005 , an output device 1006 , a bus 1007 , and the like.
- the wording “device” may be replaced by a circuit, a device, a unit, or the like.
- the hardware configuration of the utterance sentence generation device 10 may be configured to include one or a plurality of the devices illustrated in FIG. 2 , or may be configured without including some of these devices.
- the processor 1001 performs an arithmetic operation by reading predetermined software (program) on hardware such as the processor 1001 or the memory 1002 , and thus each function in the utterance sentence generation device 10 is realized by controlling communication in the communication device 1004 or reading and/or writing of data in the memory 1002 and the storage 1003 .
- the processor 1001 controls the whole computer, for example, by operating an operating system.
- the processor 1001 may be constituted by a central processing unit (CPU) including an interface with a peripheral device, a control device, an arithmetic operation device, a register, and the like.
- CPU central processing unit
- the functional units 11 to 20 illustrated in FIG. 1 may be realized by the processor 1001 .
- the processor 1001 reads out a program (program code), a software module or data from the storage 1003 and/or the communication device 1004 into the memory 1002 , and executes various types of processes in accordance therewith.
- An example of the program which is used includes a program causing a computer to execute at least some of the operations described in the above-described embodiment.
- the functional units 11 to 13 of the utterance sentence generation device 10 may be stored in the memory 1002 , and may be realized by a control program which is operated by the processor 1001 .
- the execution of various types of processes described above by one processor 1001 has been described, but these processes may be simultaneously or sequentially executed by two or more processors 1001 .
- the processor 1001 may be realized using one or more chips.
- the program may be transmitted from a network through an electrical communication line.
- the memory 1002 is a computer readable recording medium, and may be constituted by at least one of, for example, a read only memory (ROM), an erasable programmable ROM (EPROM), an electrically erasable programmable ROM (EEPROM), a random access memory (RAM), and the like.
- the memory 1002 may be referred to as a register, a cache, a main memory (main storage device), or the like.
- the memory 1002 can store a program (program code), a software module, or the like that can be executed in order to carry out an utterance sentence generation method according to the embodiment of the invention.
- the storage 1003 is a computer readable recording medium, and may be constituted by at least one of, for example, an optical disc such as a compact disc ROM (CD-ROM), a hard disk drive, a flexible disk, a magneto-optic disc (for example, a compact disc, a digital versatile disc, or a Blu-ray (registered trademark) disc), a smart card, a flash memory (for example, a card, a stick, or a key drive), a floppy (registered trademark) disk, a magnetic strip, and the like.
- the storage 1003 may be referred to as an auxiliary storage device.
- the foregoing storage medium may be, for example, a database including the memory 1002 and/or the storage 1003 , a server, or other suitable mediums.
- the communication device 1004 is hardware (transmitting and receiving device) for performing communication between computers through a wired and/or wireless network, and is also referred to as, for example, a network device, a network controller, a network card, a communication module, or the like.
- the input device 1005 is an input device (such as, for example, a keyboard, a mouse, a microphone, a switch, a button, or a sensor) that receives an input from the outside.
- the output device 1006 is an output device (such as, for example, a display, a speaker, or an LED lamp) that executes an output to the outside. Meanwhile, the input device 1005 and the output device 1006 may be an integrated component (for example, a touch panel).
- each device of the processor 1001 , the memory 1002 , and the like is accessed through the bus 1007 for communicating information.
- the bus 1007 may be constituted by a single bus, or may be constituted by different buses between devices.
- the utterance sentence generation device 10 may be configured to include hardware such as a microprocessor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a programmable logic device (PLD), or a field programmable gate array (FPGA), or some or all of the respective functional blocks may be realized by this hardware.
- the processor 1001 may be realized using at least one of these pieces of hardware.
- the user state acquisition unit 30 includes the sound acquisition unit 31 and the image acquisition unit 32 .
- the sound acquisition unit 31 acquires sound. Specifically, the sound acquisition unit 31 can acquire sound uttered by a user, and acquires sound acquired by a device such as a microphone.
- the image acquisition unit 32 acquires an image. Specifically, the image acquisition unit 32 can acquire an image showing the appearance of a user, and acquires an image acquired by an imaging device such as a camera.
- the user utterance acquisition unit 11 acquires a user's utterance.
- the user's utterance is an utterance which is input by the user.
- the input of the utterance is performed using, for example, sound, text, or the like.
- the user utterance acquisition unit 11 acquires sound uttered by a user through, for example, the sound acquisition unit 31 .
- the user utterance acquisition unit 11 may acquire a user's utterance as text through the input device 1005 such as a keyboard, input keys created on a touch panel, or the like.
- a user's utterance can be an object for which no particular assumptions are made regarding the contents thereof, such as for example, a chat.
- the focus extraction unit 12 extracts focus information representing at least a portion of subject matter in the user's utterance on the basis of the user's utterance acquired by the user utterance acquisition unit 11 .
- the focus information is, for example, the words which play a leading role in the subject matter in the user's utterance.
- Various well-known techniques can be applied to the extraction of the focus information from the user's utterance, and a machine learning method such as deep learning and SVM can be used.
- the focus extraction unit 12 may extract a word extracted through morphological analysis with respect to the user's utterance as a candidate for the focus information, a score may be calculated using an estimation model of the focus information based on a predetermined feature amount obtained in advance through predetermined machine learning on the basis of the predetermined feature amount extracted from the word which is a candidate for the focus information, and the focus information may be extracted on the basis of the calculated score.
- the interest state estimation unit 13 estimates an interest state indicating the degree of the user's interest in the subject matter represented by the focus information. Specifically, the interest state estimation unit 13 estimates the interest state on the basis of, for example, predetermined detection, information on the state of the user. More specifically, the interest state estimation unit 13 can acquire at least one of an acoustic feature in the user's utterance, the user's gaze, the user's facial expression and contents of the user's utterance as detection information.
- the interest state estimation unit 13 may calculate a score indicating the degree of the user's interest using an estimation model of an interest state based on a predetermined feature amount obtained in advance through predetermined machine learning, on the basis of a predetermined feature amount extracted from the detection information on the state of the user.
- the interest state estimation unit 13 extracts a predetermined feature amount on the basis of the detection information. For example, the interest state estimation unit 13 can use the height (frequency) of a voice, the strength of a voice (sound volume) of the user and the like, which are acoustic features in the user's utterance, as feature amounts. In addition, the interest state estimation unit 13 can acquire the direction of the user's gaze, a gaze time and the like from an image of the user's eyes which is acquired through the image acquisition unit 32 , and can use them as feature amounts.
- the interest state estimation unit 13 can determine a facial expression on the basis of an image of the user's face which is acquired through the image acquisition unit 32 and can use the determined facial expression as a feature amount.
- the interest state estimation unit 13 can use contents of the user's utterance acquired through the sound acquisition unit 31 as a feature amount.
- the interest state estimation unit 13 can extract a word by performing morpheme analysis on the contents of the user's utterance and can use the positive/negative degree of meaning represented by the extracted word as a feature amount.
- the interest state estimation unit 13 can use information capable of being acquired from a device used by the user as a feature amount. Specifically, for example, the interest state estimation unit 13 may estimate an interest state on the basis of profile information of the user. Examples of the profile information of the user include a Web browsing history of the user, a point of interface (POI), and the like.
- the interest state estimation unit 13 can extract a word by performing morphological analysis on a Web browsing history, a POI visit history, and the like and can use the degree of association between the extracted word and subject matter shown in the focus information and the frequency of the word as feature amounts.
- the number of connected sentences determination unit 14 determines the number of utterance sentences to be connected, on the basis of the estimated interest state. As an example, specifically, the number of connected sentences determination unit 14 determines the number of utterance sentences to be connected, with reference to the number of connected sentences table 40 .
- the number of connected sentences table 40 is a table in which information indicating the interest state and the number of utterance sentences to be connected are stored in association with each other.
- FIG. 3 is a diagram illustrating an example of a configuration of the number of connected sentences table 40 and data stored therein.
- interest states is 1 to is 5 represent scores indicating the degree of the user's interest.
- the number of connected sentences determination unit 14 determines that the number of utterance sentences to be connected is “2”.
- the connected utterance sentence generation unit 15 generates a connected utterance sentence by connecting utterance sentences corresponding to the number determined by the number of connected sentences determination unit 14 .
- the connected utterance sentence generation unit 15 acquires utterance sentences from the utterance sentence DB 50 and connects the acquired utterance sentences.
- the utterance sentence DB 50 is a database in which utterance sentences are stored.
- FIGS. 4( a ) and 4( b ) are diagrams illustrating an example of a configuration of the utterance sentence DB 50 and data stored therein.
- the utterance sentence DB 50 stores utterance sentence data 50 A including a predicate argument structure pair associated with the focus information.
- the predicate argument structure pair is a pair of a predicate as exemplified in a verb and a term serving as a subject and an object of the predicate.
- the connected utterance sentence generation unit 15 can generate utterance sentences of various forms such as “I have a meal” and “I want to have a meal” by a well-known method on the basis of a predicate argument structure pair “I have_a meal” shown in the utterance sentence data 50 A.
- the utterance sentence DB 50 may store utterance sentence data 50 B having a configuration in which one utterance sentence is associated with focus information.
- the connected utterance sentence generation unit 15 may connect a plurality of utterance sentences having subject matter represented by focus information extracted by the focus extraction unit 12 .
- An example of connection of utterance sentences by the connected utterance sentence generation unit 15 will be described below.
- focus information “meal” is extracted by the focus extraction unit 12 and the number of sentences to be connected “2” is determined by the number of connected sentences determination unit 14 .
- the connected utterance sentence generation unit 15 acquires two utterance sentences associated with the focus information “meal” with reference to the utterance sentence DB 50 .
- FIG. 5 is a diagram illustrating an example of a configuration of the utterance sentence DB 50 and utterance sentence data stored therein.
- the connected utterance sentence generation unit 15 acquires, for example, an utterance sentence “I am hungry” and an utterance sentence “What do you want to have for dinner?” which are associated with the focus information “meal” among utterance sentences shown in utterance sentence data 50 C of FIG. 5 .
- the connected utterance sentence generation unit 15 generates a connected utterance sentence “I am hungry. What do you want to have for dinner?” by connecting the two utterance sentences acquired from the utterance sentence DB 50 .
- the connected utterance sentence generation unit 15 acquires three utterance sentences associated with the focus information “meal” with reference to the utterance sentence DB 50 .
- the connected utterance sentence generation unit 15 acquires an utterance sentence “I am hungry”, an utterance sentence “What do you want to have for dinner” and an utterance sentence “Now is the best season for the bamboo shoots” which are associated with the focus information “meal” among the utterance sentences shown in the utterance sentence data 50 C.
- the connected utterance sentence generation unit 15 generates a connected utterance sentence “I am hungry. What do you want to have for dinner? Now is the best season for the bamboo shoots.” by connecting the three utterance sentences acquired from the utterance sentence DB 50 .
- naturalness as an utterance sentence may be determined for each of the utterance sentences acquired from the utterance sentence DB 50 .
- the ungrammatical sentence determination unit 16 determines naturalness as an utterance sentence of an utterance sentence acquired from the utterance sentence DB 50 or an utterance sentence generated from a predicate argument structure pair acquired from the utterance sentence DB 50 , prior to the generation of a connected utterance sentence by the connected utterance sentence generation unit 15 .
- the ungrammatical sentence determination unit 16 determines naturalness for each utterance sentence using a naturalness determination model for an utterance sentence based on a predetermined feature amount in advance obtained by predetermined machine learning, on the basis of feature amounts of utterance sentences. For example, a vector expression of an utterance sentence may be used for the feature amounts of the utterance sentences, and a known method such as Bag of words or Word2Vec can be applied.
- the connected utterance sentence generation unit 15 may use only utterance sentences which have been determined to have a naturalness equal to or more than a predetermined level by the ungrammatical sentence determination unit 16 , for the generation of the connected utterance sentence generation unit 15 .
- the ungrammatical sentence determination unit 16 is not an essential component.
- the sentence establishment determination unit 17 determines the degree of establishment of a connected utterance sentence generated by the connected utterance sentence generation unit 15 as an utterance sentence.
- Various well-known techniques can be applied to the determination, and a machine learning method such as deep learning and SVM can be used.
- a machine learning method such as deep learning and SVM can be used.
- a plurality of (a large amount of) sentences accompanying an establishment label indicating whether or not a connected utterance sentence is established as a sentence may be prepared.
- Those sentences are vectorized by a known technique such as Bag of words or Word2Vec and the above-described predetermined machine learning is performed on a pair of the vectored sentence and the establishment label, thereby generating a model for determination.
- the sentence establishment determination unit 17 outputs a score indicating the degree of establishment as a connected utterance sentence using this model.
- the output information control unit 18 causes the utterance sentence output unit 20 to output a connected utterance sentence for which the sentence establishment determination unit 17 determines that the degree of establishment as a sentence is equal to or more than a predetermined degree. That is, the output information control unit 18 performs control such that the utterance sentence output unit 20 outputs only a connected utterance sentence for which a score determined by the sentence establishment determination unit 17 is equal to or greater than a predetermined value and so that the utterance sentence output unit 20 does not output a connected utterance sentence for which the score is less than the predetermined value.
- the sentence establishment determination unit 17 and the output information control unit 18 are not essential components.
- the connected sentence correction unit 19 unifies styles of utterance sentences included in a connected utterance sentence into a predetermined style. Specifically, for example, the connected sentence correction unit 19 analyzes the styles (for example, an informal style, a formal style, and the like) of the utterance sentences included in the connected utterance sentence by a well-known method. Further, in a case where the connected utterance sentence includes utterance sentences of different styles, the connected sentence correction unit 19 corrects the connected utterance sentence so as to unify the styles.
- styles for example, an informal style, a formal style, and the like
- a style into which the styles are to be unified may be, for example, a style of an utterance sentence at the beginning or end of the connected utterance sentence or the most frequent style among the styles of the plurality of utterance sentences included in the connected utterance sentence.
- the connected sentence correction unit 19 may determine a style into which the styles are to be unified, on the basis of attribute information of the user.
- FIG. 6 is a diagram illustrating an example of correction of a connected utterance sentence.
- a connected utterance sentence CS 1 before correction includes an utterance sentence “I am hungry.” of a formal style and an utterance sentence “What do you want to have for dinner?” of an informal style.
- the connected sentence correction unit 19 can correct the utterance sentence of the informal style in the connected utterance sentence CS 1 before correction to the utterance sentence “What would you like to have for dinner?” of the formal style to generate a connected utterance sentence CS 2 after correction.
- FIG. 7 is a diagram illustrating an example of correction of a connected utterance sentence in English.
- a connected utterance sentence CS 21 before correction includes an utterance sentence “May I help you?” of a formal style and an utterance sentence “Open the window?” of an informal style.
- the connected sentence correction unit 19 can correct the utterance sentence of the informal style in the connected utterance sentence CS 21 before correction to the utterance sentence “Could you open the window?” of the formal style to generate a connected utterance sentence CS 22 after correction.
- the connected sentence correction unit 19 may impart a predetermined conjunction between utterance sentences included in a connected utterance sentence. Specifically, the connected sentence correction unit 19 performs morphological analysis, syntax analysis, semantic analysis, context analysis and the like using a well-known language processing technique on the utterance sentences included in the connected utterance sentence. For example, the connected sentence correction unit 19 imparts a conjunction between two connected utterance sentences in accordance with a difference in meaning between the two utterance sentences (for example, a difference in positive/negative degree between meanings of the sentences). In addition, the connected sentence correction unit 19 may vectorize each of the connected two utterance sentences and impart a conjunction between the two utterance sentences in accordance with the degree of similarity between the vectors thereof. Meanwhile, in the utterance sentence generation device 10 of the present embodiment, the connected sentence correction unit 19 is not an essential component.
- the utterance sentence output unit 20 outputs a connected utterance sentence. Specifically, the utterance sentence output unit 20 outputs an utterance sentence using a sound, text and the like in accordance with the user's utterance acquired by the user utterance acquisition unit 11 .
- FIG. 8 is a flowchart illustrating processing contents of the utterance sentence generation method of the present embodiment.
- step S 1 the user utterance acquisition unit 11 acquires the user's utterance.
- step S 2 the focus extraction unit 12 extracts focus information on the user's utterance on the basis of the user's utterance acquired by the user utterance acquisition unit 11 in step S 1 .
- step S 3 the interest state estimation unit 13 estimates an interest state indicating the degree of the user's interest in subject matter represented by the focus information extracted by the focus extraction unit 12 in step S 2 , on the basis of predetermined detection information on the state of the user.
- step S 4 the number of connected sentences determination unit 14 determines the number of utterance sentences to be connected, on the basis of the interest state estimated by the interest state estimation unit 13 in step S 3 .
- step S 5 the connected utterance sentence generation unit 15 generates a connected utterance sentence by connecting utterance sentences corresponding to the number determined by the number of connected sentences determination unit 14 in step S 4 .
- the ungrammatical sentence determination unit 16 may determine naturalness as an utterance sentence of the utterance sentence acquired from the utterance sentence DB 50 .
- step S 6 the sentence establishment determination unit 17 determines a score indicating the degree of establishment of the connected utterance sentence, generated by the connected utterance sentence generation unit 15 in step S 5 , as an utterance sentence.
- step S 7 the output information control unit 18 determines whether or not the score determined in step S 6 is equal to or greater than a predetermined value. In a case where it is determined that the score is equal to or greater than the predetermined value, the processing proceeds to step S 8 . On the other hand, in a case where it is not determined that the score is equal to or greater than the predetermined value, the processing is terminated. Meanwhile, in this flowchart, steps S 6 and S 7 are not essential processing steps.
- step S 8 the utterance sentence output unit 20 outputs the connected utterance sentence generated by the connected utterance sentence generation unit 15 in step S 5 .
- the connected sentence correction unit 19 may unify styles of utterance sentences included in the connected utterance sentence into a predetermined style or may impart a predetermined conjunction between the utterance sentences included in the connected utterance sentence.
- FIG. 9 is a diagram illustrating a configuration of an utterance sentence generation program P 1 .
- the utterance sentence generation program P 1 is configured to include a main module m 10 that controls the overall utterance sentence generation process in the utterance sentence generation device 10 , a user utterance acquisition module m 11 , a focus extraction module m 12 , an interest state estimation module m 13 , a number of connected sentences determination module m 14 , a connected utterance sentence generation module m 15 , an ungrammatical sentence determination module m 16 , a sentence establishment determination module m 17 , an output information control module m 18 , a connected sentence correction module m 19 and an utterance sentence output module m 20 .
- the utterance sentence generation program P 1 may be configured to be transmitted through a transmission medium such as a communication line, or may be configured to be stored in a storage medium M 1 as illustrated in FIG. 9 .
- the ungrammatical sentence determination module m 16 , the sentence establishment determination module m 17 , the output information control module m 18 and the connected sentence correction module m 19 are not essential components in the utterance sentence generation program P 1 .
- utterance sentence generation device 10 allocation information generation method and utterance sentence generation program P 1 of the present embodiment
- focus information representing subject matter of the user's utterance is extracted, and the number of sentences to be connected is determined in accordance with the degree of the user's interest in the focus information.
- a connected utterance sentence with an appropriate length in which the degree of the user's interest is reflected is output.
- the interest state estimation unit may estimate the interest state on the basis of predetermined detection information on the state of the user.
- the degree of interest is estimated in accordance with the state of the user. Therefore, an interest state for the focus information is estimated appropriately.
- the interest state estimation unit may acquire at least one of an acoustic feature in the user's utterance, the user's gaze, the user's facial expression and contents of the user's utterance as detection information.
- the degree of interest is estimated on the basis of various pieces of detection information in which interest states of the user are shown. Therefore, an interest state for the focus information is estimated appropriately.
- the connected utterance sentence generation unit may connect a plurality of utterance sentences having subject matter represented by the focus information extracted by the focus extraction unit.
- a connected utterance sentence is constituted by a plurality of utterance sentences having subject matter represented by the focus information extracted on the basis of the user's utterance, and thus an appropriate utterance sentence is generated as a response to the user's utterance.
- the number of connected sentences determination unit may determine the number of utterance sentences to be connected with reference to a number of connected sentences table in which information indicating an interest state and the number of utterance sentences to be connected are stored in association with each other.
- a number assumed to be preferable as the number of utterance sentences to be connected in accordance with information indicating an interest state is set in the table in advance, and thus it is possible to connect an appropriate number of utterance sentences.
- the utterance sentence generation device may further include a sentence establishment determination unit that determines the degree of establishment of a connected utterance sentence as an utterance sentence, and an output information control unit that causes the utterance sentence output unit to output a connected utterance sentence for which the sentence establishment determination unit determines that the degree of establishment as an utterance sentence is equal to or more than a predetermined degree.
- a connected utterance sentence for which the degree of establishment as an utterance sentence is less than the predetermined degree is not output. Therefore, a connected utterance sentence which is not appropriate as an utterance sentence is prevented from being used for a response to the user's utterance.
- the utterance sentence generation device may further include a connected sentence correction unit that unifies styles of utterance sentences included in a connected utterance sentence into a predetermined style or imparts a predetermined conjunction between utterance sentences.
- a connected utterance sentence constituted by a plurality of utterance sentences connected to each other can be configured to be natural as a whole.
- LTE long term evolution
- LTE-A LTE-advanced
- SUPER 3G IMT-Advanced
- 4G 5G
- future radio access FAA
- W-CDMA registered trademark
- GSM registered trademark
- CDMA2000 ultra mobile broadband
- UMB ultra mobile broadband
- IEEE 802.11 Wi-Fi
- IEEE 802.16 WiMAX
- IEEE 802.20 ultra-wideband
- UWB ultra-wideband
- Bluetooth registered trademark
- Information or the like can be output from an upper layer (or a lower layer) to a lower layer (or an upper layer). Information or the like may be input or output via a plurality of network nodes.
- the input or output information or the like may be stored in a specific place (for example, a memory) or may be managed in a management table.
- the input or output information or the like may be overwritten, updated, or added.
- the output information or the like may be deleted.
- the input information or the like may be transmitted to another device.
- Determination may be performed using a value (0 or 1) which is expressed by one bit, may be performed using a Boolean value (true or false), or may be performed by comparison of numerical values (for example, comparison thereof with a predetermined value).
- notification of predetermined information is not limited to explicit transmission, and may be performed by implicit transmission (for example, the notification of the predetermined information is not performed).
- software can be widely construed to refer to commands, a command set, codes, code segments, program codes, a program, a sub-program, a software module, an application, a software application, a software package, a routine, a sub-routine, an object, an executable file, an execution thread, an order, a function, or the like.
- Software, a command, and the like may be transmitted and received via a transmission medium.
- a transmission medium For example, when software is transmitted from a web site, a server, or another remote source using wired technology such as a coaxial cable, an optical fiber cable, a twisted-pair wire, or a digital subscriber line (DSL) and/or wireless technology such as infrared rays, radio waves, or microwaves, the wired technology and/or the wireless technology are included in the definition of a transmission medium.
- wired technology such as a coaxial cable, an optical fiber cable, a twisted-pair wire, or a digital subscriber line (DSL) and/or wireless technology such as infrared rays, radio waves, or microwaves
- Information, a signal or the like described in this specification may be expressed using any of various different techniques.
- data, an instruction, a command, information, a signal, a bit, a symbol, and a chip which can be mentioned in the overall description may be expressed by a voltage, a current, an electromagnetic wave, a magnetic field or magnetic particles, an optical field or photons, or any combination thereof.
- system and “network” which are used in this specification are used interchangeably.
- information, parameters, and the like described in this specification may be expressed as absolute values, may be expressed by values relative to a predetermined value, or may be expressed by other corresponding information.
- any reference to elements having names such as “first” and “second” which are used in this specification does not generally limit amounts or an order of the elements. The terms can be conveniently used to distinguish two or more elements in this specification. Accordingly, reference to first and second elements does not mean that only two elements are employed or that the first element has to precede the second element in any form.
- a single device is assumed to include a plurality of devices unless only one device may be present in view of the context or the technique.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (15)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2017220731 | 2017-11-16 | ||
| JP2017-220731 | 2017-11-16 | ||
| JPJP2017-220731 | 2017-11-16 | ||
| PCT/JP2018/041958 WO2019098185A1 (en) | 2017-11-16 | 2018-11-13 | Dialog text generation system and dialog text generation program |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20210133398A1 US20210133398A1 (en) | 2021-05-06 |
| US11055495B2 true US11055495B2 (en) | 2021-07-06 |
Family
ID=66539513
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/640,104 Expired - Fee Related US11055495B2 (en) | 2017-11-16 | 2018-11-13 | Utterance sentence generation system and utterance sentence generation program |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US11055495B2 (en) |
| JP (1) | JP6840862B2 (en) |
| WO (1) | WO2019098185A1 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2022012300A (en) * | 2020-07-01 | 2022-01-17 | トヨタ自動車株式会社 | Information processor, program, and information processing method |
| CN113157894A (en) * | 2021-05-25 | 2021-07-23 | 中国平安人寿保险股份有限公司 | Dialog method, device, terminal and storage medium based on artificial intelligence |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014219872A (en) | 2013-05-09 | 2014-11-20 | 日本電信電話株式会社 | Utterance selecting device, method and program, and dialog device and method |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0981632A (en) * | 1995-09-13 | 1997-03-28 | Toshiba Corp | Information disclosure device |
| JP4465730B2 (en) * | 1999-01-20 | 2010-05-19 | 日本ビクター株式会社 | Dialogue device |
| JP4185500B2 (en) * | 2005-03-14 | 2008-11-26 | 株式会社東芝 | Document search system, document search method and program |
| US9177318B2 (en) * | 2013-04-22 | 2015-11-03 | Palo Alto Research Center Incorporated | Method and apparatus for customizing conversation agents based on user characteristics using a relevance score for automatic statements, and a response prediction function |
| JP6034459B1 (en) * | 2015-08-14 | 2016-11-30 | Psソリューションズ株式会社 | Interactive interface |
| CN106599998B (en) * | 2016-12-01 | 2019-02-01 | 竹间智能科技(上海)有限公司 | Method and system for adjusting robot answer based on emotional characteristics |
-
2018
- 2018-11-13 WO PCT/JP2018/041958 patent/WO2019098185A1/en not_active Ceased
- 2018-11-13 US US16/640,104 patent/US11055495B2/en not_active Expired - Fee Related
- 2018-11-13 JP JP2019554224A patent/JP6840862B2/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014219872A (en) | 2013-05-09 | 2014-11-20 | 日本電信電話株式会社 | Utterance selecting device, method and program, and dialog device and method |
Non-Patent Citations (2)
| Title |
|---|
| International Preliminary Report on Patentability and Written Opinion dated May 28, 2020 in PCT/JP2018/041958 (English Translation only), 6 pages. |
| International Search Report dated Jan. 29, 2019 in PCT/JP2018/041958 filed on Nov. 13, 2018. |
Also Published As
| Publication number | Publication date |
|---|---|
| US20210133398A1 (en) | 2021-05-06 |
| JP6840862B2 (en) | 2021-03-10 |
| WO2019098185A1 (en) | 2019-05-23 |
| JPWO2019098185A1 (en) | 2020-07-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11664027B2 (en) | Method of providing voice command and electronic device supporting the same | |
| US12164872B2 (en) | Electronic apparatus for recommending words corresponding to user interaction and controlling method thereof | |
| CN108829235B (en) | Voice data processing method and electronic device supporting the same | |
| CN109427333B (en) | Method for activating voice recognition service and electronic device for implementing the method | |
| CN108304846B (en) | Image recognition method, device and storage medium | |
| US10909982B2 (en) | Electronic apparatus for processing user utterance and controlling method thereof | |
| US9520127B2 (en) | Shared hidden layer combination for speech recognition systems | |
| US11137978B2 (en) | Method for operating speech recognition service and electronic device supporting the same | |
| US10217477B2 (en) | Electronic device and speech recognition method thereof | |
| US12307757B2 (en) | Recognition error correction device and correction model | |
| EP3444811B1 (en) | Speech recognition method and device | |
| US11398228B2 (en) | Voice recognition method, device and server | |
| CN110674314A (en) | Sentence recognition method and device | |
| US20160062983A1 (en) | Electronic device and method for recognizing named entities in electronic device | |
| KR20210043894A (en) | Electronic apparatus and method of providing sentence thereof | |
| CN111460117B (en) | Method, device, medium and electronic equipment for generating conversational robot intent corpus | |
| US12008988B2 (en) | Electronic apparatus and controlling method thereof | |
| KR20200084260A (en) | Electronic apparatus and controlling method thereof | |
| EP4310837A1 (en) | Methods for constructing speech recognition model and processing speech, and system | |
| KR20230014802A (en) | Method for recommending designated items | |
| US20220293103A1 (en) | Method of processing voice for vehicle, electronic device and medium | |
| KR20200095947A (en) | Electronic device and Method for controlling the electronic device thereof | |
| US11055495B2 (en) | Utterance sentence generation system and utterance sentence generation program | |
| JP6782329B1 (en) | Emotion estimation device, emotion estimation system, and emotion estimation method | |
| US11862167B2 (en) | Voice dialogue system, model generation device, barge-in speech determination model, and voice dialogue program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: NTT DOCOMO, INC., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TSUNOMORI, YUIKO;REEL/FRAME:051855/0415 Effective date: 20200117 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20250706 |