WO2020174537A1 - 情報処理システムおよび情報処理方法 - Google Patents
情報処理システムおよび情報処理方法 Download PDFInfo
- Publication number
- WO2020174537A1 WO2020174537A1 PCT/JP2019/007075 JP2019007075W WO2020174537A1 WO 2020174537 A1 WO2020174537 A1 WO 2020174537A1 JP 2019007075 W JP2019007075 W JP 2019007075W WO 2020174537 A1 WO2020174537 A1 WO 2020174537A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- unit
- utterance
- information processing
- person
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 76
- 238000003672 processing method Methods 0.000 title claims description 13
- 238000004458 analytical method Methods 0.000 claims abstract description 46
- 230000002787 reinforcement Effects 0.000 claims description 61
- 238000003384 imaging method Methods 0.000 claims description 10
- 238000000034 method Methods 0.000 claims description 10
- 238000010304 firing Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000007613 environmental effect Effects 0.000 abstract description 9
- 230000033001 locomotion Effects 0.000 description 48
- 230000009471 action Effects 0.000 description 43
- 230000008921 facial expression Effects 0.000 description 26
- 238000006243 chemical reaction Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 14
- 238000010411 cooking Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 13
- 238000004364 calculation method Methods 0.000 description 8
- 239000000463 material Substances 0.000 description 8
- 230000008859 change Effects 0.000 description 5
- 230000014509 gene expression Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 4
- 206010041308 Soliloquy Diseases 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 101100494773 Caenorhabditis elegans ctl-2 gene Proteins 0.000 description 2
- 101100112369 Fasciola hepatica Cat-1 gene Proteins 0.000 description 2
- 101100005271 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cat-1 gene Proteins 0.000 description 2
- 101100208039 Rattus norvegicus Trpv5 gene Proteins 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 101100114417 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) con-13 gene Proteins 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 235000011888 snacks Nutrition 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1694—Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
- B25J9/1697—Vision controlled systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J13/00—Controls for manipulators
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J19/00—Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
- B25J19/02—Sensing devices
- B25J19/021—Optical sensing devices
- B25J19/023—Optical sensing devices including video camera means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/12—Hotels or restaurants
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
- G06Q30/015—Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
- G06Q30/016—After-sales
Definitions
- the present invention relates to an information processing system and an information processing method.
- the robot only performs a certain customer service operation for the customer, and grasps the customer's request from the reaction of the customer to the operation. Therefore, there is a problem in that it is not possible to perform an operation according to the customer and its situation, and it is not possible to flexibly serve the customer.
- An object of the present invention is to provide an information processing system and an information processing method capable of flexible customer service.
- the information processing system of the present invention is An imaging unit, An analysis unit that analyzes personal information about a person included in the image captured by the imaging unit, A database that stores the personal information and environment information indicating the environment in which the information processing system is installed; An utterance unit that utters utterance content according to the person information and the environment information, The person information and the environment information are read out from the database, and a first score according to the utterance content is shown for each combination of the read out person information and environment information, indicating the result of the utterance made by the utterance unit. It has a reinforcement learning unit that performs learning and updates based on the result information, The utterance section utters the utterance content associated with the first score having the largest first score for the combination.
- the information processing device has a camera, a robot, and an information processing device,
- the information processing device An analysis unit that analyzes person information about a person included in the image captured by the camera,
- a database that stores the personal information and environment information indicating the environment in which the information processing system is installed;
- An utterance control unit for instructing the robot to utter an utterance content according to the person information and the environment information,
- the person information and the environment information are read from the database, and a first score corresponding to the utterance content is set for each combination of the read person information and the environment information, and a result of the utterance instructed by the utterance control unit is obtained.
- the utterance control unit instructs the robot to utter the utterance content associated with the first score having the largest first score for the combination
- the robot is It has a voice output unit for outputting a voice indicated by the utterance content instructed by the utterance control unit.
- the information processing method of the present invention is An information processing method in an information processing system, comprising: A process of analyzing person information about a person included in an image captured by a camera, A process of reading the personal information and the environment information from a database that stores the personal information and environment information indicating the environment in which the information processing system is installed; A process of uttering the utterance content associated with the first score having the largest first score for the combination of the read out personal information and environment information; Learning is performed based on the result information indicating the result of the utterance performed, and a process of updating the first score is performed.
- FIG. 6 is a diagram for explaining an example of processing for identifying the position of a person imaged by the camera shown in FIG. 5.
- FIG. 6 is a diagram showing an example of association of events and execution tasks, which the execution task selection unit 171 shown in FIG.
- FIG. 5 refers to, which can be grasped from a combination of personal information and environment information stored in a database.
- FIG. 6 is a diagram showing an example of a software configuration in the information processing system shown in FIG. 5. It is a figure which shows an example of the correspondence memorize
- FIG. 9 is a diagram showing an example of the types of utterance data shown in FIG. 8 and an index to be improved by the utterance data. It is a figure which shows an example of the information registered as the speech data shown in FIG. It is a figure which shows an example of the information registered as the speech data shown in FIG. 6 is a flowchart for explaining an example of an information processing method in the information processing system shown in FIG.
- FIG. 1 is a diagram showing a first embodiment of an information processing system of the present invention.
- the information processing system includes an imaging unit 110, an analysis unit 120, a database 130, a speech unit 140, and a reinforcement learning unit 150.
- the image capturing unit 110 captures an image of a target person.
- the analysis unit 120 analyzes person information about a person included in the image captured by the image capturing unit 110.
- the database 130 stores personal information and environment information indicating the environment in which the information processing system is installed.
- the person information is information about a person included in the image captured by the image capturing unit 110, and includes, for example, the position of the person included in the image captured by the image capturing unit 110, gender, age group, facial expression (for example, smile, surprise). Face, sad face, angry face, etc.), height, clothes, race, relationship between people, etc.
- the person information also includes a usage language indicating a language used by the person included in the image captured by the image capturing unit 110, and an order content indicating the content ordered by the person.
- the language used is the information analyzed by the analysis unit 120 based on the sound collected by using a sound collection member (not shown) such as a microphone installed near the imaging unit 110.
- the order content is the content of the order received by an input unit (not shown) for placing an order.
- the person information is personal identification information (for example, personal identification information given to the customer if the person included in the image captured by the image capturing unit 110 is authenticated (identified) as a customer who has been registered by then. , Customer ID number, etc.).
- personal identification information for example, personal identification information given to the customer if the person included in the image captured by the image capturing unit 110 is authenticated (identified) as a customer who has been registered by then. , Customer ID number, etc.
- the customer's past order details ordered products, the number of orders, etc.
- the environmental information indicates the number of people, the current date, the time zone, the climate, the operating status of this system (processing load status), the location classification, the number of orders left, the status of stores, etc. It is information.
- the environment information may be at least one of the above-mentioned information.
- the operating status of this system is, for example, "a customer has ordered a product", “the food is crowded”, “there are no people around the store”, “the food is being cooked”, or “the order is It is information indicating how the system is currently, such as "the remaining number is zero", “the cooking robot placed the product at the provision position", and the like.
- the utterance content indicates a specific phrase of the utterance performed by the utterance unit 140.
- the utterance content indicates the content of utterance around the utterance unit 140, the content of talking to a person included in the image captured by the imaging unit 110, and the like.
- the utterance content is a content that calls for the purpose of attracting customers, a content that attracts the attention of people in the surrounding area, a content that urges a customer who ordered the product to make an additional order, a soliloquy, current affairs news, a description of the product.
- Etc. are the same as the contents of the utterances performed by a salesclerk of a general store according to the situation.
- the score is a value (first score) learned by the reinforcement learning unit 150 based on result information indicating the result of the utterance made by the utterance unit 140.
- This score is updated by the reinforcement learning unit 150 as the reinforcement learning unit 150 performs reinforcement learning.
- the result information is, for example, sales information indicating the reaction of the customer after the utterance by the utterance unit 140, changes in sales content and sales amount, and is information including at least one of them.
- the customer reaction is acquired by the analysis unit 120 analyzing changes in facial expressions regarding a person included in the image captured by the image capturing unit 110.
- the utterance unit 140 utters the utterance content according to the person information and the environment information.
- the utterance unit 140 utters the utterance content associated with the first score having the largest first score for the combination of the person information and the environment information according to the person information and the environment information.
- the reinforcement learning unit 150 reads out person information and environment information from the database 130.
- the reinforcement learning unit 150 performs learning based on the result information indicating the result of the utterance performed by the utterance unit 140, the first score according to the utterance content for each combination that is the combination of the read out person information and the environment information. Go and update.
- FIG. 2 is a diagram showing an example of a score included in the reinforcement learning unit 150 shown in FIG.
- the reinforcement learning unit 150 shown in FIG. 1 is set with a start task according to the operating status of the system in the environment information, and the utterance category according to the started task and the utterance category thereof.
- personal information is shown as “a1”, “a2”, “b1”, “b2”, and “b3”.
- the environmental information is shown as “c1”, “c2”, and “d1”.
- the utterance categories are shown as “Cat1” and “Cat2”. Further, in FIG.
- the utterance contents corresponding to the utterance category “Cat1” are shown as “Con11”, “Con12”, and “Con13”. Further, in FIG. 2, the utterance contents corresponding to the utterance category “Cat2” are shown as “Con21”, “Con22”, and “Con23”. In FIG. 2, if the personal information “a” is sex, “a1” can be male and “a2” can be female. The same applies to other person information and environment information.
- FIG. 3 is a diagram showing an example of input/output of reinforcement learning performed in the reinforcement learning unit 150 shown in FIG.
- the reinforcement learning unit 150 shown in FIG. 1 has a reward calculating unit 1501, an updating unit 1502, and a value function calculating unit 1503.
- the reinforcement learning unit 150 performs reinforcement learning, calculates a reward, and updates based on the result information and sales data (product, quantity, amount, etc.) after the utterance, It is input to the value function calculation unit 1503. Then, the value (score) of each utterance content is output based on the person information and the environment information.
- the value function calculation unit 1503 can be realized by using a neural network, but the analysis method performed by the value function calculation unit 1503 is not particularly specified.
- FIG. 4 is a flowchart for explaining an example of an information processing method in the information processing system shown in FIG.
- the analyzing unit 120 analyzes person information about a person included in the image captured by the image capturing unit 110 (step S2).
- the analysis unit 120 writes the analysis result in the database 130.
- the reinforcement learning unit 150 reads out the person information from the database 130, and calculates the adequacy of the utterance content based on the read out person information, environment information, and utterance content.
- the utterance unit 140 selects the most suitable utterance content (step S3). Specifically, the utterance unit 140 selects the utterance content associated with the score having the maximum score for the combination of the personal information and the environment information stored in the database 130.
- the utterance unit 140 utters the selected utterance content (step S4).
- the reinforcement learning unit 150 performs learning based on the result information after the utterance made by the utterance unit 140 and updates the score (step S5).
- all the same values may be stored, or a value set in advance according to the prediction of the effect of the utterance content on the combination of the personal information and the environmental information may be stored. You may memorize it.
- the utterance according to the imaged person or environment is performed, learning is performed based on the result, and the score of the utterance content is updated using the utterance. Therefore, flexible customer service can be provided.
- FIG. 5 is a diagram showing a second embodiment of the information processing system of the present invention.
- the information processing system in this embodiment includes a camera 111, an information processing device 101, and a robot 201.
- the information processing system shown in FIG. 5 is installed in, for example, a store that provides food and drink such as coffee and snacks, and the robot 201 serves customers.
- the camera 111 captures an image of the surroundings of the store, and the robot 201 performs an utterance or an operation with the person included in the captured image as a customer candidate or customer.
- the camera 111 is an image capturing unit that captures an image of a target person.
- the camera 111 may be a camera that captures a still image or a moving image, or may be a camera with a built-in depth sensor that can acquire depth information. The timing at which the camera 111 captures an image is not specified. In addition, the camera 111 is installed at a position where the relative position of the position where the customer exists to the position where the product is provided can be recognized based on the captured image. The number of cameras 111 is not limited to one. Further, the camera 111 may be capable of freely changing the imaging direction based on control from the outside.
- the information processing device 101 is a device that is connected to the camera 111 and the robot 201 and controls the camera 111 and the robot 201.
- the information processing apparatus 101 may be a PC (Personal Computer) capable of executing software.
- the robot 201 outputs a predetermined sound or performs a predetermined operation based on an instruction from the information processing device 101.
- the robot 201 can perform, for example, cooking and dancing as a predetermined operation.
- the information processing apparatus 101 includes an analysis unit 121, a database 131, a speech control unit 141, a speech system reinforcement learning unit 1511, a motion control unit 161, a motion system reinforcement learning unit 1512, and an execution unit. It has a task 191, an execution task selection unit 171, and an input unit 181. It should be noted that FIG. 5 shows only the main components related to the present embodiment among the components included in the information processing apparatus 101.
- the analysis unit 121 analyzes person information about a person included in an image captured by the camera 111.
- the person information is, for example, the position of the person, sex, age group, facial expression, height, clothes, race, language used, person-to-person relationship, and order contents, as in the first embodiment. Etc.
- an image recognition method generally used for image recognition may be used, and the analysis method is not particularly specified.
- the person information is personal identification information (for example, if the person included in the image captured by the camera 111 is authenticated (identified) as a customer who has been registered by that time (for example, Customer ID number).
- the analysis unit 121 calculates the relative position of the position where the customer is present with respect to the position where the product is provided, based on the position where the camera 111 is installed and the position of the person imaged by the camera 111. In addition, the analysis unit 121 recognizes, as an orderer, a person located in front of an ordering terminal for a customer to input an order at the time of ordering.
- the database 131 stores personal information and environment information indicating the environment of the information processing system.
- the database 131 also stores the execution task information selected by the execution task selection unit 171 according to the task firing condition. A specific example of the stored information will be described later.
- the execution task selection unit 171 selects a task to be executed by the information processing apparatus 101 from a plurality of execution tasks 191, and activates the task, based on the task firing condition.
- the utterance system reinforcement learning unit 1511 updates and controls the utterance category according to the selected and activated execution task 191 and the score according to the utterance content included in the utterance category.
- the utterance system reinforcement learning unit 1511 learns a score according to the utterance content for each combination of the personal information and the environment information read from the database 131, based on the result information indicating the utterance result output by the voice output unit 211. Go and update. Result information indicating the result of the utterance made by the voice output unit 211 is collected, learning is performed based on the collected result information, and the score is updated.
- the learning performed here is the same as that of the first embodiment.
- the score here is a value (first score) learned based on the result information indicating the result of the utterance made by the voice output unit 211.
- This score is updated by the utterance system reinforcement learning unit 1511 by performing reinforcement learning.
- the result information is sales information (for example, Upsell rate or improvement in sales) indicating a reaction (for example, a smile rate) of a customer, a change in sales content or a sales amount after the voice output unit 211 speaks. Rate) and the like, and is information including at least one of them.
- This sales information may indicate the content of sales of the product sold based on the input to the input unit 181.
- the above-described customer reaction is acquired by the analysis unit 121 analyzing based on the person information regarding the person included in the image captured by the camera 111.
- the action-based reinforcement learning unit 1512 updates and controls the score according to the action category according to the selected and activated execution task 191 and the action information included in the action category.
- the motion system reinforcement learning unit 1512 based on the result information indicating the result of the motion performed by the motion execution unit 221, the score according to the motion information for each combination of the personal information and the environment information read from the database 131. Learn and update.
- the motion system reinforcement learning unit 1512 performs learning based on the result information indicating the result of the motion performed by the motion executing unit 221, and updates the score.
- the score here is a value (second score) learned based on the result information indicating the result of the operation performed by the operation execution unit 221. This score is updated by the action-based reinforcement learning unit 1512 by performing reinforcement learning.
- the result information is sales information (for example, Upsell rate or sales improvement) indicating a reaction (for example, a smile rate) of a customer after the operation execution unit 221 has performed an operation, sales content or a change in sales amount. Rate) and the like, and is information including at least one of them.
- sales information for example, Upsell rate or sales improvement
- a reaction for example, a smile rate
- the above-described customer reaction is acquired by the analysis unit 121 analyzing based on the person information regarding the person included in the image captured by the camera 111.
- the utterance control unit 141 associates the first score with the largest value among the first scores output by the utterance system reinforcement learning unit 1511 with each other.
- the voice output unit 211 included in the robot 201 is instructed to utter the specified utterance content.
- the operation control unit 161 associates the second score having the largest value among the second scores output by the operation system reinforcement learning unit 1512 with each other.
- the operation execution unit 221 of the robot 201 is instructed to perform the operation indicated by the acquired operation information.
- the input unit 181 inputs information.
- the input unit 181 may be one that inputs information based on an operation received from the outside, or one that inputs a numerical value calculated inside or outside the information processing apparatus 101.
- the input unit 181 may be used for ordering, and in this case, the ordered product is input based on an operation received from the outside.
- the robot 201 has a voice output unit 211 and an action execution unit 221. It should be noted that FIG. 5 shows only the main components related to the present embodiment among the components included in the robot 201.
- the voice output unit 211 outputs a voice based on an instruction from the utterance control unit 141.
- the audio output unit 211 may be a general speaker. It is preferable that the voice output unit 211 is attached to a position where the voice output as if the robot 201 is talking is heard outside. Note that the number of voice output units 211 is not limited to one, and may be installed at a position that is not inside the robot 201.
- the operation execution unit 221 performs an operation based on the instruction from the operation control unit 161.
- the operation execution unit 221 may be, for example, an arm portion that operates using the motor of the robot 201 or the like.
- the utterance control unit 141 and the voice output unit 211 are collectively referred to as an utterance unit
- the operation control unit 161 and the operation execution unit 221 are collectively referred to as an operation unit.
- the personal information, environment information, and utterance content in this embodiment may be the same as those described in the first embodiment.
- the motion information is information for performing a predetermined motion such as cooking and dancing.
- FIG. 6 is a diagram for explaining an example of processing for specifying the position of a person imaged by the camera 111 shown in FIG.
- cameras 111-1 to 111-3 are installed in a store, and based on the images captured by the cameras 111-1 to 111-3, the analysis unit 121 determines the position of the person. Specify.
- the analysis unit 121 identifies that the person included in the image captured by the camera 111-1 and present in the specific area as viewed from the camera 111-1 is located in the area 1 (Zone 1).
- the analysis unit 121 specifies that a person included in an image captured by the camera 111-2 and present in a specific area as viewed from the camera 111-2 is located in the area 2 (Zone 2). .. Further, the analysis unit 121 specifies that a person included in the image captured by the camera 111-3 and present in a specific area as viewed from the camera 111-3 is located in the area 3 (Zone 3). .. Further, the analysis unit 121 is a person included in an image captured by any of the cameras 111-1 to 111-3, and is a region far from any of the cameras 111-1 to 111-3 capturing the image. The person present inside is specified to be located in area 0 (Zone 0).
- Zone0 Area around the store. There are a mixture of passing customers and interested customers. Zone1: Place of order. Many customers order products. Zone2: Area adjacent to the store. Many customers wait for the product to be completed after ordering. Zone3: The place where the product is provided. Many customers take away finished products.
- Zone0 the correspondence between the defined areas and actions (speech, action) is registered in the database 131 in advance.
- the person present in Zone0 can take the action of calling to the store or performing an action to attract customers. Can be determined.
- Zone1 with the utterance content asking about the product to be ordered, it is possible to take an action for a person existing in Zone1 to perform an utterance or an action to ask about the product to be ordered. it can. In this way, it is possible to prepare an appropriate action for the target person according to the area.
- the boundaries of the respective areas are specified using the coordinates of the four vertices.
- the cameras 111-1 to 111-3 and the Zone 0 to Zone 3 do not necessarily have to be associated with each other.
- the camera 111-2 and the camera 111-3 may capture the customer existing in the Zone 2 and analyze the person information such as the position of the customer captured by the two cameras.
- FIG. 7 is a diagram showing an example of correspondence between events and execution tasks that can be grasped from the combination of personal information and environment information stored in the database 131, which is referred to by the execution task selection unit 171 shown in FIG. is there. This association may be stored in advance in the database 131 shown in FIG. As shown in FIG. 7, the tasks are associated with each other according to the position of the person and the environment information indicating the operating status of the system.
- the cooking task is associated with the event "order is placed.”
- the execution task selection unit 171 selects the cooking task when the event at that time is “order is placed”.
- detailed utterance contents and motion information are associated with the cooking task, and when the cooking task is executed, the robot 201 cooks and an action is performed according to these utterance contents and motion information.
- the utterance contents at this time are, for example, utterances for improving the smile rate and utterances for improving the Repeat rate. These utterance contents are stored in the database 131 in advance.
- order promotion tasks are associated with the events "person enters a specific area” and "there is an order area”.
- the execution task selection unit 171 selects the order promotion task when the events at that time are “person enters the specific area” and “there is the order area”.
- the analysis unit 121 determines whether “there is the order area” by using the information indicating the position of the person. For example, if there is a person in Zone 1 shown in FIG. 6, the analysis unit 121 determines that “there is the order area”.
- detailed utterance contents and operation information are associated with the order promotion task, and when the order promotion task is executed, an action is performed according to these utterance contents and operation information.
- the utterance content at this time is, for example, for utterance for ordering a product or utterance for recommending an order for another product. These utterance contents are stored in the database 131 in advance.
- the customer satisfaction improvement task is associated with the phenomenon that "a person enters a specific area" and "there is other than the order area".
- the execution task selecting unit 171 selects the customer satisfaction improving task when the event at that time is "a person enters the specific area" and "there is other than the order area”.
- the analysis unit 121 determines whether “there is other than the order area” by using the information indicating the position of the person. For example, if a person is in the Zones 2 and 3 shown in FIG. 6, the analysis unit 121 determines that the area is “other than the order area”.
- detailed utterance content and operation information are associated with the customer satisfaction improvement task, and when the customer satisfaction improvement task is executed, actions are performed according to these utterance content and operation information.
- the utterance contents at this time are, for example, utterances for improving the smile rate and utterances for improving the Repeat rate. These utterance contents are stored in the database 131 in advance.
- the task of attracting customers is associated with the events "the number of remaining orders has become zero" and "there are no people around or people with a high reaction rate”.
- the analysis unit 121 determines whether "the number of remaining orders is zero" by using the information indicating the operating status of the system in the environment information. In addition, whether or not “there are no people around” is determined by the analysis unit 121, for example, based on whether there are any people in Zones 0 to 3 shown in FIG.
- the unit 121 determines. Further, a detailed utterance content and operation information are associated with the customer attraction task, and when the customer attraction task is executed, an action is performed according to the utterance content and operation information.
- the motion information at this time is, for example, information for executing a flashy robot operation for attracting customers in accordance with music. This operation information is stored in the database 131 in advance.
- the pinpoint calling task is associated with the events "the number of remaining orders has become zero" and "there are people with a high reaction rate”.
- the execution task selection unit 171 causes the pinpoint call-in task when the events at that time are "the number of remaining orders has become zero" and "there are people with a high reaction rate”.
- the analysis unit 121 determines whether "the number of remaining orders is zero" by using the information indicating the operating status of the system in the environment information.
- the phenomenon that “there is a person having a high reaction rate” is, for example, that a person is included in Zones 0 to 3 shown in FIG.
- the analysis unit 121 determines if the facial expression or movement is of interest in the order. Further, detailed utterance contents and operation information are associated with the pinpoint calling task, and when the pinpoint calling task is executed, actions according to these utterance contents and operation information are performed.
- the utterance content and motion information at this time are, for example, for executing utterances and motions that easily attract a specific person.
- the utterance content and motion information are stored in the database 131 in advance.
- Priority is also given to each execution task as shown in Fig. 7.
- a task with a higher priority than the priority assigned to the task is selected, the process with the higher priority task is interrupted. This is the same as the interrupt processing when the processing is executed sequentially.
- FIG. 8 is a diagram showing an example of a software configuration in the information processing system shown in FIG.
- the information processing apparatus 101 shown in FIG. 5 can realize its operation using software having the configuration shown in FIG.
- the image recognition unit performs person recognition, person position detection, and facial expression recognition on the image captured by the camera. Further, the image recognition unit stores information regarding the recognized person in the person position/expression/relationship/attribute database.
- the relationship is a relationship between a plurality of persons included in the image captured by the camera, and is information indicating, for example, a parent and child, a friend, and the like.
- the attribute is information indicating the characteristics of the person, such as the sex and age group of the person, height, clothes, race, and language used.
- This image recognition unit can be realized by the analysis unit 121 shown in FIG.
- the image recognition unit uses the area definition data to detect the position of the person.
- the area definition data may be, for example, the data described with reference to FIG. 6 or data that defines coordinates at each position in the area and uses the image captured by the camera and the defined coordinates.
- the order management unit that manages the information input from the order terminal where the user inputs an order receives the order, associates the person (the user who made the input) with the order content, and confirms the order status.
- the order management unit manages orders by reading out necessary information from the person position/facial expression/relationship/attribute database and writing necessary data in the person position/facial expression/relationship/attribute database.
- the event detection unit detects an event that triggers processing based on the person information and environment information stored in the person position/expression/relationship/attribute database and the order accepted by the order management unit, and executes the task. Is selected and activated. When the execution task is selected, the task is switched among the cooking task, the customer attraction task, the order task, and the customer satisfaction improving task.
- the utterance system reinforcement learning unit based on the person information and environment information stored in the person position/facial expression/relationship/attribute database, state observation, reward calculation, utterance value function update, and utterance target person/utterance content make a selection.
- the utterance system reinforcement learning unit selects the utterance content from the utterance data stored in advance. Further, the utterance system reinforcement learning unit performs the above-described processing using a database that stores the utterance system learning result.
- the utterance content/target determination unit determines, as the utterance content and the target person, the utterance target/utterance content selected by the utterance system reinforcement learning unit according to the task to be executed.
- the voice synthesis unit synthesizes the utterance content determined by the utterance content/target determination unit as a voice and outputs the voice to the speaker.
- the action system reinforcement learning unit uses the action system learning result data to perform state observation, reward calculation, action value function update, and action selection.
- the action system reinforcement learning unit selects an action from action data stored in advance.
- the motion determining unit determines, as a motion to be executed, one of the motions selected by the motion strengthening learning unit according to the task to be executed.
- the motion instruction unit instructs the robot to perform the motion determined by the motion determination unit.
- FIG. 9 is a diagram showing an example of associations stored in the person position/facial expression/relationship/attribute database shown in FIG.
- a person number is given to each person included in the image captured by the camera, and each data is registered for each person number.
- the items of the person position area type, the person position coordinates, and the certainty of the person position are related information of the existence position of the person.
- the person location area type is, for example, an area such as an order place, a product providing place, a store adjacent to, and a store periphery as described with reference to FIG.
- the certainty of the human position is calculated based on the position of the camera, the characteristics of the camera, the position specifying algorithm, and the like.
- the customer status is information indicating whether the person included in the image captured by the camera is a customer, a prospective customer, a potential customer, an onlooker, or a passerby. ..
- This is the result of the analysis unit 121 performing face authentication of a person, analysis of facial expressions, and analysis based on existing positions and movements.
- the relationship with the stranger number is information indicating a relationship with a person included in the image together with the person, such as a parent, child, friend, and lover.
- the customer past order count and the customer past order content as a result of the analysis unit 121 analyzing the person in the image captured by the camera, if the person is a customer having a registered ID, the customer orders in the past.
- the customer past order count and the customer past order content are obtained based on the information read from the membership card when the person has the system read the membership card. It may be information indicating the content. These are registered in the database when ordering.
- by assigning a customer ID to the customer who ordered the product and registering the ordered product and the number of times it is possible to learn the customer's preference, and to recommend the product when visiting the store again. You can also make utterances and actions to guide you.
- FIG. 10 is a diagram showing an example of the types of utterance data shown in FIG. 8 and indexes to be improved by the utterance data.
- 11A and 11B are diagrams showing an example of information registered as the speech data shown in FIG. 8. These pieces of information are stored in the database 131 shown in FIG.
- the speech data is composed of the following items.
- FIG. 10 shows an arrangement of what utterance should be made to which utterance target according to the result of analysis by the analysis unit 121 of the image captured by the camera 111 and the current operating status of the system. Indicates. Further, FIG. 10 also shows what kind of evaluation result these utterance data improve.
- the evaluation result indicates the result of utterance, and indicates the degree of change in the call-in rate, order rate, Upsell rate, smile rate, and Repeat rate. For example, when the robot puts the product in the providing position or when the customer takes the product away from the providing position, when the person talks to an individual person, a guideline that the utterance that improves the Repeat rate should be given (Fig. In 10, it is indicated by a circle).
- More specific utterance data includes, as shown in FIGS. 11A and 11B, a plurality of specific utterances that should be uttered based on the result of analysis of the image captured by the camera 111 by the analysis unit 121 and the current operating status of the system.
- Utterance content is stored. Some of the utterances are expressed in foreign languages other than Japanese, even if expressions in English, Chinese, Korean, etc. are stored. good.
- the plurality of utterance contents are stored so that they can be selected according to various attributes of the target person. Further, any one of these plural utterance contents is selected according to the evaluation result. That is, the one with the highest evaluation is selected.
- the utterance system learning result data output as a result of utterance includes the following items.
- ⁇ Speech value function learning result data ⁇ Data required for batch learning at multiple stores (speech firing condition, utterance content number, utterance content content, utterance content replacement word, utterance reaction result) These are reinforcement learning based on the variation of the sales content after the utterance and the facial expression of the target person.
- the operation data is composed of the following items. ⁇ Operation content number and operation type (cooking operation, customer attraction operation, customer service operation) ⁇ Operation ignition conditions (store chain, ordered product type, cooking stage, location classification, congestion status, time zone, season, weather/temperature/humidity, special event) -Contents of motion content-Contents of music content-Total playback time-Maximum interrupt-free time-Facial expression during motion
- the motion content number is, for example, a number relating to data for moving the arm of the robot 201.
- the facial expression of the robot 201 has a function of expressing a facial expression on the face of the robot 201 (for example, a display that displays a facial image)
- the facial expression is information indicating the facial expression displayed on the display.
- the facial expression during operation may display the following facial expressions, for example. ⁇ Expressive expressions when talking to a specific person ⁇ Exciting facial expressions when waiting for an order ⁇ Expressive expressions of gratitude when receiving an order ⁇ Move quickly while cooking A look that is alive, a look that is chilling when uttering a soliloquy, and a look that is healed when calling a customer when passing a product
- the motion system learning result data output as a result of the motion is composed of the following items.
- ⁇ Motion value function learning result data ⁇ Data required for batch learning at multiple stores (motion ignition condition, motion content number, motion content content, music content content, motion reaction result) These are reinforced learning based on the change of the sales content after the operation and the facial expression of the target person.
- FIG. 12 is a flowchart for explaining an example of the information processing method in the information processing system shown in FIG.
- the camera 111 captures an image and transmits the captured image to the information processing apparatus 101.
- the analysis unit 121 analyzes the person information regarding the person including the person position included in the image transmitted from the camera 111, and stores the analyzed person information in the database 130 (step S21). For example, the analysis unit 121 analyzes which area the target person is in and which area is being moved from which area among a plurality of areas as shown in FIG.
- the execution task selection unit 171 selects the execution task 191 based on the personal information and the environment information (task firing condition) stored in the database 130 (step S22).
- the operating status of the system included in the environmental information at this time is, for example, a state where cooking is being performed, a state where the store is crowded, a state where the number of remaining orders is zero, a state where customer service is zero, etc. Show.
- the execution task selection unit 171 selects a task to be executed.
- the utterance control unit 141 selects the utterance content according to the score output by the utterance system reinforcement learning unit 1511 according to the execution task 191 selected and activated by the execution task selection unit 171.
- the utterance control unit 141 selects the utterance content having the highest score output from the utterance system reinforcement learning unit 1511 for the combination of the person information and the environment information stored in the database 130.
- the motion control unit 161 selects the motion information according to the score output by the motion system reinforcement learning unit 1512 according to the execution task 191 selected and activated by the execution task selection unit 171.
- the motion control unit 161 selects, for the combination of the person information and the environment information stored in the database 130, the motion information having the largest score output by the motion system reinforcement learning unit 1512 (step S23). ).
- the utterance control unit 141 transmits the selected utterance content to the voice output unit 211, and instructs the voice output unit 211 to utter. Further, the operation control unit 161 transmits the selected operation information to the operation execution unit 221, and instructs the operation execution unit 221 to perform an operation. Then, the voice output unit 211 performs the instructed utterance, and the action execution unit 221 performs the instructed action (step S24).
- the utterance-based reinforcement learning unit 1511 and the action-based reinforcement learning unit 1512 perform reinforcement learning based on changes in the ordered products, the sales content, the facial expression of the target person, and the like received according to the utterance and the action, and update the score. Yes (step S25). For example, when the amount of sales increases, the utterance type reinforcement learning unit 1511 and the action type reinforcement learning unit 1512 increase the scores of the utterance content and action information performed. When the sales amount decreases, the utterance type reinforcement learning unit 1511 and the action type reinforcement learning unit 1512 lower the score of the utterance content and the action information made. This score may be called “reward” in reinforcement learning.
- the store status, the customer status, the attributes of the utterance target, the utterance content and the action content when the utterance and the action are performed are set to the learning state, and the reaction result of the target person to the utterance, the reaction result of the target person to the action, and
- the value function relating to the utterance content and the motion information is updated according to the reward calculation value calculated based on the change in the sales of the product.
- the reinforcement learning carried out may be carried out across multiple stores. That is, the result learned based on the reaction of the customer may be shared by a plurality of stores.
- the management system shown in FIG. 8 may manage the results learned in a plurality of stores as a data group and share the managed learning results in a plurality of stores.
- the unit of learning described above may be each product, each store, or each location area.
- the utterance and the action are performed according to the person information and the environment information of the person included in the image captured by the camera, the learning is performed based on the result, and the utterance content and the action information are used. Will update the score. In other words, depending on the person or environment in which the image was taken, what kind of person, what kind of situation and what kind of utterance and action should be performed to efficiently serve the customer is learned. Therefore, flexible customer service can be provided.
- each function is assigned to each function (processing), but this allocation is not limited to the above. Also, regarding the configuration of the constituent elements, the above-described form is merely an example, and the present invention is not limited to this.
- 101 information processing device 110 imaging unit 111, 111-1 to 111-3 camera 120, 121 analysis unit 130, 131 database 140 utterance unit 141 utterance control unit 150 reinforcement learning unit 161 operation control unit 171 execution task selection unit 181 input unit 191 Execution task 201 Robot 211 Voice output unit 221 Motion execution unit 1501 Reward calculation unit 1502 Update unit 1503 Value function calculation unit 1511 Speech system reinforcement learning unit 1512 Motion system reinforcement learning unit
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Primary Health Care (AREA)
- Human Resources & Organizations (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Manipulator (AREA)
- Image Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
撮像部と、
前記撮像部が撮像した画像に含まれる人物に関する人物情報を解析する解析部と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースと、
前記人物情報と前記環境情報とに応じた発話内容の発話を行う発話部と、
前記データベースから前記人物情報と前記環境情報とを読み出し、該読み出した人物情報と環境情報との組み合わせごとに前記発話内容に応じた第1のスコアを、前記発話部が行った発話の結果を示す結果情報に基づいて学習を行って更新する強化学習部とを有し、
前記発話部は、前記組み合わせについて前記第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行う。
前記情報処理装置は、
前記カメラが撮像した画像に含まれる人物に関する人物情報を解析する解析部と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースと、
前記人物情報と前記環境情報とに応じた発話内容の発話を行うように前記ロボットへ指示する発話制御部と、
前記データベースから前記人物情報と前記環境情報とを読み出し、該読み出した人物情報と環境情報との組み合わせごとに前記発話内容に応じた第1のスコアを、前記発話制御部が指示した発話の結果を示す結果情報に基づいて学習を行って更新する強化学習部とを有し、
前記発話制御部は、前記組み合わせについて前記第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行うように前記ロボットへ指示し、
前記ロボットは、
前記発話制御部から指示された発話内容が示す音声を出力する音声出力部を有する。
情報処理システムにおける情報処理方法であって、
カメラが撮像した画像に含まれる人物に関する人物情報を解析する処理と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースから前記人物情報と前記環境情報とを読み出す処理と、
前記読み出した人物情報と環境情報との組み合わせについての第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行う処理と、
前記行った発話の結果を示す結果情報に基づいて学習を行い、前記第1のスコアを更新する処理とを行う。
(第1の実施の形態)
撮像部110は、対象となる人物の撮像を行う。
解析部120は、撮像部110が撮像した画像に含まれる人物に関する人物情報を解析する。
データベース130は、人物情報と情報処理システムが設置された環境を示す環境情報とを記憶する。
強化学習部150は、データベース130から人物情報と環境情報とを読み出す。強化学習部150は、読み出した人物情報と環境情報との組み合わせである組み合わせごとに発話内容に応じた第1のスコアを、発話部140が行った発話の結果を示す結果情報に基づいて学習を行って更新する。
(第2の実施の形態)
カメラ111は、対象となる人物の撮像を行う撮像部である。カメラ111は、静止画を撮像するものであっても、動画を撮像するものであっても、奥行き情報を取得できる深度センサ内蔵のカメラでも良い。また、カメラ111が撮像を行うタイミングは、特に規定しない。また、カメラ111は、撮像した画像に基づいて、商品を提供する位置に対する、客が存在する位置の相対位置が認識できる位置に設置されている。また、カメラ111の台数は1台に限らない。また、カメラ111は、外部からの制御に基づいて撮像方向を自在に変えられるものであっても良い。
情報処理装置101は、カメラ111およびロボット201と接続され、カメラ111およびロボット201を制御する装置である。例えば、情報処理装置101は、ソフトウェアを実行可能なPC(Personal Computer)であっても良い。
ロボット201は、情報処理装置101からの指示に基づいて、所定の音声を出力したり、所定の動作を行ったりする。ロボット201は、所定の動作として、例えば、調理やダンスを行うことができる。
Zone0:店舗周辺の領域。通り過ぎる客と興味がある客とが混在している。
Zone1:注文場所。商品を注文する注文客が多い。
Zone2:店舗隣接の領域。注文後、商品の出来上がりを待つ客が多い。
Zone3:商品の提供場所。出来上がった商品を持ち去る客が多い。
これらの領域を定義し、定義した領域とアクション(発話、動作)との対応付けをデータベース131にあらかじめ登録しておく。例えば、Zone0と店舗への呼び込みを行う発話内容とを対応付けておくことで、Zone0に存在する人物に対しては、店舗へ呼び込むための発話や集客のための動作を行うというアクションを取ることを決定することができる。また、例えば、Zone1と注文する商品を尋ねる発話内容とを対応付けておくことで、Zone1に存在する人物に対しては、注文する商品を尋ねるための発話や動作を行うというアクションを取ることができる。このように、対象となる人物に対して、その領域に応じた適当なアクションを準備することができる。なお、それぞれの領域の境界は、4頂点座標等を用いて特定する。なお、カメラ111-1~111-3とZone0~Zone3とを必ずしも対応付けておく必要はない。例えば、カメラ111-2とカメラ111-3とがZone2に存在する客を撮像して、2つのカメラで撮像した客の位置等の人物情報を解析するものであっても良い。
・発話コンテンツ番号
・発話コンテンツ種別(挨拶ネタ、独り言ネタ、時事ネタ、商品会話ネタ、個人特定ネタ、顧客褒めネタ、外国人向けネタ、Upsellネタ)
・発話発火条件(店舗チェーン、注文商品種別、調理段階、立地区分、混雑状況、時間帯、季節、天気・気温・湿度、特別イベント、人位置エリア、顧客ステータス、他人との関係、人種・言語、性別、年齢層、表情、服装、身長)
・発話コンテンツ内容(コンテンツ内に差し替えワードを変数で記述可能)
・発話時表情
・発話時動作
・発話価値関数学習結果データ
・複数店舗でのバッチ学習のために必要なデータ(発話発火条件、発話コンテンツ番号、発話コンテンツ内容、発話コンテンツ差し替えワード、発話リアクション結果)
これらは、発話を行った後の、売上内容の変動や、対象人物の表情に基づいて強化学習されていくものである。
・動作コンテンツ番号
・動作種別(調理動作、集客動作、接客動作)
・動作発火条件(店舗チェーン、注文商品種別、調理段階、立地区分、混雑状況、時間帯、季節、天気・気温・湿度、特別イベント)
・動作コンテンツ内容
・音楽コンテンツ内容
・全体再生時間
・割り込み不可最大時間
・動作時表情
動作コンテンツ番号は、例えば、ロボット201のアームを動かすためのデータに関する番号である。動作時表情は、ロボット201の顔の部分に表情を表す機能(例えば、顔の画像を表示するディスプレイ)を具備していれば、そのディスプレイに表示させる顔の表情を示す情報である。動作時表情は、例えば、以下のような顔の表情を表示するものであっても良い。
・特定の人物に話しかけるような発話を行う時には眼力のある表情
・注文待ちの状態である時にはわくわくしている感じの表情
・注文を受けた時には感謝を表現する表情
・調理中はてきぱきと動いている表情
・独り言を発話する時にはニヒルな表情
・商品を渡す際に顧客を呼び出す時には癒される表情
・動作価値関数学習結果データ
・複数店舗でのバッチ学習のために必要なデータ(動作発火条件、動作コンテンツ番号、動作コンテンツ内容、音楽コンテンツ内容、動作リアクション結果)
これらは、動作を行った後の、売上内容の変動や、対象人物の表情に基づいて強化学習されていくものである。
110 撮像部
111,111-1~111-3 カメラ
120,121 解析部
130,131 データベース
140 発話部
141 発話制御部
150 強化学習部
161 動作制御部
171 実行タスク選択部
181 入力部
191 実行タスク
201 ロボット
211 音声出力部
221 動作実行部
1501 報酬計算部
1502 更新部
1503 価値関数計算部
1511 発話系強化学習部
1512 動作系強化学習部
Claims (8)
- 情報処理システムであって、
撮像部と、
前記撮像部が撮像した画像に含まれる人物に関する人物情報を解析する解析部と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースと、
前記人物情報と前記環境情報とに応じた発話内容の発話を行う発話部と、
前記データベースから前記人物情報と前記環境情報とを読み出し、該読み出した人物情報と環境情報との組み合わせごとに前記発話内容に応じた第1のスコアを、前記発話部が行った発話の結果を示す結果情報に基づいて学習を行って更新する強化学習部とを有し、
前記発話部は、前記組み合わせについて前記第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行う情報処理システム。 - 請求項1に記載の情報処理システムにおいて、
前記人物情報と、前記環境情報のうちの当該情報処理システムの稼動状況とを用いたタスク発火条件に基づいて、実行するタスクを選択して起動する実行タスク選択部を有し、
前記発話部は、前記実行タスク選択部が起動したタスクに従って動作する情報処理システム。 - 請求項1に記載の情報処理システムにおいて、
所定の動作を行う動作部を有し、
前記データベースは、前記動作の動作内容を示す動作情報をさらに対応付けて記憶し、
前記動作部は、前記組み合わせについて第2のスコアが最も大きな値の前記第2のスコアと対応付けられた動作情報が示す動作を行い、
前記強化学習部は、前記動作部が行った動作の結果を示す結果情報に基づいて学習を行い、前記第2のスコアを更新する情報処理システム。 - 請求項3に記載の情報処理システムにおいて、
前記環境情報のうち、当該情報処理システムの稼動状況を示すタスク発火条件に基づいて、実行するタスクを選択して起動する実行タスク選択部を有し、
前記動作部は、前記実行タスク選択部が起動したタスクに従って動作する情報処理システム。 - 請求項1から4のいずれか1項に記載の情報処理システムにおいて、
情報を入力する入力部を有し、
前記結果情報は、前記発話を行った後に前記撮像部が撮像した画像に含まれる人物に関する人物情報と、前記入力部への入力に基づいて販売した商品の売上の内容を示す売上情報との少なくとも一方を含む情報処理システム。 - 請求項1から5のいずれか1項に記載の情報処理システムにおいて、
前記環境情報は、前記撮像部が撮像した日時および当該情報処理システムの処理負荷状況の少なくとも1つを含む情報処理システム。 - カメラと、ロボットと、情報処理装置とを有する情報処理システムであって、
前記情報処理装置は、
前記カメラが撮像した画像に含まれる人物に関する人物情報を解析する解析部と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースと、
前記人物情報と前記環境情報とに応じた発話内容の発話を行うように前記ロボットへ指示する発話制御部と、
前記データベースから前記人物情報と前記環境情報とを読み出し、該読み出した人物情報と環境情報との組み合わせごとに前記発話内容に応じた第1のスコアを、前記発話制御部が指示した発話の結果を示す結果情報に基づいて学習を行って更新する強化学習部とを有し、
前記発話制御部は、前記組み合わせについて前記第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行うように前記ロボットへ指示し、
前記ロボットは、
前記発話制御部から指示された発話内容が示す音声を出力する音声出力部を有する情報処理システム。 - 情報処理システムにおける情報処理方法であって、
カメラが撮像した画像に含まれる人物に関する人物情報を解析する処理と、
前記人物情報と当該情報処理システムが設置された環境を示す環境情報とを記憶するデータベースから前記人物情報と前記環境情報とを読み出す処理と、
前記読み出した人物情報と環境情報との組み合わせについての第1のスコアが最も大きな値の前記第1のスコアと対応付けられた発話内容の発話を行う処理と、
前記行った発話の結果を示す結果情報に基づいて学習を行い、前記第1のスコアを更新する処理とを行う情報処理方法。
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217002686A KR20210027396A (ko) | 2019-02-25 | 2019-02-25 | 정보 처리 시스템 및 정보 처리 방법 |
PCT/JP2019/007075 WO2020174537A1 (ja) | 2019-02-25 | 2019-02-25 | 情報処理システムおよび情報処理方法 |
CN201980052330.6A CN112585642A (zh) | 2019-02-25 | 2019-02-25 | 信息处理系统和信息处理方法 |
JP2019536236A JP6667766B1 (ja) | 2019-02-25 | 2019-02-25 | 情報処理システムおよび情報処理方法 |
EP19916903.8A EP3806022A4 (en) | 2019-02-25 | 2019-02-25 | INFORMATION PROCESSING SYSTEM AND INFORMATION PROCESSING PROCEDURES |
US17/257,425 US20210402611A1 (en) | 2019-02-25 | 2019-02-25 | Information processing system and information processing method |
TW108134405A TWI717030B (zh) | 2019-02-25 | 2019-09-24 | 資訊處理系統及資訊處理方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2019/007075 WO2020174537A1 (ja) | 2019-02-25 | 2019-02-25 | 情報処理システムおよび情報処理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020174537A1 true WO2020174537A1 (ja) | 2020-09-03 |
Family
ID=70000623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2019/007075 WO2020174537A1 (ja) | 2019-02-25 | 2019-02-25 | 情報処理システムおよび情報処理方法 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20210402611A1 (ja) |
EP (1) | EP3806022A4 (ja) |
JP (1) | JP6667766B1 (ja) |
KR (1) | KR20210027396A (ja) |
CN (1) | CN112585642A (ja) |
TW (1) | TWI717030B (ja) |
WO (1) | WO2020174537A1 (ja) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11645498B2 (en) * | 2019-09-25 | 2023-05-09 | International Business Machines Corporation | Semi-supervised reinforcement learning |
DE102022121132A1 (de) | 2022-08-22 | 2024-02-22 | Dr. Ing. H.C. F. Porsche Aktiengesellschaft | Verfahren zur Entwicklung eines technischen Bauteils |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006012171A (ja) * | 2004-06-24 | 2006-01-12 | Hitachi Ltd | 生体認識を用いたレビュー管理システム及び管理方法 |
WO2016194173A1 (ja) * | 2015-06-03 | 2016-12-08 | 株式会社日立システムズ | サポート支援システムおよびサポート支援方法ならびにサポート支援プログラム |
JP2018027613A (ja) * | 2016-08-10 | 2018-02-22 | パナソニックIpマネジメント株式会社 | 接客装置、接客方法及び接客システム |
JP2018084998A (ja) | 2016-11-24 | 2018-05-31 | 田嶋 雅美 | 接客システム及び接客方法 |
JP2019018265A (ja) * | 2017-07-13 | 2019-02-07 | 田嶋 雅美 | 接客サービスシステム |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7319780B2 (en) * | 2002-11-25 | 2008-01-15 | Eastman Kodak Company | Imaging method and system for health monitoring and personal security |
JP2005157494A (ja) * | 2003-11-20 | 2005-06-16 | Aruze Corp | 会話制御装置及び会話制御方法 |
US7949529B2 (en) * | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
US8793119B2 (en) * | 2009-07-13 | 2014-07-29 | At&T Intellectual Property I, L.P. | System and method for generating manually designed and automatically optimized spoken dialog systems |
JP2011186351A (ja) * | 2010-03-11 | 2011-09-22 | Sony Corp | 情報処理装置、および情報処理方法、並びにプログラム |
US9956687B2 (en) * | 2013-03-04 | 2018-05-01 | Microsoft Technology Licensing, Llc | Adapting robot behavior based upon human-robot interaction |
JP5704279B1 (ja) * | 2014-10-14 | 2015-04-22 | 富士ゼロックス株式会社 | 関連付プログラム及び情報処理装置 |
US9818126B1 (en) * | 2016-04-20 | 2017-11-14 | Deep Labs Inc. | Systems and methods for sensor data analysis through machine learning |
JP2018067100A (ja) * | 2016-10-18 | 2018-04-26 | 株式会社日立製作所 | ロボット対話システム |
US10289076B2 (en) * | 2016-11-15 | 2019-05-14 | Roborus Co., Ltd. | Concierge robot system, concierge service method, and concierge robot |
JP6642401B2 (ja) * | 2016-12-09 | 2020-02-05 | トヨタ自動車株式会社 | 情報提供システム |
US11170768B2 (en) * | 2017-04-17 | 2021-11-09 | Samsung Electronics Co., Ltd | Device for performing task corresponding to user utterance |
-
2019
- 2019-02-25 KR KR1020217002686A patent/KR20210027396A/ko unknown
- 2019-02-25 CN CN201980052330.6A patent/CN112585642A/zh active Pending
- 2019-02-25 EP EP19916903.8A patent/EP3806022A4/en not_active Withdrawn
- 2019-02-25 JP JP2019536236A patent/JP6667766B1/ja active Active
- 2019-02-25 WO PCT/JP2019/007075 patent/WO2020174537A1/ja unknown
- 2019-02-25 US US17/257,425 patent/US20210402611A1/en not_active Abandoned
- 2019-09-24 TW TW108134405A patent/TWI717030B/zh not_active IP Right Cessation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006012171A (ja) * | 2004-06-24 | 2006-01-12 | Hitachi Ltd | 生体認識を用いたレビュー管理システム及び管理方法 |
WO2016194173A1 (ja) * | 2015-06-03 | 2016-12-08 | 株式会社日立システムズ | サポート支援システムおよびサポート支援方法ならびにサポート支援プログラム |
JP2018027613A (ja) * | 2016-08-10 | 2018-02-22 | パナソニックIpマネジメント株式会社 | 接客装置、接客方法及び接客システム |
JP2018084998A (ja) | 2016-11-24 | 2018-05-31 | 田嶋 雅美 | 接客システム及び接客方法 |
JP2019018265A (ja) * | 2017-07-13 | 2019-02-07 | 田嶋 雅美 | 接客サービスシステム |
Non-Patent Citations (1)
Title |
---|
See also references of EP3806022A4 |
Also Published As
Publication number | Publication date |
---|---|
EP3806022A1 (en) | 2021-04-14 |
CN112585642A (zh) | 2021-03-30 |
EP3806022A4 (en) | 2022-01-12 |
TW202032491A (zh) | 2020-09-01 |
JP6667766B1 (ja) | 2020-03-18 |
KR20210027396A (ko) | 2021-03-10 |
JPWO2020174537A1 (ja) | 2021-03-11 |
TWI717030B (zh) | 2021-01-21 |
US20210402611A1 (en) | 2021-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11089985B2 (en) | Systems and methods for using mobile and wearable video capture and feedback plat-forms for therapy of mental disorders | |
CN110249360B (zh) | 用于推荐产品的装置和方法 | |
CN110447232B (zh) | 用于确定用户情绪的电子设备及其控制方法 | |
US20190236361A1 (en) | Systems and methods for using persistent, passive, electronic information capturing devices | |
US11954150B2 (en) | Electronic device and method for controlling the electronic device thereof | |
CN110249304A (zh) | 电子设备的视觉智能管理 | |
EP3820369B1 (en) | Electronic device and method of obtaining emotion information | |
US20030039379A1 (en) | Method and apparatus for automatically assessing interest in a displayed product | |
JP7151959B2 (ja) | 映像アライメント方法及びその装置 | |
JP7057077B2 (ja) | 購買支援システム | |
CN109933782A (zh) | 用户情绪预测方法和装置 | |
WO2020174537A1 (ja) | 情報処理システムおよび情報処理方法 | |
KR20190076870A (ko) | 연락처 정보를 추천하는 방법 및 디바이스 | |
JP2020091736A (ja) | プログラム、情報処理装置及び情報処理方法 | |
KR102586170B1 (ko) | 전자 장치 및 이의 검색 결과 제공 방법 | |
JP2019045978A (ja) | 対話制御装置、学習装置、対話制御方法、学習方法、制御プログラム、および、記録媒体 | |
JP5910249B2 (ja) | インタラクション装置およびインタラクション制御プログラム | |
CN106951433A (zh) | 一种检索方法及装置 | |
JP2020091824A (ja) | プログラム、情報処理装置及び情報処理方法 | |
KR102290855B1 (ko) | 디지털 사이니지 시스템 | |
JP6972526B2 (ja) | コンテンツ提供装置、コンテンツ提供方法、及びプログラム | |
WO2021095473A1 (ja) | 情報処理装置、情報処理方法及びコンピュータプログラム | |
Siqueira | An adaptive neural approach based on ensemble and multitask learning for affect recognition | |
KR20200069251A (ko) | 대화형 게임을 제공하는 전자 장치 및 그 동작 방법 | |
KR102643720B1 (ko) | 로봇용 인공지능 인터페이스 시스템 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2019536236 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19916903 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20217002686 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2019916903 Country of ref document: EP Effective date: 20210106 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |