WO2024018545A1 - 生成プログラム、生成方法および情報処理装置 - Google Patents
生成プログラム、生成方法および情報処理装置 Download PDFInfo
- Publication number
- WO2024018545A1 WO2024018545A1 PCT/JP2022/028127 JP2022028127W WO2024018545A1 WO 2024018545 A1 WO2024018545 A1 WO 2024018545A1 JP 2022028127 W JP2022028127 W JP 2022028127W WO 2024018545 A1 WO2024018545 A1 WO 2024018545A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- class
- questionnaire
- video data
- person
- relationship
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0203—Market surveys; Market polls
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- the present invention relates to a generation program, a generation method, and an information processing device.
- questionnaire responses are compiled into a database by preparing a questionnaire on a table or the like, or by sending the questionnaire to users at a later date.
- One aspect of the present invention is to provide a generation program, a generation method, and an information processing device that can reduce the amount of processing required to construct a database.
- the generation program acquires video data in a computer and inputs the acquired video data into a machine learning model to determine the class of human behavior included in the video data and the reliability of the class.
- the present invention is characterized in that a process is performed in which question information related to the identified class is generated based on the identified confidence level.
- the amount of processing required for database construction can be reduced.
- FIG. 1 is a diagram showing an example of the overall configuration of an information processing system according to a first embodiment.
- FIG. 2 is a diagram illustrating the reference technology.
- FIG. 3 is a diagram illustrating the information processing apparatus according to the first embodiment.
- FIG. 4 is a functional block diagram showing the functional configuration of the information processing apparatus according to the first embodiment.
- FIG. 5 is a diagram illustrating the customer DB.
- FIG. 6 is a diagram illustrating the questionnaire DB.
- FIG. 7 is a diagram illustrating the analysis result DB.
- FIG. 8 is a diagram illustrating training data.
- FIG. 9 is a diagram illustrating machine learning of a relationship model.
- FIG. 10 is a diagram illustrating generation of an action recognition model.
- FIG. 11 is a diagram illustrating the identification of relationships.
- FIG. 1 is a diagram showing an example of the overall configuration of an information processing system according to a first embodiment.
- FIG. 2 is a diagram illustrating the reference technology.
- FIG. 3 is
- FIG. 12 is a diagram illustrating the identification of relationships using HOID.
- FIG. 13 is a diagram illustrating behavior recognition.
- FIG. 14 is a diagram illustrating generation and transmission of a questionnaire.
- FIG. 15 is a diagram illustrating registration of analysis results.
- FIG. 16 is a flowchart showing the flow of processing according to the first embodiment.
- FIG. 17 is a diagram showing an example of a scene graph.
- FIG. 18 is a diagram illustrating an example of generating a scene graph showing relationships between people and objects.
- FIG. 19 is a diagram illustrating specifying relationships using a scene graph.
- FIG. 20 is a diagram illustrating the behavior recognition model according to the third embodiment.
- FIG. 21 is a diagram illustrating machine learning of the behavior recognition model according to the third embodiment.
- FIG. 22 is a diagram illustrating sending a questionnaire using the behavior recognition model according to the third embodiment.
- FIG. 23 is a diagram illustrating questionnaire transmission according to the fourth embodiment.
- FIG. 24 is a diagram illustrating a specific example of sending a questionnaire according to the fourth embodiment.
- FIG. 25 is a flowchart showing the flow of processing according to the fourth embodiment.
- FIG. 26 is a diagram illustrating a signage questionnaire display example according to the fourth embodiment.
- FIG. 27 is a diagram illustrating an example of the hardware configuration of the information processing device.
- FIG. 28 is a diagram illustrating an example of the hardware configuration of signage.
- FIG. 1 is a diagram showing an example of the overall configuration of an information processing system according to a first embodiment.
- this information processing system includes a store 1, which is an example of a space having an area where products, which are examples of objects, are placed, and a plurality of cameras, each of which is installed at a different location within the store 1. 2 and an information processing device 10 that executes analysis of video data are connected via a network N.
- the network N can be any of various communication networks, such as the Internet or a dedicated line, regardless of whether it is wired or wireless.
- the store 1 is, for example, a supermarket or a convenience store, where products purchased by customers 5 are displayed, and a self-checkout system using electronic payment is used, for example.
- An example of the store 1 is assumed to be an unmanned store where customers 5 are registered in advance and only registered customers 5 can use the store.
- the customer 5 accesses the home page of the operator of the store 1 and registers his/her name, age, contact information (eg, e-mail address, etc.), and payment method (eg, credit card number, etc.).
- the customer 5 can enter the store 1 using the user ID and password issued after registration and the store entry card, and make a purchase by paying using the registered payment method.
- Each of the plurality of cameras 2 is an example of a surveillance camera that images a predetermined area within the store 1, and transmits data of the captured video to the information processing device 10.
- video data may be referred to as "video data.”
- the video data includes a plurality of time-series frames. A frame number is assigned to each frame in ascending chronological order.
- One frame is image data of a still image captured by the camera 2 at a certain timing.
- the information processing device 10 has a customer DB that stores information regarding customers 5 who are permitted to enter the store 1, receives video data from a plurality of cameras 2, and provides various information for improving services for the customers 5.
- This is an example of a computer device that collects data and the like. Name, age, contact information (for example, e-mail address, etc.), payment method (for example, credit card number, etc.), etc. are registered in the customer DB.
- FIG. 2 is a diagram illustrating the reference technology.
- a store clerk 6 hands a customer 5 a questionnaire form to a customer 5 who purchased a product at the store 1 or who entered the store 1 but wanted to purchase the product when leaving the store.
- the customer 5 fills out the questionnaire sheet handed to him and sends it by mail or the like.
- the clerk 7 tallies the questionnaire forms sent from each customer 5 and compiles them into a database. Based on the information compiled in the database in this way, consideration is given to the timing of the sales staff's greetings, the arrangement of products, the expansion of products, etc.
- the reference technology requires a lot of processing such as collecting, examining, and inputting the results of questionnaires, and a large amount of processing is required to construct the database. Additionally, as users want to compile more useful information into a database, there is a tendency for the number of items in questionnaires to increase, which increases the burden on users, and many users do not respond to questionnaires.
- the information processing device 10 can recognize relationships between people, objects, environments, and actions and attributes of people from images inside the store 1, and can digitize and analyze the situation (context) of the sales floor. Reduces the process of creating a database of information. Specifically, the information processing device 10 inputs video data of an area in the store 1 where products are placed into a machine learning model, thereby identifying products of a specific user (customer 5) included in the video data. Identify the relationship between the customer 5 and the product in the behavior of the customer. Subsequently, the information processing device 10 acquires the psychological evaluation of the customer 5 regarding the product for which the relationship has been identified. Thereafter, the information processing device 10 registers the results related to the specified relationship and the psychological evaluation of the customer 5 in association with each other in the database showing the analysis results of the products stored in the storage unit.
- FIG. 3 is a diagram illustrating the information processing device 10 according to the first embodiment.
- the information processing device 10 acquires video data captured inside the store 1, inputs each frame in the video data to a trained machine learning model, and uses Identify relationships.
- the information processing device 10 specifies whether or not a product is purchased, time, location, behavior toward the product (for example, grasping), and the like.
- the information processing device 10 identifies items that could not be identified from the video based on the relationship between the customer 5 and the product as psychological evaluations, generates a questionnaire regarding the psychological evaluation, and Send to the terminal, etc. For example, the information processing device 10 sends a questionnaire to the customer 5 who did not purchase the product, asking the customer 5 why they did not purchase the product.
- the information processing device 10 associates the specific results identified from the video with the questionnaire results and creates a database. For example, the information processing device 10 associates and stores "age, gender, and whether or not the product has been purchased” identified from the video and the questionnaire result "reason for not purchasing the product.”
- the information processing device 10 can recognize the behavior of customers in real time from in-store videos and the like, narrow down the target customers and transmission timing, and automatically send the questionnaire. Therefore, the information processing device 10 can acquire only effective questionnaire results, and therefore can reduce the amount of processing required for database construction.
- FIG. 4 is a functional block diagram showing the functional configuration of the information processing device 10 according to the first embodiment.
- the information processing device 10 includes a communication section 11, a storage section 12, and a control section 20.
- the communication unit 11 is a processing unit that controls communication between other devices, and is, for example, a communication interface.
- the communication unit 11 receives video data and the like from each camera 2, and outputs the processing results of the information processing device 10 to a pre-designated device or the like.
- the storage unit 12 is a processing unit that stores various data and programs executed by the control unit 20, and is realized by, for example, a memory or a hard disk.
- This storage unit 12 stores a customer DB 13, a questionnaire DB 14, a video data DB 15, a training data DB 16, a relationship model 17, a behavior recognition model 18, and an analysis result DB 19.
- the customer DB 13 is a database that stores information regarding the customer 5.
- the information stored here is information on the customer (user) 5 who visits the store 1 and wishes to purchase products, and is collected and registered by performing user registration before visiting the store.
- FIG. 5 is a diagram explaining the customer DB 13.
- the customer DB 13 stores "customer ID, name, age, gender, family structure, notification destination, number of visits to the store, card information" and the like.
- Customer ID is an identifier that identifies customer 5.
- Name, age, gender, family composition, card information is information input by the customer 5 at the time of user registration, and "number of visits” is the number of visits counted at the time of entering the store.
- the questionnaire DB 14 is a database that stores questionnaires to be sent to the customers 5.
- FIG. 6 is a diagram illustrating the questionnaire DB 14. As shown in FIG. 6, the questionnaire to be sent can include a plurality of question items in which questions (Q) are associated with selection items.
- question 1 is a question item that inquires about the customer's age and gender, and the response options are "Female/Male, 20s/30s/40s/50s/60s/ Available for people over 70.
- question 3 is a question item that inquires about the type of purchased product, and "food/daily necessities/other" is prepared as an answer option.
- Each question should be associated with the 5W1H (When, Where, Who, What, Why, How) that indicates the intent of the question. You can also do it. For example, Q1 "Please tell us your age and gender” can be associated with "Who”, and Q6 "Please tell us why you are dissatisfied with the service” can be associated with "Why”. .
- the video data DB 15 is a database that stores video data captured by each of the plurality of cameras 2 installed in the store 1. For example, the video data DB 15 stores video data for each camera 2 or for each captured time period.
- the training data DB 16 is a database that stores various training data used to generate various machine learning models described in the embodiments, including the relationship model 17, the behavior recognition model 18, and the like.
- the training data stored here can include supervised training data to which correct answer information is added and unsupervised training data to which correct answer information is not added.
- the relationship model 17 is an example of a machine learning model that identifies the relationship between a person and an object in the behavior of a specific user toward an object included in the video data.
- the relationship model 17 is a HOID (Human Object Interaction Detection) model generated by machine learning that identifies relationships between people or between people and objects. be.
- HOID Human Object Interaction Detection
- the relationship model 17 is created using a first class representing a first person and a first class representing an area where the first person appears, in accordance with the input of frames in video data. Specify and output the first area information, the second class indicating the second person, the second area information indicating the area where the second person appears, and the relationship between the first class and the second class.
- a model for HOID is used.
- the relationship model 17 When identifying the relationship between a person and an object, the relationship model 17 includes a first class indicating a person and first area information indicating an area where the person appears, and a second class indicating an object and an area where the object appears.
- a model for HOID is used that specifies and outputs second region information indicating the relationship between the first class and the second class.
- the relationships shown here are just examples, and are not limited to simple relationships such as “holding”, but also "holding product A in the right hand”, “putting product B back on the shelf”, and “shopping the product”. This includes complex relationships such as "Add to Cart”.
- the two HOID models described above may be used separately as the relationship model 17, and one HOID model that is generated to identify both person-to-person relationships and person-to-object relationships may be used as the relationship model 17.
- the model may be used.
- the relationship model 17 is generated by the control unit 20, which will be described later, a model generated in advance may be used.
- the behavior recognition model 18 is an example of a machine learning model that performs human skeletal information and behavior recognition from video data. Specifically, the behavior recognition model 18 outputs two-dimensional skeletal information and behavior recognition results in response to input of image data. For example, the behavior recognition model 18 estimates the two-dimensional joint positions (skeletal coordinates) of the head, wrists, hips, ankles, etc. from two-dimensional image data of a person, and recognizes the basic movements and the user-defined This is an example of a deep learning device that recognizes rules that
- the basic movements of a person can be recognized, and the position of the ankle, the direction of the face, and the direction of the body can be acquired.
- Basic movements include, for example, walking, running, and stopping.
- the rules defined by the user include the transition of skeletal information corresponding to each action up to picking up the product. Note that although the behavior recognition model 18 is generated by the control unit 20, which will be described later, data generated in advance may be used.
- the analysis result DB 19 is a database that stores information regarding analysis results collected by the information processing device 10.
- FIG. 7 is a diagram illustrating the analysis result DB 19. As shown in FIG. 7, the analysis result DB 19 stores "ID, name, user information, product, purchase or not, questionnaire results", and the like.
- ID is an identifier that identifies the analysis result.
- Name is the name of the customer 5, and is specified using the customer DB 13 when entering the store or purchasing a product.
- the "user information” includes the age, gender, family structure, etc. of the customer 5, and is specified using the customer DB 13.
- Product is information on a product purchased by the customer 5, and is specified using the customer DB 13 at the time of product purchase.
- Purchase presence/absence is information indicating whether or not a product was purchased when visiting the store, and is specified using the customer DB 13 at the time of product purchase.
- Qualitynaire results are answers to the questionnaire sent by the control unit 20, which will be described later.
- control unit 20 is a processing unit that controls the entire information processing device 10, and is realized by, for example, a processor.
- This control section 20 has a preprocessing section 30 and an operation processing section 40.
- the preprocessing unit 30 and the operation processing unit 40 are realized by an electronic circuit included in a processor, a process executed by the processor, or the like.
- the pre-processing unit 30 is a processing unit that generates each model, rule, etc. using the training data stored in the storage unit 12 before the operation processing unit 40 operates the behavior prediction and questionnaire collection.
- the pre-processing unit 30 is a processing unit that generates the relationship model 17 using training data stored in the training data DB 16.
- a model for HOID using a neural network or the like is generated as the relationship model 17.
- generation of a HOID model that specifies the relationship between a person and an object will be described, but a HOID model that specifies the relationship between people can be generated in the same way.
- FIG. 8 is a diagram illustrating training data. As shown in FIG. 8, each training data includes image data (explanatory variables) serving as input data and correct answer information (objective variables) set for the image data.
- image data explanatory variables
- object variables object variables
- the correct answer information includes the class of the person to be detected (first class), the class of the object to be purchased or operated by the person (second class), and the relationship class indicating the interaction between the person and the object.
- a Bbox Bounding Box: object area information
- the interaction between a person and an object is an example of a relationship between a person and an object.
- a class indicating the other person as the second class area information of the other person as area information of the second class, and area information of the other person as the relationship class. Use relationships.
- FIG. 9 is a diagram illustrating machine learning of the relationship model 17.
- the pre-processing unit 30 inputs the training data to the HOID model and obtains the output result of the HOID model.
- This output result includes the class of the person detected by the HOID model, the class of the object, the relationship (interaction) between the person and the object, and the like.
- the preprocessing unit 30 calculates error information between the correct information of the training data and the output result of the HOID model, and updates the parameters of the HOID model by error backpropagation so that the error becomes smaller. Perform machine learning to perform.
- the pre-processing unit 30 is a processing unit that generates the behavior recognition model 18 using training data. Specifically, the pre-processing unit 30 generates the behavior recognition model 18 through supervised learning using training data with correct answer information (labels).
- FIG. 10 is a diagram illustrating generation of the behavior recognition model 18.
- the pre-processing unit 30 inputs the image data of the basic motion to which the label of the basic motion is attached to the behavior recognition model 18, and the error between the output result of the behavior recognition model 18 and the label is reduced.
- Machine learning of the behavior recognition model 18 is executed as follows.
- the behavior recognition model 18 is a neural network.
- the preprocessing unit 30 changes the parameters of the neural network by executing machine learning of the behavior recognition model 18.
- the action recognition model 18 inputs explanatory variables, which are image data (for example, image data of a person performing a basic action), into the neural network. Then, the action recognition model 18 generates a machine learning model in which the parameters of the neural network are changed so that the error between the output result output by the neural network and the correct data (target variable) that is the label of the basic action is reduced. do.
- the training data includes the following labels: ⁇ Walk'', ⁇ Run'', ⁇ Stop'', ⁇ Stand'', ⁇ Stand in front of the shelf'', ⁇ Pick up an item'', ⁇ Turn your head to the right'', ⁇ It is possible to use image data to which "turn your head to the left", “look up”, “tilt your head downward", etc.
- the generation of the behavior recognition model 18 is just an example, and other methods can be used.
- behavior recognition model 18 behavior recognition disclosed in Japanese Patent Application Publication No. 2020-71665 and Japanese Patent Application Publication No. 2020-77343 can also be used.
- the operation processing unit 40 includes an acquisition unit 41, a relationship identification unit 42, a behavior recognition unit 43, an evaluation acquisition unit 44, and a registration unit 45, and each model prepared in advance by the preprocessing unit 30 This is a processing unit that uses this to send a questionnaire to people appearing in video data.
- the acquisition unit 41 is a processing unit that acquires video data from each camera 2 and stores it in the video data DB 21.
- the acquisition unit 41 may acquire information from each camera 2 at any time or periodically.
- the acquisition unit 41 acquires customer information when the customer 5 enters the store, and outputs it to each processing unit of the operation processing unit 40.
- the acquisition unit 41 acquires a "customer ID" by having the user perform a user card, fingerprint authentication, ID and password, etc. upon entering the store.
- the acquisition unit 41 then refers to the customer DB 13 and acquires the name, age, etc. associated with the "customer ID.”
- the relationship specifying unit 42 is a processing unit that uses the relationship model 17 to execute a relationship specifying process that specifies the relationship between people appearing in video data or the relationship between a person and an object. . Specifically, the relationship specifying unit 42 inputs each frame included in the video data into the relationship model 17, and specifies the relationship according to the output result of the relationship model 17. The relationship specifying unit 42 then outputs the specified relationship to the evaluation acquisition unit 44, the registration unit 45, and the like.
- FIG. 11 is a diagram illustrating the identification of relationships.
- the relationship specifying unit 42 inputs frame 1 into the machine-learned relationship model 17 to determine the first person's class, the second person's class, and the relationship between the people. Identify.
- the relationship identifying unit 42 inputs the frame to the machine-learned relationship model 17 to identify the person class, the object class, and the relationship between the person and the object. In this way, the relationship identifying unit 42 uses the relationship model 17 to identify relationships between people or relationships between people and objects for each frame.
- FIG. 12 is a diagram illustrating the identification of relationships using HOID.
- the relationship specifying unit 42 inputs each frame (image data) included in the video data to the HOID (relationship model 17) and obtains the output result of the HOID.
- the relationship identification unit 42 determines the person's Bbox, the person's class name, the object's Bbox, the object's class name, the probability value of the interaction between the person and the object, and the class name of the interaction between the person and the object. get.
- the relationship identifying unit 42 identifies "person (customer)” and "product (object)” as classes of persons, and establishes a relationship between "person (customer)” and “product (object)”. Identify the gender "customer owns the product”.
- the relationship specifying unit 42 executes the relationship specifying process described above for each subsequent frame such as frame 2 and frame 3, so that the relationship “has product A” and the relationship “pass product A” are determined for each frame. ”, etc.
- the relationship specifying unit 42 can also obtain information about whether or not a product has been purchased from a self-checkout register or from information at the time of leaving the store.
- the relationship identifying unit 42 can also identify information related to the time, place, and relationship of the behavior from the customer's behavior toward the object included in the video data. For example, the relationship identifying unit 42 identifies the time of the frame in the video data for which the relationship has been identified, the location of the camera 2 that captured the video data, and the like.
- the behavior recognition unit 43 is a processing unit that uses the behavior recognition model 18 to recognize the behavior and gestures of a person from video data. Specifically, the behavior recognition unit 43 inputs each frame in the video data to the behavior recognition model 18, and uses the skeletal information and basic movements of each part of the person obtained from the behavior recognition model 18 to determine the behavior of the person. The palm gesture is specified and output to the evaluation acquisition section 44, registration section 45, etc.
- FIG. 13 is a diagram explaining behavior recognition.
- the behavior recognition unit 43 inputs frame 1, which is image data, to the behavior recognition model 18.
- the action recognition model 18 generates skeletal information of each part in response to the input of frame 1, and outputs the motion of each part according to the skeletal information of each part.
- the behavior recognition unit 43 can acquire motion information of each body part, such as "face: facing forward, arms: raised, legs: walking, . . .”.
- the behavior recognition unit 43 also executes recognition processing using the behavior recognition model 18 for each subsequent frame of frame 2 and frame 3, and identifies motion information of each part of the person in the frame for each frame. do. Then, the behavior recognition unit 43 refers to the correspondence between representative gestures and changes in behavior that are stored in association with each other in advance, and uses changes in the behavior recognition results (i.e., motion information of each body part) to make the behavior more specific. It is also possible to identify specific actions and gestures.
- the behavior recognition unit 43 detects a pre-specified "dissatisfied behavior" when the direction of the face moves left or right within 5 frames, or when the product is returned to its original position after 15 frames or more have elapsed since the product was picked up. If a gesture is detected, it can be recognized as a gesture of dissatisfaction. In addition, if the behavior recognition unit 43 detects a pre-specified "satisfied action" such as when the product is put in the cart less than three frames after the product is picked up, the behavior recognition unit 43 may recognize the gesture as "satisfied". can.
- the evaluation acquisition unit 44 is a processing unit that acquires the psychological evaluation of the customer 5 regarding the product whose relationship has been identified by the relationship identification unit 42. Specifically, the evaluation acquisition unit 44 can also employ the “gesture” recognized by the behavior recognition unit 43 as a psychological evaluation.
- the evaluation acquisition unit 44 transmits a questionnaire regarding the psychological index regarding the customer 2's product to the terminal associated with the customer 5, and acquires the response results of the questionnaire received from the terminal as the psychological evaluation of the customer. You can also.
- the evaluation acquisition unit 44 generates a partial questionnaire that inquires about items that are not specified from the behavior of the customer 2 toward the product, among a plurality of items included in the questionnaire stored in the questionnaire DB 14.
- the evaluation acquisition unit 44 can also transmit the partial questionnaire to the customer's terminal and acquire the response results of the questionnaire received from the terminal as the customer's psychological evaluation.
- FIG. 14 is a diagram illustrating generation and transmission of a questionnaire.
- the evaluation acquisition unit 44 uses the customer information (30s, female, number of store visits (10th)) acquired by the acquisition unit 22 to enter “30s” in “age, gender” of questionnaire Q1. , Female” is automatically entered, and "Second time or more” is automatically entered in Questionnaire Q2 "Is this your first visit?".
- the evaluation acquisition unit 44 uses the relationship “Product A, not purchased” between the customer and the product specified by the relationship identification unit 42 to send out a questionnaire Q3 that inquires whether the product has been purchased or not, and a questionnaire Q3 that inquires about the satisfaction level of the purchased product. Questionnaire Q4 is excluded from the questionnaire.
- the evaluation acquisition unit 44 uses the behavior and gesture “dissatisfied” identified by the behavior recognition unit 43 to automatically input “dissatisfied” in the question “Are you satisfied with the service?” of the questionnaire Q5.
- the evaluation acquisition unit 44 uses the relationship between the customer and the product specified by the relationship identification unit 42 “Product A, not purchased” and the behavior and gesture “dissatisfied” specified by the behavior recognition unit 43. to identify why they did not purchase the product and why they were dissatisfied with it. In other words, the evaluation acquisition unit 44 determines that "Why?" corresponds to the customer's psychological evaluation. As a result, the evaluation acquisition unit 44 selects Q6 "Please tell us the reason why you are dissatisfied with the service" which corresponds to "Why?” from among the items included in the questionnaire as the partial questionnaire 61 and stores it in the customer DB 13. Send to the "notification destination" that will be sent.
- the evaluation acquisition unit 44 determines the psychological evaluation of the customer as "The clerk is unfriendly.” Note that the evaluation acquisition unit 44 determines which questionnaire item to select as a partial questionnaire using management data in which at least one of 5W1H is associated with each combination of relationship identification results and behavior recognition results. You can also. Furthermore, since a questionnaire asking "why" is generally the most desired information, the evaluation acquisition unit 44 can also transmit only questionnaire items corresponding to "why" as a partial questionnaire.
- the registration unit 45 matches information related to the relationship between the customer 2 and the product identified by the relationship identification unit 23 and the psychological evaluation of the customer 2 acquired by the evaluation acquisition unit 44 in the analysis result DB 19. This is a processing unit that is registered with the Specifically, the registration unit 45 associates the information related to the specified time, place, and relationship with the response results of the partial questionnaire and registers them in the analysis result DB 19.
- FIG. 15 is a diagram illustrating registration of analysis results.
- the registration unit 45 acquires “female, 30s, visited the store more than once, dissatisfied with the service” which was automatically input by the evaluation acquisition unit 44 from among the questionnaire items, and also acquired the partial questionnaire 61. The result is ⁇ The clerk is unfriendly.'' Then, the registration unit 45 registers the acquired "Female, 30s, Visited the store twice or more, Unsatisfied with service, Unfriendly store staff" in the analysis result DB 19.
- the registration unit 45 registers various information such as the time of the frame in the video data whose relationship has been identified by the relationship identification unit 42 and the location of the camera 2 that captured the video data in the analysis result DB 19. You can also do it. For example, the registration unit 45 registers in the analysis result DB 19 the time "13:00", the location "product shelf YY", and the relationship information "held product A in hand”, “stopped at product shelf YY", etc. You can also do Further, the registration unit 45 can also register only the customer information and the response results of the partial questionnaire in the analysis result DB 19. In other words, the registration unit 45 can register any analysis item desired by the user.
- FIG. 16 is a flowchart showing the flow of processing according to the first embodiment. Note that although the processing from when one customer enters the store to when the customer leaves the store will be described as an example, the operation processing unit 40 is not required to follow one customer, and each camera 2 The above processing can be executed using each video data captured in the image. In that case, the operation processing unit 40 can distinguish each customer by recognizing each person shown in the video data at the time of entering the store and assigning an identifier or the like. In addition, it is assumed that the preliminary processing has been completed.
- the operation processing unit 40 uses the video data and the relationship model 17 to identify the relationship between the customer and the product (S104), and uses the video data and behavior recognition. Using the model 18, the customer's behavior and gestures toward the product are identified (S105).
- the steps from S103 onward are repeated until leaving the store is detected (S106: No), and when leaving the store is detected (S106: Yes), the operation processing unit 40 records the identified relationships, actions, and gestures.
- the content of the questionnaire is determined using the information (S107).
- the operation processing unit 40 transmits a questionnaire (partial questionnaire 61) inquiring about the determined questionnaire contents (S108), and upon receiving the questionnaire results (S109: Yes), generates an analysis result (S110). is registered in the analysis result DB 19 (S111).
- the information processing device 10 can automatically input most of the questionnaire items from the video data and transmit only the questionnaire items that cannot be specified from the video data. Therefore, the information processing device 10 can reduce the burden on customers, increase the number of customers who respond to questionnaires, collect more useful information, and reduce the amount of processing required to construct a database. .
- the information processing device 10 can realize pinpoint questionnaire transmission, it is possible to reduce respondents' aversion to questionnaires and improve the response rate.
- Example 1 an example was explained in which a model for HOID was used as an example of specifying the relationship between a customer and a product.
- the present invention is not limited to this, and each object included in the video data It is also possible to use a scene graph, which is an example of graph data showing the relationship between the following.
- a scene graph is graph data that describes each object (person, product, etc.) included in each image data in video data and the relationship between each object.
- FIG. 17 is a diagram showing an example of a scene graph.
- a scene graph is a directed graph in which objects in image data are nodes, each node has an attribute (for example, the type of object), and relationships between nodes are directed edges. .
- the relationship from the node "person" of the attribute "clerk” to the node “person” of the attribute "customer” is "speak.” In other words, it is defined that there is a relationship such as "a store clerk talks to a customer.” Further, it is shown that the relationship from the node "person” with the attribute "customer” to the node "product” with the attribute "large” is “standing”. In other words, it is defined that there is a relationship such as "a customer stands in front of a shelf of large products.”
- the relationship shown here is just an example. For example, it includes not only simple relationships such as "holding” but also complex relationships such as "holding product A in the right hand.” Note that a scene graph corresponding to relationships between people and a scene graph corresponding to relationships between people and things may be stored, or one scene graph including each relationship may be stored. . Further, although the scene graph is generated by the control unit 20 described later, data generated in advance may be used.
- FIG. 18 is a diagram illustrating an example of generating a scene graph showing relationships between people and things.
- the pre-processing unit 30 inputs the image data to a trained recognition model, and outputs the label "person (male)", the label “drink (green)", the relationship " Get "have”.
- the preprocessing unit 30 acquires that "the man has a green drink.” As a result, the preprocessing unit 30 generates a scene graph that associates the relationship "have” from the node “person” having the attribute "male” to the node “drink” having the attribute "green.” Note that the generation of the scene graph is just an example, and other methods may be used, and the scene graph may be generated manually by an administrator or the like.
- the relationship specifying unit 42 executes a relationship specifying process that specifies the relationship between people appearing in the video data or the relationship between people and objects according to the scene graph. Specifically, for each frame included in the video data, the relationship identifying unit 42 identifies the type of person or object that appears in the frame, searches the scene graph using the identified information, and determines the relationship. Identify. Then, the relationship specifying unit 42 outputs the specified relationship to each processing unit.
- FIG. 19 is a diagram illustrating the identification of relationships using a scene graph.
- the relationship specifying unit 42 determines the relationship within frame 1 by inputting frame 1 into a machine learning model that has undergone machine learning or by using known image analysis for frame 1. Identify the types of people, types of things, number of people, etc. For example, the relationship identifying unit 42 identifies "person (customer)" as the type of person and "product (product A)" as the type of object. Thereafter, the relationship specifying unit 42 determines, according to the scene graph, that the relationship between the node "person” with the attribute "customer” and the node “product A” with the attribute "food” is "Person (customer) is the product (product A)". Specify “having”. The relationship specifying unit 42 also executes the relationship specifying process for each subsequent frame such as frame 2 and frame 3, thereby specifying the relationship for each frame.
- the information processing device 10 according to the second embodiment uses a scene graph generated for each store, for example, so that it can easily be used in a store without relearning according to the store like a machine learning model. It is possible to perform a determination of a relationship suitable for the Therefore, the information processing apparatus 10 according to the second embodiment can easily implement the system installation in this embodiment.
- a machine learning model that performs binary class classification can also be used. That is, as the behavior recognition model 18, a model that detects "lost" behavior that corresponds to the behavior or gesture to which the questionnaire is sent can be used.
- FIG. 20 is a diagram illustrating the behavior recognition model 18 according to the third embodiment. As shown in FIG. 20, the behavior recognition model 18 determines a binary value of class 1 "I was confused about purchasing the product" or class 2 "I was not confused about purchasing the product” in response to the input of image data. Note that the output result of the behavior recognition model 18 includes reliability (for example, probability value) of each class.
- FIG. 21 is a diagram illustrating machine learning of the behavior recognition model 18 according to the third embodiment.
- the preprocessing unit 30 uses "image data" showing a person selecting a product as an explanatory variable, and "I got lost” or "I wasn't confused” as a correct answer label, which is a target variable.
- the training data having the above are input to the behavior recognition model 18, and the output results of the behavior recognition model 18 are obtained.
- the pre-processing unit 30 updates the parameters of the behavior recognition model 18 so that the error between the output result of the behavior recognition model 18 and the correct label becomes smaller. In this way, the pre-processing unit 30 executes training of the behavior recognition model 18 and generates the behavior recognition model 18.
- FIG. 22 is a diagram illustrating sending a questionnaire using the behavior recognition model 18 according to the third embodiment.
- the operation processing unit 40 inputs each frame in the video data captured by the camera 2 to the behavior recognition model 18, and obtains the output result of the behavior recognition model 18.
- the operation processing unit 40 obtains class 1 "I got lost” as the output result of the behavior recognition model 18, and also sets the reliability of class 1 "lost” and the reliability of class 2 "not lost”. If the difference is greater than or equal to the threshold and the output result is highly reliable, sending the questionnaire is suppressed.
- the operation processing unit 40 obtains class 1 "I got lost” as the output result of the behavior recognition model 18, and also sets the reliability of class 1 "lost” and the reliability of class 2 "not lost”. If the difference is less than the threshold and the output result has low reliability, send the questionnaire. Note that, if class 2 "not confused” is acquired as the output result of the behavior recognition model 18, the operation processing unit 40 executes the questionnaire transmission regardless of the difference in reliability.
- the operation processing unit 40 controls the questionnaire transmission according to the reliability when class 1 "I am lost" is specified.
- the operation processing unit 40 can also generate retraining data using the questionnaire results. For example, since the output result obtained by inputting the image data AA to the behavior recognition model 18 is class 1 "I'm lost" and has low reliability, the operation processing unit 40 executes sending a questionnaire and responds to the questionnaire. Suppose that the user receives ⁇ I didn't get lost''. In this case, the operation processing unit 40 can generate training data for retraining that has "image data AA" as the explanatory variable and "didn't get lost” as the objective variable. The pre-processing unit 30 can improve the recognition accuracy of the behavior recognition model 18 by retraining the behavior recognition model 18 using this training data for retraining.
- the questionnaire sent here may be the partial questionnaire 61 described above.
- the recognition result is "class 1 'lost' and high reliability", which is an example of the first condition
- the operation processing unit 40 registers the analysis result using the automatic acquisition described in the first embodiment.
- the recognition result is an example of the second condition
- the questionnaire to be sent may be the entire questionnaire described in Example 1, or may be other question information prepared in advance.
- the operation processing unit 40 does not execute the relationship identification processing of the first embodiment, the action and gesture identification processing, etc., and uses the behavior recognition model 18 of the third embodiment to generate a highly reliable class 1 "lost". It is also possible to send the questionnaire 60 only when detected.
- the behavior recognition model 18 that performs not only binary classification but multi-value classification can also be used.
- the behavior recognition model 18 performs multivalue class classification such as class 1 "very lost”, class 2 "lost”, class 3 "not confused", and class 4 "neutral".
- the behavior recognition model 18 registers the analysis result using the automatic acquisition described in Example 1. .
- the behavior recognition model 18 uses the automatic acquisition and response results of the partial questionnaire described in Example 1. Register analysis results.
- the information processing device 10 according to the third embodiment can control the sending of questionnaires according to the reliability of the recognition results of the behavior recognition model 18, so that the information processing device 10 according to the third embodiment can control the sending of questionnaires according to the reliability of the recognition results of the behavior recognition model 18. Even if the customer's psychological evaluation is a little poor, it is possible to obtain user evaluations based on questionnaires. As a result, the information processing device 10 can collect accurate analysis results.
- the information processing device 10 can send the questionnaire to any location, not just the customer's terminal.
- FIG. 23 is a diagram illustrating questionnaire transmission according to the fourth embodiment.
- the operation processing unit 40 of the information processing device 10 can transmit a questionnaire 60 or a partial questionnaire 61 to the display of the self-checkout register 70 or the signage 80 of the store 2.
- the information processing device 10 uses the positional relationship of the questionnaire response signage, the questionnaire target person, and surrounding people other than the target person, and information on the posture of each person to ensure that only the target person can respond to the questionnaire. Display a questionnaire response screen on the signage only in situations where the questionnaire is not available, and prompt the user to respond to the questionnaire.
- the information processing device 10 identifies the state of the customer with respect to the product among the plurality of people included in the video by analyzing a video shot of a first area including the customer or the product.
- the information processing device 10 generates a questionnaire related to the customer or the product based on the customer's status regarding the product.
- the information processing device 10 specifies the position and orientation of each of the plurality of customers with respect to the signage by analyzing the video shot of the second area including the signage. Thereafter, the information processing device 10 determines, based on the specified position and orientation, that the specific customer is closest to the signage and facing the signage, and that the other customers are farthest from the signage and are facing the signage. Have the signage display a survey for a specific customer when you are not looking at the signage.
- FIG. 24 is a diagram illustrating a specific example of sending a questionnaire according to the fourth embodiment.
- the operation processing unit 40 of the information processing device 10 inputs each image data (each frame) in the video data to the action recognition model 18, and determines the position of the person in each image data. Identify the orientation.
- the operation processing unit 40 determines, based on the processing results of the relationship identification unit 42, the customers who held the product in their hands, the customers who made the payment, and the customers who stayed in front of the product shelf for a certain period of time or more, as survey subjects ( specific customers).
- the operation processing unit 40 determines that the questionnaire subject is facing the signage 80 and is in a position where it can be operated, based on the position and orientation of the person depicted in each image data. At the same time, if it is specified that the person who is not the subject of the questionnaire is not facing the signage 80 and is not in a position where he or she can operate the questionnaire, the questionnaire is displayed on the signage 80.
- the operation processing unit 40 determines that the person to be surveyed is not facing the signage 80 due to the position and orientation of the person depicted in each image data, and is therefore not a target for the questionnaire.
- a questionnaire is not displayed on the signage 80 when it is specified that the person faces the signage 80 and is in an operable position.
- the operation processing unit 40 determines that the questionnaire subject is facing the signage 80 but is in an operable position, depending on the position and orientation of the person depicted in each image data. If it is specified that the person not targeted for the questionnaire is not facing the signage 80, a message is displayed so that the person not targeted for the questionnaire approaches the signage 80.
- FIG. 25 is a flowchart showing the flow of processing according to the fourth embodiment.
- the operation processing unit 40 acquires on-site video data (S201) and analyzes the video data (S202). For example, the operation processing unit 40 identifies relationships, positions and orientations of people, and actions and gestures.
- the operation processing unit 40 executes determination of questionnaire subjects and questionnaire display conditions (S203). For example, the operation processing unit 40 reads predetermined questionnaire contents and subject conditions, and uses the analysis results to determine whether the display conditions are met.
- the operation processing unit 40 determines not to display the questionnaire (S204: No), it repeats S201 and subsequent steps. On the other hand, if the operation processing unit 40 determines that the questionnaire should be displayed (S204: Yes), it displays the questionnaire on a display device such as the signage 80 and receives responses (S205).
- the operation processing unit 40 when the operation processing unit 40 receives the input of the questionnaire response (S206: Yes), it records the questionnaire (S207) and hides the questionnaire (S209). On the other hand, the operation processing unit 40 does not accept input of answers to the questionnaire (S206: No), and displays the questionnaire on a display device such as the signage 80 and accepts answers until a timeout is reached (S208: No) (S205). . Note that the operation processing unit 40 does not accept the input of answers to the questionnaire (S206: No), and when a timeout is reached (S208: Yes), hides the questionnaire (S209).
- FIG. 26 is a diagram illustrating a signage questionnaire display example according to the fourth embodiment.
- the operation processing unit 40 identifies the positions and orientations of the questionnaire subjects and the questionnaire non-subjects based on the positions and orientations of the people depicted in each image data. Then, the operation processing unit 40 displays the questionnaire 62 in the area of the signage 80 that faces the person to be surveyed, and displays the dummy questionnaire 63 in the area of the signage 80 that faces the person not targeted for the questionnaire.
- the operation processing unit 40 registers the response results of the questionnaire 62 as analysis results, and discards the response results of the dummy questionnaire 63. Note that it is also beneficial to manage the response results of the dummy questionnaire 63 as information about the accompanying person.
- the information processing device 10 uses images from a surveillance camera or the like to determine the positions and postures of the signage 80 for answering the questionnaire, the questionnaire subject, and people around them.
- the information processing device 10 satisfies the conditions that the person closest to the signage 80 for answering the questionnaire is the person being surveyed, the person is facing the signage 80, and no one other than the person is facing the signage 80.
- a screen for answering the questionnaire is displayed on the signage 80 only at the bottom.
- the information processing device 10 can prevent a situation in which a person who is not the subject of the questionnaire answers the questionnaire and the quality of the answers deteriorates.
- each component of each device shown in the drawings is functionally conceptual, and does not necessarily need to be physically configured as shown in the drawings. That is, the specific form of distributing and integrating each device is not limited to what is shown in the drawings. In other words, all or part of them can be functionally or physically distributed and integrated into arbitrary units depending on various loads and usage conditions.
- each processing function performed by each device can be realized by a CPU and a program that is analyzed and executed by the CPU, or can be realized as hardware using wired logic.
- FIG. 27 is a diagram illustrating an example of the hardware configuration of the information processing device 10.
- the information processing device 10 includes a communication device 10a, an HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. Furthermore, the parts shown in FIG. 27 are interconnected by a bus or the like.
- the communication device 10a is a network interface card or the like, and communicates with other devices.
- the HDD 10b stores programs and DB that operate the functions shown in FIG.
- the processor 10d reads a program that executes the same processing as each processing unit shown in FIG. 4 from the HDD 10b, etc., and deploys it in the memory 10c, thereby operating a process that executes each function described in FIG. 4, etc. For example, this process executes the same functions as each processing unit included in the information processing device 10. Specifically, the processor 10d reads a program having the same functions as the preprocessing section 30, the operation processing section 40, etc. from the HDD 10b. The processor 10d then executes a process that executes the same processing as the preprocessing unit 30, the operation processing unit 40, and the like.
- the information processing device 10 operates as an information processing device that executes an information processing method by reading and executing a program. Further, the information processing device 10 can also realize the same functions as in the above-described embodiments by reading the program from the recording medium using the medium reading device and executing the read program. Note that the programs in other embodiments are not limited to being executed by the information processing device 10. For example, the above embodiments may be similarly applied to cases where another computer or server executes a program, or where these computers or servers cooperate to execute a program.
- This program may be distributed via a network such as the Internet. Additionally, this program is recorded on a computer-readable recording medium such as a hard disk, flexible disk (FD), CD-ROM, MO (Magneto-Optical disk), or DVD (Digital Versatile Disc), and is read from the recording medium by the computer. It may also be executed by being read.
- a computer-readable recording medium such as a hard disk, flexible disk (FD), CD-ROM, MO (Magneto-Optical disk), or DVD (Digital Versatile Disc)
- FIG. 28 is a diagram illustrating an example of the hardware configuration of the signage 80.
- the signage 80 includes a communication device 80a, a touch panel 80b, an HDD 80c, a memory 80d, and a processor 80e. Furthermore, the parts shown in FIG. 28 are interconnected by a bus or the like.
- the communication device 80a is a network interface card or the like, and communicates with other devices.
- the touch panel 80b displays a questionnaire and accepts responses to the questionnaire.
- the HDD 80c stores various programs and DB.
- the processor 80e reads a program that executes the same process as that described in the fourth embodiment from the HDD 80c, etc., and expands it to the memory 80d, thereby operating a process that executes each process. For example, this process performs functions similar to receiving a survey, displaying a survey, and accepting responses to a survey.
- the signage 80 operates as an information processing device that executes a display method by reading and executing a program. Further, the signage 80 can also realize the same functions as those in the above-described embodiments by reading the program from a recording medium using a medium reading device and executing the read program. Note that the programs in other embodiments are not limited to being executed by the signage 80. For example, the above embodiments may be applied in the same way when another computer or server executes the program, or when these computers or servers cooperate to execute the program.
- This program may be distributed via a network such as the Internet. Further, this program may be recorded on a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, or a DVD, and may be executed by being read from the recording medium by the computer.
- a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, or a DVD
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Multimedia (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Psychiatry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
図1は、実施例1にかかる情報処理システムの全体構成例を示す図である。図1に示すように、この情報処理システムは、物体の一例である商品が配置されるエリアを有する空間の一例である店舗1と、それぞれが店舗1内の異なる場所に設置された複数のカメラ2と、映像データの解析を実行する情報処理装置10とがネットワークNを介して接続される。なお、ネットワークNには、有線や無線を問わず、インターネットや専用線などの様々な通信網を採用することができる。
顧客5に対するサービス向上の施策として、顧客5に対するアンケートが利用される。図2は、参考技術を説明する図である。図2に示すように、店舗1で商品を購入した顧客5または店舗1に入店したが商品を購入したかった顧客5に対して、退店時に店員6がアンケート用紙を顧客5に手渡す。顧客5は、手渡されたアンケート用紙に記入して郵送等で送付する。その後、店員7が、各顧客5から送付されたアンケート用紙の集計を行ってDB化する。このようにしてDB化された情報にしたがって、店員による声掛けのタイミング、商品の配列や商品の拡充等の検討がなされる。
そこで、実施例1にかかる情報処理装置10は、店舗1内の映像から人、モノ、環境、行動の関係性や人の属性を認識し、売場の状況(コンテキスト)をデジタル化して分析可能な情報のDB化の処理を削減する。具体的には、情報処理装置10は、商品が配置された店舗1内のエリアを撮像した映像データを機械学習モデルに入力することで、映像データに含まれる特定のユーザ(顧客5)の商品に対する行動における、顧客5と商品の関係性を特定する。続いて、情報処理装置10は、関係性が特定された商品に対する顧客5の心理的な評価を取得する。その後、情報処理装置10は、記憶部に記憶される商品の分析結果を示すデータベースに、特定された関係性に関連する結果と、顧客5心理的な評価とを対応づけて登録する。
図4は、実施例1にかかる情報処理装置10の機能構成を示す機能ブロック図である。図4に示すように、情報処理装置10は、通信部11、記憶部12、制御部20を有する。
事前処理部30は、運用処理部40による行動予測やアンケート集計の運用に先立って、記憶部12に記憶される訓練データを用いて、各モデルやルールなどの生成を実行する処理部である。
事前処理部30は、訓練データDB16に記憶される訓練データを用いて、関係性モデル17を生成する処理部である。ここでは、一例として、関係性モデル17として、ニューラルネットワークなどを用いたHOID用のモデルを生成する例で説明する。なお、あくまで一例として、人物と物体の関係性を特定するHOID用のモデルの生成について説明するが、人物と人物の関係性を特定するHOID用のモデルについても同様に生成することができる。
事前処理部30は、訓練データを用いて、行動認識モデル18の生成を実行する処理部である。具体的には、事前処理部30は、正解情報(ラベル)付の訓練データを用いた教師あり学習により、行動認識モデル18を生成する。
図4に戻り、運用処理部40は、取得部41、関係性特定部42、行動認識部43、評価取得部44、登録部45を有し、事前処理部30により事前に準備された各モデルを用いて、映像データに写る人物に対してアンケート送信を実行する処理部である。
関係性特定部42は、関係性モデル17を用いて、映像データに写る人物と人物との関係性、または、人物と物体との関係性を特定する関係性特定処理を実行する処理部である。具体的には、関係性特定部42は、映像データに含まれるフレームごとに、各フレームを関係性モデル17に入力し、関係性モデル17の出力結果に応じて、関係性を特定する。そして、関係性特定部42は、特定された関係性を、評価取得部44や登録部45等に出力する。
行動認識部43は、行動認識モデル18を用いて、映像データから人物の行動やしぐさを認識する処理部である。具体的には、行動認識部43は、映像データ内の各フレームを行動認識モデル18に入力し、行動認識モデル18から得られる人物の各部位の骨格情報や基本動作を用いて、人物の行動やしぐさを特定し、評価取得部44や登録部45等に出力する。
図16は、実施例1にかかる処理の流れを示すフローチャートである。なお、ここでは、1人の顧客が入店してから退店するまでの処理を例示として説明するが、運用処理部40は、1人の顧客の追従することは要求されず、各カメラ2に写る各映像データを用いて上記処理を実行することができる。その場合、運用処理部40は、入店時の映像データに写っている各人物を認識して識別子などを付与することで、各顧客を区別することができる。また、事前処理は完了済とする。
上述したように、情報処理装置10は、アンケートの項目数のうち、ほとんどの項目を映像データから自動入力し、映像データから特定できないアンケート項目のみを送信することができる。したがって、情報処理装置10は、顧客の負担を軽減してアンケートに回答する顧客数の増加を実現し、より有益な情報の収集を可能とし、データベースの構築にかかる処理量を削減することができる。
上記実施例で用いた数値例、カメラ数、ラベル名、ルール例、行動例、状態例等は、あくまで一例であり、任意に変更することができる。また、各フローチャートで説明した処理の流れも矛盾のない範囲内で適宜変更することができる。また、上記実施例では、店舗を例にして説明したが、これに限定されるものではなく、例えば倉庫、工場、教室、電車の車内や飛行機の客室などにも適用することができる。
上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。
図27は、情報処理装置10のハードウェア構成例を説明する図である。図27に示すように、情報処理装置10は、通信装置10a、HDD(Hard Disk Drive)10b、メモリ10c、プロセッサ10dを有する。また、図27に示した各部は、バス等で相互に接続される。
図28は、サイネージ80のハードウェア構成例を説明する図である。図28に示すように、サイネージ80は、通信装置80a、タッチパネル80b、HDD80c、メモリ80d、プロセッサ80eを有する。また、図28に示した各部は、バス等で相互に接続される。
11 通信部
12 記憶部
13 顧客DB
14 アンケートDB
15 映像データDB
16 訓練データDB
17 関係性モデル
18 行動認識モデル
19 分析結果DB
20 制御部
30 事前処理部
40 運用処理部
41 取得部
42 関係性特定部
43 行動認識部
44 評価取得部
45 登録部
80 サイネージ
Claims (6)
- コンピュータに、
映像データを取得し、
取得した前記映像データを機械学習モデルに入力することで、前記映像データに含まれる人物の行動のクラスと、クラスの信頼度とを特定し、
特定された前記信頼度に基づいて、特定された前記クラスに関連する質問情報を生成する、
処理を実行させることを特徴とする生成プログラム。 - 前記特定する処理は、
人物と物体とを含む画像データと、人物の物体に対する行動を示すクラスとを含む訓練データとに基づいて、複数のクラスに分類するように機械学習された前記機械学習モデルを取得し、
取得をした前記機械学習モデルに対して前記映像データに含まれる人物を入力することで、複数のクラスのそれぞれにおける信頼度を特定し、
特定をされた前記信頼度が高い最も高い第1のクラスと信頼度が2番目に高い第2のクラスとの信頼度の差を算出し、
算出をされた前記信頼度の差が所定の条件を満たすときに、前記映像データに含まれる人物と、物体に関連する質問情報とを対応づけたアンケートを出力する
ことを特徴とする請求項1に記載の生成プログラム。 - 前記特定する処理は、
特定をされた前記信頼度が高い最も高い第1のクラスと信頼度が2番目に高い第2のクラスとの信頼度の差を算出し、
算出をされた前記信頼度の差が第一の条件を満たすときに、記憶部に記憶されたデータベースが有する複数の項目のうち、特定されたクラスに関連する項目に対して、特定されたクラスに関連する情報を登録し、
算出をされた前記信頼度の差が第二の条件を満たすときに、前記記憶部に記憶されたデータベースが有する複数の項目のうち、特定されたクラスに関連する項目に対して、前記人物が端末を通じて入力をされた前記アンケートへの回答結果を登録する、
ことを特徴とする請求項2に記載の生成プログラム。 - 前記映像データを説明変数、前記アンケートへの回答結果を目的変数とする再訓練用の訓練データを生成し、
前記再訓練用の訓練データを用いて、前記機械学習モデルの再訓練を実行する、
処理を前記コンピュータに実行させることを特徴とする請求項3に記載の生成プログラム。 - コンピュータが、
映像データを取得し、
取得した前記映像データを機械学習モデルに入力することで、前記映像データに含まれる人物の行動のクラスと、クラスの信頼度とを特定し、
特定された前記信頼度に基づいて、特定された前記クラスに関連する質問情報を生成する、
処理を実行することを特徴とする生成方法。 - 映像データを取得し、
取得した前記映像データを機械学習モデルに入力することで、前記映像データに含まれる人物の行動のクラスと、クラスの信頼度とを特定し、
特定された前記信頼度に基づいて、特定された前記クラスに関連する質問情報を生成する、
制御部を有することを特徴とする情報処理装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2024534822A JPWO2024018545A1 (ja) | 2022-07-19 | 2022-07-19 | |
PCT/JP2022/028127 WO2024018545A1 (ja) | 2022-07-19 | 2022-07-19 | 生成プログラム、生成方法および情報処理装置 |
EP22951934.3A EP4560559A4 (en) | 2022-07-19 | 2022-07-19 | GENERATION PROGRAM, GENERATION METHOD AND INFORMATION PROCESSING DEVICE |
US18/999,692 US20250124710A1 (en) | 2022-07-19 | 2024-12-23 | Non-transitory computer-readable recording medium storing generation program, generation method, and information processing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2022/028127 WO2024018545A1 (ja) | 2022-07-19 | 2022-07-19 | 生成プログラム、生成方法および情報処理装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/999,692 Continuation US20250124710A1 (en) | 2022-07-19 | 2024-12-23 | Non-transitory computer-readable recording medium storing generation program, generation method, and information processing device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024018545A1 true WO2024018545A1 (ja) | 2024-01-25 |
Family
ID=89617472
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/028127 WO2024018545A1 (ja) | 2022-07-19 | 2022-07-19 | 生成プログラム、生成方法および情報処理装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20250124710A1 (ja) |
EP (1) | EP4560559A4 (ja) |
JP (1) | JPWO2024018545A1 (ja) |
WO (1) | WO2024018545A1 (ja) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014106628A (ja) * | 2012-11-26 | 2014-06-09 | Hitachi Systems Ltd | 消費者ニーズ分析システム及び消費者ニーズ分析方法 |
JP2017033401A (ja) * | 2015-08-04 | 2017-02-09 | 株式会社 impactTV | 顧客情報収集装置、顧客情報収集システムおよび顧客情報収集方法 |
JP2018151963A (ja) * | 2017-03-14 | 2018-09-27 | オムロン株式会社 | 人物動向記録装置 |
WO2019049216A1 (ja) | 2017-09-05 | 2019-03-14 | 富士通株式会社 | 採点方法、採点プログラムおよび採点装置 |
JP2019121133A (ja) * | 2017-12-28 | 2019-07-22 | 株式会社Epark | 有用情報提供装置、有用情報提供装置の制御方法及び有用情報提供プログラム |
JP2020071665A (ja) | 2018-10-31 | 2020-05-07 | 富士通株式会社 | 行動認識方法、行動認識プログラムおよび行動認識装置 |
JP2020077343A (ja) | 2018-11-09 | 2020-05-21 | 富士通株式会社 | ルール生成装置、ルール生成方法及びルール生成プログラム |
JP2021189701A (ja) * | 2020-05-29 | 2021-12-13 | 富士通株式会社 | 情報処理プログラム、情報処理装置、及び情報処理方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2062206A4 (en) * | 2006-09-07 | 2011-09-21 | Procter & Gamble | METHODS OF MEASURING EMOTIONAL RESPONSE AND PREFERENCE OF CHOICE |
WO2018084577A1 (en) * | 2016-11-03 | 2018-05-11 | Samsung Electronics Co., Ltd. | Data recognition model construction apparatus and method for constructing data recognition model thereof, and data recognition apparatus and method for recognizing data thereof |
WO2019072195A1 (en) * | 2017-10-13 | 2019-04-18 | Midea Group Co., Ltd. | METHOD AND SYSTEM FOR PROVIDING AN EXCHANGE OF CUSTOMIZED INFORMATION ON THE LOCATION |
-
2022
- 2022-07-19 JP JP2024534822A patent/JPWO2024018545A1/ja active Pending
- 2022-07-19 WO PCT/JP2022/028127 patent/WO2024018545A1/ja active Application Filing
- 2022-07-19 EP EP22951934.3A patent/EP4560559A4/en active Pending
-
2024
- 2024-12-23 US US18/999,692 patent/US20250124710A1/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014106628A (ja) * | 2012-11-26 | 2014-06-09 | Hitachi Systems Ltd | 消費者ニーズ分析システム及び消費者ニーズ分析方法 |
JP2017033401A (ja) * | 2015-08-04 | 2017-02-09 | 株式会社 impactTV | 顧客情報収集装置、顧客情報収集システムおよび顧客情報収集方法 |
JP2018151963A (ja) * | 2017-03-14 | 2018-09-27 | オムロン株式会社 | 人物動向記録装置 |
WO2019049216A1 (ja) | 2017-09-05 | 2019-03-14 | 富士通株式会社 | 採点方法、採点プログラムおよび採点装置 |
JP2019121133A (ja) * | 2017-12-28 | 2019-07-22 | 株式会社Epark | 有用情報提供装置、有用情報提供装置の制御方法及び有用情報提供プログラム |
JP2020071665A (ja) | 2018-10-31 | 2020-05-07 | 富士通株式会社 | 行動認識方法、行動認識プログラムおよび行動認識装置 |
JP2020077343A (ja) | 2018-11-09 | 2020-05-21 | 富士通株式会社 | ルール生成装置、ルール生成方法及びルール生成プログラム |
JP2021189701A (ja) * | 2020-05-29 | 2021-12-13 | 富士通株式会社 | 情報処理プログラム、情報処理装置、及び情報処理方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4560559A4 |
Also Published As
Publication number | Publication date |
---|---|
EP4560559A1 (en) | 2025-05-28 |
JPWO2024018545A1 (ja) | 2024-01-25 |
US20250124710A1 (en) | 2025-04-17 |
EP4560559A4 (en) | 2025-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230206633A1 (en) | Computer-readable recording medium, information processing method, and information processing apparatus | |
JP2003271084A (ja) | 情報提供装置および情報提供方法 | |
JP2023505455A (ja) | 人間の社会的行動の分類を決定するための方法およびシステム | |
CN110503457A (zh) | 用户满意度的分析方法及装置、存储介质、计算机设备 | |
EP4386652A1 (en) | Information processing program, information processing method, and information processing device | |
JP2024089580A (ja) | 情報出力プログラム、情報出力方法及び情報処理装置 | |
EP4231252A1 (en) | Information processing program, information processing method, and information processing apparatus | |
CN109074498A (zh) | 用于pos区域的访问者跟踪方法和系统 | |
US20230206693A1 (en) | Non-transitory computer-readable recording medium, information processing method, and information processing apparatus | |
KR20230015272A (ko) | 인공지능을 이용한 무인 정보 단말기, 주문 관리 서버 및 주문 정보 제공방법 | |
JP2020067720A (ja) | 人属性推定システム、それを利用する情報処理装置及び情報処理方法 | |
EP4231222A1 (en) | Information processing program, information processing method, and information processing apparatus | |
EP4231250A1 (en) | Information processing program, information processing method, and information processing apparatus | |
WO2024018545A1 (ja) | 生成プログラム、生成方法および情報処理装置 | |
WO2024018548A1 (ja) | 生成プログラム、生成方法および情報処理装置 | |
JP2024013129A (ja) | 表示制御プログラム、表示制御方法および情報処理装置 | |
EP4125067A1 (en) | Generating program, generation method, and information processing device | |
JP7647427B2 (ja) | 接客検出プログラム、接客検出方法および情報処理装置 | |
WO2022175935A1 (en) | Method and system for visual analysis and assessment of customer interaction at a scene | |
US20230206694A1 (en) | Non-transitory computer-readable recording medium, information processing method, and information processing apparatus | |
US12361715B2 (en) | Non-transitory computer-readable recording medium, generation method, and information processing device | |
US20240020596A1 (en) | Customer service management apparatus, customer service management system, customer service management method, and computer program | |
JP7457049B2 (ja) | サポートシステム、サポート処理装置及びサポート方法 | |
CN113722605A (zh) | 实时兴趣信息的计算方法及系统 | |
JP2023160057A (ja) | 行動追跡装置、及び行動追跡システム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22951934 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2024534822 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022951934 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022951934 Country of ref document: EP Effective date: 20250219 |
|
WWP | Wipo information: published in national office |
Ref document number: 2022951934 Country of ref document: EP |