WO2024009623A1 - Système d'évaluation, dispositif d'évaluation et procédé d'évaluation - Google Patents

Système d'évaluation, dispositif d'évaluation et procédé d'évaluation Download PDF

Info

Publication number
WO2024009623A1
WO2024009623A1 PCT/JP2023/018500 JP2023018500W WO2024009623A1 WO 2024009623 A1 WO2024009623 A1 WO 2024009623A1 JP 2023018500 W JP2023018500 W JP 2023018500W WO 2024009623 A1 WO2024009623 A1 WO 2024009623A1
Authority
WO
WIPO (PCT)
Prior art keywords
person
satisfaction
feature amount
unit
satisfaction level
Prior art date
Application number
PCT/JP2023/018500
Other languages
English (en)
Japanese (ja)
Inventor
孝治 堀内
武志 安慶
裕人 冨田
純子 上田
義照 田中
毅 吉原
康 岡田
Original Assignee
パナソニックIpマネジメント株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニックIpマネジメント株式会社 filed Critical パナソニックIpマネジメント株式会社
Publication of WO2024009623A1 publication Critical patent/WO2024009623A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Definitions

  • the present disclosure relates to an evaluation system, an evaluation device, and an evaluation method.
  • Patent Document 1 discloses a store management system that calculates the employee satisfaction level of a store clerk based on a conversation between the store clerk and a conversation partner.
  • the store management system stores calculation algorithms for calculating employee satisfaction for each type of person who can be a conversation partner.
  • the store management system acquires a conversation between a store employee and a conversation partner, recognizes the store employee's emotion based on the store employee's voice included in the conversation, and determines the type of conversation partner.
  • the store management system calculates the employee's satisfaction level based on the recognition result of the employee's emotion and a calculation algorithm corresponding to the determined type of conversation partner.
  • emotional information is calculated from voice data based on a conversation between people (for example, a store clerk and a customer), and the store clerk's satisfaction level (that is, employee satisfaction level) is calculated from the calculated emotional information.
  • the store clerk's satisfaction level that is, employee satisfaction level
  • evaluating a person's satisfaction level based only on voice data included in a conversation may not be accurate enough, and a more accurate satisfaction evaluation is required.
  • the present disclosure was devised in view of the conventional situation described above, and aims to perform highly accurate satisfaction evaluation using multiple pieces of information included in conversations between people.
  • the present disclosure includes: an acquisition unit that acquires audio data related to a conversation between a first person and a second person; an imaging unit that images the first person and the second person; an extraction unit that extracts a first feature amount related to the line of sight or face direction of each of the first person and the second person based on imaging data, and a second feature amount of the audio data; and calculating the satisfaction level of the first person based on the first feature amount, the second feature amount, and a calculation algorithm for calculating the satisfaction level of the first person.
  • An evaluation system is provided, comprising a satisfaction level calculation section.
  • the present disclosure acquires audio data related to a conversation between a first person and a second person, and imaging data obtained by imaging the first person and the second person, and based on the imaging data.
  • an extraction unit that extracts a first feature amount related to the line of sight or face direction of the first person and the second person, and a second feature amount of the audio data;
  • a satisfaction calculation unit that calculates the satisfaction level of the first person based on the first feature amount, the second feature amount, and a calculation algorithm for calculating the satisfaction level of the first person;
  • an evaluation device comprising:
  • the present disclosure acquires audio data related to a conversation between a first person and a second person, images the first person and the second person, and based on the imaged data, the first person A first feature amount related to the line of sight or face direction of the person and the second person, and a second feature amount of the audio data are extracted, and the first feature amount and the second person are extracted.
  • An evaluation method is provided that calculates the degree of satisfaction of the first person based on a second feature amount and a calculation algorithm for calculating the degree of satisfaction of the first person.
  • Diagram showing an overview of this embodiment Diagram showing an example of feature quantities A block diagram showing an example of the internal configuration of a terminal device and a server according to Embodiment 1. Diagram showing how satisfaction is calculated for logic-based algorithms Diagram showing an example of calculating satisfaction level at predetermined time intervals Sequence diagram of satisfaction evaluation processing according to Embodiment 1 A diagram showing an example of the internal configuration of a terminal device and a server according to Embodiment 2.
  • Sequence diagram of satisfaction evaluation processing according to Embodiment 2 A diagram showing an example of the internal configuration of a terminal device according to Embodiment 3 Flowchart showing the process of calculating satisfaction level on a terminal device Sequence diagram showing the process of calculating the satisfaction level on the server from data captured in images and sounds in the past Diagram showing an example of a screen displayed on a terminal device Diagram showing an example of a screen where a message is displayed depending on the satisfaction result
  • FIG. 1 is a diagram showing an overview of this embodiment.
  • FIG. 1 shows a case where a person A is having a conversation with a person B using a terminal device 1 that is connected via a network to the terminal device used by the person B.
  • Person A and Person B have an interpersonal relationship; for example, Person A is Person B's subordinate, and Person B is Person A's superior.
  • the relationship between Person A and Person B is not limited to that between a boss and a subordinate, but may be between employees and customers, between colleagues, between an interviewer and an interviewee, or in any other relationship (for example, between a teacher and a student).
  • person B who is a boss, interviews person A, who is a subordinate, online.
  • person A may be read as a first person
  • person B may be read as a second person.
  • the audio acquisition device 10 is, for example, a microphone, and picks up the person A's utterance CO “ ⁇ .”
  • the audio acquisition device 10 may be installed in the terminal device 1 or may be an external device communicably connected to the terminal device 1.
  • the data collected by the audio acquisition device 10 will be referred to as audio data.
  • the imaging device 11 is, for example, a camera, and images the person A.
  • the imaging device 11 may be installed in the terminal device 1 or may be an external device communicably connected to the terminal device 1.
  • imaging data the data of the person A captured by the imaging device 11 will be referred to as imaging data.
  • the terminal device 1 transmits the audio data acquired by the audio acquisition device 10 and the imaging data acquired by the imaging device 11 to a device that extracts feature amounts.
  • the device that extracts the feature amount is, for example, a server. Note that the terminal device 1 may extract the feature amount without transmitting the audio data and the imaging data to the server.
  • the feature amounts extracted from the imaging data and audio data are, for example, facial expressions, line of sight, speech, or actions. Note that the feature amounts extracted from the imaging data and audio data are not limited to these.
  • Information on facial expression or line of sight is extracted from image FR1 representing the face of person A in the captured image data.
  • Information regarding the behavior is extracted from the image FR2 representing the upper body of the person A in the image data.
  • Information related to the utterance is extracted from the audio data.
  • Information related to facial expressions, line of sight, speech, or actions is extracted by the terminal device 1 or the server 2 (see FIG. 7), and will be described in detail later.
  • the degree of satisfaction is calculated using the extracted feature data (hereinafter referred to as feature amount data) and an algorithm for estimating the degree of satisfaction (hereinafter referred to as satisfaction degree estimation algorithm).
  • Satisfaction is an index representing the degree of satisfaction of person A with the conversation with person B, which is estimated by a satisfaction estimation algorithm based on feature data.
  • Satisfaction estimation algorithms include algorithms based on predetermined logic (hereinafter referred to as logic-based algorithms) and algorithms based on machine learning (hereinafter referred to as machine learning-based algorithms).
  • a logic-based algorithm is an algorithm that defines a procedure for calculating satisfaction by repeatedly adding and subtracting points based on predetermined logic.
  • Machine learning-based algorithms are, for example, algorithms that use deep learning based on multilayer perceptrons, random forests, or convolutional neural networks as a configuration and directly output satisfaction levels from feature data.
  • Person B can confirm that Person A is highly satisfied by checking the satisfaction calculated from the audio data and image data. Able to communicate smoothly. Note that the conversation between person A and person B is not limited to an online interview or the like, but may be a face-to-face conversation.
  • FIG. 2 is a diagram showing an example of feature amounts.
  • facial expressions include a smiling face, a straight face, or a crying face.
  • Actions include nodding, standing still, or tilting one's head.
  • Embodiment 1 The evaluation system 100 in Embodiment 1 extracts user feature data using a terminal device when users (for example, person A and person B) are having an online conversation, and extracts the extracted feature data. is sent to the server and the satisfaction level is calculated by the server.
  • users for example, person A and person B
  • the satisfaction level is calculated by the server.
  • FIG. 3 is a block diagram showing an example of the internal configuration of the terminal device and the server according to the first embodiment.
  • the evaluation system 100A includes at least a terminal device 1A and a server 2A.
  • the number of terminal devices is not limited to one, but may be two or more than two.
  • the terminal device 1A is an example of a terminal used by a user. Note that when distinguishing between an evaluation system, a terminal device, and a server, an alphabet is added after the number. In addition, when the evaluation system, terminal device, and server are not distinguished, only numbers will be used in the description.
  • the terminal device 1A and the server 2A are communicably connected via the network NW.
  • the terminal device 1A and the server 2A may be communicably connected via a wired LAN (Local Area Network).
  • the terminal device 1A and the server 2A may perform wireless communication (for example, wireless LAN such as Wi-Fi (registered trademark)) without going through the network NW.
  • the terminal device 1A includes at least a communication I/F 13, a memory 14, an input device 15, a display device 16, an I/F 17, an audio acquisition device 10, an imaging device 11, and a processor 12.
  • the terminal device 1A is a PC (Personal Computer), a tablet, a mobile terminal, a housing including the audio acquisition device 10 and the imaging device 11, or the like.
  • the communication I/F 13 is a network interface circuit that performs wireless or wired communication with the network NW.
  • I/F represents an interface.
  • the terminal device 1A is communicably connected to the server 2A via the communication I/F 13 and the network NW.
  • the communication I/F 13 transmits the feature amount data extracted by the feature amount extraction unit 12A (see below) to the server 2A.
  • Communication methods used for communication in the communication I/F 13 include, for example, WAN (Wide Area Network), LAN (Local Area Network), LTE (Long Term Evolution), mobile communication such as 5G, power line communication, and short-range wireless communication.
  • Communication for example, Bluetooth (registered trademark) communication), communication for mobile phones, etc.
  • the memory 14 includes, for example, a RAM (Random Access Memory) as a work memory used when executing each process of the processor 12, and a ROM (Read Only Memory) that stores programs and data that define the operations of the processor 12. has. Data or information generated or acquired by the processor 12 is temporarily stored in the RAM. A program that defines the operation of the processor 12 is written in the ROM.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • the input device 15 receives input from a user (for example, person A or person B).
  • the input device 15 is, for example, a touch panel display or a keyboard.
  • the input device 15 accepts operations in response to instructions displayed on the display device 16.
  • the display device 16 displays a screen (see below) created by the drawing screen creation unit 24B of the server 2.
  • the display device 16 is, for example, a display or a notebook PC monitor.
  • the I/F 17 is a software interface.
  • the I/F 17 is communicably connected to the communication I/F 13, memory 14, input device 15, display device 16, audio acquisition device 10, imaging device 11, and processor 12, and exchanges data with each device. .
  • the I/F 17 may be omitted from the terminal device 1A, and data may be exchanged between the devices of the terminal device 1A.
  • the audio acquisition device 10 picks up the utterances of a user (for example, person A or person B).
  • the audio acquisition device 10 is configured with a microphone device that can collect audio generated based on a user's utterance (that is, detect an audio signal).
  • the audio acquisition device 10 collects audio generated based on a user's utterance, converts it into an electrical signal as audio data, and outputs the electrical signal to the I/F 17 .
  • the imaging device 11 is a camera that images a user (for example, person A or person B).
  • the imaging device 11 includes at least a lens (not shown) as an optical element and an image sensor (not shown).
  • the lens receives light reflected by the object from within the angle of view of the imaged area of the imaging device 11 and forms an optical image of the object on the light receiving surface (in other words, the imaging surface) of the image sensor.
  • the image sensor is, for example, a solid-state imaging device such as a CCD (Charged Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor).
  • the image sensor converts an optical image formed on an imaging surface through a lens into an electrical signal and sends it to the I/F 17 at predetermined time intervals (for example, 1/30 seconds).
  • the audio acquisition device 10 and the imaging device 11 may be external devices that are communicably connected to the terminal device 1A.
  • the processor 12 is, for example, a CPU (Central Processing Unit), a DSP (Digital Signal Processor), a GPU (Graphical Processing Unit), or an FPGA (Field Processing Unit). It is a semiconductor chip on which at least one of electronic devices such as a programmable gate array is mounted.
  • the processor 12 functions as a controller that controls the overall operation of the terminal device 1A, and performs control processing for unifying the operation of each part of the terminal device 1A, data input/output processing with the I/F 17, and data calculation. Performs processing and data storage.
  • the processor 12 realizes the function of the feature extraction unit 12A.
  • the processor 12 uses the RAM of the memory 14 during operation, and temporarily stores data generated or acquired by the processor 12 in the RAM of the memory 14.
  • the feature amount extraction unit 12A extracts feature amounts (see FIG. 2) based on the audio data acquired from the audio acquisition device 10 and the image data acquired from the imaging device 11.
  • the feature amount extraction unit 12A extracts each feature amount from the audio data and the imaging data using, for example, trained model data for AI (Artificial Intelligence) processing stored in the memory 14 (in other words, based on AI). may be extracted.
  • trained model data for AI Artificial Intelligence
  • the feature amount extraction unit 12A detects the face part of the person A from the image data, and also detects the direction (in other words, the line of sight) of both eyes (that is, the left eye and the right eye) of the detected face part.
  • the feature extracting unit 12A detects the line of sight of the person A who is viewing the screen displayed on the display device 16 (for example, the captured video of the person B).
  • the line of sight detection method can be realized using publicly known technology, for example, it may be detected based on the difference in the orientation of both eyes reflected in each of a plurality of captured images (frames), or it may be realized using other detection methods. good.
  • the feature extracting unit 12A detects the facial part of the person A from the image data and also detects the direction of the face.
  • the direction of the face is the angle of the face with respect to a specific location on the display device 16 (for example, the center position of the panel of the display device 16).
  • the angle of the face is an azimuth angle and an elevation angle indicating the three-dimensional direction in which the face of the person A looking at the specific location exists, as viewed from a specific location on the display device 16 (see above).
  • This is a vector representation composed of . Note that the specific location does not have to be limited to the center position of the panel.
  • the face direction detection method can be realized using known techniques.
  • the feature extracting unit 12A detects the face part of the person A from the image data and also detects the facial expression of the person A.
  • the facial expression detection method can be realized using known techniques.
  • the feature extraction unit 12A detects the motion of person A from the image data.
  • the motion detection method can be realized using known techniques.
  • the feature extraction unit 12A detects the speaking time of person A from the voice data.
  • the speaking time may be detected by, for example, integrating the time of a portion of the voice data in which the voice signal of person A is detected. Note that the speech time detection method may be implemented using other known techniques. Furthermore, the feature extracting unit 12A calculates the rate at which person A and person B are speaking, based on the detected speaking time.
  • the feature extraction unit 12A detects the emotion of person A from the voice data.
  • the feature extraction unit 12A detects the emotion by detecting, for example, the intensity of the voice, the number of moras per unit time, the intensity of each word, the volume, or the spectrum of the voice, etc. from the voice data.
  • the emotion detection method is not limited to this, and may be realized by other known techniques.
  • the server 2A includes a communication I/F 21, a memory 22, an input device 23, an I/F 26, and a processor 24.
  • the communication I/F 21 transmits and receives data to and from each of the one or more terminal devices 1 via the network NW.
  • the communication I/F 21 transmits data of a screen output from the I/F 26 to be displayed on the display device 16 to the terminal device 1A.
  • the memory 22 includes, for example, a RAM as a work memory used when the processor 24 executes each process, and a ROM that stores programs and data that define the operations of the processor 24. Data or information generated or acquired by the processor 24 is temporarily stored in the RAM. A program that defines the operation of the processor 24 is written in the ROM.
  • the memory 22 also stores a satisfaction level estimation algorithm.
  • the input device 23 receives input from a user (for example, an administrator of the evaluation system 100).
  • the input device 15 is, for example, a touch panel display or a keyboard.
  • the input device 15 accepts the setting of a threshold value (see below) for a logic-based algorithm.
  • the I/F 26 is a software interface.
  • the I/F 26 is communicably connected to the communication I/F 21, the memory 22, the input device 23, and the processor 24, and exchanges data with each device. Note that the I/F 26 may be omitted from the server 2A, and data may be exchanged between the devices of the server 2A.
  • the processor 24 is a semiconductor chip on which at least one of electronic devices such as a CPU, a DSP, a GPU, and an FPGA is mounted.
  • the processor 24 functions as a controller that governs the overall operation of the server 2A, and performs control processing for unifying the operation of each part of the server 2A, data input/output processing with the I/F 26, data arithmetic processing, and Performs data storage processing.
  • the processor 24 implements the functions of the satisfaction level estimation section 24A and the drawing screen creation section 24B.
  • the processor 24 uses the RAM of the memory 22 during operation, and temporarily stores data generated or obtained by the processor 24 in the RAM of the memory 22.
  • the satisfaction estimation unit 24A calculates the satisfaction of the person A using the feature amount data acquired from the terminal device 1A and the satisfaction estimation algorithm recorded in the memory 22.
  • the satisfaction level estimation unit 24A may calculate the satisfaction level using a logic-based algorithm, or may calculate the satisfaction level using a machine learning-based algorithm.
  • the satisfaction estimation unit 24A outputs information regarding the calculated satisfaction level to the drawing screen creation unit 24B.
  • the drawing screen creation unit 24B creates a screen to be displayed on the display device 16 of the terminal device 1A using the satisfaction level acquired from the satisfaction level estimation unit 24A.
  • the screen includes, for example, a captured video of person A, information regarding satisfaction, a button for controlling the start of satisfaction evaluation, and the like. Note that the items included on the screen are not limited to these.
  • Methods of displaying information regarding satisfaction include displaying satisfaction values calculated at predetermined time intervals by plotting them numerically or on a graph each time, or displaying satisfaction values during or at the end of a meeting. .
  • the graph regarding the satisfaction value is a graph in which values are plotted, a bar graph, a meter, or the like.
  • the drawing screen creation unit 24B outputs the created screen to the I/F 26.
  • FIG. 4 is a diagram showing a method for calculating the satisfaction level of the logic-based algorithm.
  • satisfaction is calculated by adding and subtracting points according to a predetermined agreement (hereinafter referred to as a determination method).
  • the feature amounts used in the determination method are referred to as determination elements.
  • Points are added and subtracted at predetermined time intervals, throughout the conversation, from the start of the conversation to the current time, or at the last 30% of the conversation. Note that the range of time during which points are added and points are subtracted is not limited to these and may be arbitrarily determined by the user.
  • the speech rate represents the percentage of time that a user (for example, person A or person B) speaks within a predetermined period of time. For example, if the user speaks for a total of 1.0 seconds out of 2.5 seconds, the speaking rate is 1.0/2.5, which is 0.4 (that is, 40%).
  • the speech rate can be calculated, for example, by extracting the user's speech time at specific time intervals and dividing the total of the extracted speech times by the extracted total time. Note that the method of calculating the speech rate is one example and is not limited to this. Calculation of the utterance rate may be performed by the feature amount extraction section 12A, or may be performed by the satisfaction level estimation section 24A based on the feature amount data acquired from the feature amount extraction section 12A.
  • the satisfaction estimation algorithm adds 0.5 points to the satisfaction level. Note that the numerical values added and subtracted below are merely examples, and are not limited to 0.5, but may be any predetermined value. If the speech rate of person A is less than the speech rate of person B, the satisfaction estimation algorithm deducts 0.5 points from the satisfaction level.
  • points may be added or subtracted by taking into consideration not only the speech rate of person A relative to the speech rate of person B, but also whether the speech rate of person A is equal to or higher than a preset threshold. That is, when the speech rate of person A is equal to or greater than the speech rate of person B and the speech rate of person A is equal to or greater than the first threshold value, the satisfaction level estimation algorithm adds 0.5 points to the satisfaction level. If the speech rate of person A is less than the speech rate of person B and the speech rate of person A is less than a second threshold that is less than or equal to the first threshold, the satisfaction level estimation algorithm subtracts 0.5 points from the satisfaction level.
  • the first threshold is, for example, 50%
  • the second threshold is, for example, 40%. Note that the values of the first threshold value and the second threshold value are merely examples, and may be changed as appropriate by the user (for example, person B).
  • points may be subtracted, or there is no need to add or subtract points.
  • points may be added, or points may not be added or points may be subtracted.
  • points may be added regardless of the speech rate of person B.
  • points may be deducted regardless of the speech rate of person B.
  • Emotion is an index calculated from conversation audio data.
  • a positive rate, neutral rate, and negative rate are calculated based on the "emotion, facial expression, or action.”
  • the satisfaction estimation algorithm uses the positive rate and negative rate to add and subtract points.
  • the positive rate indicates the rate at which the user's (for example, person A) emotion is determined to be positive within a predetermined period of time.
  • Examples of feature amounts that are determined to be positive include person A's voice becoming louder, person A's voice becoming louder, person A nodding, or person A smiling. Note that the feature amounts that are determined to be positive are just examples and are not limited to these.
  • the neutral rate indicates the rate at which the emotions of the user (for example, person A) are determined to be neutral within a predetermined period of time.
  • the neutral state is a state in which it is assumed that person A's emotions are neither positive nor negative.
  • a neutral state is a state in which person A is calm.
  • the feature amount that is determined to be neutral is, for example, that person A has a straight face or that person A is standing still. Note that the feature amounts that are determined to be neutral are merely examples, and are not limited to these.
  • the negative rate indicates the rate at which the emotions of the user (for example, person A) are determined to be negative within a predetermined period of time.
  • Features that are determined to be negative include, for example, person A has a crying face, person A's voice has become lower, person A's voice has lowered pitch, or person A is tilting his head. . Note that the feature amounts that are determined to be negative are merely examples and are not limited to these.
  • the evaluation system 100 makes two positive determinations, two negative determinations, and one neutral determination within 2.5 seconds.
  • the positive rate is (1+1)/5, which is 0.4 (that is, 40%).
  • the negative rate is (1+1)/5, which is 0.4 (that is, 40%).
  • the neutral rate is 1/5, which is 0.2 (that is, 20%).
  • the calculation of the positive rate, neutral rate, and negative rate may be performed by the feature amount extraction unit 12A, or may be performed by the satisfaction level estimation unit 24A based on the feature amount data acquired from the feature amount extraction unit 12A.
  • the satisfaction estimation algorithm adds 0.5 points to the satisfaction level. If the negative rate of person A is equal to or greater than the threshold for deducting points, the satisfaction estimation algorithm deducts 0.5 points from the satisfaction level.
  • the threshold for adding points is 50%. In this case, if the positive rate is 50% or more, the satisfaction estimation algorithm adds 0.5 points to the satisfaction. Note that the threshold value for adding points is not limited to 50% and may be changed as appropriate by the user.
  • the threshold for demerit points is 50%. In this case, if the negative rate is 50% or more, the satisfaction estimation algorithm reduces the satisfaction by 0.5 points. Note that the threshold value for demerit points is not limited to 50%, and may be changed as appropriate by the user.
  • the satisfaction estimation algorithm adds 0.5 points to the satisfaction. If the time the person A looks in the direction of the display is less than the fourth threshold, which is equal to or less than the third threshold, 0.5 points are deducted from the satisfaction level.
  • FIG. 5 is a diagram illustrating an example of calculating satisfaction levels at predetermined time intervals.
  • the satisfaction value at the start of evaluation is set to 3, and the satisfaction estimation algorithm repeatedly adds and subtracts satisfaction points.
  • the satisfaction value at the start of the evaluation is not limited to 3 and may be any value.
  • the satisfaction level is assumed to take a value between 0 and 5 points. Note that the range of values that the satisfaction level can take is not limited to 0 to 5 points, but may be in other ranges, and the range does not need to be set.
  • the graphs for Case CA and Case CB are plots of satisfaction values calculated every 30 seconds.
  • the horizontal axis of the graphs for case CA and case CB represents elapsed time, and the vertical axis represents satisfaction level.
  • Case CA and case CB are, for example, cases in which the conversation ends in 5 minutes.
  • the satisfaction level estimation algorithm repeatedly adds or subtracts points every 30 seconds, and when 5 minutes have passed, the satisfaction level is 5 points, and the user (for example, person A) is highly satisfied. Indicates that the conversation has ended.
  • the satisfaction level estimation algorithm repeatedly adds or subtracts points every 30 seconds, and when 5 minutes have passed, the satisfaction level is 0 points, and the user (for example, person A) has a low satisfaction level. Indicates that the conversation has ended.
  • FIG. 6 is a sequence diagram of satisfaction evaluation processing according to the first embodiment.
  • the satisfaction level is evaluated by two terminal devices (terminal device 1AA, terminal device 1AB) and server 2A.
  • the evaluation system 100A calculates the satisfaction level of person A from a conversation between person A and person B.
  • person A who is the person to be evaluated, uses terminal device 1AA, and person B uses terminal device 1AB.
  • the number of terminal devices is not limited to two, and may be one or two or more.
  • the terminal device 1AA sets the values of each threshold value related to addition and deduction of satisfaction points by the satisfaction estimation algorithm (St100). Note that the setting of the threshold value in the terminal device 1AA may be omitted from the process related to FIG. 6.
  • the terminal device 1AA starts evaluating the satisfaction level of person A (St101).
  • the start of the satisfaction evaluation is executed, for example, by the user (for example, person B) pressing a button to start evaluation displayed on the display device 16.
  • the terminal device 1AA acquires image data and audio data of person A (St102).
  • the terminal device 1AA extracts feature amounts based on the imaging data and audio data acquired in the process of step St102 (St103).
  • the terminal device 1AB sets the values of each threshold regarding addition and deduction of satisfaction points by the satisfaction estimation algorithm (St104).
  • the threshold value may be set arbitrarily by the person B, or may be automatically set based on a set value stored in the memory 14 in advance. Further, the setting of the threshold value may be performed not in the terminal device 1AB but in the server 2A.
  • the terminal device 1AB starts evaluating the satisfaction level of person A (St105).
  • the terminal device 1AB acquires image data and audio data of person B (St106).
  • the terminal device 1AB extracts feature amounts based on the imaging data and audio data acquired in the process of step St106 (St107).
  • the terminal device 1AA transmits the threshold value set in the process of step St100 and the feature amount extracted in the process of step St103 to the server 2A.
  • the terminal device 1AB transmits the threshold setting value set in the process of step St104 and the feature amount extracted in the process of step St107 to the server 2A (St108).
  • the server 2A calculates the satisfaction level based on the threshold setting value, the feature amount, and the satisfaction level estimation algorithm obtained in the process of step St108 (St109).
  • the terminal device 1AA requests the server 2A to transmit the satisfaction results (St110). Note that the process of step St110 may be omitted from the process related to FIG. 6.
  • the terminal device 1AB requests the server 2A to send the satisfaction results (St111).
  • the server 2A draws a screen related to the satisfaction level results.
  • the server 2A transmits a screen on which the satisfaction level results are drawn to the terminal device 1AB (St112).
  • the server 2A transmits a screen on which the satisfaction level results are drawn to the terminal device 1AA (St113).
  • the process of step St113 may be omitted from the process related to FIG. 6.
  • the terminal device 1AA displays the screen acquired in the process of step St113 on the display of the terminal device 1AA (St114).
  • the process of step St114 may be omitted from the process related to FIG. 6.
  • the terminal device 1AB displays the screen acquired in the process of step St112 on the display of the terminal device 1AB (St115).
  • the server 2A transmits a signal to end the evaluation to the terminal device 1AA and the terminal device 1AB (St116).
  • the terminal device 1AA ends the satisfaction evaluation based on the signal acquired in the process of step St116 (St117).
  • the terminal device 1AB ends the satisfaction evaluation based on the signal acquired in the process of step St116 (St118).
  • the terminal device 1AA transmits a request to send the final satisfaction result to the server 2A (St119).
  • the process of step St119 may be omitted from the process of FIG.
  • the terminal device 1AB transmits a request to transmit the final satisfaction result to the server 2A (St120).
  • the server 2A draws a screen related to the final satisfaction result based on the request obtained in the process of step St120.
  • the server 2A transmits a screen showing the final result of the satisfaction level to the terminal device 1AB (St121).
  • the server 2A draws a screen showing the final result of the satisfaction level.
  • the server 2A transmits a screen showing the final result of the satisfaction level to the terminal device 1AA (St122).
  • the process of step St122 may be omitted from the process of FIG. 6.
  • the terminal device 1AA displays the screen acquired in the process of step St122 on the display of the terminal device 1AA (St123).
  • the process of step St123 may be omitted from the process of FIG.
  • the terminal device 1AB displays the screen acquired in the process of step St121 on the display of the terminal device 1AB (St124).
  • Embodiment 2 In the evaluation system according to Embodiment 2, a server performs everything from extraction of feature amounts to calculation of satisfaction level all at once based on imaging data and audio data acquired by a terminal device.
  • the same reference numerals will be used for the same components as in Embodiment 1, and the description thereof will be omitted.
  • FIG. 7 is a diagram showing an example of the internal configuration of a terminal device and a server according to the second embodiment. Only the parts that are different from the hardware block diagram according to the first embodiment shown in FIG. 3 will be explained.
  • the feature extraction unit 12A is incorporated into the processor 24 of the server 2B. That is, the terminal device 1B includes a communication I/F 13, a memory 14, an input device 15, a display device 16, an I/F 17, an audio acquisition device 10, and an imaging device 11.
  • the server 2B includes a communication I/F 21, a memory 22, an input device 23, an I/F 26, and a processor 24.
  • the processor 24 realizes the functions of the feature amount extraction section 12A, the satisfaction estimation section 24A, and the drawing screen creation section 24B.
  • the feature amount extraction unit 12A extracts feature amounts based on the audio data and image data acquired from the terminal device 1B.
  • FIG. 8 is a sequence diagram of satisfaction evaluation processing according to the second embodiment. Processes similar to those in the sequence diagram of FIG. 6 of the first embodiment are given the same reference numerals, and only different processes will be described.
  • the terminal device 1BA transmits the threshold value set in the process of step St100 and the imaging data and audio data acquired in the process of step St102 to the server 2B (St200).
  • the terminal device 1BB transmits the threshold value set in the process of step St104 and the imaging data and audio data acquired in the process of step St106 to the server 2B (St200).
  • the server 2B extracts feature amounts based on the imaging data and audio data acquired in the process of step St200 (St201).
  • the server 2B calculates the degree of satisfaction based on the feature amount extracted in the process of step St201 (St202).
  • the following processing is the same as each processing related to the sequence diagram of FIG. 6, so the explanation will be omitted.
  • the evaluation system calculates the degree of satisfaction using the terminal device or the server based on the imaging data and audio data (that is, the data recorded or recorded in the past) acquired by the terminal device in the past.
  • the same reference numerals will be used for the same components as in Embodiment 1, and the description thereof will be omitted.
  • FIG. 9 is a diagram showing an example of the internal configuration of a terminal device according to the third embodiment. Only the parts that are different from the hardware block diagram according to the first embodiment shown in FIG. 3 will be explained.
  • the terminal device 1C includes a communication I/F 13, a memory 14, an input device 15, a display device 16, an audio acquisition device 10, an imaging device 11, and a processor 12.
  • the audio acquisition device 10 and the imaging device 11 may be omitted.
  • the communication I/F 13 may transmit the screen drawn by the drawing screen creation unit 24B of the processor 12 to another terminal device or the like. Further, when the audio acquisition device 10 and the imaging device 11 are external devices, the communication I/F 13 acquires image data captured in the past and audio data captured in the past from the external devices.
  • the feature amount extraction unit 12A of the processor 12 extracts feature amounts based on image data captured in the past and audio data captured in the past.
  • the feature extraction unit 12A outputs the extracted feature data to the satisfaction estimation unit 24A.
  • the feature extraction unit 12A obtains one file that includes the imaging data and audio data of both person A and person B.
  • the feature extracting unit 12A separates one file into four pieces of data: image data and voice data of person A, and image data and voice data of person B, using known techniques such as image recognition or voice recognition. do.
  • the feature extracting unit 12A may obtain two files: a file containing image data and audio data of person A, and a file containing image data and audio data of person B. .
  • the feature extraction unit 12A separates each file into image data and audio data using a known technique.
  • the input device 15 may obtain an input from a user (for example, person A or person B) regarding whether each of the two files is associated with person A or person B.
  • the feature extraction unit 12A obtains four files: a file of image data of person A, a file of voice data of person A, a file of image data of person B, and a file of voice data of person B. You may.
  • the input device 15 may obtain input from the user (for example, person A or person B) regarding whether each of the four files is associated with person A or person B.
  • the hardware block diagram of the third embodiment is similar to FIG. 7 of the second embodiment.
  • the server 2B acquires audio data previously acquired by the audio acquisition device 10 of the terminal device 1B and imaging data previously acquired by the imaging device 11 of the terminal device 1B.
  • the feature amount extraction unit 12A of the server 2B extracts the feature amount based on the acquired audio data and image data, and the satisfaction level estimation unit 24A calculates the satisfaction level based on the extracted feature amount.
  • FIG. 10 is a flowchart illustrating the process of calculating the degree of satisfaction on the terminal device. Each process related to FIG. 10 is executed by the processor 12.
  • the processor 12 sets the values of each threshold regarding addition and deduction of satisfaction points by the satisfaction estimation algorithm (St300).
  • the processor 12 may set the threshold value by obtaining an input signal from the user (for example, person B) obtained from the input device 15, or may set the threshold value automatically based on the setting value stored in advance in the memory 14. You can.
  • the processor 12 acquires previously captured image data and captured audio data stored in the memory 14 (St301). Note that the processor 12 is not limited to past data, and may acquire data currently being acquired by the audio acquisition device 10 and the imaging device 11 of the terminal device 1C.
  • the processor 12 extracts feature amounts from the imaging data and audio data acquired in the process of step St301 (St302).
  • the processor 12 calculates the satisfaction level of the user (for example, person A) based on the feature amount extracted in the process of step St302 (St303).
  • the processor 12 draws a screen showing the satisfaction level calculated in the process of step St303 (St304).
  • FIG. 11 is a sequence diagram showing a process in which the server calculates the degree of satisfaction from data captured in images and sounds in the past.
  • the terminal device 1B transmits to the server 2B threshold setting information regarding addition and deduction of satisfaction points based on the satisfaction estimation algorithm (St400).
  • the terminal device 1B transmits image data captured in the past and audio data captured in the past to the server 2B (St401).
  • the server 2B extracts feature amounts based on the imaging data and audio data acquired in the process of step St401 (St402).
  • the server 2B calculates the degree of satisfaction based on the feature amount acquired in the process of step St402 (St403).
  • the terminal device 1B requests the server 2B to send the final result of the satisfaction level (St404).
  • the server 2B draws a screen including the final satisfaction result based on the request received from the terminal device 1B in the process of step St404.
  • the server 2B transmits the drawn screen to the terminal device 1B (St405).
  • the terminal device 1B displays a screen including the final result of the satisfaction level obtained in the process of step St405 (St406).
  • FIG. 12 is a diagram showing an example of a screen displayed on a terminal device.
  • Screen MN1 is an example of a screen displayed on terminal device 1 at a certain moment during a meeting. For example, if Person A and Person B are having a meeting and Person A is the person to be evaluated, screen MN1 is the screen that Person B refers to. Screen MN1 includes display areas IT1, IT2 and buttons BT1, BT2, BT3, BT4, BT5, and BT6.
  • the display area IT2 is an area where the captured video of the person A is displayed in real time.
  • the drawing screen creation unit 24B displays the captured video of the person A acquired from the imaging device 11 in the display area IT2.
  • the display area IT1 is an area where the satisfaction results are displayed.
  • the display area IT1 displays a graph in which satisfaction values calculated at predetermined time intervals are plotted.
  • the drawing screen creation unit 24B may display the satisfaction level in the display area IT1 at the timing when the satisfaction level is acquired from the satisfaction level estimation unit 24A.
  • the display area IT1 is not limited to graphs, and may display satisfaction values calculated from the start of the meeting to now in numbers, or satisfaction values calculated at predetermined time intervals. may be displayed numerically each time.
  • the display area IT1 may display the current satisfaction level in text such as "high”, “medium”, or "low” based on the calculated satisfaction value, or may display the current satisfaction level in accordance with the satisfaction level. You may also display emoticons or pictograms.
  • the button BT1 is a button that turns on the display of the captured image of the user on the other party's terminal device 1.
  • the button BT2 is a button for turning off the display of the user's captured video on the other party's terminal device 1.
  • the button BT3 is a button that turns on the output of your own voice to the terminal device 1 of the other party.
  • the button BT4 is a button for turning off the output of your own voice to the terminal device 1 of the other party.
  • the button BT5 is a button for starting or ending satisfaction evaluation. Button BT5 may be omitted from screen MN1.
  • the button BT6 is a button for starting or ending a conference.
  • Screen MN2 is an example of a screen displayed on the terminal device 1 at a certain moment during the meeting.
  • Screen MN2 is a screen displayed on terminal device 1 when one minute has passed since screen MN1 was displayed on terminal device 1.
  • the display area IT3 is an area where the satisfaction results are displayed.
  • the display area IT3 displays a graph in which satisfaction values calculated at predetermined time intervals are plotted.
  • the display area IT3 displays a graph in which two satisfaction results are additionally plotted on the graph displayed in the display area IT1 as the conversation between person A and person B progresses for one minute. In this way, in the display area IT3, satisfaction results are additionally plotted in real time according to the elapsed time.
  • FIG. 13 is a diagram showing an example of a screen on which a message is displayed according to the satisfaction level result.
  • elements that overlap with those in FIG. 12 are given the same reference numerals to simplify or omit the description, and different contents will be described.
  • Screen MN3 is an example of a screen displayed on terminal device 1 at a certain moment during a meeting.
  • Screen MN3 is a screen that is displayed on terminal device 1 when one minute has elapsed since screen MN1 was displayed on terminal device 1.
  • the display area IT4 is an area where the satisfaction results are displayed.
  • the display area IT4 displays a graph in which satisfaction values calculated at predetermined time intervals are plotted.
  • the display area IT4 displays a graph in which two satisfaction results are additionally plotted on the graph displayed in the display area IT1 as the conversation between person A and person B progresses for one minute.
  • the message Mes is a message displayed according to the satisfaction level. For example, the message Mes is displayed according to person A's speaking rate.
  • the satisfaction estimation unit 24A sends a signal to the drawing screen creation unit 24B that the speech rate of person A is less than the speech rate of person B. Output.
  • the satisfaction estimation unit 24A determines that the speech rate of person A is less than the speech rate of person B.
  • the signal may be output to the drawing screen creation section 24B.
  • the satisfaction level estimation unit 24A when determining that the speech rate of person A is less than the second threshold, the satisfaction level estimation unit 24A outputs a signal indicating that the speech rate of person A is less than the second threshold to the drawing screen creation unit 24B.
  • the feature extraction unit 12A performs the determination as to whether or not the speech rate of person A is less than the speech rate of person B, and the determination as to whether or not the speech rate of person A is less than the second threshold.
  • the drawing screen creation unit 24B creates a message for the person B to refrain from speaking based on the signal acquired from the satisfaction level estimation unit 24A, and causes the message to be displayed on the screen MN3.
  • the message to refrain from speaking is, for example, "Let's listen to what Person A has to say.” Note that the message to refrain from speaking is one example and is not limited to this.
  • the terminal device 1 or the server 2 may calculate the satisfaction level of each of a plurality of people and calculate the average value of the satisfaction levels of all the people. In this way, the terminal device 1 or the server 2 may aggregate the satisfaction level and notify the user without identifying the individual.
  • the terminal device 1 may display something that attracts the viewer's attention, such as an avatar, on the screen that person B is viewing.
  • the avatar and the like may also be displayed on the screen of person A.
  • the evaluation system can improve the satisfaction level of person A by displaying an avatar on the screen to attract the attention of person A and person B to the screen.
  • the terminal device 1 may display a notification that the person A is currently thinking.
  • the terminal device 1 or the server 2 may calculate the degree of satisfaction without displaying the captured video of the person having the conversation on the display device 16 (that is, with the display of the captured video turned off).
  • the evaluation system has an acquisition unit (for example, it includes a voice acquisition device 10) and an imaging unit (for example, an imaging device 11) that images a first person and a second person.
  • the evaluation system calculates a first feature amount related to the line of sight or face direction of each of the first person and the second person, and a second feature amount of the audio data based on the imaging data of the imaging unit.
  • It includes an extraction unit (for example, feature quantity extraction unit 12A) that performs extraction.
  • the evaluation system calculates the satisfaction level of the first person based on the first feature amount, the second feature amount, and the calculation algorithm for calculating the satisfaction level of the first person.
  • a calculation unit for example, a satisfaction level estimation unit 24A is provided.
  • the evaluation system can calculate the satisfaction level based on two pieces of information: information related to the first person's line of sight or face direction, and information related to the first person's voice data. Thereby, the evaluation system can perform highly accurate satisfaction evaluation using a plurality of pieces of information included in conversations between people.
  • the satisfaction calculation unit of the evaluation system of this embodiment calculates the satisfaction at predetermined time intervals from the start to the end of the conversation. Thereby, the evaluation system can calculate the satisfaction level of the first person at each time from the start of the conversation until the end of the conversation, and can perform a flexible evaluation of the satisfaction level.
  • the extraction unit of the evaluation system of the present embodiment uses, as second feature quantities, a first ratio indicating the ratio of the first person speaking in the conversation and a ratio of the second person speaking.
  • a second ratio is calculated.
  • the calculation algorithm adds a predetermined value to the satisfaction value when the first ratio is greater than or equal to the second ratio, and subtracts a predetermined value from the satisfaction value when the first ratio is less than the second ratio. do.
  • the evaluation system can evaluate the degree of satisfaction according to the speech rate of the first person relative to the speech rate of the second person.
  • the calculation algorithm of the evaluation system of the present embodiment adds a predetermined value to the satisfaction value when the first ratio is equal to or higher than the second ratio and the first ratio is equal to or higher than the first threshold value.
  • the calculation algorithm subtracts a predetermined value from the satisfaction value when the first ratio is less than the second ratio and the first ratio is less than the second threshold, which is less than or equal to the first threshold.
  • the extraction unit of the evaluation system of the present embodiment detects the emotion of the first person from the voice data as a second feature quantity, and detects a positive rate, which is the rate at which the first person feels positive, from the emotion. and a negative rate, which is the rate at which the first person felt negative based on the emotion.
  • the calculation algorithm adds a predetermined value to the satisfaction value when the positive rate is greater than or equal to a threshold for adding points, and subtracts a predetermined value from the satisfaction value when the negative rate is greater than or equal to a threshold for subtracting points.
  • the evaluation system can evaluate the degree of satisfaction based on the emotion detected from the voice data of the first person.
  • the evaluation system of this embodiment further includes a first display section (for example, display device 16) on which a second person is displayed when the first person has a conversation.
  • the extraction unit calculates a time period during which the first person looks at the first display unit as the first feature amount.
  • the calculation algorithm adds a predetermined value to the satisfaction value when the time is equal to or greater than a third threshold, and subtracts a predetermined value from the satisfaction value when the time is less than a fourth threshold that is equal to or less than the third threshold.
  • the evaluation system can evaluate the degree of satisfaction based on the time the first person looks at the first display section.
  • the calculation algorithm of the evaluation system of this embodiment calculates the degree of satisfaction based on machine learning.
  • the evaluation system can calculate the degree of satisfaction from the feature amount data using a calculation algorithm based on machine learning.
  • the evaluation system of the present embodiment includes a second display section (e.g., display device 16) that displays a screen that the second person refers to when having a conversation, and a screen creation section (e.g., It further includes a drawing screen creation section 24B).
  • the screen creation unit creates a screen including the satisfaction result calculated by the satisfaction calculation unit and causes the second display unit to display the screen. This allows the second person, who is the evaluator, to confirm the satisfaction level results of the first person. Thereby, the evaluation system can support the first person to have a conversation with a high level of satisfaction by notifying the second person of the result of the first person's satisfaction level.
  • the evaluation system further includes a second display unit that displays a screen that the second person refers to when having a conversation, and a screen creation unit that creates the screen.
  • the screen creation unit displays the satisfaction level on the second display unit while acquiring the satisfaction level calculated by the satisfaction level calculation unit from the satisfaction level calculation unit according to the conversation between the first person and the second person. .
  • This allows the second person to check the first person's current satisfaction level while conversing with the first person.
  • the evaluation system can support the first person to have a conversation so that the first person has a high degree of satisfaction.
  • the evaluation system further includes a second display unit that displays a screen that the second person refers to when having a conversation, and a screen creation unit that creates the screen.
  • the screen creation unit displays on the screen a message to the effect that the second person should refrain from speaking.
  • the evaluation system can display a message that helps increase the satisfaction level of the first person based on the speech rates of the first person and the second person.
  • the screen created by the screen creation unit of the evaluation system includes a display area where the captured video of the first person is displayed, a display area where the satisfaction result is displayed, and a screen where the second person's captured image is displayed.
  • a button for displaying the captured video on a screen referenced by the first person a button for outputting the audio data of the second person from a terminal device used by the first person, and a button for controlling the start or end of the conference. including a button.
  • the evaluation system can display a screen including the satisfaction result to the second person.
  • the evaluation system can support the second person to have a smooth conversation with the first person by notifying the second person of the satisfaction level of the first person.
  • the extraction unit of the evaluation system is based on the audio data acquired in advance by the acquisition unit and the imaged data of the first person and the second person that were imaged in advance by the imaging unit. Then, a first feature amount related to the line of sight or face direction of each of the first person and the second person, and a second feature amount of the audio data are extracted. Thereby, the evaluation system can extract the feature amount from the image data captured in the past and the audio data captured, and evaluate the satisfaction level of the first person.
  • the extraction unit of the evaluation system extracts a third feature amount related to the facial expressions of each of the first person and the second person based on the imaging data of the imaging unit, and calculates the satisfaction level.
  • the calculation unit calculates the satisfaction level of the first person based on the third feature amount and the calculation algorithm. Thereby, the evaluation system can evaluate the degree of satisfaction from the feature amount based on the facial expression of the first person.
  • the extraction unit of the evaluation system extracts a fourth feature amount related to each of the actions of the first person and the second person based on the imaging data of the imaging unit.
  • the satisfaction level calculation unit calculates the satisfaction level of person A based on the fourth feature amount and the calculation algorithm. Thereby, the evaluation system can evaluate the degree of satisfaction from the feature amount based on the behavior of the first person.
  • the second feature used in the evaluation system according to the present embodiment is at least one of the voice intensity, the number of moras per unit time, the intensity of each word, the volume, or the voice spectrum. Thereby, the evaluation system can calculate the first person's emotion from the second feature amount.
  • the second person in this embodiment has an interpersonal relationship with the first person, and the interpersonal relationship includes at least one of the following: between a boss and a subordinate, between an employee and a customer, between colleagues, or between an interviewer and an interviewee. It is characterized in that it includes one.
  • the evaluation system can evaluate the satisfaction level of the first person in a situation where the second person has a conversation with the first person with whom he or she has an interpersonal relationship.
  • the evaluation system further includes a calculation algorithm storage unit (for example, the memory 14 or the memory 22) that stores the calculation algorithm.
  • a calculation algorithm storage unit for example, the memory 14 or the memory 22
  • the evaluation system can evaluate the satisfaction level of the first person based on the calculation algorithm stored in the calculation algorithm storage unit.
  • the technology of the present disclosure is useful as an evaluation system, an evaluation device, and an evaluation method that perform highly accurate satisfaction evaluation using multiple pieces of information included in conversations between people.

Abstract

Ce système d'évaluation comprend : une unité d'acquisition pour acquérir des données de parole relatives à une conversation entre une première personne et une seconde personne ; une unité d'imagerie qui génère une image de la première personne et de la seconde personne ; une unité d'extraction qui extrait une première grandeur caractéristique relative à la ligne de visée ou à l'orientation du visage de chacune de la première personne et de la seconde personne sur la base des données d'imagerie provenant de l'unité d'imagerie, et une seconde grandeur caractéristique relative aux données de parole ; et une unité de calcul de niveau de satisfaction qui calcule le niveau de satisfaction de la première personne sur la base de la première grandeur caractéristique, de la seconde grandeur caractéristique et d'un algorithme de calcul pour calculer le niveau de satisfaction de la première personne.
PCT/JP2023/018500 2022-07-04 2023-05-17 Système d'évaluation, dispositif d'évaluation et procédé d'évaluation WO2024009623A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022107706A JP2024006627A (ja) 2022-07-04 2022-07-04 評価システム、評価装置および評価方法
JP2022-107706 2022-07-04

Publications (1)

Publication Number Publication Date
WO2024009623A1 true WO2024009623A1 (fr) 2024-01-11

Family

ID=89453003

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/018500 WO2024009623A1 (fr) 2022-07-04 2023-05-17 Système d'évaluation, dispositif d'évaluation et procédé d'évaluation

Country Status (2)

Country Link
JP (1) JP2024006627A (fr)
WO (1) WO2024009623A1 (fr)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011210133A (ja) * 2010-03-30 2011-10-20 Seiko Epson Corp 満足度算出方法、満足度算出装置およびプログラム
JP2011237957A (ja) * 2010-05-10 2011-11-24 Seiko Epson Corp 満足度算出装置、満足度算出方法およびプログラム
JP2018041120A (ja) * 2016-09-05 2018-03-15 富士通株式会社 業務評価方法、業務評価装置および業務評価プログラム
JP2018124604A (ja) * 2017-01-30 2018-08-09 グローリー株式会社 接客支援システム、接客支援装置及び接客支援方法
JP2020113197A (ja) * 2019-01-16 2020-07-27 オムロン株式会社 情報処理装置、情報処理方法、及び情報処理プログラム
JP2020160425A (ja) * 2019-09-24 2020-10-01 株式会社博報堂Dyホールディングス 評価システム、評価方法、及びコンピュータプログラム。
JP2021072497A (ja) * 2019-10-29 2021-05-06 株式会社Zenkigen 分析装置及びプログラム
WO2022064621A1 (fr) * 2020-09-24 2022-03-31 株式会社I’mbesideyou Système d'évaluation de réunion vidéo et serveur d'évaluation de réunion vidéo
JP2022075662A (ja) * 2020-10-27 2022-05-18 株式会社I’mbesideyou 情報抽出装置
WO2022137547A1 (fr) * 2020-12-25 2022-06-30 株式会社日立製作所 Système d'aide à la communication

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011210133A (ja) * 2010-03-30 2011-10-20 Seiko Epson Corp 満足度算出方法、満足度算出装置およびプログラム
JP2011237957A (ja) * 2010-05-10 2011-11-24 Seiko Epson Corp 満足度算出装置、満足度算出方法およびプログラム
JP2018041120A (ja) * 2016-09-05 2018-03-15 富士通株式会社 業務評価方法、業務評価装置および業務評価プログラム
JP2018124604A (ja) * 2017-01-30 2018-08-09 グローリー株式会社 接客支援システム、接客支援装置及び接客支援方法
JP2020113197A (ja) * 2019-01-16 2020-07-27 オムロン株式会社 情報処理装置、情報処理方法、及び情報処理プログラム
JP2020160425A (ja) * 2019-09-24 2020-10-01 株式会社博報堂Dyホールディングス 評価システム、評価方法、及びコンピュータプログラム。
JP2021072497A (ja) * 2019-10-29 2021-05-06 株式会社Zenkigen 分析装置及びプログラム
WO2022064621A1 (fr) * 2020-09-24 2022-03-31 株式会社I’mbesideyou Système d'évaluation de réunion vidéo et serveur d'évaluation de réunion vidéo
JP2022075662A (ja) * 2020-10-27 2022-05-18 株式会社I’mbesideyou 情報抽出装置
WO2022137547A1 (fr) * 2020-12-25 2022-06-30 株式会社日立製作所 Système d'aide à la communication

Also Published As

Publication number Publication date
JP2024006627A (ja) 2024-01-17

Similar Documents

Publication Publication Date Title
US9674485B1 (en) System and method for image processing
JP2016149063A (ja) 感情推定装置及び感情推定方法
JP2019058625A (ja) 感情読み取り装置及び感情解析方法
WO2019137147A1 (fr) Procédé d'identification d'identité dans une vidéoconférence, et appareil associé
US20200058302A1 (en) Lip-language identification method and apparatus, and augmented reality device and storage medium
JP2016103081A (ja) 会話分析装置、会話分析システム、会話分析方法及び会話分析プログラム
JP2019144917A (ja) 滞在状況表示システムおよび滞在状況表示方法
CN110569726A (zh) 一种服务机器人的交互方法及系统
JP7153888B2 (ja) 双方向映像通信システム及びそのオペレータの管理方法
WO2024009623A1 (fr) Système d'évaluation, dispositif d'évaluation et procédé d'évaluation
JP6547290B2 (ja) 画像センシングシステム
US11100944B2 (en) Information processing apparatus, information processing method, and program
JP6598227B1 (ja) 猫型会話ロボット
JP7206741B2 (ja) 健康状態判定システム、健康状態判定装置、サーバ、健康状態判定方法、及びプログラム
US11935140B2 (en) Initiating communication between first and second users
JP6711621B2 (ja) ロボット、ロボット制御方法およびロボットプログラム
JP6550951B2 (ja) 端末、ビデオ会議システム、及びプログラム
US10440183B1 (en) Cognitive routing of calls based on derived employee activity
KR101878155B1 (ko) 휴대 단말기의 제어방법
JP2013239991A (ja) テレビ制御装置、テレビ制御方法及びテレビ制御プログラム
JP7253371B2 (ja) 抽出プログラム、抽出方法、および、抽出装置
EP3956748A1 (fr) Signaux de casque pour déterminer des états émotionnels
WO2024038699A1 (fr) Dispositif de traitement d'expression, procédé de traitement d'expression et programme de traitement d'expression
EP4242943A1 (fr) Système de traitement d'informations, procédé de traitement d'informations et moyens de support
US11928253B2 (en) Virtual space control system, method for controlling the same, and control program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23835160

Country of ref document: EP

Kind code of ref document: A1