WO2024111121A1 - Interview support device, interview support method, and program - Google Patents

Interview support device, interview support method, and program Download PDF

Info

Publication number
WO2024111121A1
WO2024111121A1 PCT/JP2022/043607 JP2022043607W WO2024111121A1 WO 2024111121 A1 WO2024111121 A1 WO 2024111121A1 JP 2022043607 W JP2022043607 W JP 2022043607W WO 2024111121 A1 WO2024111121 A1 WO 2024111121A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
case information
conversation
voice
interview
Prior art date
Application number
PCT/JP2022/043607
Other languages
French (fr)
Japanese (ja)
Inventor
香央里 藤村
大河 佐野
妙 佐藤
康雄 石榑
麻美 宮島
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2022/043607 priority Critical patent/WO2024111121A1/en
Publication of WO2024111121A1 publication Critical patent/WO2024111121A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services

Definitions

  • This disclosure relates to an interview support device, an interview support method, and a program.
  • an interviewer interviews an interviewee (the person being interviewed; hereafter also referred to as the "subject")
  • the interviewer picks up on the subject's emotional changes from facial expressions, movements, gaze, responses, etc., and from these emotional changes he or she is able to grasp the subject's interest or concern in a particular topic or matter.
  • the interviewer will grasp from emotional changes picked up from changes in the subject's tone of voice, volume of speech, etc., whether a rapport has been established with the subject, whether the subject has become interested in health, whether they feel motivated to improve their lifestyle, whether they are sufficiently motivated to take action, etc.
  • Non-Patent Document 1 one known study to clarify the motivational factors that motivate people who receive specific health guidance to engage in health promotion behavior is to conduct semi-structured interviews with people who receive specific health guidance and extract motivational factors from the content of the interviews.
  • conversations that induce emotional changes in the subject is accumulated by each individual interviewer and is not shared.
  • conversations that induce emotional changes in the subject may differ depending on the characteristics of the subject (for example, the subject's personality, lifestyle, level of health awareness, etc.).
  • the present disclosure has been made in consideration of the above points, and provides technology that can present an interviewee with a conversation that induces an emotional change in the subject according to the subject's characteristics.
  • the interview support device includes a determination unit configured to determine whether or not an emotional change has occurred in the first subject using at least one of the voice of the first subject during an interview and a first voice recognition result representing the result of voice recognition of the voice of the first subject; a conversation extraction unit configured to extract a conversation consisting of a plurality of utterances including an utterance in which the emotional change has occurred from the first voice recognition result and a second voice recognition result representing the result of voice recognition of the voice of the first interviewer who is interviewing the first subject when it is determined that an emotional change has occurred in the first subject; a case creation unit configured to create case information that associates the characteristics of the first subject with the conversation and store the case information in a storage unit; and a similar case presentation unit configured to present case information stored in the storage unit that includes characteristics similar to the characteristics of the second subject to be interviewed as similar case information to the second interviewer who is interviewing the second subject.
  • Technology can present an interviewee with a conversation that induces emotional changes in the subject based on the subject's characteristics.
  • FIG. 1 is a diagram illustrating an example of a hardware configuration of an interview support device according to an embodiment of the present invention.
  • 1 is a diagram illustrating an example of a functional configuration of an interview support device according to an embodiment of the present invention.
  • FIG. 11 is a diagram illustrating an example of case information.
  • 10 is a flowchart illustrating an example of a case creation process according to the present embodiment.
  • 10 is a flowchart illustrating an example of a similar case presentation process according to the present embodiment.
  • FIG. 13 is a diagram showing an example of a presentation result of subject characteristics included in similar cases.
  • FIG. 13 is a diagram showing an example of a presentation result of conversations included in similar cases.
  • an interview support device 10 can present a conversation that will cause an emotional change in the subject of the interview according to the characteristics of the subject of the interview, for various interviews (including interviews, conversations, etc.) including health guidance.
  • the interview support device 10 of this embodiment allows the interviewer to know the conversation that will cause an emotional change in the subject according to the characteristics of the subject, making it possible to effectively conduct an interview with the subject.
  • the interviewer can refer to the conversation presented by the interview support device 10 and effectively have a conversation that encourages the subject to change their behavior, such as improving their lifestyle habits.
  • emotional changes may also be called “mental changes,” and a conversation that causes an emotional change in a subject refers to a conversation that touches or penetrates the subject's heart.
  • interviews to which the interview support device 10 according to this embodiment can be applied are not limited to interviews regarding health guidance.
  • the device can be applied to various interviews (including interviews) such as interviews regarding career guidance at schools, interviews regarding learning guidance at cram schools, personnel interviews and business interviews at companies, and employment interviews. More generally, the device can be applied to cases in which a certain person (interviewer) has some kind of conversation with one or more other people (subjects).
  • the interview including interviews, conversations, etc.
  • the interview support device 10 executes two processes, a "case creation process” and a “similar case presentation process.”
  • the "case creation process” is a process for creating case information that associates the characteristics of the subject with a conversation that caused an emotional change in the subject, using the results of speech recognition of the voices of the interviewee and the subject, the results of a preliminary interview, etc.
  • the "similar case presentation process” is a process for acquiring case information that includes characteristics similar to the characteristics of the subject from the case information created in the case creation process as similar case information, and presenting the conversation included in this similar case information to the interviewee.
  • the case creation process is executed before the similar case presentation process, but after a certain amount of case information has been created, for example, the case creation process may be executed in the background of the similar case presentation process, or the case creation process may be executed periodically or non-periodically.
  • the interviewer and the subject in the case creation process are also referred to as the "first interviewer” and the “first subject”, respectively, and the interviewer and the subject in the similar case presentation process are also referred to as the “second interviewer” and the “second subject”.
  • the characteristics of the first subject are also referred to as the “first subject characteristics”
  • the characteristics of the second subject are also referred to as the "second subject characteristics”.
  • the subject's characteristics are the nature that represents the characteristics of the subject.
  • the characteristics of a subject in an interview regarding health guidance include the test results in a health check as the subject's physical characteristics, personality tendencies and health consciousness as psychological characteristics, and occupation, family structure, and the subject's lifestyle as social characteristics.
  • the subject's characteristics are not limited to these, and various characteristics can be used as the subject's characteristics depending on the type of interview.
  • FIG. 1 An example of the hardware configuration of an interview support device 10 according to this embodiment is shown in Fig. 1.
  • the interview support device 10 according to this embodiment is realized with the hardware configuration of a general computer, and has, for example, an input device 101, a display device 102, an external I/F 103, a communication I/F 104, a RAM (Random Access Memory) 105, a ROM (Read Only Memory) 106, an auxiliary storage device 107, and a processor 108.
  • Each of these pieces of hardware are connected to each other so as to be able to communicate with each other via a bus 109.
  • the input device 101 is, for example, a keyboard, a mouse, a touch panel, a physical button, etc.
  • the display device 102 is, for example, a display, a display panel, etc. Note that the interview support device 10 does not have to have at least one of the input device 101 and the display device 102, for example.
  • the external I/F 103 is an interface with external devices such as a recording medium 103a.
  • recording media 103a include a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), and a USB (Universal Serial Bus) memory card.
  • the communication I/F 104 is an interface for connecting the interview support device 10 to a communication network.
  • the RAM 105 is a volatile semiconductor memory (storage device) that temporarily stores programs and data.
  • the ROM 106 is a non-volatile semiconductor memory (storage device) that can store programs and data even when the power is turned off.
  • the auxiliary storage device 107 is a non-volatile storage device such as a HDD (Hard Disk Drive), SSD (Solid State Drive), flash memory, etc.
  • the processor 108 is, for example, various types of arithmetic devices such as a CPU (Central Processing Unit).
  • the hardware configuration shown in FIG. 1 is an example, and the hardware configuration of the interview support device 10 is not limited to this.
  • the interview support device 10 may have multiple auxiliary storage devices 107 or multiple processors 108, may not have some of the hardware shown in the figure, or may have various hardware other than the hardware shown in the figure (e.g., a microphone, a speaker, a camera, etc.).
  • the interview support device 10 according to this embodiment also has a conversation DB 230 and a case information DB 240. Each of these DBs (databases) is realized, for example, by the auxiliary storage device 107 or the like. However, either or both of the conversation DB 230 and the case information DB 240 may also be realized by a storage device provided in a database server or the like connected to the interview support device 10 via a communications network.
  • the case creation processing unit 210, the similar case presentation processing unit 220, the conversation DB 230, and the case information DB 240 are included in the interview support device 10 realized by a single computer, but the case creation processing unit 210, the similar case presentation processing unit 220, the conversation DB 230, and the case information DB 240 may be distributed among multiple computers. In this case, the system realized by these multiple computers may be called an "interview support system" or the like.
  • the case creation processing unit 210 executes the case creation process.
  • the case creation processing unit 210 includes a first interviewee voice recording unit 211, a first interviewee voice recognition unit 212, a first subject voice recording unit 213, a first subject voice recognition unit 214, an emotion change analysis unit 215, a conversation extraction unit 216, a first subject characteristic acquisition unit 217, and a case information creation unit 218.
  • the first interviewee voice recording unit 211 records the voice of the first interviewee input to the microphone for the first interviewee to create voice data (hereinafter also referred to as first interviewee voice data).
  • the microphone for the first interviewee may be directly connected to or built into the interview support device 10, or may be connected to the interview support device 10 via a communication network, or in the case of an online interview, may be directly connected to or built into a terminal (PC (personal computer), smartphone, tablet terminal) used by the first interviewee.
  • PC personal computer
  • smartphone tablet terminal
  • the first interviewer voice recognition unit 212 performs voice recognition on the voice represented by the first interviewer voice data, and creates text data with time information (hereinafter also referred to as first interviewer text data) in which text representing the content of the voice is associated with the time of speech.
  • the first interviewer voice recognition unit 212 also stores this first interviewer text data in the conversation DB 230.
  • the first interviewer voice recognition unit 212 may create the first interviewer text data using existing voice recognition technology.
  • the first subject voice recording unit 213 records the voice of the first subject input to the microphone for the first subject to create voice data (hereinafter also referred to as first subject voice data).
  • the microphone for the first subject may be directly connected to or built into the interview support device 10, or may be connected to the interview support device 10 via a communication network, or in the case of an online interview, may be directly connected to or built into a terminal (PC (personal computer), smartphone, tablet terminal) used by the first subject.
  • PC personal computer
  • smartphone tablet terminal
  • the first subject voice recognition unit 214 performs voice recognition on the voice represented by the first subject voice data, and creates text data with time information (hereinafter also referred to as first subject text data) in which text representing the spoken content of the voice is associated with the time of utterance.
  • the first subject voice recognition unit 214 also stores this first subject text data in the conversation DB 230.
  • the first subject voice recognition unit 214 may create the first subject text data using existing voice recognition technology.
  • the microphone for the first interviewer and the microphone for the first subject may be common.
  • the first interviewer voice recording unit 211 and the first subject voice recording unit 213, and the first interviewer voice recognition unit 212 and the first subject voice recognition unit 214 may each be common, and the voice recognition technology used may be one that is capable of voice recognition for each speaker.
  • the emotion change analysis unit 215 performs emotion analysis or emotion recognition (hereinafter collectively referred to as "emotion analysis”) using the first subject's voice data and the first subject's text data to determine whether or not there has been an emotional change in the first subject, and if an emotional change has occurred, identifies the time of the change.
  • emotion analysis emotion analysis or emotion recognition
  • the conversation extraction unit 216 uses the first interviewee text data and the first subject text data stored in the conversation DB 230 to extract a conversation including the utterance that caused the emotional change and the utterances before and after it.
  • the first subject characteristic acquisition unit 217 acquires the first subject characteristic. For example, in an interview regarding health guidance, the first subject characteristic acquisition unit 217 may acquire the first subject characteristic from data showing the results of a health check or data from a medical questionnaire. Also, for example, the first subject characteristic acquisition unit 217 may acquire the first subject characteristic from at least one of the first interviewee text data and the first subject text data.
  • the case information creation unit 218 creates case information that associates a subject ID indicating identification information for identifying the first subject, the first subject characteristics acquired by the first subject characteristics acquisition unit 217, the time at which an emotional change occurred in the first subject, and the conversation extracted by the conversation extraction unit 216.
  • the case information creation unit 218 also stores the case information in the case information DB 240.
  • the similar case presentation processing unit 220 executes the similar case presentation process.
  • the similar case presentation processing unit 220 includes a second subject characteristic acquisition unit 221, a similar case acquisition unit 222, and a similar case presentation unit 223.
  • the second subject characteristic acquisition unit 221 acquires the second subject characteristic.
  • the second subject characteristic acquisition unit 221 may acquire the second subject characteristic in the same manner as the first subject characteristic acquisition unit 217. That is, for example, in an interview regarding health guidance, the second subject characteristic acquisition unit 221 may acquire the second subject characteristic from data representing the results of a health check or data from a medical questionnaire.
  • the second subject characteristic acquisition unit 221 may acquire the second subject characteristic from at least one of the second interviewee text data and the second subject text data.
  • the similar case acquisition unit 222 acquires case information including a first subject characteristic that is similar to a second subject characteristic from the case information DB 240 as similar case information.
  • the similar case presentation unit 223 presents the first subject characteristics and conversation contained in the similar case information to the second interviewer. As a result, the conversation that caused an emotional change in a subject with characteristics similar to the second subject with whom the second interviewer is currently interviewing is presented to the second interviewer. This allows the second interviewer to refer to this conversation and effectively conduct the interview with the second subject.
  • the conversation DB 230 stores the first interviewee text data and the first target person text data.
  • the case information DB240 stores case information. An example of case information will be described later.
  • the functional configuration of the interview support device 10 shown in FIG. 2 is an example and is not limited to this.
  • case information for a health guidance interview is shown in Fig. 3.
  • the case information includes a "subject ID” indicating identification information for identifying a first subject, a "first subject characteristic” indicating the characteristic of the first subject, an "emotion change time” indicating the time when an emotion change occurred, and a “conversation” indicating the conversation extracted by the conversation extraction unit 216.
  • the case information shown in FIG. 3 includes a subject ID "B1".
  • the case information shown in FIG. 3 also includes items representing characteristics classified as “physical”, “psychological”, “social”, “lifestyle”, etc. as the first subject characteristic, and the physical aspect includes items such as "Test value: BMI”, “Test value: high blood pressure”, and “Test value: hyperglycemia”.
  • the psychological aspect includes items such as “personality tendency: scrupulous”, “personality tendency: lazy”, “personality tendency: logical”, “health consciousness: high”, and “health consciousness: low”
  • the social aspect includes items such as "family composition: spouse”, “family composition: children”
  • the lifestyle habit includes items such as "exercise/golf”, “exercise/running”, “drinking”, “smoking”, and “sleep”.
  • the items representing each of these characteristics are expressed as two values (for example, "1” if a certain condition corresponding to the item is met, and "0” if it is not met).
  • Test value: BMI is expressed as “1” if the BMI value is outside a predetermined standard range, and as “0” otherwise.
  • personality tendency: meticulous is represented as “1” if the personality is meticulous, and as “0” if not.
  • drinking is represented as “1” if the person has a drinking habit, and as “0” if not. The same applies to items that represent other characteristics.
  • the case information shown in FIG. 3 includes the emotion change time "2022/9/20 11:40:22". Furthermore, the case information shown in FIG. 3 includes a conversation including an utterance at the time when the emotion change occurred, and two utterances by the first interviewer and the first subject before and after that time, as a conversation including utterances before and after the emotion change occurred. For example, in the example shown in FIG. 3, an emotion change occurred with the utterance of the first subject (the utterance underlined in FIG. 3), "I see.
  • the case information includes two utterances by the first interviewer before and after the utterance at which the emotion change occurred, and two utterances by the second subject before and after the utterance at which the emotion change occurred.
  • ⁇ Case information creation process> The case creation process according to this embodiment will be described below with reference to Fig. 4. Note that the following steps S101 to S109 are repeatedly executed every time an interview is conducted between the first interviewer and the first target person, for example.
  • the first interviewee voice recording unit 211 records the voice of the first interviewee to create first interviewee voice data (step S101).
  • the first interviewee voice recognition unit 212 performs voice recognition on the voice represented by the first interviewee voice data created in step S101 above, and creates first interviewee text data (step S102).
  • the first interviewee text data is stored in the conversation DB 230.
  • the first subject voice recording unit 213 records the voice of the first subject and creates first subject voice data (step S103).
  • the first subject voice recognition unit 214 performs voice recognition on the voice represented by the first subject voice data created in step S103 above, and creates first subject text data (step S104).
  • the first subject text data is stored in the conversation DB 230.
  • steps S101 to S102 and steps S103 to S104 are repeatedly executed, for example, every unit time (for example, a predetermined time span of several tens of seconds to about one minute), and the next step S105 and subsequent steps are executed after the end of the interview. That is, the first interviewee text data and the first target person text data are stored in the conversation DB 230 for each unit time. However, this does not necessarily have to be every unit time, and for example, the voice from the start to the end of the interview may be recorded in steps S101 and S103, and the voice may be subjected to voice recognition in steps S102 and S104 to create the first interviewee text data and the first target person text data.
  • next step S105 does not necessarily have to be executed after the end of the interview, and may be executed during the interview, for example, after a certain amount of the first interviewee text data and the first target person text data have been stored in the conversation DB 230.
  • the emotion change analysis unit 215 performs emotion analysis using the first subject's voice data and the first subject's text data to determine whether or not there has been an emotion change in the first subject, and if an emotion change has occurred, to identify the time of the emotion change (step S105).
  • emotion analysis by the emotion change analysis unit 215 are described below.
  • the emotion change analysis unit 215 analyzes emotion changes from the tone of the first subject's voice, the number of backchannels, and the number of utterances. This is because it is considered that the higher the tone of the voice, the more backchannels, and the more utterances, the more the first subject is interested in the approach from the first interviewer (e.g., health guidance, etc.), and an emotion change is occurring.
  • the first interviewer e.g., health guidance, etc.
  • the emotion change analysis unit 215 analyzes emotion changes for each unit time by following steps 11 to 18 below.
  • the unit time is represented as t, and an explanation will be given below of the case where emotion changes are analyzed in a certain unit time t.
  • Step 11 The emotion change analysis unit 215 uses the first subject's voice data for unit time t to obtain a fundamental frequency xt that indicates the pitch of the voice for unit time t.
  • Step 12 The emotion change analysis unit 215 obtains the number of backchannels yt in the unit time t by using the first subject text data in the unit time t and a dictionary in which words used for backchannels are registered. Examples of words used for backchannels include “Yes,”"Yeah,””Isee,””That'sright,””Uh-huh,” and “That's right.”
  • Step 13 The emotion change analysis unit 215 obtains the number of utterances zt in unit time t by using at least one of the first subject's voice data in unit time t and the first subject's text data in unit time t.
  • an utterance is a unit of speech that has syntactic, harnessive, and interactive coherence (Reference 1).
  • Step 14 The emotion change analysis unit 215 determines whether or not at least one of xt > thx , yt > thy , and zt > thz is satisfied, where thx is the threshold for the fundamental frequency, thy is the threshold for the number of backchannels, and thz is the threshold for the number of utterances per unit time. Note that the values of these thresholds thx , thy , and thz are set in advance.
  • Step 15 If it is determined in step 14 above that none of xt > thx , yt > thy , and zt > thz is satisfied, the emotion change analysis unit 215 determines that no emotion change has occurred in the first subject in unit time t. On the other hand, if it is determined in step 14 above that at least one of x> thx , y> thy , and z> thz is satisfied, the emotion change analysis unit 215 calculates the index value S t as follows.
  • a is a weight for the fundamental frequency
  • b is a weight for the number of backchannels
  • c is a weight for the number of utterances.
  • the values of a, b, and c are set in advance.
  • the index value S t indicates the degree to which the emotion of the first subject has changed in unit time t, and the higher the value, the greater the change in the emotion of the first subject in unit time t. This index value S t may be called, for example, "degree of sensitivity.”
  • Step 16 The emotion change analysis unit 215 determines whether S t > th S is satisfied, with th S being the threshold for the index value.
  • the value of the threshold th S is set in advance. There are various methods for determining the value of the threshold th S. For example, interviews regarding health guidance are conducted with multiple subjects in the same environment, and when the interviewer judges that an emotional change has occurred in the subject (for example, the subject has become more health conscious, has become interested in health behavior, has felt the need for behavioral change, etc.), the voice pitch (fundamental frequency), the number of backchannels, and the number of utterances per unit time are measured, and the value of the threshold th S is determined using these as teachers.
  • Step 17 If it is determined in step 16 above that S t > th S is not satisfied, the emotion change analysis unit 215 determines that no emotion change has occurred in the first subject in the unit time t. On the other hand, if it is determined in step 16 above that S t > th S is satisfied, the emotion change analysis unit 215 determines that an emotion change has occurred in the first subject in the unit time t.
  • Step 18 If it is determined in step 17 above that an emotional change has occurred in the first subject, the emotional change analysis unit 215 identifies the time when the emotional change occurred in the first subject (hereinafter also referred to as the emotional change time). For example, the emotional change analysis unit 215 may identify the start time, end time, or intermediate time of the unit time t as the emotional change time.
  • the emotional change analysis unit 215 may find the amount of change in the pitch of the first subject's voice (fundamental frequency) or the amount of change in the number of utterances within that unit time t, and identify the time when the amount of change in the pitch of the voice becomes equal to or exceeds a predetermined threshold or the time when the amount of change in the number of utterances becomes equal to or exceeds a predetermined threshold as the emotional change time.
  • ⁇ Sentiment analysis example 2 When an emotion change occurs, for example, the first subject's arm movement, upper body movement, head movement, and head up and down movement (i.e., nodding, etc.) may increase. For this reason, when calculating the index value S t in step 15 of the emotion analysis example 1 above, the amount of movement u t of the first subject in unit time t may be taken into consideration. That is, in step 15 of the emotion analysis example 1 above, the emotion change analysis unit 215 may calculate the index value S t as follows.
  • d is a weight for the amount of movement of the first subject, and its value is set in advance.
  • the amount of movement u t can be calculated from image data of the first subject when the image data is available.
  • the amount of movement u t can be calculated from the sensor value when an acceleration sensor or motion sensor is attached to the head or arm of the first subject and the sensor value is available.
  • the amount of speech of the first subject may be used instead of or in addition to the pitch of the voice, the number of backchannels, and the number of utterances.
  • the amount of speech refers to the speech time, speech frequency, and speech length.
  • the speech time indicates the amount of speech generated relative to the data length (i.e., unit time) as a percentage.
  • the speech frequency refers to the number of utterances per unit time.
  • the speech length refers to the average time from the start to the end of one utterance.
  • ⁇ Sentiment analysis example 4 The presence or absence of a change in emotion and the time of the change may be determined by using an existing emotion analysis technique.
  • the existing emotion analysis technique include those described in References 3 to 8.
  • References 3 and 8 are techniques for analyzing emotion from audio
  • References 4 and 6 are techniques for analyzing emotion from video
  • Reference 7 is a technique for analyzing emotion from text
  • Reference 5 is a technique for analyzing emotion from text, audio, and video.
  • step S105 If it is not determined in step S105 that the first subject has experienced an emotional change in all unit times t (NO in step S106), the case creation processing unit 210 ends the case creation process. On the other hand, if it is determined in step S105 that the first subject has experienced an emotional change in a certain unit time t (YES in step S106), the conversation extraction unit 216 extracts a conversation including the utterance in which the emotional change occurred and the utterances before and after the utterance from the conversation DB 230 (step S107).
  • the conversation extraction unit 216 extracts, for example, the utterance of the first subject at the utterance time closest to the time of the emotional change of the first subject, N utterances of the first subject before and after the utterance time, and the utterance of the first interviewer before and after the utterance time from the conversation DB 230.
  • the conversation extraction unit 216 may extract from the conversation DB 230 a conversation that includes, for example, an utterance by the first subject (or the first interviewee) at the utterance time closest to the emotion change time of the first subject, and an utterance uttered within a predetermined time span before and after the utterance time.
  • the utterance of the first interviewee whose time of speech is closest to the time of the emotion change of the first subject may be extracted.
  • the first subject characteristic acquisition unit 217 acquires the first subject characteristic (step S108).
  • the case information creation unit 218 creates case information that associates the subject ID of the first subject, the first subject characteristic acquired in step S108 above, the emotion change time identified in step S105 above, and the conversation extracted in step S107 above, and stores the case information in the case information DB 240 (step S109). This results in obtaining case information that includes the first subject characteristic and the conversation that causes an emotional change in the first subject having that characteristic.
  • the second subject characteristic acquisition unit 221 acquires the second subject characteristic (step S201).
  • the similar case acquisition unit 222 acquires case information including a first subject characteristic similar to the second subject characteristic acquired in step S201 above from the case information DB 240 as similar case information (step S202). Specifically, the similar case acquisition unit 222 acquires similar case information by steps 21 to 23 below.
  • Step 21 The similar case acquisition unit 222 obtains the similarity between the second subject characteristic obtained in step S201 and the first subject characteristic included in each case information stored in the case information DB 240. If the subject characteristic includes n items representing the characteristics of the subject, the first subject characteristic and the second subject characteristic are expressed as an n-dimensional vector.
  • the similarity Sim(.,.) for example, cosine similarity or the like may be used, but this is not limited to this, and any similarity that can measure the similarity between vectors can be used.
  • Step 22 The similar case acquiring unit 222 obtains m′ that is the maximum value of Sim(V (m) , W) obtained in the above step 21.
  • Step 23 The similar case acquisition unit 222 acquires the m'th case information obtained in step 22 above from the case information DB 240. This allows the case information containing the first subject characteristic that is most similar to the second subject characteristic to be obtained as similar case information.
  • the case information containing the first subject characteristic that is most similar to the second subject characteristic is defined as the similar case information, but this is not limited to this.
  • the top M' M' is an integer equal to or greater than 2 pieces of case information in order of similarity to the second subject characteristic may be obtained as the similar case information.
  • the similar case presentation unit 223 presents the first subject characteristic and conversation contained in the similar case information acquired in step S202 above to the second interviewer (step S203).
  • the first subject characteristic and conversation may be displayed on the display of a terminal used by the second interviewer.
  • the first subject characteristic and conversation may be displayed on the display device 102 of the interview support device 10.
  • FIG. 6 an example of the first subject characteristic presented to the second interviewer in step S203 above is shown in FIG. 6.
  • items with the same values as the items included in the first subject characteristic are highlighted in a manner different from the other items. That is, in the example shown in FIG. 6, "Laboratory value: high blood pressure”, “Laboratory value: high blood sugar”, “Personality tendency: logical”, “Drinking”, and “Smoking” are highlighted. This allows the second interviewer to easily know the items that are the same characteristics as the second subject he or she is interviewing.
  • FIG. 7 shows an example of the conversation presented to the second interviewer in step S203 above.
  • conversation 2200 shown in FIG. 7 utterances 2201-2205 of the first subject and utterances 2211-2214 of the first interviewer are displayed, and utterance 2201 of the first subject, whose utterance time is closest to the emotion change time, is highlighted in a manner different from the other utterances.
  • This allows the second interviewer to know the utterance when the emotion change occurred (utterance 2201) and the utterances before and after it (utterances 2202-2205 and utterances 2211-2214), and he or she can use this as a reference to effectively conduct the interview.
  • step S203 both the first subject characteristics and the conversation contained in the similar case information are presented to the second interviewee, but this is not limited thereto, and for example, only the conversation contained in the similar case information may be presented to the second interviewee.
  • the interview support device 10 stores case information in association with a conversation that occurs when an emotional change (mental change) occurs in a subject during an interview with an interviewer, and the characteristics of the subject. Then, when another interview is conducted, the interview support device 10 according to the present embodiment extracts case information that includes characteristics similar to the characteristics of the subject of the interview from past case information according to the characteristics of the subject of the interview, and presents the case information to the interviewer of the interview. In this way, the interview support device 10 according to the present embodiment makes it possible for a plurality of interviewers to share a conversation that causes a mental change in a subject having at least similar characteristics. Therefore, each interviewer can effectively and efficiently have a conversation that encourages a behavioral change in the subject by referring to the information presented by the interview support device 10.
  • the interviewer in an interview regarding health guidance, as the level of interest in health and enthusiasm for the interview vary from person to person, the interviewer must proceed with the conversation while adjusting the dialogue process according to the personality traits and reactions of the subject picked up during the conversation, and lead the subject to an action plan such as improving lifestyle habits while drawing out the subject's motivation (Reference 9). For this reason, by using the interview support device 10 of this embodiment, a conversation that can draw out the subject's motivation becomes clear, and as a result, the interviewer can provide the subject with effective and efficient health guidance.
  • Reference 1 Corpus of Japanese Daily Conversation
  • Reference 2 Hirai, Yuki, and Inoue, Tomoo: State estimation in pair programming learning - Differences in conversation between success and failure in resolving stumbling blocks, Transactions of the Information Processing Society of Japan, Vol. 53, No. 1, pp. 72-80 (2012).
  • Reference 3 Emotion recognition technology, Internet ⁇ URL: https://www.docomo.ne.jp/corporate/technology/rd/tech/term/21/index.html>
  • Reference 4 Com Analyzer, Internet ⁇ URL: https://www.nttdata.com/jp/ja/news/release/2019/052700/>
  • Reference 5 AI suite, Internet ⁇ URL: https://cloud.watch.impress.co.jp/docs/news/1364523.html>
  • Reference 6 Heart Sensor for Communication, Internet ⁇ URL: https://service.cac.co.jp/hctech/ks4c>
  • Reference 7 Tone Analyzer, Internet ⁇ URL: https://cloud.ibm.com/docs/tone-analyzer/getting-started.html>
  • Reference 9 Tae Sato,

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

An interview support device according to an embodiment of the present disclosure comprises: a determination unit configured to determine whether or not an emotional change has occurred in a first subject during an interview using at least one of the voice of the first subject and a first voice recognition result that represents the result of voice recognition of the voice of the first subject; a conversation extraction unit configured to, if it is determined that an emotional change has occurred in the first subject, extract a conversation comprising a plurality of utterances including an utterance that caused the emotional change, from the first voice recognition result and a second voice recognition result that represents the result of voice recognition of the voice of a first interviewer who is interviewing the first subject; a case creation unit configured to create case information in which the characteristics of the first subject are associated with the conversation, and store the created case information in a storage unit; and a similar case presentation unit configured to present, as similar case information, case information that is included in the case information stored in the storage unit, and that includes characteristics similar to the characteristics of a second subject to be interviewed, to a second interviewer who will interview the second subject.

Description

面談支援装置、面談支援方法及びプログラムInterview support device, interview support method, and program
 本開示は、面談支援装置、面談支援方法及びプログラムに関する。 This disclosure relates to an interview support device, an interview support method, and a program.
 面談者が被面談者(面談対象となる者。以下、「対象者」ともいう。)と面談する場合、面談者は対象者の感情変化を表情の変化、動作、視線、相槌等から捉え、その感情の変化によって或る特定の話題や事項等に対する対象者の興味や関心等を把握している。例えば、保健指導に関する面談の場では、面談者は、対象者との関係性が構築できたか、対象者は健康に関心を抱いたか、生活習慣を改善しようという気持ちになったか、行動を起こす動機づけが十分か等を対象者の声のトーンの変化や発話量の変化等から捉えた感情変化によって把握している。 When an interviewer interviews an interviewee (the person being interviewed; hereafter also referred to as the "subject"), the interviewer picks up on the subject's emotional changes from facial expressions, movements, gaze, responses, etc., and from these emotional changes he or she is able to grasp the subject's interest or concern in a particular topic or matter. For example, in an interview regarding health guidance, the interviewer will grasp from emotional changes picked up from changes in the subject's tone of voice, volume of speech, etc., whether a rapport has been established with the subject, whether the subject has become interested in health, whether they feel motivated to improve their lifestyle, whether they are sufficiently motivated to take action, etc.
 なお、特定保健指導を受ける人がヘルスプロモーション行動へ向かう動機づけの要素を明らかにする研究として、特定保健指導を受ける人に対して半構成的インタビューを実施し、そのインタビューの内容から動機づけの要素を抽出する研究が知られている(非特許文献1)。 In addition, one known study to clarify the motivational factors that motivate people who receive specific health guidance to engage in health promotion behavior is to conduct semi-structured interviews with people who receive specific health guidance and extract motivational factors from the content of the interviews (Non-Patent Document 1).
 しかしながら、保健指導を含む様々な面談(面接、会話等も含む。)において、対象者に感情変化を生じさせる会話のノウハウは面談者個々人に蓄積され、その共有はなされていない。また、対象者に感情変化を生じさせる会話は、その対象者の特性(例えば、対象者の性格や生活習慣、健康意識の高さ等)に応じて異なり得る。 However, in various consultations (including interviews, conversations, etc.) including health guidance, the know-how of conversations that induce emotional changes in the subject is accumulated by each individual interviewer and is not shared. Furthermore, conversations that induce emotional changes in the subject may differ depending on the characteristics of the subject (for example, the subject's personality, lifestyle, level of health awareness, etc.).
 本開示は、上記の点に鑑みてなされたもので、対象者の特性に応じてその対象者に感情変化を生じさせる会話を面談者に提示できる技術を提供する。 The present disclosure has been made in consideration of the above points, and provides technology that can present an interviewee with a conversation that induces an emotional change in the subject according to the subject's characteristics.
 本開示の一態様による面談支援装置は、面談中の第1の対象者の音声と、前記第1の対象者の音声を音声認識した結果を表す第1の音声認識結果との少なくとも一方を用いて、前記第1の対象者に感情変化が生じたか否かを判定するように構成されている判定部と、前記第1の対象者に感情変化が生じたと判定された場合、前記第1の音声認識結果と、前記第1の対象者と面談を行っている第1の面談者の音声を音声認識した結果を表す第2の音声認識結果との中から、前記感情変化が生じた発話を含む複数の発話で構成される会話を抽出するように構成されている会話抽出部と、前記第1の対象者の特性と、前記会話とを対応付けた事例情報を作成して記憶部に記憶させるように構成されている事例作成部と、前記記憶部に記憶されている事例情報のうち、面談対象の第2の対象者の特性と類似する特性が含まれる事例情報を類似事例情報として前記第2の対象者と面談する第2の面談者に提示するように構成されている類似事例提示部と、を有する。 The interview support device according to one aspect of the present disclosure includes a determination unit configured to determine whether or not an emotional change has occurred in the first subject using at least one of the voice of the first subject during an interview and a first voice recognition result representing the result of voice recognition of the voice of the first subject; a conversation extraction unit configured to extract a conversation consisting of a plurality of utterances including an utterance in which the emotional change has occurred from the first voice recognition result and a second voice recognition result representing the result of voice recognition of the voice of the first interviewer who is interviewing the first subject when it is determined that an emotional change has occurred in the first subject; a case creation unit configured to create case information that associates the characteristics of the first subject with the conversation and store the case information in a storage unit; and a similar case presentation unit configured to present case information stored in the storage unit that includes characteristics similar to the characteristics of the second subject to be interviewed as similar case information to the second interviewer who is interviewing the second subject.
 対象者の特性に応じてその対象者に感情変化を生じさせる会話を面談者に提示できる技術が提供される。 Technology is provided that can present an interviewee with a conversation that induces emotional changes in the subject based on the subject's characteristics.
本実施形態に係る面談支援装置のハードウェア構成の一例を示す図である。1 is a diagram illustrating an example of a hardware configuration of an interview support device according to an embodiment of the present invention. 本実施形態に係る面談支援装置の機能構成の一例を示す図である。1 is a diagram illustrating an example of a functional configuration of an interview support device according to an embodiment of the present invention. 事例情報の一例を示す図である。FIG. 11 is a diagram illustrating an example of case information. 本実施形態に係る事例作成処理の一例を示すフローチャートである。10 is a flowchart illustrating an example of a case creation process according to the present embodiment. 本実施形態に係る類似事例提示処理の一例を示すフローチャートである。10 is a flowchart illustrating an example of a similar case presentation process according to the present embodiment. 類似事例に含まれる対象者特性の提示結果の一例を示す図である。FIG. 13 is a diagram showing an example of a presentation result of subject characteristics included in similar cases. 類似事例に含まれる会話の提示結果の一例を示す図である。FIG. 13 is a diagram showing an example of a presentation result of conversations included in similar cases.
 以下、本発明の一実施形態について説明する。以下の実施形態では、保健指導を含む様々な面談(面接、会話等も含む。)を対象として、その面談の対象者の特性に応じて当該対象者に感情変化を生じさせる会話を面談に提示できる面談支援装置10について説明する。本実施形態に係る面談支援装置10により、面談者は、対象者の特性に応じてその対象者に感情変化を生じさせる会話を知ることができるため、その対象者との面談を効果的に進めることが可能となる。例えば、保健指導に関する面談では、面談者は、面談支援装置10から提示された会話を参考に、生活習慣の改善等といった対象者に行動変容を促す会話を効果的に行うことが可能となる。 An embodiment of the present invention will be described below. In the following embodiment, an interview support device 10 will be described that can present a conversation that will cause an emotional change in the subject of the interview according to the characteristics of the subject of the interview, for various interviews (including interviews, conversations, etc.) including health guidance. The interview support device 10 of this embodiment allows the interviewer to know the conversation that will cause an emotional change in the subject according to the characteristics of the subject, making it possible to effectively conduct an interview with the subject. For example, in an interview regarding health guidance, the interviewer can refer to the conversation presented by the interview support device 10 and effectively have a conversation that encourages the subject to change their behavior, such as improving their lifestyle habits.
 なお、感情変化は「心的変化」等と呼ばれてもよく、対象者に感情変化を生じさせる会話とは対象者の心に響いた又は心に刺さった会話のことを意味する。 In addition, emotional changes may also be called "mental changes," and a conversation that causes an emotional change in a subject refers to a conversation that touches or penetrates the subject's heart.
 以下では、面談として保健指導に関する面談を想定する。ただし、これは一例であって、本実施形態に係る面談支援装置10が適用可能な面談は保健指導に関する面談に限られるものではない。保健指導に関する面談以外にも、例えば、学校等における進路指導に関する面談、塾等における学習指導に関する面談、会社等における人事面談や業務面談、採用面接等いった様々な面談(面接も含む。)に適用可能である。より一般には、或る者(面談者)が他の1人以上の者(対象者)との間で何等かの会話を行う場合にも適用可能である。また、面談(面接、会話等を含む。)の形式はオンライン面談(Web面談等を含む。)であってもよいし、対面形式の面談であってもよい。 In the following, an interview regarding health guidance is assumed as the interview. However, this is only one example, and interviews to which the interview support device 10 according to this embodiment can be applied are not limited to interviews regarding health guidance. In addition to interviews regarding health guidance, the device can be applied to various interviews (including interviews) such as interviews regarding career guidance at schools, interviews regarding learning guidance at cram schools, personnel interviews and business interviews at companies, and employment interviews. More generally, the device can be applied to cases in which a certain person (interviewer) has some kind of conversation with one or more other people (subjects). The interview (including interviews, conversations, etc.) may be in the form of an online interview (including web interviews, etc.) or a face-to-face interview.
 ここで、本実施形態に係る面談支援装置10は、「事例作成処理」と「類似事例提示処理」という2つの処理を実行する。「事例作成処理」とは、面談者と対象者の音声を音声認識した結果や事前の問診結果等を用いて、対象者の特性とその対象者に感情変化を生じさせた会話とを対応付けた事例情報を作成する処理のことである。一方で、「類似事例提示処理」とは、事例作成処理で作成された事例情報のうち、対象者の特性に類似する特性が含まれる事例情報を類似事例情報として取得し、この類似事例情報に含まれる会話を面談者に提示する処理のことである。事例作成処理は類似事例提示処理によりも前に実行されるが、或る程度の事例情報が作成された後は、例えば、事例作成処理が類似事例提示処理のバックグラウンドで実行されたり、定期的又は非定期的に事例作成処理が実行されたりしてもよい。 Here, the interview support device 10 according to this embodiment executes two processes, a "case creation process" and a "similar case presentation process." The "case creation process" is a process for creating case information that associates the characteristics of the subject with a conversation that caused an emotional change in the subject, using the results of speech recognition of the voices of the interviewee and the subject, the results of a preliminary interview, etc. On the other hand, the "similar case presentation process" is a process for acquiring case information that includes characteristics similar to the characteristics of the subject from the case information created in the case creation process as similar case information, and presenting the conversation included in this similar case information to the interviewee. The case creation process is executed before the similar case presentation process, but after a certain amount of case information has been created, for example, the case creation process may be executed in the background of the similar case presentation process, or the case creation process may be executed periodically or non-periodically.
 なお、以下では、事例作成処理における面談者及び対象者のことをそれぞれ「第1の面談者」及び「第1の対象者」ともいい、類似事例提示処理における面談者及び対象者のことをそれぞれ「第2の面談者」及び「第2の対象者」ともいう。また、第1の対象者の特性のことを「第1の対象者特性」、第2の対象者の特性のことを「第2の対象者特性」ともいう。ここで、対象者の特性とは、その対象者の特徴を表す性質のことである。例えば、保健指導に関する面談における対象者の特性としては、その対象者の身体面の特性として健康診断における検査値、心理面の特性として性格的傾向や健康意識、社会面の特性として職業や家族構成、その対象者の生活習慣等が挙げられる。ただし、これらはいずれも一例であって、対象者の特性はこれらに限られるものではなく、面談の種類に応じて様々な特性を対象者の特性として利用することが可能である。 In the following, the interviewer and the subject in the case creation process are also referred to as the "first interviewer" and the "first subject", respectively, and the interviewer and the subject in the similar case presentation process are also referred to as the "second interviewer" and the "second subject". The characteristics of the first subject are also referred to as the "first subject characteristics", and the characteristics of the second subject are also referred to as the "second subject characteristics". Here, the subject's characteristics are the nature that represents the characteristics of the subject. For example, the characteristics of a subject in an interview regarding health guidance include the test results in a health check as the subject's physical characteristics, personality tendencies and health consciousness as psychological characteristics, and occupation, family structure, and the subject's lifestyle as social characteristics. However, these are all examples, and the subject's characteristics are not limited to these, and various characteristics can be used as the subject's characteristics depending on the type of interview.
 <面談支援装置10のハードウェア構成例>
 本実施形態に係る面談支援装置10のハードウェア構成例を図1に示す。図1に示すように、本実施形態に係る面談支援装置10は一般的なコンピュータのハードウェア構成で実現され、例えば、入力装置101と、表示装置102と、外部I/F103と、通信I/F104と、RAM(Random Access Memory)105と、ROM(Read Only Memory)106と、補助記憶装置107と、プロセッサ108とを有する。これらの各ハードウェアは、それぞれがバス109を介して通信可能に接続される。
<Example of Hardware Configuration of Interview Support Device 10>
An example of the hardware configuration of an interview support device 10 according to this embodiment is shown in Fig. 1. As shown in Fig. 1, the interview support device 10 according to this embodiment is realized with the hardware configuration of a general computer, and has, for example, an input device 101, a display device 102, an external I/F 103, a communication I/F 104, a RAM (Random Access Memory) 105, a ROM (Read Only Memory) 106, an auxiliary storage device 107, and a processor 108. Each of these pieces of hardware are connected to each other so as to be able to communicate with each other via a bus 109.
 入力装置101は、例えば、キーボード、マウス、タッチパネル、物理ボタン等である。表示装置102は、例えば、ディスプレイ、表示パネル等である。なお、面談支援装置10は、例えば、入力装置101及び表示装置102のうちの少なくとも一方を有していなくてもよい。 The input device 101 is, for example, a keyboard, a mouse, a touch panel, a physical button, etc. The display device 102 is, for example, a display, a display panel, etc. Note that the interview support device 10 does not have to have at least one of the input device 101 and the display device 102, for example.
 外部I/F103は、記録媒体103a等の外部装置とのインタフェースである。記録媒体103aとしては、例えば、CD(Compact Disc)、DVD(Digital Versatile Disk)、SDメモリカード(Secure Digital memory card)、USB(Universal Serial Bus)メモリカード等が挙げられる。 The external I/F 103 is an interface with external devices such as a recording medium 103a. Examples of recording media 103a include a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), and a USB (Universal Serial Bus) memory card.
 通信I/F104は、面談支援装置10を通信ネットワークに接続するためのインタフェースである。RAM105は、プログラムやデータを一時保持する揮発性の半導体メモリ(記憶装置)である。ROM106は、電源を切ってもプログラムやデータを保持することができる不揮発性の半導体メモリ(記憶装置)である。補助記憶装置107は、例えば、HDD(Hard Disk Drive)、SSD(Solid State Drive)、フラッシュメモリ等の不揮発性の記憶装置である。プロセッサ108は、例えば、CPU(Central Processing Unit)等の各種演算装置である。 The communication I/F 104 is an interface for connecting the interview support device 10 to a communication network. The RAM 105 is a volatile semiconductor memory (storage device) that temporarily stores programs and data. The ROM 106 is a non-volatile semiconductor memory (storage device) that can store programs and data even when the power is turned off. The auxiliary storage device 107 is a non-volatile storage device such as a HDD (Hard Disk Drive), SSD (Solid State Drive), flash memory, etc. The processor 108 is, for example, various types of arithmetic devices such as a CPU (Central Processing Unit).
 なお、図1に示すハードウェア構成は一例であって、面談支援装置10のハードウェア構成はこれに限られるものではない。例えば、面談支援装置10は、複数の補助記憶装置107や複数のプロセッサ108を有していてもよいし、図示したハードウェアの一部を有していなくてもよいし、図示したハードウェア以外の種々のハードウェア(例えば、マイクロフォン、スピーカー、カメラ等)を有していてもよい。 Note that the hardware configuration shown in FIG. 1 is an example, and the hardware configuration of the interview support device 10 is not limited to this. For example, the interview support device 10 may have multiple auxiliary storage devices 107 or multiple processors 108, may not have some of the hardware shown in the figure, or may have various hardware other than the hardware shown in the figure (e.g., a microphone, a speaker, a camera, etc.).
 <面談支援装置10の機能構成例>
 本実施形態に係る面談支援装置10の機能構成例を図2に示す。図2に示すように、本実施形態に係る面談支援装置10は、事例作成処理部210と、類似事例提示処理部220とを有する。これら各部は、例えば、面談支援装置10にインストールされた1以上のプログラムが、プロセッサ108等に実行させる処理により実現される。また、本実施形態に係る面談支援装置10は、会話DB230と、事例情報DB240とを有する。これら各DB(データベース)は、例えば、補助記憶装置107等により実現される。ただし、会話DB230及び事例情報DB240のいずれか一方又は両方が、面談支援装置10と通信ネットワークを介して接続されるデータベースサーバ等が備える記憶装置により実現されていてもよい。
<Example of functional configuration of interview support device 10>
An example of the functional configuration of the interview support device 10 according to this embodiment is shown in Fig. 2. As shown in Fig. 2, the interview support device 10 according to this embodiment has a case creation processing unit 210 and a similar case presentation processing unit 220. Each of these units is realized, for example, by processing in which one or more programs installed in the interview support device 10 are executed by the processor 108 or the like. The interview support device 10 according to this embodiment also has a conversation DB 230 and a case information DB 240. Each of these DBs (databases) is realized, for example, by the auxiliary storage device 107 or the like. However, either or both of the conversation DB 230 and the case information DB 240 may also be realized by a storage device provided in a database server or the like connected to the interview support device 10 via a communications network.
 なお、図2に示す例では、事例作成処理部210と類似事例提示処理部220と会話DB230と事例情報DB240とを1台のコンピュータで実現される面談支援装置10が有しているが、事例作成処理部210と類似事例提示処理部220と会話DB230と事例情報DB240は複数台のコンピュータが分散して有していてもよい。この場合、これら複数台のコンピュータで実現されるシステムは「面談支援システム」等と呼ばれてもよい。 In the example shown in FIG. 2, the case creation processing unit 210, the similar case presentation processing unit 220, the conversation DB 230, and the case information DB 240 are included in the interview support device 10 realized by a single computer, but the case creation processing unit 210, the similar case presentation processing unit 220, the conversation DB 230, and the case information DB 240 may be distributed among multiple computers. In this case, the system realized by these multiple computers may be called an "interview support system" or the like.
 事例作成処理部210は、事例作成処理を実行する。ここで、事例作成処理部210には、第1の面談者音声録音部211と、第1の面談者音声認識部212と、第1の対象者音声録音部213と、第1の対象者音声認識部214と、感情変化分析部215と、会話抽出部216と、第1の対象者特性取得部217と、事例情報作成部218とが含まれる。 The case creation processing unit 210 executes the case creation process. Here, the case creation processing unit 210 includes a first interviewee voice recording unit 211, a first interviewee voice recognition unit 212, a first subject voice recording unit 213, a first subject voice recognition unit 214, an emotion change analysis unit 215, a conversation extraction unit 216, a first subject characteristic acquisition unit 217, and a case information creation unit 218.
 第1の面談者音声録音部211は、第1の面談者用のマイクロフォンに入力された第1の面談者の音声を録音して音声データ(以下、第1の面談者音声データともいう。)を作成する。なお、第1の面談者用のマイクロフォンは、面談支援装置10に直接接続又は内蔵されていてもよいし、面談支援装置10と通信ネットワークを介して接続されていてもよいし、オンライン面談である場合には第1の面談者が利用する端末(PC(パーソナルコンピュータ)、スマートフォン、タブレット端末)等に直接接続又は内蔵されていてもよい。 The first interviewee voice recording unit 211 records the voice of the first interviewee input to the microphone for the first interviewee to create voice data (hereinafter also referred to as first interviewee voice data). The microphone for the first interviewee may be directly connected to or built into the interview support device 10, or may be connected to the interview support device 10 via a communication network, or in the case of an online interview, may be directly connected to or built into a terminal (PC (personal computer), smartphone, tablet terminal) used by the first interviewee.
 第1の面談者音声認識部212は、第1の面談者音声データが表す音声に対して音声認識を行って、その音声の発話内容を表すテキストとその発話時刻とが対応付けられた時刻情報付きテキストデータ(以下、第1の面談者テキストデータともいう。)を作成する。また、第1の面談者音声認識部212は、この第1の面談者テキストデータを会話DB230に保存する。なお、第1の面談者音声認識部212は、既存の音声認識技術により第1の面談者テキストデータを作成すればよい。 The first interviewer voice recognition unit 212 performs voice recognition on the voice represented by the first interviewer voice data, and creates text data with time information (hereinafter also referred to as first interviewer text data) in which text representing the content of the voice is associated with the time of speech. The first interviewer voice recognition unit 212 also stores this first interviewer text data in the conversation DB 230. The first interviewer voice recognition unit 212 may create the first interviewer text data using existing voice recognition technology.
 第1の対象者音声録音部213は、第1の対象者用のマイクロフォンに入力された第1の対象者の音声を録音して音声データ(以下、第1の対象者音声データともいう。)を作成する。なお、第1の対象者用のマイクロフォンは、面談支援装置10に直接接続又は内蔵されていてもよいし、面談支援装置10と通信ネットワークを介して接続されていてもよいし、オンライン面談である場合には第1の対象者が利用する端末(PC(パーソナルコンピュータ)、スマートフォン、タブレット端末)等に直接接続又は内蔵されていてもよい。 The first subject voice recording unit 213 records the voice of the first subject input to the microphone for the first subject to create voice data (hereinafter also referred to as first subject voice data). The microphone for the first subject may be directly connected to or built into the interview support device 10, or may be connected to the interview support device 10 via a communication network, or in the case of an online interview, may be directly connected to or built into a terminal (PC (personal computer), smartphone, tablet terminal) used by the first subject.
 第1の対象者音声認識部214は、第1の対象者音声データが表す音声に対して音声認識を行って、その音声の発話内容を表すテキストとその発話時刻とが対応付けられた時刻情報付きテキストデータ(以下、第1の対象者テキストデータともいう。)を作成する。また、第1の対象者音声認識部214は、この第1の対象者テキストデータを会話DB230に保存する。なお、第1の対象者音声認識部214は、既存の音声認識技術により第1の対象者テキストデータを作成すればよい。 The first subject voice recognition unit 214 performs voice recognition on the voice represented by the first subject voice data, and creates text data with time information (hereinafter also referred to as first subject text data) in which text representing the spoken content of the voice is associated with the time of utterance. The first subject voice recognition unit 214 also stores this first subject text data in the conversation DB 230. The first subject voice recognition unit 214 may create the first subject text data using existing voice recognition technology.
 なお、対面形式の面談である場合には、第1の面談者用のマイクロフォンと第1の対象者用のマイクロフォンは共通であってもよい。また、この場合、第1の面談者音声録音部211と第1の対象者音声録音部213、第1の面談者音声認識部212と第1の対象者音声認識部214はそれぞれ共通でよく、音声認識技術には話者毎に音声認識が可能な技術を利用すればよい。 In the case of a face-to-face interview, the microphone for the first interviewer and the microphone for the first subject may be common. In this case, the first interviewer voice recording unit 211 and the first subject voice recording unit 213, and the first interviewer voice recognition unit 212 and the first subject voice recognition unit 214 may each be common, and the voice recognition technology used may be one that is capable of voice recognition for each speaker.
 感情変化分析部215は、第1の対象者音声データ及び第1の対象者テキストデータを用いて感情分析又は感情認識(以下、これらをまとめて「感情分析」という。)を行って、第1の対象者の感情変化の有無を判定すると共に感情変化が生じた場合にはその時刻を特定する。 The emotion change analysis unit 215 performs emotion analysis or emotion recognition (hereinafter collectively referred to as "emotion analysis") using the first subject's voice data and the first subject's text data to determine whether or not there has been an emotional change in the first subject, and if an emotional change has occurred, identifies the time of the change.
 会話抽出部216は、第1の対象者に感情変化が生じたと判定された場合、会話DB230に格納されている第1の面談者テキストデータ及び第1の対象者テキストデータを用いて、感情変化が生じた発話とその前後の発話とを含む会話を抽出する。 When it is determined that an emotional change has occurred in the first subject, the conversation extraction unit 216 uses the first interviewee text data and the first subject text data stored in the conversation DB 230 to extract a conversation including the utterance that caused the emotional change and the utterances before and after it.
 第1の対象者特性取得部217は、第1の対象者特性を取得する。例えば、保健指導に関する面談では、第1の対象者特性取得部217は、健康診断の結果を表すデータや問診票のデータ等から第1の対象者特性を取得すればよい。また、例えば、第1の対象者特性取得部217は、第1の面談者テキストデータと第1の対象者テキストデータの少なくとも一方から第1の対象者特性を取得してもよい。 The first subject characteristic acquisition unit 217 acquires the first subject characteristic. For example, in an interview regarding health guidance, the first subject characteristic acquisition unit 217 may acquire the first subject characteristic from data showing the results of a health check or data from a medical questionnaire. Also, for example, the first subject characteristic acquisition unit 217 may acquire the first subject characteristic from at least one of the first interviewee text data and the first subject text data.
 事例情報作成部218は、第1の対象者を識別する識別情報を示す対象者IDと、第1の対象者特性取得部217によって取得された第1の対象者特性と、第1の対象者に感情変化が生じた時刻と、会話抽出部216によって抽出された会話とを対応付けた事例情報を作成する。また、事例情報作成部218は、事例情報を事例情報DB240に保存する。 The case information creation unit 218 creates case information that associates a subject ID indicating identification information for identifying the first subject, the first subject characteristics acquired by the first subject characteristics acquisition unit 217, the time at which an emotional change occurred in the first subject, and the conversation extracted by the conversation extraction unit 216. The case information creation unit 218 also stores the case information in the case information DB 240.
 類似事例提示処理部220は、類似事例提示処理を実行する。ここで、類似事例提示処理部220には、第2の対象者特性取得部221と、類似事例取得部222と、類似事例提示部223とが含まれる。 The similar case presentation processing unit 220 executes the similar case presentation process. Here, the similar case presentation processing unit 220 includes a second subject characteristic acquisition unit 221, a similar case acquisition unit 222, and a similar case presentation unit 223.
 第2の対象者特性取得部221は、第2の対象者特性を取得する。なお、第2の対象者特性取得部221は、第1の対象者特性取得部217と同様の方法により第2の対象者特性を取得すればよい。すなわち、例えば、保健指導に関する面談では、第2の対象者特性取得部221は、健康診断の結果を表すデータや問診票のデータ等から第2の対象者特性を取得すればよい。また、第2の面談者の音声と第2の対象者の音声とを音声認識した結果(つまり、第2の面談者テキストデータと第2の対象者テキストデータ)が存在する場合、第2の対象者特性取得部221は、第2の面談者テキストデータと第2の対象者テキストデータの少なくとも一方から第2の対象者特性を取得してもよい。 The second subject characteristic acquisition unit 221 acquires the second subject characteristic. The second subject characteristic acquisition unit 221 may acquire the second subject characteristic in the same manner as the first subject characteristic acquisition unit 217. That is, for example, in an interview regarding health guidance, the second subject characteristic acquisition unit 221 may acquire the second subject characteristic from data representing the results of a health check or data from a medical questionnaire. In addition, when there is a result of speech recognition of the voice of the second interviewee and the voice of the second subject (i.e., second interviewee text data and second subject text data), the second subject characteristic acquisition unit 221 may acquire the second subject characteristic from at least one of the second interviewee text data and the second subject text data.
 類似事例取得部222は、第2の対象者特性と類似する第1の対象者特性が含まれる事例情報を類似事例情報として事例情報DB240から取得する。 The similar case acquisition unit 222 acquires case information including a first subject characteristic that is similar to a second subject characteristic from the case information DB 240 as similar case information.
 類似事例提示部223は、類似事例情報に含まれる第1の対象者特性及び会話を第2の面談者に提示する。これにより、第2の面談者が現在面談している第2の対象者に類似する特性を持つ対象者に感情変化を生じさせた会話が、当該第2の面談者に提示される。このため、当該第2の面談者は、この会話を参考にして第2の対象者との面談を効果的に進めることが可能となる。 The similar case presentation unit 223 presents the first subject characteristics and conversation contained in the similar case information to the second interviewer. As a result, the conversation that caused an emotional change in a subject with characteristics similar to the second subject with whom the second interviewer is currently interviewing is presented to the second interviewer. This allows the second interviewer to refer to this conversation and effectively conduct the interview with the second subject.
 会話DB230は、第1の面談者テキストデータ及び第1の対象者テキストデータを格納する。 The conversation DB 230 stores the first interviewee text data and the first target person text data.
 事例情報DB240は、事例情報を格納する。なお、事例情報の一例については後述する。 The case information DB240 stores case information. An example of case information will be described later.
 なお、図2に示す面談支援装置10の機能構成は一例であって、これに限られるものではない。例えば、第2の面談者の音声を録音する「第2の面談者音声録音部」、第2の面談者の音声を音声認識して第2の面談者テキストデータを作成する「第2の面談者音声認識部」、第2の対象者の音声を録音する「第2の対象者音声録音部」、第2の対象者の音声を音声認識して第2の対象者テキストデータを作成する「第2の対象者音声認識部」が存在してもよい。 Note that the functional configuration of the interview support device 10 shown in FIG. 2 is an example and is not limited to this. For example, there may be a "second interviewee voice recording unit" that records the voice of the second interviewee, a "second interviewee voice recognition unit" that performs voice recognition on the voice of the second interviewee to create second interviewee text data, a "second subject voice recording unit" that records the voice of the second subject, and a "second subject voice recognition unit" that performs voice recognition on the voice of the second subject to create second subject text data.
 <事例情報>
 保健指導に関する面談を対象とした場合の事例情報の一例を図3に示す。図3に示すように、事例情報には、第1の対象者を識別する識別情報を示す「対象者ID」と、当該第1の対象者の特性を示す「第1の対象者特性」と、感情変化が生じた時刻を示す「感情変化時刻」と、会話抽出部216によって抽出された会話を示す「会話」とが含まれている。
<Case Information>
An example of case information for a health guidance interview is shown in Fig. 3. As shown in Fig. 3, the case information includes a "subject ID" indicating identification information for identifying a first subject, a "first subject characteristic" indicating the characteristic of the first subject, an "emotion change time" indicating the time when an emotion change occurred, and a "conversation" indicating the conversation extracted by the conversation extraction unit 216.
 例えば、図3に示す事例情報には、対象者ID「B1」が含まれている。また、図3に示す事例情報には、第1の対象者特性として「身体面」、「心理面」、「社会面」、「生活習慣」等に分類される特性を表す項目が含まれており、身体面には「検査値:BMI」、「検査値:高血圧」、「検査値:高血糖」といった項目が含まれている。以降も同様に、心理面にに「性格的傾向:几帳面」、「性格的傾向:ズボラ」、「性格的傾向:論理的」、「健康意識:高」、「健康意識:低」、社会面には「家族構成:配偶者」、「家族構成:子ども」、生活習慣には「運動・ゴルフ」、「運動・ランニング」、「飲酒」、「喫煙」、「睡眠」等といった項目が含まれている。図3に示す例では、これら各特性を表す項目が2値(例えば、その項目に応じた或る条件を満たす場合は「1」、満たさない場合は「0」)で表されている。例えば、「検査値:BMI」は、BMI値が予め決められた標準範囲外である場合は「1」、そうでない場合は「0」で表される。同様に、例えば、「性格的傾向:几帳面」は、性格が几帳面である場合は「1」、そうでない場合は「0」で表される。同様に、例えば、「飲酒」は、飲酒の習慣がある場合は「1」、そうでない場合は「0」で表される。その他の特性を表す項目に関しても同様である。 For example, the case information shown in FIG. 3 includes a subject ID "B1". The case information shown in FIG. 3 also includes items representing characteristics classified as "physical", "psychological", "social", "lifestyle", etc. as the first subject characteristic, and the physical aspect includes items such as "Test value: BMI", "Test value: high blood pressure", and "Test value: hyperglycemia". Similarly, the psychological aspect includes items such as "personality tendency: scrupulous", "personality tendency: lazy", "personality tendency: logical", "health consciousness: high", and "health consciousness: low", the social aspect includes items such as "family composition: spouse", "family composition: children", and the lifestyle habit includes items such as "exercise/golf", "exercise/running", "drinking", "smoking", and "sleep". In the example shown in FIG. 3, the items representing each of these characteristics are expressed as two values (for example, "1" if a certain condition corresponding to the item is met, and "0" if it is not met). For example, "Test value: BMI" is expressed as "1" if the BMI value is outside a predetermined standard range, and as "0" otherwise. Similarly, for example, "personality tendency: meticulous" is represented as "1" if the personality is meticulous, and as "0" if not. Similarly, for example, "drinking" is represented as "1" if the person has a drinking habit, and as "0" if not. The same applies to items that represent other characteristics.
 また、例えば、図3に示す事例情報には、感情変化時刻「2022/9/20 11:40:22」が含まれている。更に、図3に示す事例情報には、感情変化が生じた前後の発話を含む会話として、感情変化が生じた時刻の発話と、その時刻の前後2つの第1の面談者の発話及び第1の対象者の発話とが含まれる会話が含まれている。例えば、図3に示す例では、「そうなんですね。血管が傷つくとどうなるんですか。」という第1の対象者の発話(図3中で下線が付与されている発話)で感情変化が生じており、当該感情変化が生じた発話の前後2つの第1の面談者の発話と当該感情変化が生じた発話の前後2つの第2の対象者の発話とが事例情報に含まれている。 Also, for example, the case information shown in FIG. 3 includes the emotion change time "2022/9/20 11:40:22". Furthermore, the case information shown in FIG. 3 includes a conversation including an utterance at the time when the emotion change occurred, and two utterances by the first interviewer and the first subject before and after that time, as a conversation including utterances before and after the emotion change occurred. For example, in the example shown in FIG. 3, an emotion change occurred with the utterance of the first subject (the utterance underlined in FIG. 3), "I see. What happens if your blood vessels are damaged?", and the case information includes two utterances by the first interviewer before and after the utterance at which the emotion change occurred, and two utterances by the second subject before and after the utterance at which the emotion change occurred.
 <事例情報作成処理>
 以下、本実施形態に係る事例作成処理について、図4を参照しながら説明する。なお、以下のステップS101~ステップS109は、例えば、第1の面談者と第1の対象者との間で面談が行われる毎に繰り返し実行される。
<Case information creation process>
The case creation process according to this embodiment will be described below with reference to Fig. 4. Note that the following steps S101 to S109 are repeatedly executed every time an interview is conducted between the first interviewer and the first target person, for example.
 第1の面談者音声録音部211は、第1の面談者の音声を録音して第1の面談者音声データを作成する(ステップS101)。 The first interviewee voice recording unit 211 records the voice of the first interviewee to create first interviewee voice data (step S101).
 第1の面談者音声認識部212は、上記のステップS101で作成された第1の面談者音声データが表す音声に対して音声認識を行って、第1の面談者テキストデータを作成する(ステップS102)。なお、第1の面談者テキストデータは会話DB230に保存される。 The first interviewee voice recognition unit 212 performs voice recognition on the voice represented by the first interviewee voice data created in step S101 above, and creates first interviewee text data (step S102). The first interviewee text data is stored in the conversation DB 230.
 第1の対象者音声録音部213は、第1の対象者の音声を録音して第1の対象者音声データを作成する(ステップS103)。 The first subject voice recording unit 213 records the voice of the first subject and creates first subject voice data (step S103).
 第1の対象者音声認識部214は、上記のステップS103で作成された第1の対象者音声データが表す音声に対して音声認識を行って、第1の対象者テキストデータを作成する(ステップS104)。なお、第1の対象者テキストデータは会話DB230に保存される。 The first subject voice recognition unit 214 performs voice recognition on the voice represented by the first subject voice data created in step S103 above, and creates first subject text data (step S104). The first subject text data is stored in the conversation DB 230.
 なお、上記のステップS101~ステップS102と上記のステップS103~ステップS104は、例えば、単位時間(例えば、数十秒から1分程度の予め決められた時間幅)毎に繰り返し実行され、当該面談の終了後に次のステップS105以降が実行される。すなわち、会話DB230には、単位時間毎の第1の面談者テキストデータ及び第1の対象者テキストデータが保存される。ただし、必ずしも単位時間毎である必要はなく、例えば、面談の開始から終了までの音声をステップS101及びステップS103で録音し、その音声をステップS102及びステップS104で音声認識して第1の面談者テキストデータ及び第1の対象者テキストデータを作成してもよい。また、次のステップS105についても、当該面談の終了後である必要は必ずしもなく、例えば、或る程度の量の第1の面談者テキストデータ及び第1の対象者テキストデータが会話DB230に保存された後に、当該面談の実施中に実行されてもよい。 Note that the above steps S101 to S102 and steps S103 to S104 are repeatedly executed, for example, every unit time (for example, a predetermined time span of several tens of seconds to about one minute), and the next step S105 and subsequent steps are executed after the end of the interview. That is, the first interviewee text data and the first target person text data are stored in the conversation DB 230 for each unit time. However, this does not necessarily have to be every unit time, and for example, the voice from the start to the end of the interview may be recorded in steps S101 and S103, and the voice may be subjected to voice recognition in steps S102 and S104 to create the first interviewee text data and the first target person text data. In addition, the next step S105 does not necessarily have to be executed after the end of the interview, and may be executed during the interview, for example, after a certain amount of the first interviewee text data and the first target person text data have been stored in the conversation DB 230.
 次に、感情変化分析部215は、第1の対象者音声データ及び第1の対象者テキストデータを用いて感情分析を行って、第1の対象者の感情変化の有無を判定すると共に感情変化が生じた場合にはその時刻を特定する(ステップS105)。ここで、以下、感情変化分析部215による感情分析例をいくつか説明する。 Then, the emotion change analysis unit 215 performs emotion analysis using the first subject's voice data and the first subject's text data to determine whether or not there has been an emotion change in the first subject, and if an emotion change has occurred, to identify the time of the emotion change (step S105). Here, several examples of emotion analysis by the emotion change analysis unit 215 are described below.
 ・感情分析例1
 本感情分析例では、感情変化分析部215は、第1の対象者の声のトーン、相槌の回数、発話数から感情変化を分析する。これは、声のトーンが高いほど、相槌の回数が多いほど、発話数が多いほど、第1の対象者は、第1の面談者からの働きかけ(例えば、保健指導等)に興味を示し、感情変化が生じていると考えられるためである。
・Sentiment analysis example 1
In this emotion analysis example, the emotion change analysis unit 215 analyzes emotion changes from the tone of the first subject's voice, the number of backchannels, and the number of utterances. This is because it is considered that the higher the tone of the voice, the more backchannels, and the more utterances, the more the first subject is interested in the approach from the first interviewer (e.g., health guidance, etc.), and an emotion change is occurring.
 具体的には、感情変化分析部215は、単位時間毎に、以下の手順11~手順18により感情変化を分析する。以下、簡単のため、単位時間をtで表し、或る単位時間tにおける感情変化を分析する場合について説明する。 Specifically, the emotion change analysis unit 215 analyzes emotion changes for each unit time by following steps 11 to 18 below. For simplicity's sake, the unit time is represented as t, and an explanation will be given below of the case where emotion changes are analyzed in a certain unit time t.
 手順11:感情変化分析部215は、単位時間tにおける第1の対象者音声データを用いて、単位時間tにおける声の高さを表す基本周波数xを求める。 Step 11: The emotion change analysis unit 215 uses the first subject's voice data for unit time t to obtain a fundamental frequency xt that indicates the pitch of the voice for unit time t.
 手順12:感情変化分析部215は、単位時間tにおける第1の対象者テキストデータと、相槌に使用される語が登録された辞書とを用いて、単位時間tにおける相槌の回数yを求める。なお、相槌に使用される語としては、例えば、「はい」、「ええ」、「なるほど」、「そうですね」、「うんうん」、「そうなんですね」等が挙げられる。 Step 12: The emotion change analysis unit 215 obtains the number of backchannels yt in the unit time t by using the first subject text data in the unit time t and a dictionary in which words used for backchannels are registered. Examples of words used for backchannels include "Yes,""Yeah,""Isee,""That'sright,""Uh-huh," and "That's right."
 手順13:感情変化分析部215は、単位時間tにおける第1の対象者音声データと、単位時間tにおける第1の対象者テキストデータの少なくとも一方を用いて、単位時間tにおける発話数zを求める。なお、発話とは、統語的・談話的・相互行為的なまとまりをもった発言単位のことである(参考文献1)。 Step 13: The emotion change analysis unit 215 obtains the number of utterances zt in unit time t by using at least one of the first subject's voice data in unit time t and the first subject's text data in unit time t. Note that an utterance is a unit of speech that has syntactic, discursive, and interactive coherence (Reference 1).
 手順14:感情変化分析部215は、基本周波数に対する閾値をth、相槌の回数に対する閾値をth、単位時間あたりの発話数に対する閾値をthとして、x>th、y>th、z>thの少なくとも1つを満たすか否かを判定する。なお、これらの閾値th、th、thの値は予め設定される。 Step 14: The emotion change analysis unit 215 determines whether or not at least one of xt > thx , yt > thy , and zt > thz is satisfied, where thx is the threshold for the fundamental frequency, thy is the threshold for the number of backchannels, and thz is the threshold for the number of utterances per unit time. Note that the values of these thresholds thx , thy , and thz are set in advance.
 手順15:上記の手順14でx>th、y>th、z>thのいずれも満たさないと判定した場合、感情変化分析部215は、単位時間tでは第1の対象者に感情変化が生じていないとする。一方で、上記の手順14でx>th、y>th、z>thの少なくとも1つを満たすと判定した場合、感情変化分析部215は、以下により指標値Sを求める。 Step 15: If it is determined in step 14 above that none of xt > thx , yt > thy , and zt > thz is satisfied, the emotion change analysis unit 215 determines that no emotion change has occurred in the first subject in unit time t. On the other hand, if it is determined in step 14 above that at least one of x> thx , y> thy , and z> thz is satisfied, the emotion change analysis unit 215 calculates the index value S t as follows.
 S=(a×x+b×y+c×z)/(a+b+c)
 ここで、aは基本周波数に対する重み、bは相槌の回数に対する重み、cは発話数に対する重みである。なお、a、b、cの値は予め設定される。上記の指標値Sは単位時間tで第1の対象者の感情に変化が生じた度合いを意味し、その値が高いほど単位時間tにおける当該第1の対象者の感情変化が大きいことを表している。この指標値Sは、例えば、「ささり度」等と呼ばれてもよい。
S t = (a x t + b x y t + c x z t ) / (a + b + c)
Here, a is a weight for the fundamental frequency, b is a weight for the number of backchannels, and c is a weight for the number of utterances. The values of a, b, and c are set in advance. The index value S t indicates the degree to which the emotion of the first subject has changed in unit time t, and the higher the value, the greater the change in the emotion of the first subject in unit time t. This index value S t may be called, for example, "degree of sensitivity."
 手順16:感情変化分析部215は、指標値に対する閾値をthとしてS>thを満たすか否かを判定する。なお、閾値thの値は予め設定される。閾値thの値の決め方としては様々な方法が考えられるが、例えば、同一の環境で複数の対象者に対して保健指導に関する面談を行い、面談者が対象者に感情変化が生じた(例えば、健康意識が高まった、健康行動に興味を持った、行動変容の必要性を感じた等)と判断したときの単位時間あたりの声の高さ(基本周波数)、相槌の回数、発話数を計測し、それらを教師として用いて閾値thの値を決定すればよい。 Step 16: The emotion change analysis unit 215 determines whether S t > th S is satisfied, with th S being the threshold for the index value. The value of the threshold th S is set in advance. There are various methods for determining the value of the threshold th S. For example, interviews regarding health guidance are conducted with multiple subjects in the same environment, and when the interviewer judges that an emotional change has occurred in the subject (for example, the subject has become more health conscious, has become interested in health behavior, has felt the need for behavioral change, etc.), the voice pitch (fundamental frequency), the number of backchannels, and the number of utterances per unit time are measured, and the value of the threshold th S is determined using these as teachers.
 手順17:上記の手順16でS>thを満たさないと判定した場合、感情変化分析部215は、単位時間tでは第1の対象者に感情変化が生じていないとする。一方で、上記の手順16でS>thを満たすと判定した場合、感情変化分析部215は、単位時間tで第1の対象者に感情変化が生じたとする。 Step 17: If it is determined in step 16 above that S t > th S is not satisfied, the emotion change analysis unit 215 determines that no emotion change has occurred in the first subject in the unit time t. On the other hand, if it is determined in step 16 above that S t > th S is satisfied, the emotion change analysis unit 215 determines that an emotion change has occurred in the first subject in the unit time t.
 手順18:上記の手順17で第1の対象者に感情変化が生じたと判定した場合、感情変化分析部215は、第1の対象者に感情変化が生じた時刻(以下、感情変化時刻ともいう。)を特定する。例えば、感情変化分析部215は、単位時間tの開始時刻、終了時刻、又は中間の時刻のいずれかを感情変化時刻として特定すればよい。又は、例えば、感情変化分析部215は、単位時間tが比較的長い時間幅(例えば、1分程度かそれ以上)である場合には、当該単位時間t内の第1の対象者の声の高さ(基本周波数)の変化量や発話数の変化量を求め、声の高さの変化量が所定の閾値以上となる時刻や発話数の変化量が所定の閾値以上となる時刻を感情変化時刻として特定してもよい。 Step 18: If it is determined in step 17 above that an emotional change has occurred in the first subject, the emotional change analysis unit 215 identifies the time when the emotional change occurred in the first subject (hereinafter also referred to as the emotional change time). For example, the emotional change analysis unit 215 may identify the start time, end time, or intermediate time of the unit time t as the emotional change time. Alternatively, for example, if the unit time t is a relatively long time span (for example, about one minute or more), the emotional change analysis unit 215 may find the amount of change in the pitch of the first subject's voice (fundamental frequency) or the amount of change in the number of utterances within that unit time t, and identify the time when the amount of change in the pitch of the voice becomes equal to or exceeds a predetermined threshold or the time when the amount of change in the number of utterances becomes equal to or exceeds a predetermined threshold as the emotional change time.
 ・感情分析例2
 感情変化が生じている場合、例えば、第1の対象者の腕の動きが大きくなったり、上半身の動きが大きくなったり、頭部の動きや頭部の上下運動(つまり、頷き等)が大きくなったりすると考えられる。このため、上記の感情分析例1の手順15で指標値Sを求める際に、第1の対象者の単位時間tにおける動作量uを考慮してもよい。すなわち、上記の感情分析例1の手順15で、感情変化分析部215は、以下により指標値Sを求めてもよい。
・Sentiment analysis example 2
When an emotion change occurs, for example, the first subject's arm movement, upper body movement, head movement, and head up and down movement (i.e., nodding, etc.) may increase. For this reason, when calculating the index value S t in step 15 of the emotion analysis example 1 above, the amount of movement u t of the first subject in unit time t may be taken into consideration. That is, in step 15 of the emotion analysis example 1 above, the emotion change analysis unit 215 may calculate the index value S t as follows.
 S=(a×x+b×y+c×z+d×u)/(a+b+c+d)
 ここで、dは第1の対象者の動作量に対する重みであり、その値は予め設定される。
S t = (a x t + b x y t + c x z t + d x u t ) / (a + b + c + d)
Here, d is a weight for the amount of movement of the first subject, and its value is set in advance.
 上記の動作量uは、例えば、第1の対象者を撮影した映像データが得られる場合にはその映像データから求めることができる。これ以外にも、例えば、第1の対象の頭や腕等に加速度センサやモーションセンサ等が装着されており、そのセンサ値が得られる場合にはそのセンサ値から求めてもよい。 The amount of movement u t can be calculated from image data of the first subject when the image data is available. In addition, the amount of movement u t can be calculated from the sensor value when an acceleration sensor or motion sensor is attached to the head or arm of the first subject and the sensor value is available.
 ・感情分析例3
 上記の感情分析例1又は感情分析例2において、声の高さ、相槌の回数、発話数の代わりに、又は、これらに加えて、第1の対象者の発話量(参考文献2)を用いてもよい。発話量とは、発話時間、発話頻度、発話長のことである。なお、発話時間とは、データ長(つまり、単位時間)に対してどの程度発話が生じていたかを割合で示したものである。発話頻度とは、単位時間あたりの発話数である。発話長とは、1回の発話の開始から終了までの平均時間である。
・Sentiment analysis example 3
In the above emotion analysis example 1 or emotion analysis example 2, the amount of speech of the first subject (Reference 2) may be used instead of or in addition to the pitch of the voice, the number of backchannels, and the number of utterances. The amount of speech refers to the speech time, speech frequency, and speech length. Note that the speech time indicates the amount of speech generated relative to the data length (i.e., unit time) as a percentage. The speech frequency refers to the number of utterances per unit time. The speech length refers to the average time from the start to the end of one utterance.
 ・感情分析例4
 既存の感情分析技術により、感情変化有無の判定とその時刻の特定とを行ってもよい。既存の感情分析技術としては、例えば、参考文献3~参考文献8に記載されているものが挙げられる。なお、参考文献3及び8は音声から感情を分析する技術、参考文献4及び6は映像から感情を分析する技術、参考文献7はテキストから感情を分析する技術、参考文献5はテキスト、音声、映像から感情を分析する技術である。
・Sentiment analysis example 4
The presence or absence of a change in emotion and the time of the change may be determined by using an existing emotion analysis technique. Examples of the existing emotion analysis technique include those described in References 3 to 8. References 3 and 8 are techniques for analyzing emotion from audio, References 4 and 6 are techniques for analyzing emotion from video, Reference 7 is a technique for analyzing emotion from text, and Reference 5 is a technique for analyzing emotion from text, audio, and video.
 上記のステップS105ですべての単位時間tで第1の対象者に感情変化が生じたと判定されなかった場合(ステップS106でNO)、事例作成処理部210は、事例作成処理を終了する。一方で、上記のステップS105で或る単位時間tで第1の対象者に感情変化が生じたと判定された場合(ステップS106でYES)、会話抽出部216は、感情変化が生じた発話とその前後の発話とを含む会話を会話DB230から抽出する(ステップS107)。すなわち、会話抽出部216は、例えば、第1の対象者の感情変化時刻に最も近い発話時刻の第1の対象者の発話と、当該発話時刻の前後にあるN個の第1の対象者の発話と、当該発話時刻の前後における第1の面談者の発話とを会話DB230から抽出する。これにより、2N+1個の第1の対象者の発話と、2N個の第1の面談者の発話とで構成される会話が抽出される。なお、Nは予め決められた1以上の整数であり、例えば、N=2やN=3等と設定することが考えられるが、これに限られるものではない。また、会話抽出部216は、例えば、第1の対象者の感情変化時刻に最も近い発話時刻の第1の対象者(又は、第1の面談者)の発話と、その発話時刻の前後の所定の時間幅内に発話された発話とを含む会話を会話DB230から抽出してもよい。 If it is not determined in step S105 that the first subject has experienced an emotional change in all unit times t (NO in step S106), the case creation processing unit 210 ends the case creation process. On the other hand, if it is determined in step S105 that the first subject has experienced an emotional change in a certain unit time t (YES in step S106), the conversation extraction unit 216 extracts a conversation including the utterance in which the emotional change occurred and the utterances before and after the utterance from the conversation DB 230 (step S107). That is, the conversation extraction unit 216 extracts, for example, the utterance of the first subject at the utterance time closest to the time of the emotional change of the first subject, N utterances of the first subject before and after the utterance time, and the utterance of the first interviewer before and after the utterance time from the conversation DB 230. As a result, a conversation consisting of 2N+1 utterances of the first subject and 2N utterances of the first interviewer is extracted. Note that N is a predetermined integer of 1 or more, and can be set to, for example, N=2, N=3, etc., but is not limited to this. In addition, the conversation extraction unit 216 may extract from the conversation DB 230 a conversation that includes, for example, an utterance by the first subject (or the first interviewee) at the utterance time closest to the emotion change time of the first subject, and an utterance uttered within a predetermined time span before and after the utterance time.
 なお、上記のステップS107で会話を抽出する際に、第1の対象者の感情変化時刻に最も近い発話時刻の第1の対象者の発話に加えて又はその代わりに、当該感情変化時刻に最も近い発話時刻の第1の面談者の発話を抽出してもよい。 When extracting the conversation in step S107 above, in addition to or instead of the utterance of the first subject whose time of speech is closest to the time of the emotion change of the first subject, the utterance of the first interviewee whose time of speech is closest to the time of the emotion change of the first subject may be extracted.
 第1の対象者特性取得部217は、第1の対象者特性を取得する(ステップS108)。 The first subject characteristic acquisition unit 217 acquires the first subject characteristic (step S108).
 事例情報作成部218は、第1の対象者の対象者IDと、上記のステップS108で取得された第1の対象者特性と、上記のステップS105で特定された感情変化時刻と、上記のステップS107で抽出された会話とを対応付けた事例情報を作成し、事例情報DB240に保存する(ステップS109)。これにより、第1の対象者特性と、その特性を持つ第1の対象者に感情変化を生じさせる会話とが含まれる事例情報が得られる。 The case information creation unit 218 creates case information that associates the subject ID of the first subject, the first subject characteristic acquired in step S108 above, the emotion change time identified in step S105 above, and the conversation extracted in step S107 above, and stores the case information in the case information DB 240 (step S109). This results in obtaining case information that includes the first subject characteristic and the conversation that causes an emotional change in the first subject having that characteristic.
 <類似事例提示処理>
 本実施形態に係る類似事例提示処理について、図5を参照しながら説明する。なお、以下のステップS201~ステップS203は、例えば、第2の面談者と第2の対象者との間で面談が行われる毎に繰り返し実行される。
<Similar Case Presentation Processing>
The similar case presentation process according to this embodiment will be described with reference to Fig. 5. Note that the following steps S201 to S203 are repeatedly executed every time an interview is conducted between the second interviewee and the second target person, for example.
 第2の対象者特性取得部221は、第2の対象者特性を取得する(ステップS201)。 The second subject characteristic acquisition unit 221 acquires the second subject characteristic (step S201).
 類似事例取得部222は、上記のステップS201で取得された第2の対象者特性に類似する第1の対象者特性が含まれる事例情報を類似事例情報として事例情報DB240から取得する(ステップS202)。具体的には、類似事例取得部222は、以下の手順21~手順23により類似事例情報を取得する。 The similar case acquisition unit 222 acquires case information including a first subject characteristic similar to the second subject characteristic acquired in step S201 above from the case information DB 240 as similar case information (step S202). Specifically, the similar case acquisition unit 222 acquires similar case information by steps 21 to 23 below.
 手順21:類似事例取得部222は、上記のステップS201で取得された第2の対象者特性と、事例情報DB240に格納されている各事例情報に含まれる第1の対象者特性との類似度をそれぞれ求める。ここで、対象者特性には対象者の特性を表すn個の項目が含まれるものとすれば、第1の対象者特性及び第2の対象者特性はn次元ベクトルで表される。そこで、以下、事例情報DB240に格納されているm番目の事例情報に含まれる第1の対象者特性をV(m)=(v (m)・・・,v (m))、第2の対象者特性をW=(w・・・,w)とする。このとき、類似事例取得部222は、各m=1,・・・,M(ただし、Mは事例情報DB240に格納されている事例情報数)に対して、第1の対象者特性V(m)と第2の対象者特性Wとを用いて類似度Sim(V(m),W)をそれぞれ求める。なお、類似度Sim(・,・)としては、例えば、コサイン類似度等を用いればよいが、これに限られるものではなく、ベクトル間の類似度を測ることができるものであれば任意の類似度を用いることが可能である。 Step 21: The similar case acquisition unit 222 obtains the similarity between the second subject characteristic obtained in step S201 and the first subject characteristic included in each case information stored in the case information DB 240. If the subject characteristic includes n items representing the characteristics of the subject, the first subject characteristic and the second subject characteristic are expressed as an n-dimensional vector. Hereinafter, the first subject characteristic included in the m-th case information stored in the case information DB 240 is defined as V (m) =( v1 (m) ..., vn (m) ), and the second subject characteristic is defined as W=( w1 ..., wn ). At this time, the similar case acquisition unit 222 obtains the similarity Sim(V(m), W) using the first subject characteristic V ( m) and the second subject characteristic W for each m=1,..., M (where M is the number of case information stored in the case information DB 240). As the similarity Sim(.,.), for example, cosine similarity or the like may be used, but this is not limited to this, and any similarity that can measure the similarity between vectors can be used.
 手順22:類似事例取得部222は、上記の手順21で求めたSim(V(m),W)の最大値を取るm'を求める。 Step 22: The similar case acquiring unit 222 obtains m′ that is the maximum value of Sim(V (m) , W) obtained in the above step 21.
 手順23:類似事例取得部222は、上記の手順22で求めたm'番目の事例情報を事例情報DB240から取得する。これにより、第2の対象者特性に最も類似する第1の対象者特性が含まれる事例情報が類似事例情報として得られる。 Step 23: The similar case acquisition unit 222 acquires the m'th case information obtained in step 22 above from the case information DB 240. This allows the case information containing the first subject characteristic that is most similar to the second subject characteristic to be obtained as similar case information.
 なお、上記のステップS202では第2の対象者特性に最も類似する第1の対象者特性が含まれる事例情報を類似事例情報としたが、これに限られるものではなく、例えば、第2の対象者特性と類似する順に上位M'(M'は2以上の整数)個の事例情報を類似事例情報として取得してもよい。 In step S202 above, the case information containing the first subject characteristic that is most similar to the second subject characteristic is defined as the similar case information, but this is not limited to this. For example, the top M' (M' is an integer equal to or greater than 2) pieces of case information in order of similarity to the second subject characteristic may be obtained as the similar case information.
 類似事例提示部223は、上記のステップS202で取得された類似事例情報に含まれる第1の対象者特性及び会話を第2の面談者に提示する(ステップS203)。なお、類似事例情報に含まれる第1の対象者特性及び会話を第2の面談者に提示する場合には、例えば、当該第2の面談者が利用している端末のディスプレイ等に当該第1の対象者特性及び会話を表示すればよい。これ以外にも、例えば、面談支援装置10を第2の面談者が利用している場合には、この面談支援装置10の表示装置102に当該第1の対象者特性及び会話を表示してもよい。 The similar case presentation unit 223 presents the first subject characteristic and conversation contained in the similar case information acquired in step S202 above to the second interviewer (step S203). When presenting the first subject characteristic and conversation contained in the similar case information to the second interviewer, for example, the first subject characteristic and conversation may be displayed on the display of a terminal used by the second interviewer. In addition, for example, when the second interviewer is using the interview support device 10, the first subject characteristic and conversation may be displayed on the display device 102 of the interview support device 10.
 ここで、上記のステップS203で第2の面談者に提示される第1の対象者特性の一例を図6に示す。図6に示す第1の対象者特性2100では、第1の対象者特性に含まれる各項目と同じ値の項目が他の項目と異なる態様で強調表示されている。すなわち、図6に示す例では、「検査値:高血圧」、「検査値:高血糖」、「性格的傾向:論理的」、「飲酒」、「喫煙」が強調表示されている。これにより、第2の面談者は、自身が面談する第2の対象者と同じ特性である項目を容易に知ることができる。 Here, an example of the first subject characteristic presented to the second interviewer in step S203 above is shown in FIG. 6. In the first subject characteristic 2100 shown in FIG. 6, items with the same values as the items included in the first subject characteristic are highlighted in a manner different from the other items. That is, in the example shown in FIG. 6, "Laboratory value: high blood pressure", "Laboratory value: high blood sugar", "Personality tendency: logical", "Drinking", and "Smoking" are highlighted. This allows the second interviewer to easily know the items that are the same characteristics as the second subject he or she is interviewing.
 また、上記のステップS203で第2の面談者に提示される会話の一例を図7に示す。図7に示す会話2200では、第1の対象者の発話2201~2205と第1の面談者の発話2211~2214とが表示されており、また感情変化時刻に最も近い発話時刻の第1の対象者の発話2201が他の発話と異なる態様で強調表示されている。これにより、第2の面談者は、感情変化が生じたときの発話(発話2201)とその前後の発話(発話2202~2205及び発話2211~2214)とを知ることが可能となり、これを参考に面談を効果的に進めることが可能となる。 FIG. 7 shows an example of the conversation presented to the second interviewer in step S203 above. In conversation 2200 shown in FIG. 7, utterances 2201-2205 of the first subject and utterances 2211-2214 of the first interviewer are displayed, and utterance 2201 of the first subject, whose utterance time is closest to the emotion change time, is highlighted in a manner different from the other utterances. This allows the second interviewer to know the utterance when the emotion change occurred (utterance 2201) and the utterances before and after it (utterances 2202-2205 and utterances 2211-2214), and he or she can use this as a reference to effectively conduct the interview.
 なお、上記のステップS203では類似事例情報に含まれる第1の対象者特性と会話の両方を第2の面談者に提示したが、これに限られず、例えば、類似事例情報に含まれる会話のみを第2の面談者に提示してもよい。 Note that in step S203 above, both the first subject characteristics and the conversation contained in the similar case information are presented to the second interviewee, but this is not limited thereto, and for example, only the conversation contained in the similar case information may be presented to the second interviewee.
 <まとめ>
 以上のように、本実施形態に係る面談支援装置10では、面談者との面談中に対象者に感情変化(心的変化)が生じたときの会話と、当該対象者の特性とを対応付けて事例情報として蓄積する。そして、本実施形態に係る面談支援装置10は、別の面談が行われる際に、その面談の対象者の特性に応じて、過去の事例情報の中から当該特性に類似する特性が含まれる事例情報を抽出し、当該面談の面談者に提示する。このように、本実施形態に係る面談支援装置10によれば、互いに少なくとも類似する特性を持つ対象者に対して心的変化を生じさせる会話を複数の面談者間で共有することが可能となる。このため、各面談者は、面談支援装置10から提示された情報を参考に、対象者に対して行動変容等を促す会話を効果的・効率的に行うことが可能となる。
<Summary>
As described above, the interview support device 10 according to the present embodiment stores case information in association with a conversation that occurs when an emotional change (mental change) occurs in a subject during an interview with an interviewer, and the characteristics of the subject. Then, when another interview is conducted, the interview support device 10 according to the present embodiment extracts case information that includes characteristics similar to the characteristics of the subject of the interview from past case information according to the characteristics of the subject of the interview, and presents the case information to the interviewer of the interview. In this way, the interview support device 10 according to the present embodiment makes it possible for a plurality of interviewers to share a conversation that causes a mental change in a subject having at least similar characteristics. Therefore, each interviewer can effectively and efficiently have a conversation that encourages a behavioral change in the subject by referring to the information presented by the interview support device 10.
 例えば、保健指導に関する面談では、健康への関心度や面談への積極性も人によって異なる中で、面談者は、対話中に捉えた対象者の性格的特性や反応等に応じて対話プロセスをアレンジしながら会話を進め、対象者の意欲を引き出しながら生活習慣改善等といった行動計画に繋げる必要がある(参考文献9)。このため、本実施形態に係る面談支援装置10を利用することで、対象者の意欲を引き出すことができる会話が明らかになり、その結果、面談者は、対象者に対して効果的・効率的な保健指導に行うことができるようになる。 For example, in an interview regarding health guidance, as the level of interest in health and enthusiasm for the interview vary from person to person, the interviewer must proceed with the conversation while adjusting the dialogue process according to the personality traits and reactions of the subject picked up during the conversation, and lead the subject to an action plan such as improving lifestyle habits while drawing out the subject's motivation (Reference 9). For this reason, by using the interview support device 10 of this embodiment, a conversation that can draw out the subject's motivation becomes clear, and as a result, the interviewer can provide the subject with effective and efficient health guidance.
 本発明は、具体的に開示された上記の実施形態に限定されるものではなく、請求の範囲の記載から逸脱することなく、種々の変形や変更、既知の技術との組み合わせ等が可能である。 The present invention is not limited to the specifically disclosed embodiments above, and various modifications, changes, and combinations with known technologies are possible without departing from the scope of the claims.
 [参考文献]
 参考文献1:日本語日常会話コーパス | 大規模日常会話コーパスに基づく話し言葉の多角的研究,インターネット<URL:https://www2.ninjal.ac.jp/conversation/cejc-monitor/transcript.html>
 参考文献2:平井祐樹,井上智雄:ペアプログラミング学習における 状態の推定-つまずきの解決の成功と失敗に見られる会 話の違い,情報処理学会論文誌,Vol.53, No.1, pp.72-80 (2012).
 参考文献3:感情認識技術,インターネット<URL:https://www.docomo.ne.jp/corporate/technology/rd/tech/term/21/index.html>
 参考文献4:Com Analyzer,インターネット<URL:https://www.nttdata.com/jp/ja/news/release/2019/052700/>
 参考文献5:AI suite,インターネット<URL:https://cloud.watch.impress.co.jp/docs/news/1364523.html>
 参考文献6:心sensor for Communication,インターネット<URL:https://service.cac.co.jp/hctech/ks4c>
 参考文献7:Tone Analyzer,インターネット<URL:https://cloud.ibm.com/docs/tone-analyzer/getting-started.html>
 参考文献8:Web Empath API,インターネット<URL:https://www.apibank.jp/ApiBank/api/detail?api_no=555&api_type=I>
 参考文献9:佐藤 妙,藤村 香央里,有賀 玲子,石榑 康雄,宮島 麻美,尾形 珠恵,峰 明奈,菊地 恵観子,西崎 泰,「生活習慣病予防のための保健指導における対話プロセス分析の試み」,信学技報, vol. 122, no. 166, HCS2022-41, pp. 27-32, 2022年8月.
[References]
Reference 1: Corpus of Japanese Daily Conversation | Multifaceted research on spoken language based on a large-scale corpus of daily conversation, Internet <URL: https://www2.ninjal.ac.jp/conversation/cejc-monitor/transcript.html>
Reference 2: Hirai, Yuki, and Inoue, Tomoo: State estimation in pair programming learning - Differences in conversation between success and failure in resolving stumbling blocks, Transactions of the Information Processing Society of Japan, Vol. 53, No. 1, pp. 72-80 (2012).
Reference 3: Emotion recognition technology, Internet <URL: https://www.docomo.ne.jp/corporate/technology/rd/tech/term/21/index.html>
Reference 4: Com Analyzer, Internet <URL: https://www.nttdata.com/jp/ja/news/release/2019/052700/>
Reference 5: AI suite, Internet <URL: https://cloud.watch.impress.co.jp/docs/news/1364523.html>
Reference 6: Heart Sensor for Communication, Internet <URL: https://service.cac.co.jp/hctech/ks4c>
Reference 7: Tone Analyzer, Internet <URL: https://cloud.ibm.com/docs/tone-analyzer/getting-started.html>
Reference 8: Web Empath API, Internet <URL: https://www.apibank.jp/ApiBank/api/detail?api_no=555&api_type=I>
Reference 9: Tae Sato, Kaori Fujimura, Reiko Ariga, Yasuo Ishigure, Asami Miyajima, Tamae Ogata, Akina Mine, Emiko Kikuchi, Yasushi Nishizaki, "An Attempt at Analyzing Dialogue Processes in Health Guidance for the Prevention of Lifestyle-Related Diseases," IEICE Technical Report, vol. 122, no. 166, HCS2022-41, pp. 27-32, August 2022.
 10    面談支援装置
 101   入力装置
 102   表示装置
 103   外部I/F
 103a  記録媒体
 104   通信I/F
 105   RAM
 106   ROM
 107   補助記憶装置
 108   プロセッサ
 109   バス
 210   事例作成処理部
 211   第1の面談者音声録音部
 212   第1の面談者音声認識部
 213   第1の対象者音声録音部
 214   第1の対象者音声認識部
 215   感情変化分析部
 216   会話抽出部
 217   第1の対象者特性取得部
 218   事例情報作成部
 220   類似事例提示処理部
 221   第2の対象者特性取得部
 222   類似事例取得部
 223   類似事例提示部
 230   会話DB
 240   事例情報DB
10 Interview support device 101 Input device 102 Display device 103 External I/F
103a Recording medium 104 Communication I/F
105 RAM
106 ROM
107 Auxiliary storage device 108 Processor 109 Bus 210 Case creation processing unit 211 First interviewee voice recording unit 212 First interviewee voice recognition unit 213 First subject voice recording unit 214 First subject voice recognition unit 215 Emotion change analysis unit 216 Conversation extraction unit 217 First subject characteristic acquisition unit 218 Case information creation unit 220 Similar case presentation processing unit 221 Second subject characteristic acquisition unit 222 Similar case acquisition unit 223 Similar case presentation unit 230 Conversation DB
240 Case Information DB

Claims (8)

  1.  面談中の第1の対象者の音声と、前記第1の対象者の音声を音声認識した結果を表す第1の音声認識結果との少なくとも一方を用いて、前記第1の対象者に感情変化が生じたか否かを判定するように構成されている判定部と、
     前記第1の対象者に感情変化が生じたと判定された場合、前記第1の音声認識結果と、前記第1の対象者と面談を行っている第1の面談者の音声を音声認識した結果を表す第2の音声認識結果との中から、前記感情変化が生じた発話を含む複数の発話で構成される会話を抽出するように構成されている会話抽出部と、
     前記第1の対象者の特性と、前記会話とを対応付けた事例情報を作成して記憶部に記憶させるように構成されている事例作成部と、
     前記記憶部に記憶されている事例情報のうち、面談対象の第2の対象者の特性と類似する特性が含まれる事例情報を類似事例情報として前記第2の対象者と面談する第2の面談者に提示するように構成されている類似事例提示部と、
     を有する面談支援装置。
    a determination unit configured to determine whether or not an emotional change has occurred in the first subject, using at least one of a voice of the first subject during an interview and a first speech recognition result representing a result of speech recognition of the voice of the first subject;
    a conversation extraction unit configured to extract, when it is determined that an emotional change has occurred in the first subject, a conversation consisting of a plurality of utterances including an utterance in which the emotional change has occurred from the first speech recognition result and a second speech recognition result representing a result of speech recognition of a first interviewer who is interviewing the first subject;
    a case creation unit configured to create case information in which characteristics of the first subject are associated with the conversation, and store the case information in a storage unit;
    a similar case presentation unit configured to present case information including characteristics similar to those of the second subject to be interviewed, among the case information stored in the storage unit, to a second interviewer who interviews the second subject as similar case information;
    An interview support device having the above structure.
  2.  前記類似事例提示部は、
     前記類似事例情報に含まれる会話を構成する複数の発話のうち、前記感情変化が生じた発話を強調表示して前記第2の面談者に提示するように構成されている請求項1に記載の面談支援装置。
    The similar case presentation unit,
    The interview support device according to claim 1, which is configured to highlight and present to the second interviewer the utterance in which the emotional change occurred among the multiple utterances that constitute the conversation included in the similar case information.
  3.  前記類似事例提示部は、
     前記類似事例情報に含まれる特性のうち、前記第2の対象者と同一の特性を強調表示して前記第2の面談者に提示するように構成されている請求項2に記載の面談支援装置。
    The similar case presentation unit,
    The interview support device according to claim 2 , configured to highlight and present to the second interviewee those characteristics contained in the similar case information that are the same as those of the second subject.
  4.  前記判定部は、
     前記第1の対象者の音声と、前記第1の音声認識結果とを用いて、単位時間における前記第1の対象者の声の高さ、相槌の回数、及び発話数を算出し、
     前記声の高さ、相槌の回数、及び発話数を用いて、前記第1の対象者の感情に変化が生じた度合いを表す指標値を算出し、
     前記指標値から前記感情変化が生じたか否かを判定するように構成されている請求項1乃至3の何れか一項に記載の面談支援装置。
    The determination unit is
    Calculating a pitch of the voice of the first target person, a number of backchannels, and a number of utterances of the first target person per unit time using the voice of the first target person and the first speech recognition result;
    Calculating an index value representing a degree of change in the emotion of the first subject using the pitch of the voice, the number of backchannels, and the number of utterances;
    4. The interview support device according to claim 1, further comprising: a step of determining whether or not the emotion change has occurred based on the index value.
  5.  前記判定部は、
     前記第1の対象者を撮影した映像データ、又は、前記第1の対象者に装着されたセンサのセンサ値、を用いて、前記第1の対象者の頭部又は腕の動作量を更に算出し、
     前記動作量を更に用いて、前記指標値を算出するように構成されている請求項4に記載の面談支援装置。
    The determination unit is
    Further calculating a movement amount of the head or arm of the first subject using video data of the first subject or a sensor value of a sensor attached to the first subject;
    5. The interview support device according to claim 4, further configured to calculate the index value by using the amount of movement.
  6.  前記特性には、対象者の身体面の特性、心理面の特性、社会面の特性、生活習慣に関する特性が含まれる、請求項1に記載の面談支援装置。 The interview support device according to claim 1, wherein the characteristics include the subject's physical characteristics, psychological characteristics, social characteristics, and lifestyle characteristics.
  7.  面談中の第1の対象者の音声と、前記第1の対象者の音声を音声認識した結果を表す第1の音声認識結果との少なくとも一方を用いて、前記第1の対象者に感情変化が生じたか否かを判定する判定手順と、
     前記第1の対象者に感情変化が生じたと判定された場合、前記第1の音声認識結果と、前記第1の対象者と面談を行っている第1の面談者の音声を音声認識した結果を表す第2の音声認識結果との中から、前記感情変化が生じた発話を含む複数の発話で構成される会話を抽出する会話抽出手順と、
     前記第1の対象者の特性と、前記会話とを対応付けた事例情報を作成して記憶部に記憶させる事例作成手順と、
     前記記憶部に記憶されている事例情報のうち、面談対象の第2の対象者の特性と類似する特性が含まれる事例情報を類似事例情報として前記第2の対象者と面談する第2の面談者に提示する類似事例提示手順と、
     をコンピュータが実行する面談支援方法。
    a determination step of determining whether or not an emotional change has occurred in the first subject by using at least one of a voice of the first subject during an interview and a first speech recognition result representing a result of speech recognition of the voice of the first subject;
    a conversation extraction step of extracting a conversation consisting of a plurality of utterances including an utterance in which an emotional change has occurred from the first speech recognition result and a second speech recognition result representing a result of speech recognition of a first interviewer who is interviewing the first subject, when it is determined that an emotional change has occurred in the first subject;
    a case creation step of creating case information in which characteristics of the first subject are associated with the conversation and storing the case information in a storage unit;
    a similar case presentation step of presenting case information including characteristics similar to those of a second subject to be interviewed, among the case information stored in the storage unit, to a second interviewer who interviews the second subject as similar case information;
    The interview support method is carried out by a computer.
  8.  面談中の第1の対象者の音声と、前記第1の対象者の音声を音声認識した結果を表す第1の音声認識結果との少なくとも一方を用いて、前記第1の対象者に感情変化が生じたか否かを判定する判定手順と、
     前記第1の対象者に感情変化が生じたと判定された場合、前記第1の音声認識結果と、前記第1の対象者と面談を行っている第1の面談者の音声を音声認識した結果を表す第2の音声認識結果との中から、前記感情変化が生じた発話を含む複数の発話で構成される会話を抽出する会話抽出手順と、
     前記第1の対象者の特性と、前記会話とを対応付けた事例情報を作成して記憶部に記憶させる事例作成手順と、
     前記記憶部に記憶されている事例情報のうち、面談対象の第2の対象者の特性と類似する特性が含まれる事例情報を類似事例情報として前記第2の対象者と面談する第2の面談者に提示する類似事例提示手順と、
     をコンピュータが実行させるプログラム。
    a determination step of determining whether or not an emotional change has occurred in the first subject by using at least one of a voice of the first subject during an interview and a first speech recognition result representing a result of speech recognition of the voice of the first subject;
    a conversation extraction step of extracting a conversation consisting of a plurality of utterances including an utterance in which an emotional change has occurred from the first speech recognition result and a second speech recognition result representing a result of speech recognition of a first interviewer who is interviewing the first subject, when it is determined that an emotional change has occurred in the first subject;
    a case creation step of creating case information in which characteristics of the first subject are associated with the conversation and storing the case information in a storage unit;
    a similar case presentation step of presenting case information including characteristics similar to those of a second subject to be interviewed, among the case information stored in the storage unit, to a second interviewer who interviews the second subject as similar case information;
    A program that causes a computer to execute the following.
PCT/JP2022/043607 2022-11-25 2022-11-25 Interview support device, interview support method, and program WO2024111121A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/043607 WO2024111121A1 (en) 2022-11-25 2022-11-25 Interview support device, interview support method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/043607 WO2024111121A1 (en) 2022-11-25 2022-11-25 Interview support device, interview support method, and program

Publications (1)

Publication Number Publication Date
WO2024111121A1 true WO2024111121A1 (en) 2024-05-30

Family

ID=91195861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/043607 WO2024111121A1 (en) 2022-11-25 2022-11-25 Interview support device, interview support method, and program

Country Status (1)

Country Link
WO (1) WO2024111121A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11306195A (en) * 1998-04-24 1999-11-05 Mitsubishi Electric Corp Information retrieval system and method therefor
WO2019163700A1 (en) * 2018-02-20 2019-08-29 日本電気株式会社 Customer service support device, customer service support method, recording medium with customer service support program stored therein
JP2020184216A (en) * 2019-05-08 2020-11-12 株式会社日立システムズ Proposal support system and proposal support method
WO2021255795A1 (en) * 2020-06-15 2021-12-23 日本電信電話株式会社 Information processing device, information processing method, and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11306195A (en) * 1998-04-24 1999-11-05 Mitsubishi Electric Corp Information retrieval system and method therefor
WO2019163700A1 (en) * 2018-02-20 2019-08-29 日本電気株式会社 Customer service support device, customer service support method, recording medium with customer service support program stored therein
JP2020184216A (en) * 2019-05-08 2020-11-12 株式会社日立システムズ Proposal support system and proposal support method
WO2021255795A1 (en) * 2020-06-15 2021-12-23 日本電信電話株式会社 Information processing device, information processing method, and program

Similar Documents

Publication Publication Date Title
Tackman et al. Depression, negative emotionality, and self-referential language: A multi-lab, multi-measure, and multi-language-task research synthesis.
Holtzman et al. Linguistic markers of grandiose narcissism: A LIWC analysis of 15 samples
Cohn et al. Multimodal assessment of depression from behavioral signals
Recchia et al. Reproducing affective norms with lexical co-occurrence statistics: Predicting valence, arousal, and dominance
Vitevitch The neighborhood characteristics of malapropisms
Carroll et al. Word frequency and age of acquisition as determiners of picture-naming latency
Brysbaert et al. Do the effects of subjective frequency and age of acquisition survive better word frequency norms?
Bucholtz et al. Hella Nor Cal or totally So Cal? the perceptual dialectology of California
Cortese et al. Imageability and age of acquisition effects in disyllabic word recognition
Emmorey et al. Lexical recognition in sign language: Effects of phonetic structure and morphology
Signorelli et al. Working memory in simultaneous interpreters: Effects of task and age
Ferreira et al. Phonological influences on lexical (mis) selection
Soares et al. On the advantages of word frequency and contextual diversity measures extracted from subtitles: The case of Portuguese
Pérez Age of acquisition persists as the main factor in picture naming when cumulative word frequency and frequency trajectory are controlled
Chen et al. Acoustic-prosodic and lexical cues to deception and trust: deciphering how people detect lies
Prud'hommeaux et al. Graph-based word alignment for clinical language evaluation
Simon Acquiring a new second language contrast: An analysis of the English laryngeal system of native speakers of Dutch
An et al. Automatically Classifying Self-Rated Personality Scores from Speech.
Khan Improved multi-lingual sentiment analysis and recognition using deep learning
Sluis et al. An automated approach to examining pausing in the speech of people with dementia
Coats Articulation rate in American English in a corpus of YouTube videos
Le et al. The linguistic output of psychopathic offenders during a PCL-R interview
Goatley-Soan et al. Words apart: A study of attitudes toward Varieties of South African English accents in a United States employment scenario
Meer et al. The Trini Sing-Song: Sociophonetic variation in Trinidadian English prosody and differences to other varieties
Lewis et al. Comparison of nasalance scores obtained from the Nasometer and the NasalView

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22966529

Country of ref document: EP

Kind code of ref document: A1