WO2022145040A1 - Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo - Google Patents

Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo Download PDF

Info

Publication number
WO2022145040A1
WO2022145040A1 PCT/JP2020/049295 JP2020049295W WO2022145040A1 WO 2022145040 A1 WO2022145040 A1 WO 2022145040A1 JP 2020049295 W JP2020049295 W JP 2020049295W WO 2022145040 A1 WO2022145040 A1 WO 2022145040A1
Authority
WO
WIPO (PCT)
Prior art keywords
video meeting
user terminal
video
unit
moving image
Prior art date
Application number
PCT/JP2020/049295
Other languages
English (en)
Japanese (ja)
Inventor
渉三 神谷
Original Assignee
株式会社I’mbesideyou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社I’mbesideyou filed Critical 株式会社I’mbesideyou
Priority to JP2022517944A priority Critical patent/JP7465012B2/ja
Priority to PCT/JP2020/049295 priority patent/WO2022145040A1/fr
Publication of WO2022145040A1 publication Critical patent/WO2022145040A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]

Definitions

  • This disclosure relates to a video meeting evaluation terminal, a video meeting evaluation system, and a video meeting evaluation program.
  • Patent Document 1 a system for teaching some knowledge online or giving explanations is known (see, for example, Patent Document 1).
  • the method of measuring the effect by the above-mentioned questionnaire tends to be subjective, and it is insufficient as a method of objectively measuring the effect of the content of the video meeting.
  • an object of the present invention is to objectively evaluate the content of a video meeting in particular.
  • a first user terminal having at least a first camera unit and a first display unit
  • a second user terminal having at least a second camera unit and a second display unit
  • a video meeting including at least a video meeting service terminal that provides a video meeting to the first user terminal and the second user terminal and stores at least a moving image acquired by the first camera unit or the second camera unit. It ’s a system,
  • the first user terminal includes a display means for displaying at least a moving image acquired from the video meeting with the second user terminal.
  • the second user terminal is An acquisition means for acquiring the moving image of the second camera, and An identification means for identifying at least a facial image contained in the moving image for each predetermined frame unit, A providing means for providing the face information regarding the identified face image to the first user terminal is provided. A video meeting evaluation terminal is obtained.
  • the acquired moving image since it is assumed that the acquired moving image is stored in the terminal, it is analyzed and evaluated on the terminal, and the result is provided to the user of the terminal. Therefore, for example, even if it is a video meeting containing personal information or a video meeting containing confidential information, the video itself can be analyzed and evaluated without providing it to an external evaluation company or the like.
  • a first user terminal having at least a first camera unit and a first display unit
  • a second user terminal having at least a second camera unit and a second display unit
  • a video meeting including at least a video meeting service terminal that provides a video meeting to the first user terminal and the second user terminal and stores at least a moving image acquired by the first camera unit or the second camera unit. It ’s a system,
  • the first user terminal includes a display means for displaying at least a moving image acquired from the video meeting with the second user terminal.
  • the second user terminal is An acquisition means for acquiring the moving image of the second camera, and An identification means for identifying at least a facial image contained in the moving image for each predetermined frame unit, A providing means for providing the face information regarding the identified face image to the first user terminal is provided.
  • Video meeting system. [Item 2] The video meeting system according to item 1. The second user terminal further includes an evaluation means for calculating an evaluation value regarding emotions based on the identified face image. The providing means provides only the evaluation value to the first user terminal as information about the face obtained from the second camera.
  • Video meeting system [Item 3] The video meeting system according to item 1 or item 2.
  • the face information includes information regarding the presence or absence of the user's face acquired by the second camera. Video meeting system.
  • the face information includes information regarding the orientation of the user's face acquired by the second camera.
  • the face information includes information about an object generated based on the user's emotion acquired by the second camera.
  • a video meeting terminal that allows video meetings with other user terminals. An acquisition means for acquiring a self-moving image acquired by the camera unit, An identification means for identifying at least a facial image contained in the moving image for each predetermined frame unit, A providing means for providing the face information regarding the identified face image to the other user terminal is provided. Video meeting terminal.
  • a video meeting terminal that allows video meetings with other user terminals, Acquisition means for acquiring a self-moving image acquired by the camera unit, An identification means that identifies at least a facial image contained in the moving image for each predetermined frame unit. It functions as a providing means for providing the face information related to the identified face image to the other user terminal.
  • Video meeting program [Item 8] It is a video meeting method using a video meeting terminal that allows video meetings with other user terminals. The acquisition step to acquire the self-moving image acquired by the camera unit, An identification step that identifies at least a facial image contained in the moving image for each predetermined frame unit, and The provision step of providing the face information regarding the identified face image to the other user terminal is included. Video meeting method.
  • the video meeting evaluation system of the present embodiment (hereinafter, may be simply referred to as “this system”) is used in an environment where a video meeting (hereinafter, referred to as an online session including one-way and two-way) is held by a plurality of people. It is a system that analyzes and evaluates specific emotions (feelings that occur for oneself or others' words and actions, such as comfort / discomfort or the degree thereof) of the person to be analyzed in a person.
  • An online session is, for example, an online conference, an online class, an online chat, etc., in which terminals installed in multiple locations are connected to a server via a communication network such as the Internet, and moving images are transmitted between the terminals through the server. It is designed to be able to communicate.
  • the moving images handled in the online session include the face image and voice of the user who uses the terminal.
  • the moving image also includes an image such as a material shared and viewed by a plurality of users. It is possible to switch between the face image and the material image on the screen of each terminal to display only one of them, or to divide the display area and display the face image and the material image at the same time. Further, it is possible to display the image of one of a plurality of people on the full screen, or to display the image of a part or all of the users on a small screen.
  • the leader, facilitator, or administrator of an online session designates any user as the analysis target.
  • Organizers of online sessions include, for example, instructors of online classes, chairs and facilitators of online conferences, and coaches of sessions for coaching purposes.
  • the organizer of an online session is usually one of a plurality of users who participate in the online session, but may be another person who does not participate in the online session.
  • all the participants may be the analysis target without designating the analysis target person.
  • the leader, facilitator, or administrator of the online session (hereinafter collectively referred to as the organizer) to specify any user as the analysis target.
  • Organizers of online sessions include, for example, instructors of online classes, chairs and facilitators of online conferences, and coaches of sessions for coaching purposes.
  • the organizer of an online session is usually one of a plurality of users who participate in the online session, but may be another person who does not participate in the online session.
  • the video meeting evaluation system displays at least a moving image acquired from the video meeting when the video meeting session is established between a plurality of terminals.
  • the displayed moving image is acquired by the terminal, and at least the facial image contained in the moving image is identified for each predetermined frame unit. After that, the evaluation value for the identified facial image is calculated.
  • the evaluation value is shared as necessary.
  • the acquired moving image is stored in the terminal, analyzed and evaluated on the terminal, and the result is provided to the user of the terminal. Therefore, for example, even a video meeting containing personal information or a video meeting containing confidential information can be analyzed and evaluated without providing the video itself to an external evaluation organization or the like. Further, by providing only the evaluation result (evaluation value) to the external terminal as needed, the result can be visualized, cross analysis, or the like can be performed.
  • the video meeting evaluation system has user terminals 10 and 20 having at least an input unit such as a camera unit and a microphone unit, a display unit such as a display unit, and an output unit such as a speaker. It also includes a video meeting service terminal 30 that provides bidirectional video meetings to user terminals 10 and 20, and an evaluation terminal 40 that performs a part of evaluation related to the video meeting.
  • FIG. 2 is a diagram showing a hardware configuration example of a computer that realizes each of the terminals 10 to 40 according to the present embodiment.
  • the computer includes at least a control unit 110, a memory 120, a storage 130, a communication unit 140, an input / output unit 150, and the like. These are electrically connected to each other through the bus 160.
  • the control unit 110 is an arithmetic unit that controls the operation of each terminal as a whole, controls the transmission and reception of data between each element, and performs information processing necessary for application execution and authentication processing.
  • the control unit 110 is a processor such as a CPU, and executes each information processing by executing a program or the like stored in the storage 130 and expanded in the memory 120.
  • the memory 120 includes a main storage configured by a volatile storage device such as a DRAM and an auxiliary storage configured by a non-volatile storage device such as a flash memory or an HDD.
  • the memory 120 is used as a work area or the like of the control unit 110, and also stores a BIOS executed when each terminal is started, various setting information, and the like.
  • the storage 130 stores various programs such as application programs.
  • a database storing data used for each process may be built in the storage 130.
  • the moving image in the online session is not recorded in the storage 130 of the video meeting service terminal 30, but is stored in the storage 130 of the user terminal 10.
  • the evaluation terminal 40 stores applications and other programs necessary for evaluating the moving image acquired on the user terminal 10, and provides the user terminal 10 as appropriate so that it can be used.
  • the storage 13 managed by the evaluation terminal 40 may share, for example, only the evaluation result as a result of analysis by the user terminal 10.
  • the communication unit 140 connects the terminal to the network.
  • the communication unit 140 connects directly to an external device or a network access point by, for example, a wired LAN, a wireless LAN, Wi-Fi (registered trademark), infrared communication, Bluetooth (registered trademark), short-range or non-contact communication, or the like. Communicate via.
  • the input / output unit 150 is, for example, an information input device such as a keyboard, a mouse, and a touch panel, and an output device such as a display.
  • the bus 160 is commonly connected to each of the above elements and transmits, for example, an address signal, a data signal, and various control signals.
  • the evaluation terminal acquires a moving image from the video meeting service terminal, identifies at least the facial image contained in the moving image for each predetermined frame unit, and calculates the evaluation value for the facial image.
  • the video meeting service provided by the video meeting service terminal (hereinafter, may be simply referred to as “the service”) is bidirectionally imaged and voiced with respect to the user terminals 10 and 20. Communication is possible.
  • This service displays a moving image acquired by the camera unit of the other user terminal on the display of the user terminal, and can output the sound acquired by the microphone unit of the other user terminal from the speaker.
  • this service may record (record) moving images and audio (collectively referred to as “moving images, etc.”) in a storage unit on at least one of the user terminals by either or both user terminals. It is configured to be possible.
  • the recorded moving image information Vs (hereinafter referred to as “recording information”) is cached in the user terminal that started recording and is recorded only locally in one of the user terminals. If necessary, the user can view the recorded information by himself / herself within the scope of using this service, share it with others, and so on.
  • the user terminal 10 evaluates the moving image acquired as described above by the following analysis.
  • FIG. 4 is a block diagram showing a configuration example according to the present embodiment.
  • the video meeting evaluation system of the present embodiment is realized as a functional configuration of the user terminal 10. That is, the user terminal 10 includes a moving image acquisition unit 11, a biological reaction analysis unit 12, a peculiarity determination unit 13, a related event identification unit 14, a clustering unit 15, and an analysis result notification unit 16 as its functions.
  • Each of the above functional blocks 11 to 16 can be configured by any of hardware, DSP (Digital Signal Processor), and software provided in the user terminal 10, for example.
  • DSP Digital Signal Processor
  • each of the above functional blocks 11 to 16 is actually configured to include a computer CPU, RAM, ROM, etc., and is a program stored in a recording medium such as RAM, ROM, a hard disk, or a semiconductor memory. Is realized by the operation of.
  • the moving image acquisition unit 11 acquires a moving image obtained by shooting a plurality of people (multiple users) with a camera provided in each terminal during an online session. It does not matter whether the moving image acquired from each terminal is set to be displayed on the screen of each terminal. That is, the moving image acquisition unit 11 acquires the moving image from each terminal, including the moving image being displayed on each terminal and the moving image being hidden.
  • the biological reaction analysis unit 12 analyzes changes in the biological reaction of each of a plurality of people based on the moving image (whether or not it is displayed on the screen) acquired by the moving image acquisition unit 11.
  • the biological reaction analysis unit 12 separates the moving image acquired by the moving image acquisition unit 11 into a set of images (a collection of frame images) and a voice, and analyzes changes in the biological reaction from each.
  • the biological reaction analysis unit 12 analyzes the user's face image using the frame image separated from the moving image acquired by the moving image acquisition unit 11, and thereby at least one of the facial expression, the line of sight, the pulse, and the movement of the face. Analyze changes in biological reactions related to one. In addition, the biological reaction analysis unit 12 analyzes changes in the biological reaction regarding at least one of the user's speech content and voice quality by analyzing the voice separated from the moving image acquired by the moving image acquisition unit 11.
  • the biological reaction analysis unit 12 calculates the biological reaction index value reflecting the content of the change in the biological reaction by quantifying the change in the biological reaction according to a predetermined standard.
  • Analysis of changes in facial expressions is performed, for example, as follows. That is, for each frame image, a facial area is specified from the frame image, and the specified facial expressions are classified into a plurality of types according to an image analysis model trained in advance by machine learning. Then, based on the classification result, it is analyzed whether a positive facial expression change occurs between consecutive frame images, a negative facial expression change occurs, and how large the facial expression change occurs. The facial expression change index value according to the analysis result is output.
  • Analysis of changes in the line of sight is performed, for example, as follows. That is, for each frame image, the area of the eyes is specified from the frame image, and the orientation of both eyes is analyzed to analyze where the user is looking. For example, it analyzes whether the speaker's face being displayed, the shared material being displayed, or the outside of the screen is being viewed. In addition, it may be possible to analyze whether the movement of the line of sight is large or small, and whether the movement is frequent or infrequent. The change in the line of sight is also related to the degree of concentration of the user.
  • the biological reaction analysis unit 12 outputs the line-of-sight change index value according to the analysis result of the line-of-sight change.
  • Analysis of pulse changes is performed, for example, as follows. That is, for each frame image, the face area is specified from the frame image. Then, using a trained image analysis model that captures the numerical value of the face color information (G in RGB), the change in the G color on the face surface is analyzed. By arranging the results along the time axis, a waveform showing the change in color information is formed, and the pulse is specified from this waveform. When a person is nervous, the pulse becomes faster, and when he / she feels calm, the pulse becomes slower.
  • the biological reaction analysis unit 12 outputs a pulse change index value according to the analysis result of the pulse change.
  • Analysis of changes in facial movement is performed, for example, as follows. That is, for each frame image, the area of the face is specified from the frame image, and the orientation of the face is analyzed to analyze where the user is looking. For example, it analyzes whether the speaker's face being displayed, the shared material being displayed, or the outside of the screen is being viewed. In addition, it may be possible to analyze whether the movement of the face is large or small, and whether the movement is frequent or infrequent. The movement of the face and the movement of the line of sight may be combined and analyzed. For example, it may be possible to analyze whether the speaker's face being displayed is viewed straight, whether the speaker is viewed with the upper or lower eye, or whether the speaker is viewed from an angle.
  • the biological reaction analysis unit 12 outputs a face orientation change index value according to the analysis result of the face orientation change.
  • the content of the statement is analyzed as follows, for example. That is, the biological reaction analysis unit 12 converts the voice into a character string by performing a known voice recognition process on the voice for a specified time (for example, a time of about 30 to 150 seconds), and performs morphological analysis of the character string. By doing so, words that are unnecessary for expressing conversation such as auxiliary words and acronyms are removed. Then, the remaining words are vectorized, and whether a positive emotional change is occurring, a negative emotional change is occurring, and how large the emotional change is occurring is analyzed, and the analysis result is used. Outputs the statement content index value.
  • Voice quality analysis is performed as follows, for example. That is, the biological reaction analysis unit 12 identifies the acoustic characteristics of the voice by performing a known voice analysis process on the voice for a specified time (for example, a time of about 30 to 150 seconds). Then, based on the acoustic characteristics, it is analyzed whether a positive voice quality change is occurring, a negative voice quality change is occurring, and how loud the voice quality change is occurring, and according to the analysis result. Outputs the voice quality change index value.
  • the biological reaction analysis unit 12 uses at least one of the facial expression change index value, the line-of-sight change index value, the pulse change index value, the face orientation change index value, the speech content index value, and the voice quality change index value calculated as described above.
  • the biological reaction index value is calculated.
  • the biological reaction index value is calculated by weighting the facial expression change index value, the line-of-sight change index value, the pulse change index value, the face orientation change index value, the speech content index value, and the voice quality change index value.
  • the peculiarity determination unit 13 determines whether or not the change in the biological reaction analyzed for the person to be analyzed is specific to the change in the biological reaction analyzed for a person other than the person to be analyzed. In the present embodiment, the peculiarity determination unit 13 compares the changes in the biological reaction analyzed for the analysis target person with those of others based on the biological reaction index values calculated for each of the plurality of users by the biological reaction analysis unit 12. To determine whether it is specific or not.
  • the peculiarity determination unit 13 calculates the variance of the biological reaction index value calculated for each of a plurality of persons by the biological reaction analysis unit 12, and compares the biological reaction index value calculated for the analysis target person with the variance. It is determined whether or not the change in the biological reaction analyzed for the person to be analyzed is specific compared to the other person.
  • the following three patterns can be considered as cases where the changes in the biological reaction analyzed for the person to be analyzed are more specific than those of others.
  • the first is the case where a particularly large change in the biological reaction has not occurred in the other person, but a relatively large change in the biological reaction has occurred in the person to be analyzed.
  • the second is the case where a particularly large change in the biological reaction has not occurred in the subject to be analyzed, but a relatively large change in the biological reaction has occurred in the other person.
  • the third is the case where a relatively large change in the biological reaction occurs in both the analysis target person and the other person, but the content of the change differs between the analysis target person and the other person.
  • the related event identification unit 14 identifies an event occurring with respect to at least one of the analysis subject, another person, and the environment when a change in the biological reaction determined to be specific by the peculiarity determination unit 13 occurs. .. For example, the related event identification unit 14 identifies the behavior of the analysis target person himself / herself from the moving image when a specific change in the biological reaction occurs for the analysis target person. In addition, the related event identification unit 14 identifies the behavior of another person from the moving image when a specific change in the biological reaction occurs for the analysis target person. In addition, the related event identification unit 14 identifies the environment when a specific change in the biological reaction occurs for the analysis target person from the moving image. The environment is, for example, a shared material displayed on the screen, an environment reflected in the background of the person to be analyzed, and the like.
  • the clustering unit 15 includes changes in biological reactions determined to be specific by the peculiarity determination unit 13 (for example, one or a combination of eyes, pulse, facial movement, speech content, and voice quality) and the peculiarity.
  • the degree of correlation with the event (event specified by the related event identification unit 14) that occurs when a change in the biological reaction occurs is analyzed, and it is determined that the correlation is above a certain level.
  • the clustering unit 15 clusters the analysis target person or the event in any of a plurality of pre-segmented classifications according to the content of the event, the degree of negativeness, the magnitude of the correlation, and the like.
  • the clustering unit 15 clusters the analysis target person or the event in any of a plurality of pre-segmented classifications according to the content of the event, the degree of positiveness, the magnitude of the correlation, and the like.
  • the analysis result notification unit 16 determines at least one of the changes in the biological reaction determined to be specific by the peculiarity determination unit 13, the event specified by the related event identification unit 14, and the classification clustered by the clustering unit 15. , Notify the designated person of the analysis target (analysis target person or the organizer of the online session).
  • the analysis result notification unit 16 analyzes the analysis target as an event that occurs when a specific change in biological reaction occurs in the analysis target person (any of the above-mentioned three patterns; the same applies hereinafter). Notify the person to be analyzed of the person's own words and actions. As a result, the person to be analyzed can grasp that he / she has different emotions from others when he / she makes a certain word or action. At this time, the change of the specific biological reaction specified for the analysis target person may also be notified to the analysis target person. Further, the change in the biological reaction of the other person to be compared may be further notified to the analysis target person.
  • the analysis result notification unit 16 organizes the online session of the events occurring when the specific biological reaction of the analysis target person is different from that of others, together with the specific change of the biological reaction. Notify to. This allows the organizer of the online session to know what kind of phenomenon influences what kind of emotional change as a phenomenon peculiar to the designated analysis target person. Then, it becomes possible to take appropriate measures for the analysis target person according to the grasped contents.
  • the analysis result notification unit 16 notifies the organizer of the online session of the event occurring when the analysis target person has a specific change in biological reaction different from that of others or the clustering result of the analysis target person. do.
  • the organizer of the online session can grasp the tendency of the behavior peculiar to the analysis target person and predict the behavior or state that may occur in the future, depending on which classification the specified analysis target person is clustered into. be able to. Then, it becomes possible to take appropriate measures for the analysis target person.
  • the biological reaction index value is calculated by quantifying the change in the biological reaction according to a predetermined standard, and the analysis target person is analyzed based on the biological reaction index value calculated for each of the plurality of persons.
  • An example of determining whether or not a change in a biological reaction has been made is specific compared to another person has been described, but the present invention is not limited to this example. For example, it may be as follows.
  • the biological reaction analysis unit 12 analyzes the movement of the line of sight for each of a plurality of people and generates a heat map showing the direction of the line of sight.
  • the peculiarity determination unit 13 compares the heat map generated for the analysis target person with the heat map generated for the other person by the biological reaction analysis unit 12, and the change in the biological reaction analyzed for the analysis target person is the change of the biological reaction of the other person. It is determined whether or not it is specific by comparing with the change in the biological reaction analyzed for.
  • the moving image of the video meeting is stored in the local storage of the user terminal 10, and the above-mentioned analysis is performed on the user terminal 10. Although it may depend on the machine specifications of the user terminal 10, it is possible to analyze the moving image information without providing it to the outside.
  • the biological reaction analysis unit 12 has an emotion evaluation unit that evaluates the degree of emotion of the subject according to an evaluation standard leveled among a plurality of subjects based on the change in the biological reaction analyzed for the subject. You may.
  • the emotion evaluation unit is based on the change in the biological reaction (biological reaction index value) analyzed for the subject by the biological reaction analysis unit 12, and the emotional response absolute based on the evaluation criteria leveled among the plurality of subjects. Calculate the value.
  • the emotional response absolute value calculated by the emotional evaluation unit is, for example, a value obtained by adjusting the biological reaction index value calculated by the biological reaction analysis unit 12 according to the likelihood of the same emotion occurring by the subject.
  • the emotion evaluation unit calculates the absolute emotional response value by multiplying the biological reaction index value calculated by the biological reaction analysis unit 12 by a weight value according to the frequency of causing the same emotion.
  • the emotion evaluation unit calculates the absolute emotional response value according to a function such that the weight value becomes smaller as the same emotion is more likely to occur, and the weight value becomes larger as the same emotion is less likely to occur.
  • the emotion evaluation unit is the degree of emotion based on the magnitude of the difference in the current biological reaction to the biological reaction in normal times, and the degree of emotion adjusted according to the likelihood of the same emotion being generated by the subject. May be evaluated.
  • the emotion evaluation unit determines the biological reaction index value calculated by the biological reaction analysis unit 12 according to the magnitude of the difference in the current biological reaction to the biological reaction in normal times and the susceptibility to the same emotion by the subject.
  • the absolute value of emotional response is calculated by adjusting.
  • the absolute emotional response value calculated in this way is a value representing the degree of emotion based on the magnitude of the difference in the current biological response to the biological response in normal times, and the subject is likely to generate the same emotion or occurs. It is a value adjusted according to the degree of difficulty.
  • the frequency of generating the same emotion is used as a measure for expressing the susceptibility to the same emotion
  • the present invention is not limited to this.
  • the nature or personality of the subject may be used in place of or in addition to the frequency with which the same emotions occur.
  • the reaction information presentation unit 13a presents information indicating changes in the biological reaction to the leader, facilitator, or manager of the online session (hereinafter collectively referred to as the organizer).
  • Organizers of online sessions include, for example, instructors of online classes, chairs and facilitators of online conferences, and coaches of sessions for coaching purposes.
  • the organizer of an online session is usually one of a plurality of users who participate in the online session, but may be another person who does not participate in the online session.
  • the organizer of the online session can also grasp the state of the participants who are not displayed on the screen in the environment where the online session is held by multiple people.
  • a first embodiment of the present system based on the above-described configuration will be described with reference to FIGS. 6 to 8.
  • the face image of the target person included in the moving image is recognized for each predetermined frame unit, and the voice of the target person is recognized.
  • Recognition may be performed for a plurality of subjects.
  • the emotions from the plurality of viewpoints of the subject are quantified and evaluated based on both the recognized facial image and the voice.
  • the evaluated emotions are plotted in a graph along with their degree.
  • the graph is plotted along the time series of the video.
  • a numerical value evaluated from one viewpoint of happiness (Happy Score) may be plotted for one subject in the moving image of the position.
  • the degree of emotion from a plurality of viewpoints may be plotted for each subject.
  • a plurality of moving images including a certain target person for example, moving images of a plurality of classes taken online by a user, moving images of a plurality of online meetings in which a user participates, etc.
  • the degree of the average value (the highest value, the lowest value, the mode value, etc.) of the emotions in the above may be plotted by plotting the title of the moving image on the horizontal axis and the degree of emotions on the vertical axis. This makes it possible to visualize how the subject's emotions have changed as he / she has participated in multiple video meetings.
  • the illustrated graph is a line graph, but it may be any kind such as a bar graph or a heat map. In addition, it may be displayed in different colors for each type of emotion.
  • each graph may be, for example, plotting the degree of emotion of the subject for each type of emotion according to the evaluation criteria leveled among a plurality of subjects. This makes it possible to make an objective evaluation even with a scale of the same axis (for example, 0 to 100).
  • the degree of emotion based on the magnitude of the difference in the current biological reaction to the normal biological reaction of the subject is evaluated, and the degree of emotion adjusted according to the likelihood of the same emotion occurring is evaluated. , The same effect can be obtained by plotting the adjusted emotional degree for each emotion type.
  • a search word box for accepting search words is displayed on the screen according to the present embodiment.
  • a word is input in the search word box (for example, when "base" is input)
  • a predetermined range in the moving image including the sound corresponding to the input search word is extracted and displayed.
  • the system according to the present embodiment is a face recognition means for recognizing at least a face image of a target person included in a moving image for each predetermined frame unit, and voice recognition for recognizing at least a voice of a target person included in the moving image. It is provided with a means and a search receiving means for accepting the input of a search word. According to such a configuration, as shown in the figure, it is possible to display a part of the moving image corresponding to the range spoken as "base” from the moving image file "20201230_Biology_Tanaka" and text information. Become.
  • the word "base” is extracted in three places in the illustrated screen example. Selecting a displayed search word (eg, the very first "base”) will (partially) play a moving image containing the frame when the word was spoken.
  • a digest moving image may be generated by connecting a plurality of partial moving images including the search word. As a result, it is possible to confirm the moving image around the word that has been efficiently searched in a short time.
  • the registration of the search word is accepted in advance, and an alert is issued when the registered search word is extracted in the moving image, or a digest in which a plurality of partial moving images including the search word are connected.
  • a moving image may be automatically generated.
  • a word as shown in the alert value column may be registered and associated with information indicating the location of the moving image information in which the word appears.
  • a playback link to the point in the moving image may be generated.
  • Registered words can be easily managed by tagging them with some kind of tag (alert pattern) in advance.
  • ⁇ Third embodiment> A system according to a third embodiment of the present invention will be described with reference to FIG.
  • the user is within the range of the camera with respect to the other side (a state in which the visual information obtained from the camera is not provided). That is, it is possible to provide the presence in front of the computer) and the facial expression at that time.
  • the host cannot see the other person's face, so it is not possible to confirm whether or not he / she is properly attending the lecture or conference. In this way, the host wants to know if the guest is properly participating in front of the camera, while the guest wants to tell that they are participating properly, but does not want to turn on the camera. Can occur.
  • the guest user's terminal acquires the moving image of the guest user's camera, identifies at least the facial image contained in the moving image for each predetermined frame unit, and identifies the identified facial image. It is converted into face information and provided to the host user terminal.
  • Examples of face information include, but are not limited to, whether or not the person is in front of the camera, the orientation of the face, the emotion obtained from the facial expression captured by the camera, and the object information generated based on the feeling. ..
  • the guest is detected by the camera and the guest is looking straight at the screen without sharing the private information (visual information obtained by the camera) of the guest user to the host side.
  • Facial expressions and other information can be provided.
  • the system according to this embodiment outputs a text in association with an evaluation value related to emotions. For example, as shown in FIG. 12, the content spoken with a voice louder than a predetermined value is displayed by increasing the font size, and the words spoken by looking at the camera are underlined. ..
  • the present embodiment it is possible to add predetermined processing to the text based on the user's direct words and actions obtained from the moving image, the result of analyzing the words and actions, and the like.
  • voice intonation size, height, speed, etc.
  • voice intonation size, height, speed, etc.
  • processing to be added to the text can be exemplified by changing the font size, changing the thickness, changing to italics, changing the character color, adding a shadow, changing the font type, and the like.
  • the latent emotion is analyzed by the contradiction (dissociation of the evaluation value) between the analysis and evaluation information from each of the plurality of viewpoints acquired from the moving image.
  • this system calculates evaluation values from multiple viewpoints based on both facial images and sounds.
  • the notification means is notified.
  • the evaluation values obtained from the face image and the voice deviate by a certain amount or more
  • the evaluation values obtained from the facial movement and the voice obtained from the face image deviate by a certain amount or more. If so, the case where the degree of emotion evaluated from the facial image and the evaluation value obtained from the voice deviate from each other by a certain amount or more can be exemplified.
  • the discrepancy may be determined based on a predetermined correlation, or may be a machine learning determination.
  • This system associates a display means that displays a list of moving images obtained from the cameras of multiple users who are participating in a video meeting at the same time, and an object associated with an evaluation value for each of the multiple moving images. It has an association means to display.
  • the associating means may generate a heat map according to the evaluation value and display the corresponding color as an overlay on each of the moving images. As shown in FIG. 14, users with a high degree of anger may be grayed out.
  • association means may generate emotional and emotional icons according to the evaluation value, and display the corresponding icons together with each of the moving images.
  • the above-mentioned peculiarity determination unit 13 provides a notification means for issuing an alert to a predetermined terminal (screen or the like) when a specific reaction different from the previous one is analyzed for the same determined user. ing.
  • the notification means notifies when the reaction exceeds the threshold range. For example, a notification may be given when the anger frequency of a user who does not normally get angry becomes extremely higher than before, or a notification may be given when a user who does not usually laugh laughs. It is possible to register in advance what kind of notification will be given under what conditions.
  • the notification means shall give the above notification when the specific reaction of the user exceeds a predetermined number of times in the online meeting held in the same time zone (one meeting, one lesson, etc.). May be.
  • the series of processes by the apparatus described herein may be implemented using software, hardware, or any combination of software and hardware. It is possible to create a computer program for realizing each function of the information sharing support device 10 according to the present embodiment and implement it on a PC or the like. It is also possible to provide a computer-readable recording medium in which such a computer program is stored.
  • the recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like. Further, the above computer program may be distributed, for example, via a network without using a recording medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

[Problème] Évaluer une réunion vidéo au moyen de l'évaluation d'une vidéo acquise lors de la réunion vidéo. [Solution] Selon la présente divulgation, un terminal d'évaluation de réunion vidéo est un système de réunion vidéo comprenant un premier terminal d'utilisateur, un second terminal d'utilisateur et un terminal de service de réunion vidéo qui fournit une réunion vidéo et stocke au moins une vidéo capturée par une première unité de caméra ou une seconde unité de caméra. Le premier terminal d'utilisateur comprend une unité d'affichage destinée à afficher au moins une vidéo acquise à partir de la réunion vidéo tenue avec le second terminal d'utilisateur. Le second terminal d'utilisateur comprend une unité d'acquisition destinée à acquérir la vidéo capturée par la seconde caméra, une unité d'identification destinée à identifier au moins une image de visage comprise dans la vidéo pour chaque unité de trame prédéterminée, et une unité de fourniture destinée à fournir, au premier terminal d'utilisateur, des informations de visage concernant l'image de visage identifiée.
PCT/JP2020/049295 2020-12-31 2020-12-31 Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo WO2022145040A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022517944A JP7465012B2 (ja) 2020-12-31 2020-12-31 ビデオミーティング評価端末、ビデオミーティング評価システム及びビデオミーティング評価プログラム
PCT/JP2020/049295 WO2022145040A1 (fr) 2020-12-31 2020-12-31 Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/049295 WO2022145040A1 (fr) 2020-12-31 2020-12-31 Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo

Publications (1)

Publication Number Publication Date
WO2022145040A1 true WO2022145040A1 (fr) 2022-07-07

Family

ID=82259229

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/049295 WO2022145040A1 (fr) 2020-12-31 2020-12-31 Terminal d'évaluation de réunion vidéo, système d'évaluation de réunion vidéo et programme d'évaluation de réunion vidéo

Country Status (2)

Country Link
JP (1) JP7465012B2 (fr)
WO (1) WO2022145040A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005165597A (ja) * 2003-12-02 2005-06-23 Nec Corp 仮想会話システム
JP2014060491A (ja) * 2012-09-14 2014-04-03 Nippon Telegr & Teleph Corp <Ntt> 視聴状況判定装置、識別器構築装置、視聴状況判定方法、識別器構築方法およびプログラム
JP2020525946A (ja) * 2017-07-05 2020-08-27 フランシスカ ジョーンズ,マリア 仮想会議の参加者の反応を示す方法及びシステム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005165597A (ja) * 2003-12-02 2005-06-23 Nec Corp 仮想会話システム
JP2014060491A (ja) * 2012-09-14 2014-04-03 Nippon Telegr & Teleph Corp <Ntt> 視聴状況判定装置、識別器構築装置、視聴状況判定方法、識別器構築方法およびプログラム
JP2020525946A (ja) * 2017-07-05 2020-08-27 フランシスカ ジョーンズ,マリア 仮想会議の参加者の反応を示す方法及びシステム

Also Published As

Publication number Publication date
JP7465012B2 (ja) 2024-04-10
JPWO2022145040A1 (fr) 2022-07-07

Similar Documents

Publication Publication Date Title
WO2022064621A1 (fr) Système d&#39;évaluation de réunion vidéo et serveur d&#39;évaluation de réunion vidéo
WO2022168185A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2022168180A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2022145040A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo et programme d&#39;évaluation de réunion vidéo
WO2022145042A1 (fr) Terminal d&#39;évaluation, système d&#39;évaluation et programme d&#39;évaluation de réunion vidéo
WO2022145043A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo, et programme d&#39;évaluation de réunion vidéo
WO2022145038A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo et programme d&#39;évaluation de réunion vidéo
WO2022145044A1 (fr) Système de notification de réaction
WO2022024956A1 (fr) Système d&#39;analyse d&#39;émotion et dispositif d&#39;analyse d&#39;émotion
WO2022145041A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo, et programme d&#39;évaluation de réunion vidéo
WO2022145039A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo et programme d&#39;évaluation de réunion vidéo
WO2022137502A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo et programme d&#39;évaluation de réunion vidéo
WO2022113248A1 (fr) Terminal d&#39;évaluation de réunion vidéo et procédé d&#39;évaluation de réunion vidéo
WO2022074785A1 (fr) Terminal d&#39;évaluation de réunion vidéo, système d&#39;évaluation de réunion vidéo, et programme d&#39;évaluation de réunion vidéo
WO2022064617A1 (fr) Système d&#39;évaluation de réunion vidéo et serveur d&#39;évaluation de réunion vidéo
WO2022064620A1 (fr) Système d&#39;évaluation de réunion vidéo et serveur d&#39;évaluation de réunion vidéo
WO2022064618A1 (fr) Système d&#39;évaluation de réunion vidéo et serveur d&#39;évaluation de réunion vidéo
WO2022064619A1 (fr) Système et serveur d&#39;évaluation de réunion vidéo
JP7100938B1 (ja) 動画像分析プログラム
WO2022168182A1 (fr) Terminal d&#39;évaluation, système d&#39;évaluation et programme d&#39;évaluation de session vidéo
WO2022168177A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2023032058A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2022168175A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2022168174A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo et programme d&#39;évaluation de session vidéo
WO2022168179A1 (fr) Terminal d&#39;évaluation de session vidéo, système d&#39;évaluation de session vidéo, et programme d&#39;évaluation de session vidéo

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022517944

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20968043

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20968043

Country of ref document: EP

Kind code of ref document: A1