WO2019130818A1

WO2019130818A1 - Information processing device and group reconstruction method

Info

Publication number: WO2019130818A1
Application number: PCT/JP2018/040838
Authority: WO
Inventors: 祐毅小林; 菜美西村; 知子真野
Original assignee: 京セラドキュメントソリューションズ株式会社
Priority date: 2017-12-25
Filing date: 2018-11-02
Publication date: 2019-07-04

Abstract

An information processing device (1A) that comprises: a first detection part (101) that, from voice data in which is recorded the speech of each person in a plurality of groups that comprise a predetermined plural number of people, detects an utterance duration for each utterance included in the speech; a first totaling part (102) that, by person, totals the utterance durations of the utterances detected by the first detection part (101); and a reconstruction part (130) that reconstructs the groups on the basis of the utterance durations totaled for the people by the first totaling part (102).

Description

Information processing apparatus and group reconfiguration method

The present invention relates to an information processing apparatus and a group reconstruction method, and more particularly to a technology for analyzing contents uttered by a person.

Nowadays, there is proposed a technique for estimating activation of discussion for each group and clarifying the situation (see Patent Document 1 below). This technology stores the speech time and speech time length uttered by each member of the group, divides the speech time and speech time length of each member into time series for each group, and speaks the speech time length of all members of the group The ratio of the speaking time length of each member to is calculated, and a graph in which the speaking density contribution rate of each member is drawn is generated for each group.

JP, 2016-162339, A

Although the above technology provides the speech density contribution rate of each member for determining whether the group discussion is activated, the speech density contribution rate corresponds to the speech duration of all members in the group. It is only a ratio of the speaking time length of each member to which it belongs. That is, the above technology detects the degree of contribution of each member in a group that has already been configured, and it is expected that more effective discussions will be conducted based on the degree of contribution of discussions held in the past, etc. It does not reorganize new groups that can.

The present invention has been made in view of the above-mentioned circumstances, and is to reconstruct a new group which can be expected to have more effective debate based on the degree of contribution of the debate conducted in the past, etc. With the goal.

The information processing apparatus according to one aspect of the present invention is a speech time for each utterance in the utterance from voice data in which the utterances of each person in a plurality of predetermined number of people are recorded. In the first detection unit that detects the first message, the first counting unit that counts the speech time for each message detected by the first detection unit, and the speaking time of each person counted by the first counting unit And a reconfiguring unit configured to reconstruct a group based on the information.

Further, according to another aspect of the present invention, there is provided a group reconfiguring method, in which speech is generated from speech data in which speech of each person in a plurality of predetermined groups is recorded. A speech time detection step for detecting a speech time for speech, a speech time counting step for each person for counting the speech time for each speech detected in the speech time detection step, and a speech time count for each person And D. a group reconstruction step of reconstructing a group based on the speaking time of each person tabulated in the step.

Further, in the information processing apparatus according to another aspect of the present invention, an audio input unit to which an electrical signal indicating audio is input and an electrical signal each time an audio signal is input to the audio input unit is based on the input electrical signal. The processor includes a storage unit that stores voice data for each person who has produced voice, and a processor, and the processor executes a group configuration program to extract a portion corresponding to a speech from the voice data, and the speech continues Based on a first detection unit that detects time as a speaking time, a first counting unit that counts at least one speaking time for each person and calculates a speaking time of each person, and based on the speaking time of each person, A configuration unit that configures a group to which each person belongs, and a control unit that functions as a unit.

According to the present invention, when a person utters, it is possible to analyze up to the type of the person's remark and provide the result.

FIG. 1 is a diagram showing an information processing apparatus according to a first embodiment of the present invention, and a target person who is evaluated by the information processing apparatus. It is a block diagram which shows the outline of an internal structure of the information processing apparatus which concerns on 1st Embodiment. It is a figure which shows an example of audio | voice data. It is a flowchart which shows the evaluation process of the meeting participant by the information processing apparatus which concerns on 1st Embodiment. It is a flowchart which shows the 1st modification of the evaluation processing of the meeting participant by the information processing apparatus which concerns on 1st Embodiment. It is a flowchart which shows the 2nd modification of the evaluation processing of the meeting participant by the information processing apparatus which concerns on 1st Embodiment. It is a flowchart which shows the 3rd modification of the evaluation processing of the meeting participant by the information processing apparatus which concerns on 1st Embodiment. It is a flowchart which shows the 4th modification of the evaluation processing of the meeting participant by the information processing apparatus which concerns on 1st Embodiment. It is a block diagram showing an outline of an internal configuration of an information processor concerning a 2nd embodiment. It is a flowchart which shows the group reconfiguration | reconstruction process by the information processing apparatus which concerns on 2nd Embodiment. It is a flowchart which shows the 1st modification of the group reconfiguration | reconstruction processing by the information processing apparatus which concerns on 2nd Embodiment. It is a flow chart which shows the 2nd modification of group reconstruction processing by an information processor concerning a 2nd embodiment. It is a flow chart which shows the 3rd modification of group reconstruction processing by an information processor concerning a 2nd embodiment. It is a block diagram showing an outline of an internal configuration of an information processor concerning a 3rd embodiment. It is a flowchart which shows the evaluation process by the information processing apparatus which concerns on 3rd Embodiment. It is a figure which shows the position system of a speaking part distribution map. It is a flowchart which shows the 1st modification of the evaluation process by the information processing apparatus which concerns on 3rd Embodiment. It is a flow chart which shows the 2nd modification of the evaluation processing by the information processor concerning a 3rd embodiment.

Hereinafter, an information processing apparatus, an evaluation method, and a group reconfiguration method according to an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a diagram showing an information processing apparatus according to a first embodiment of the present invention, and a target person who is evaluated by the information processing apparatus.

The information processing apparatus 1 acquires, as voice data, voices uttered by each person belonging to a plurality of predetermined conversation groups G1 to G3. For example, a plurality of persons P11, P12, P13 belonging to the group G1, persons P21, P22, P23 belonging to the group G2, persons P31, P32, P33 belonging to the group G3 (in this embodiment, each person belonging to three groups Although it is explained as a total of nine people consisting of, but not limited to), it is holding a meeting, a discussion, a class, a meeting, etc. (hereinafter referred to simply as "meeting") on a group basis. It shall be.

Each person in the conversation group speaks using the headset 2 having a microphone function. That is, each headset 2 used by each person acquires the voice of the conversation of the person wearing the headset 2, converts the voice into an electrical signal indicating the voice, and outputs the electric signal to the information processing device 1. The information processing apparatus 1 and each headset 2 are connected by, for example, wired communication by cable connection or wireless communication such as Bluetooth (registered trademark) or wireless LAN. The information processing apparatus 1 converts the electrical signal indicating the voice output from each headset 2 into voice data consisting of a digital voice signal, and for each headset 2, that is, nine people P11 to P33. The voice data about is stored in each.

Next, the configuration of the information processing apparatus 1 according to the first embodiment will be described. FIG. 2 is a block diagram showing an outline of an internal configuration of the information processing apparatus 1 according to the first embodiment.

The information processing apparatus 1 is, for example, a computer. The information processing apparatus 1 includes a control unit 10, a read only memory (ROM) 112, a random access memory (RAM) 113, a hard disk drive (HDD) 114, a display unit 115, a communication interface 118, and an instruction input. And a unit 119. These units are capable of transmitting and receiving data or signals to each other via a central processing unit (CPU) bus.

The control unit 10 controls the operation of the entire information processing apparatus 1. The ROM 112 stores an operation program for the basic operation of the information processing device 1. The RAM 113 is used as an operation area or the like of the control unit 10.

The HDD 114 stores an evaluation program according to the first embodiment of the present invention in part of its storage area. Further, the HDD 114 stores the above-mentioned audio data of nine persons P11 to P33. The HDD 114 is also an example of a storage unit in the claims. However, a non-volatile ROM (for example, built in the control unit 10) included in the information processing apparatus 1 may function as the storage unit.

Identification information for specifying the headset 2 is attached in advance to the headset 2 connected to the information processing apparatus 1. The identification information is not particularly limited as long as it is information that can identify the headset 2. For example, there is an identification number. The HDD 114 stores the identification information for each headset 2 in advance. The HDD 114 also stores each of the identification information in association with group information for specifying a group in accordance with an instruction input by the user via the instruction input unit 119.

In the present embodiment, the HDD 114 stores identification information of the headset 2 used by each of the persons P21, P22, and P23 in association with group information specifying the group G1. The HDD 114 also stores identification information of the headset 2 used by each of the persons P21, P22, and P23 in association with group information specifying the group G2. The HDD 114 further stores identification information of the headset 2 used by each of the persons P31, P32, and P33 in association with group information specifying the group G3.

The display unit 115 is formed of an LCD (Liquid Crystal Display) or the like, and displays operation guidance or the like for the operator who operates the information processing apparatus 1.

The communication interface 118 has a USB interface or a wireless LAN interface. The communication interface 118 functions as an interface for performing data communication with each of the headsets 2. The communication interface 118 is an example of the voice input unit in the claims.

The instruction input unit 119 includes a keyboard, a mouse, and the like, and the operator inputs an operation instruction.

The control unit 10 is configured of a processor, a RAM, a ROM, and the like. The processor is a CPU, a micro processing unit (MPU), an application specific integrated circuit (ASIC), or the like. In the control unit 10, the evaluation program stored in the HDD 114 is executed by the processor, whereby the control unit 100, the first detection unit 101, the first aggregation unit 102, the first calculation unit 103, and the second calculation are performed. Unit 104, first application unit 105, third calculation unit 106, fourth calculation unit 107, second application unit 108, determination unit 109, storage control unit 110, third application unit 121, textification unit 122, addition application unit It functions as 123, the fourth assignment unit 124, the fifth calculation unit 125, and the sixth calculation unit 126. The control unit 100, the first detection unit 101, the first counting unit 102, the first calculation unit 103, the second calculation unit 104, the first assignment unit 105, the third calculation unit 106, the fourth calculation unit 107, and the second The assignment unit 108, the determination unit 109, the storage control unit 110, the third assignment unit 121, the textification unit 122, the addition assignment unit 123, the fourth assignment unit 124, the fifth calculation unit 125, and the sixth calculation unit 126 respectively It may be configured by a hard circuit.

The control unit 100 has a function of controlling the operation control of the entire information processing apparatus 1.

The first detection unit (speech time detection unit) 101 receives each speech in speech recorded in speech data from each of speech data of nine persons P11 to P33 stored in the HDD 114. Detect the speaking time about. FIG. 3 is a diagram showing an example of audio data. The vertical axis in FIG. 3 represents the amplitude of sound (in dB), and the horizontal axis represents time. The first detection unit 101 analyzes the audio data, and among the amplitudes indicated by the audio data, the amplitude (for example, a predetermined amplitude or more) continuously for a predetermined time (for example, 0.25 seconds) or more. Extract the part where 20 dB) continues as an utterance. The first detection unit 101 detects the time during which each of the extracted utterances is continuing as the utterance time and stores it in the HDD 114. In the voice data shown in FIG. 3, the first detection unit 101 extracts a part, b part, and c part as speech.

A first counting unit (speech-based speech time counting unit) 102 counts, for each person, the speech time for each of the utterances detected by the first detection unit 101. In this case, the speech time for each of the above-mentioned speeches is summed up for each identification information stored in the HDD 114.

The first calculating unit (total utterance time calculating unit) 103 adds up the utterance times of all the utterances detected by the first detection unit 101, and calculates the total time of the utterance times of all persons in all the groups. .

The second calculating unit (the individual-by-person ratio calculating unit) 104 sets the ratio of the speaking time of each person counted by the first counting unit 102 to the total time calculated by the first calculating unit 103 as the first Calculated as a percentage.

The first granting unit (first evaluation point granting unit) 105 sets evaluation points higher for people who have a larger ratio (first ratio) of the speaking time of each person calculated by the second calculating unit 104, A first evaluation point corresponding to the first ratio is given to each person.

The third calculation unit (group-by-group utterance time calculation unit) 106 counts the utterance time for each utterance detected by the first detection unit 101 for each group, and the total utterance time of each person belonging to the group is grouped Calculate every time. In this case, the speech time for each of the above-mentioned speeches is counted for each group information stored in the HDD 114.

The fourth calculating unit (in-group ratio calculating unit) 107 calculates the speaking time of each person detected by the first counting unit 102 based on the total speaking time for each group calculated by the third calculating unit 106. The ratio in the group which is the ratio to the above-mentioned total utterance time of the group to which each person concerned belongs is calculated as the second ratio.

The second granting unit (second evaluation point granting unit) 108 sets evaluation points higher for people with a larger ratio (second ratio) of the above-described persons calculated by the fourth calculating unit 107, and Each person is further awarded with a second evaluation point according to the in-group ratio.

The determination unit (completion / opposition determination unit) 109 determines that the utterance time detected by the first detection unit 101 is a predetermined first time (predetermined time longer than the prescribed time from the prescribed time). If the time to time is within, for example, 0.25 seconds or more and 2.0 seconds as the above-mentioned prescribed time), the utterance made at this utterance time is determined to be a compliment (b in the example of FIG. 3) . In addition, if the utterance time detected by the first detection unit 101 is the second predetermined time (time exceeding the first time), the determination unit 109 determines that the utterance time detected by the first detection unit 101 is longer than the first time. The utterance made at this utterance time is judged as an opinion (a, c in the example of FIG. 3). The determination unit 109 causes the HDD 114 to store the result as to whether or not the determined combination or opinion, and each utterance time of the combination and opinion.

The storage control unit (result storage unit) 110 causes the HDD 114 to store the result of the determination made by the determination unit 109, that is, the result of whether the remark is a compliment or an opinion separately for each of the persons P11 to P33.

The third granting unit (third evaluation point granting unit) 121 is based on the result of each of the persons P11 to P33 stored by the storage control unit 110, at another timing immediately after the opinion is given by the other person. If it is determined that the person has made the above-mentioned reciprocation, the third evaluation point is further given to the person who made the opinion.

The text conversion unit 122 has a known voice recognition function, and thereby converts the contents of the speech of each person included in the voice data into characters.

The addition giving unit (additional point giving unit) 123 determines, based on the text data made into text by the textification unit 122, whether or not the remarks by each of the persons P11 to P33 include a predetermined keyword, An additional point is given to an utterance judged to contain the keyword.

The fourth giving unit (fourth evaluation point giving unit) 124 adds up the addition points for the message given by the addition giving unit 123 for each person, and sets the total value of the addition points as a fourth evaluation point, Further grant to those who are targeted for the calculation.

The fifth calculation unit (total value calculation unit) 125 calculates the total value of the addition points for all persons belonging to the plurality of groups G1 to G3 using the addition points given by the addition giving unit 123. . In addition, the fourth provision unit 124 calculates the ratio (third ratio) of the fourth evaluation points to the total value calculated by the fifth calculation unit 125 for each of the persons P11 to P33, and the ratio is The higher it is, the 4th evaluation point is increased.

The sixth calculation unit (group-specific point calculation unit) 126 calculates, for each of the groups G1 to G3, the sum value of each person belonging to the group, for the addition points given by the addition application unit 123. In addition, the fourth assignment unit 124 sets the ratio (fourth ratio) of the fourth evaluation point of each person belonging to the group to the total value of the group calculated by the sixth calculation unit 126, for each of the above persons. The fourth evaluation point is increased as the ratio is higher.

Next, an evaluation process of a conference participant by the information processing apparatus 1 according to the first embodiment will be described. FIG. 4 is a flowchart showing an evaluation process of a conference participant by the information processing apparatus 1.

The scene in which the evaluation is performed is a scene in which each person belonging to the conversation groups G1 to G3 is holding a meeting for each group. Each of the persons P11 to P33 wears the headset 2, and the headsets 2 are communicably connected to the information processing apparatus 1 as described above. In this state, the people P11 to P33 speak during the meeting in the respective groups to which they belong. The voice uttered by the people P11 to P33 is collected by the headset 2 of each of the people P11 to P33, and is output to the information processing device 1.

The information processing apparatus 1 acquires audio data from each headset 2 via the communication interface 118 (step S1). That is, when the communication interface 118 receives an electrical signal indicating the voice output from each headset 2, the first detection unit 101 includes the electrical signal indicating the acquired voice from a digital voice signal. It is converted into voice data and stored in the HDD 114. The first detection unit 101 stores the voice data in the HDD 114 for each of the persons P11 to P33, that is, in association with the identification information stored in the HDD 114.

Subsequently, the first detection unit 101 extracts, as described above, the utterances in the speech indicated by the voice data from the voice data stored in the HDD 114 for each of the persons P11 to P33. (Step S2). Then, the first detection unit 101 detects the utterance time of each of the extracted utterances (step S3).

Furthermore, the first counting unit 102 counts, for each of the persons P11 to P33, the utterance times of the respective utterances detected by the first detection unit 101 individually for each of the persons P11 to P33 (step S4).

Subsequently, for example, the first calculation unit 103 sums up the utterance times for all the utterances detected for each person by the first detection unit 101, and calculates the total time of the utterance times of all the persons described above ( Step S5).

The second calculating unit 104 calculates, as a first ratio, the ratio of the speaking time of each person, which is counted by the first counting unit 102, to the total time calculated by the first calculating unit 103 (step S6). . That is, the second calculation unit 104 individually calculates the ratio to the person P33, such as the ratio of the speech time of the person P11 to the total time, the ratio of the speech time of the person P12 to the total time, and so on.

Subsequently, the first assignment unit 105 sets the evaluation point higher for the person with the larger proportion of the speaking time of each person calculated by the second calculation unit 104, and the first grant unit 105 sets the evaluation point for each person P11 to P33. A first evaluation point is given according to the first ratio (step S7). For example, the first giving unit 105 has 2 points when the first ratio is 0 to less than 20%, 4 points for 20% or more and less than 40%, 6 points for 40% or more and less than 60%, In the case of 60% or more and less than 80%, 8 points are awarded, and in the case of 80% or more and 100%, 10 points are awarded as first evaluation points.

Furthermore, the third calculation unit 106 counts the utterance time for each utterance detected by the first detection unit 101 in step S3 for each of the groups G1 to G3, that is, for each group information stored in the HDD 114. The total speech time of each person in group G1, the total speech time of each person in group G2, and the total speech time of each person in group G3 are calculated (step S8).

Then, the fourth calculating unit 107 calculates the utterances of the persons P11 to P33 detected by the first counting unit 102 in step S4 with respect to the total utterance time for each group calculated by the third calculating unit 106 in step S8. The intra-group ratio of time is calculated as the second ratio (step S9). That is, the fourth calculating unit 107 calculates, for each of the persons P11 to P33, the ratio of the group to which each person belongs to the total speech time as the ratio within the group.

Subsequently, the second giving unit 108 sets evaluation points to be higher as the ratio of the in-group ratio of each person calculated by the fourth calculating unit 107 in step S9 increases, and the group for each of the persons P11 to P33 is A second evaluation point according to the inside ratio is further awarded (step S10).

For example, the second granting unit 108 is 4 points when the second ratio is 0 to less than 20%, 8 points when 20% or more and less than 40%, 12 points when 40% or more and less than 60%. In the case of 60% or more and less than 80%, 16 points are awarded, and in the case of 80% or more and 100%, 20 points are awarded as the second evaluation points.

Thus, the second provision unit 108 sets the second evaluation point higher than the first evaluation point provided by the first application unit 105 (in the present embodiment, the second evaluation point is twice the first evaluation point). It is preferable to give it as an evaluation point. If the ratio of the speech time by each person to the total speech time in each group is higher than the ratio to the total time, which is the sum of the speech times of all the persons P11 to P33, the contribution in the meeting is high. It is because it seems.

When a conference is held separately for each group by a plurality of groups G1 to G3 consisting of a plurality of people, the discussion within the group may or may not be activated depending on the individuality of each person belonging to each group . Therefore, if a person who has a long talk time (high contribution) in one group belongs to another group, may he / she speak a lot in the other group (does it indicate a high contribution)? Is unknown.

Therefore, when the degree of contribution to the conference is evaluated based only on the ratio of the speech time of each member in the group to the total of the speech time of each person in the group, people P11 to P11 through all the groups G1 to G3 P33 The overall contribution of all members is unknown.

In the present embodiment, in step S7, the first granting unit 105 performs a first evaluation according to the ratio of the speaking time of each person to the total speaking time of all the persons P11 to P33 who have passed through all the groups G1 to G3. While giving points to each person, in step S10, the second giving unit 108 sets the second evaluation point according to the ratio within the group which is the ratio of the speaking time of each person to the total speaking time for each group. Further, it is possible to make a comprehensive evaluation taking into consideration both the contribution of each person in the group and the overall contribution of all of the persons P11 to P33 through all the groups G1 to G3.

Note that the control unit 100 displays, on the display unit 115, information indicating the first evaluation point and the second evaluation point given to each person in accordance with the instruction input by the user via the instruction input unit 119. It may be configured as possible.

Next, a first modified example of the evaluation process of the conference participant by the information processing device 1 will be described. FIG. 5 is a flowchart showing a first modified example of the evaluation processing of the conference participant by the information processing device 1. In the description of the first modification, the description of the processing similar to that of the first embodiment is omitted.

In the first modification, as in the first embodiment, after performing the processing from step S1 to step S10, whether the speaking time of each statement is within the first time after the determination unit 109 is further performed. It is determined whether it is the second time (step S11). If the speech time is within the first time ("first time" in step S11), the determination unit 109 determines that the speech made in this speech time is a reunion (step S12). In addition, when the utterance time is the second time (“second time” in step S11), the determination unit 109 determines that the utterance made in the utterance time is an opinion (step S16).

Furthermore, the storage control unit 110 determines the result of the determination made by the determination unit 109 in step S12 and step S16, that is, the result of whether the statement is a summary or an opinion, the time at which the statement indicating the summary or opinion is made And each of the persons P11 to P33 is stored in the HDD 114 (step S13).

Subsequently, the third assignment unit 121, based on the result of each of the persons P11 to P33 stored by the storage control unit 110, for each of the groups G1 to G3, the opinion given by a person in the group is At the timing immediately after being done, it is determined whether the above-mentioned sumo wrestling is being performed by another person (step S14). When it is determined that there is such an opinion (YES in step S14), the third giving unit 121 further gives the third evaluation point to the person who made the opinion (step S15). For example, the third applying unit 121 applies 10 points as the third evaluation point. When it is determined that there is no such opinion (NO in step S14), the third giving unit 121 does not give the third evaluation point.

According to the first modification, the opinion, etc. assumed to be a good opinion which is being followed up by another person immediately after the opinion and which has attracted the interest of the other person Since such an opinion will be given a higher evaluation than other opinions that have not been countered, it is possible to appropriately give a high evaluation to an opinion that is supposed to be good.

Next, a second modified example of the process of evaluating the meeting participants by the information processing device 1 will be described. FIG. 6 is a flow chart showing a second modified example of the evaluation processing of the conference participant by the information processing device 1. In the description of the second modification, the description of the same processes as those of the first embodiment and the first modification will be omitted.

The second modification is performed after steps S1 to S10 in the first embodiment or after steps S11 to S16 in the first modification.

The text conversion unit 122 converts the contents of the speech of each person included in the voice data into characters and converts them into text (step S20).

Based on the text data converted into text by the text conversion unit 122, the addition application unit 123 determines whether the utterance by each of the persons P11 to P33 included in the text data includes a predetermined keyword (refer to FIG. Step S21).

When the addition giving unit 123 determines that the utterance by each of the persons P11 to P33 included in the text data contains a predetermined keyword (YES in step S21), the addition giving unit 123 includes the keyword. An addition point is added to the determined speech (step S22). Although the text data includes a plurality of utterances by each person, the addition imparting unit 123 determines whether all the utterances include the keyword, and the addition imparting unit 123 Add points to all messages judged to contain a keyword.

When the addition giving unit 123 determines that the remarks by each of the persons P11 to P33 do not include a predetermined keyword (NO in step S21), the addition giving unit 123 does not give the addition points.

Thereafter, the fourth adding unit 124 adds up the added points for each of the above-mentioned statements added by the adding up unit 123 in step S22 for each of the persons P11 to P33 (step S23), and adds up the added points. A value is further assigned as a fourth evaluation point to each person targeted for the aggregation (step S24).

For example, the addition giving unit 123 gives the above-mentioned addition points (for example, 1 point) to the respective utterances each time the above-mentioned keyword included in the utterance appears, and adds up each of the persons P11 to P33. Then, the addition giving unit 123 gives the total value of each person as the fourth evaluation point to the person who made the statement including the keyword.

According to the second modified example, the fourth evaluation point is added to the above first to third evaluation points for the person who has made the keyword for which the organizer of the meeting or the like assumed that the degree of contribution is high. Therefore, based on the total value of the points, it is possible to accurately determine the persons P11 to P33 who are truly contributing to the conference.

Next, a third modified example of the evaluation processing of the conference participant by the information processing device 1 will be described. FIG. 7 is a flow chart showing a third modification of the evaluation processing of the conference participant by the information processing apparatus 1. In the description of the third modification, the description of the same processes as those of the first embodiment, the first modification, and the second modification will be omitted.

In the third modification, as in the second modification, the fourth adding unit 124 adds up the addition points for each of the utterances added by the addition adding unit 123 in step S22 for each of the persons P11 to P33. After that (step S23), the fifth calculation unit 125 further uses the addition points for each person given by the addition giving unit 123 to calculate the total value of the addition points for all persons belonging to a plurality of groups. It calculates (step S31).

Subsequently, the fourth adding unit 124 calculates the ratio of the total value of added points to each person to the total value calculated by the fifth calculating unit 125 (step S32), and the higher the calculated ratio, the more The fourth evaluation point is set high and given to the target person (step S33).

For example, the fourth giving unit 124 adds 20% of the total value when the ratio is less than 0 to 20%, adds 40% of the total value when 20% or more and less than 40%, 40% or more and 60% In the case of less than 60% of the aggregate value is added, in the case of 60% or more and less than 80%, 80% of the aggregate value is added, and in the case of 80% or more to 100%, 100% of the aggregate value is added. The aggregate value of is given as the fourth evaluation point.

According to the third modification, it is possible to evaluate the person who has made a comment on the keyword after adding an objective element in consideration of the state of the user who made a statement in the group and the degree of speech of the person who made the above keyword become.

Next, a fourth modified example of the process of evaluating a conference participant by the information processing device 1 will be described. FIG. 8 is a flow chart showing a fourth modification of the evaluation processing of the conference participant by the information processing apparatus 1. In the description of the fourth modification, the description of the same processes as those of the first embodiment and the first to third modifications will be omitted.

The fourth modification is performed after the third modification. As in the third modification, after setting the fourth evaluation point high and giving it to the target person (step S33), the sixth calculator 126 adds the fourth evaluation point to each of the groups G1 to G3 as a group. The total points are calculated by totaling addition points of each person who belongs (step S34).

Then, the fourth assignment unit 124 calculates the ratio of the total value of the added points to each person to the total value calculated by the sixth calculation unit 126 (step S35), and the person with the calculated ratio is higher The fourth evaluation point is set high, and is further given to the target person (step S36).

For example, the fourth giving unit 124 adds 10% of the total value when the ratio is less than 0 to 20%, adds 20% of the total value when 20% or more and less than 40%, 40% or more and 60% In the case of less than 30% of the aggregate value is added, in the case of 60% or more and less than 80%, 40% of the aggregate value is added, and in the case of 80% or more to 100%, 50% of the aggregate value is added. The aggregate value of is given as the fourth evaluation point.

The fourth assignment unit 124 sets the fourth evaluation point for the ratio (third ratio) to the total value of the entire group higher than the fourth evaluation point for the ratio (fourth ratio) to the total value in group units. It is preferable to give a fourth evaluation point. The ratio of the number of occurrences of the above keyword by each person is the one in which the ratio of the number of occurrences of all the persons P11 to P33 in all groups is higher than the ratio to the total of the number of appearances in the individual group It is because it seems that the degree of contribution is high.

According to the fourth modified example, in addition to the situation of speech in the group, the degree of speech of the person who has said the keyword is more objective in consideration of the state of speech of the keyword by all members of the whole group. Above, it will be possible to evaluate the person who said the keyword.

Next, an information processing apparatus and a group reconfiguring method according to the second embodiment will be described with reference to the drawings. Descriptions of configurations and processes similar to those of the information processing apparatus and the like according to the first embodiment will be omitted.

Similar to the information processing apparatus 1 according to the first embodiment, the information processing apparatus 1A according to the second embodiment acquires voices uttered by persons belonging to the conversation groups G1 to G3 as voice data.

Subsequently, the configuration of the information processing apparatus 1A according to the second embodiment will be described. FIG. 9 is a block diagram showing an outline of an internal configuration of the information processing apparatus 1A according to the second embodiment. Descriptions of processes similar to those of the information processing apparatus 1 according to the first embodiment will be omitted.

The control unit 10 of the information processing apparatus 1A according to the second embodiment executes the group reconfiguration program stored in the HDD 114 by the above processor, whereby the control unit 100, the first detection unit 101, and the first aggregation are performed. It functions as the unit 102, the reconstruction unit 130, the textification unit 122, the second detection unit 131, and the second aggregation unit 132. The control unit 100, the first detection unit 101, the first aggregation unit 102, the reconstruction unit 130, the textification unit 122, the second detection unit 131, and the second aggregation unit 132 may be configured by a hardware circuit. .

The reconfiguration unit (group reconfiguration unit) 130 reconfigures each group member based on the speech time of each person counted by the first counting unit 102. In the present embodiment, the reconfiguration unit 130 reconfigures the members of the groups G1 to G3 based on the speech times of the persons P11 to P33 counted by the first counting unit 102.

The text conversion unit 122 converts the contents of each person's utterance included in the voice data into text.

The second detection unit (keyword detection unit) 131 determines, based on the text data converted into text by the text conversion unit 122, whether the utterance by each person includes a predetermined keyword.

The second tabulating unit (tabulation unit for each group) 132 tabulates the speech time of each of the persons P11 to P33 tabulated by the first tabulating unit 102 for each group reconstructed by the reconstructing unit 130.

Next, group reconfiguration processing by the information processing apparatus 1A will be described. FIG. 10 is a flowchart showing group reconfiguration processing by the information processing apparatus 1A.

The scene in which the evaluation is performed is the same as in the case of the information processing apparatus 1 according to the first embodiment, and as described with reference to FIG. In a meeting. Each of the persons P11 to P33 wears the headset 2, and each headset 2 is communicably connected to the information processing apparatus 1A as described above. In this state, the people P11 to P33 speak during the meeting in the respective groups to which they belong. The voice uttered by the people P11 to P33 is collected by the headset 2 of each of the people P11 to P33, and is output to the information processing apparatus 1A.

The information processing apparatus 1A acquires audio data from each headset 2 via the communication interface 118 (step S101). The first detection unit 101 stores the voice data in the HDD 114 for each of the persons P11 to P33, that is, in association with the identification information stored in the HDD 114.

Subsequently, the first detection unit 101 extracts, as described above, each utterance in the speech indicated by the voice data from each of the voice data stored in the HDD 114 for each of the persons P11 to P33. (Step S102). Then, the first detection unit 101 detects the utterance time of each of the extracted utterances (step S103).

Furthermore, the first aggregation unit 102 individually aggregates, for each of the persons P11 to P33, the speech time for each of the utterances detected by the first detection unit 101 (step S104).

Subsequently, the reconfiguration unit 130 performs ranking in order from the person with the longest utterance time counted by the first counting unit 102 (step S105). Then, the reconstruction unit 130 performs grouping into groups of predetermined numbers from the top ranks to reconstruct a group (step S106).

For example, it is assumed that the reordering unit 130 ranks the speaking time lengths of the people P11 to P33 from P31, P21, P11, P22, P23, P12, P12, P13, P33, and P32 from the top. In this case, the reconstruction unit 130 sets the predetermined number of people as, for example, three, sets members of the group G1 as P31, P21, and P11, and sets members of the group G2 as P22, P23, and P12. Each group is reconfigured with the members as P13, P33, and P32.

The control unit 100 updates the identification information and the group information stored in association with each other in the HDD 114 so that the result of the reconfiguration by the reconfiguration unit 130 is reflected. In this case, the HDD 114 stores identification information of the headset 2 used by each of P31, P21, and P11 in association with group information specifying the group G1. The HDD 114 also stores identification information of the headset 2 used by each of P22, P23, and P12 in association with group information specifying the group G2. The HDD 114 further stores identification information of the headset 2 used by each of P13, P33, and P32 in association with group information specifying the group G3.

Then, the second totaling unit 132 totals the speech time of each person counted by the first totaling unit 102 for each of the reconfigured groups G1 to G3 (step S107).

Based on the identification information and the group information stored in HDD 114 and the counting result by second counting unit 132, control unit 100 counts the rearranged groups G1 to G3 and members of each group in S107. The display time is displayed on the display unit 115 for each of the groups G1 to G3 (step S108).

According to this second embodiment, according to the speech time length (contribution) of each person in a conference held in the past, people with long talk time, and people with short talk time, etc. are in the same group As a result, it will be possible to construct a new group that can be expected to be more effectively discussed based on the results of meetings held in the past, since the group will be rebuilt.

In addition, since the total time of the speaking time by members is calculated and displayed for each of the reconstructed groups, what kind of result will be obtained if the meeting is performed in the new, restructured groove It becomes possible to predict.

In the present embodiment, the following processing may be performed in step S106. For example, after the reordering unit 130 is ranked in order from the person with the longest utterance time tabulated by the first tabulating unit 102 (step S105), grouping is performed for each predetermined number of people from the top ranks A group is reconfigured, but at this time, a group is reconfigured without making people whose speaking time is longer than a predetermined time set in advance into the same group. For example, when the reconstruction unit 130 performs grouping into groups according to the predetermined number of people from the top of the ranking and reconstructs a group, there are a plurality of persons whose speaking time is longer than the specified time in the same group. In the case of doing this, the person concerned is replaced with a person who belongs to a group constituted by persons of lower ranks and whose speaking time is not longer than the specified time. This prevents people with long utterances from being in the same group, which can increase the possibility of making a meeting meaningful.

Next, a first modification of the group reconfiguration process by the information processing apparatus 1A according to the second embodiment will be described. FIG. 11 is a flowchart showing a first modified example of the group reconfiguration processing by the information processing apparatus 1A. The description of the same processing as that of the second embodiment will be omitted.

In the first modification, after the first tabulating unit 102 tabulates the utterance time for each of the utterances individually for each of the persons P11 to P33 (step S204), the reconfiguring unit 130 divides the utterance into predetermined groupings. According to the rules, each of the persons P11 to P33 is combined to create all the groups that can be created (step S205). For example, it is assumed that, in the case where all members are nine, a predetermined grouping rule is to create three groups with three members in each group. In this case, the reconstruction unit 130 divides the people P11 to P33 into three groups G1 to G3 by three people, covers all combinations that can be distributed in this way, and groups G1 to G3 for all combinations that can be distributed. Create

Subsequently, the reconstructing unit 130 counts the utterance time of each member who belongs to all the groups created as described above (step S206).

Furthermore, for each of the grooves G1 to G3 created as described above, the reconstruction unit 130 calculates the difference in group utterance time between the groups, and extracts the group G1 to G3 in which the difference is the smallest. (Step S207).

The control unit 100 causes the display unit 115 to display a tabulation of the utterance time of each of the groups G1 to G3 extracted in step S207 and the extracted groups G1 to G3 (step S208).

According to the first modification, in order to adjust the total speaking time in each group to be equal according to the speaking time length (contribution) of each person in the conference held in the past, a certain group There is no bias in the group structure, such as only people with many utterances, and other groups with only few people.

Next, a second modified example of the group reconfiguration processing by the information processing apparatus 1A according to the second embodiment will be described. FIG. 12 is a flow chart showing a second modification of the group reconstruction processing by the information processing apparatus 1A. The description of the same processing as that of the second embodiment will be omitted.

In the second modification, similarly to the second embodiment, the reconstruction unit 130 sequentially ranks in order from the person with the longest utterance time counted by the first aggregation unit 102 (step S305), and then reconstructs it. Section 130 satisfies the condition that the people from the lowest rank to the predetermined rank and the people from the highest rank to the predetermined rank are in the same group, and for each predetermined number of people. The group is reconfigured (step S306).

For example, it is assumed that the reordering unit 130 ranks the speaking time lengths of the people P11 to P33 from P31, P21, P11, P22, P23, P12, P12, P13, P33, and P32 from the top. Further, the predetermined number of people is, for example, three. Then, for example, the reconstruction unit 130 sets only one person with the highest rank to the above “from the highest rank to a predetermined rank”, and “up to two persons from the lowest rank” to the “predetermined rank from the lowest rank Reconfigure each group as "up". In this case, the reconfiguration unit 130 first sets P31 of the highest rank and P33 and P32 of the lowest rank to two members of the group G1. Subsequently, except for the three persons (P31, P33, P32), the reconfiguration unit 130 sets P21 having the highest rank and two P12 and P13 from the lowest rank as members of the group G2. Similarly, the reconstruction unit 130 excludes the three persons (B21, P12, and P13), and sets P11 having the highest rank and two P22 and P23 from the lowest rank as members of the group G3.

Then, the second totaling unit 132 totals the speech time of each person counted by the first totaling unit 102 for each of the reconstructed groups G1 to G3 (step S307).

The control unit 100 causes the display unit 115 to display the reconstructed groups G1 to G3 and the sum total of the utterance times of the groups G1 to G3 (step S308).

According to this second modified example, each group is relatively actively written with a person who speaks relatively positively according to the speech time length (contribution) of each person in a conference held in the past. Groups can be configured by combining with people who make a comment, so that there is no bias in group configuration, such that only one group is composed of only people who speak a lot, and another group is composed only of people who have a few voices. .

Next, a third modification of the group reconfiguration process by the information processing apparatus 1A according to the second embodiment will be described. FIG. 13 is a flowchart showing a third modification of the group reconfiguration process by the information processing apparatus 1A. The description of the same processing as that of the second embodiment will be omitted.

The third modification is processing executed after group reconfiguration is performed in the second embodiment and the first and second modifications of the second embodiment.

When the group reconstruction processing is completed, the text conversion unit 122 converts the contents of the utterances of the above persons included in the audio data into characters and converts them into text (step S401).

Subsequently, the second detection unit 131 specifies an utterance including a predetermined keyword and a person who made the utterance based on the text data (step S402).

Then, in the groups G1 to G3 for which the reconfiguration has been completed, the reconfiguration unit 130 determines whether or not a plurality of persons who have made a statement including the keyword are members of the same group (step S403). ). Here, if the reconfiguration unit 130 determines that a plurality of persons who have made a remark including the above-mentioned keyword are not members of the same group in the groups G1 to G3 after the above-mentioned reconfiguration (Step The process ends while maintaining the groups G1 to G3 configured at this point in time).

On the other hand, when it is determined that, in the groups G1 to G3 for which the reconfiguration has been completed, the reconfiguration unit 130 has a plurality of persons who have made a statement including the keyword become members of the same group (step S403 YES), it is determined whether or not there is a group consisting of only those who do not speak including the keyword among the groups G1 to G3 (step S404).

Here, when reconfiguration unit 130 determines that there is a group consisting of only those who do not speak including the keyword, among the groups G1 to G3 (YES in step S404), the reconfiguration unit 130 is one of a group consisting of only one person among those who are both members (people who have made a statement including the above keyword) in the same group but who are not making a statement including that keyword And the group is reconstructed (step S405). When it is determined that there is no group consisting of only those who do not speak including the keyword among the groups G1 to G3 (NO in step S404), the reconfiguration unit 130 rearranges the group. The process is terminated without maintaining the groups G1 to G3 configured at this time without performing the process.

According to the third modified example, it is possible to avoid grouping situations in which persons who perform similar utterance content belong to the same group, and to realize grouping into which utterances of non-biased contents are made.

Next, the configuration of the information processing apparatus 1B according to the third embodiment will be described. FIG. 14 is a block diagram showing an outline of an internal configuration of the information processing apparatus 1B according to the third embodiment. Descriptions of processes similar to those of the information processing apparatus 1 according to the first embodiment or the information processing apparatus 1A according to the second embodiment will be omitted.

In the control unit 10 of the information processing apparatus 1B according to the third embodiment, the evaluation program stored in the HDD 114 is executed by the processor, whereby the control unit 100, the third detection unit 135, the creation unit 136, the evaluation It functions as the unit 137, the text conversion unit 122, and the second detection unit 131. The control unit 100, the third detection unit 135, the creation unit 136, the evaluation unit 137, the text conversion unit 122, and the second detection unit 131 may be configured by a hardware circuit.

The third detection unit (speech detection unit) 135 made each of the speeches in the speech from each of the speech data of nine persons P11 to P33 stored in the HDD 114. Detect with the person and the time each utterance was made.

The creation unit (speech distribution creation unit) 136 creates a statement distribution map showing a change in the amount of statement according to the passage of time, using each statement detected by the third detection unit 135 and its time.

The evaluation unit 137 sets the evaluation level of the person who made the first utterance in the time zone in which the amount of utterance increases in the utterance distribution map created by the creator 136 to a predetermined first high level. Perform processing to determine that

The second detection unit 131 determines, based on the text data converted into text by the text conversion unit 122, whether the utterance by each person includes a predetermined keyword.

Next, evaluation processing by the information processing apparatus 1B according to the third embodiment will be described. FIG. 15 is a flowchart showing evaluation processing by the information processing apparatus 1B.

The scene where the evaluation process is performed is the same as in the case of the information processing apparatus 1 according to the first embodiment, and each person belonging to the conversation groups G1 to G3 is a group as described with reference to FIG. It is a scene where we have a meeting every time. The information processing apparatus 1B acquires audio data from each headset 2 via the communication interface 118 (step S501). The first detection unit 101 stores the voice data in the HDD 114 for each of the persons P11 to P33, that is, in association with the identification information stored in the HDD 114.

Subsequently, the third detection unit 135 determines, from the voice data stored in the HDD 114 for each of the persons P11 to P33, each utterance being performed in the utterance indicated by the voice data, together with the time of each utterance. Extraction is performed as described above (step S502). Here, "time" is an elapsed time from the start of audio data.

Furthermore, the creating unit 136 creates a statement distribution map showing a change in the amount of messages according to the passage of time, using the utterance and the time of each utterance detected by the third detection unit 135 (step S503). For example, as shown in FIG. 16, in accordance with the passage of time from the start of the conference, a line graph indicating the number of utterances per one minute, for example, is created for each of the groups G1 to G3.

Subsequently, the evaluation unit 137 extracts a time zone in which the amount of speech has a tendency to increase in the speech distribution map generated by the generation unit 136 (step S504). For example, the evaluation unit 137 increases, for example, in a predetermined time (for example, one minute) whether the inclination indicated by the broken line in the above-mentioned statement distribution map is equal to or more than a predetermined angle. A time zone in which the rate (differential value) is equal to or more than a predetermined value (50% increase rate in the number of messages) is extracted. For example, in the case of the utterance distribution map shown in FIG. 16, the evaluation unit 137 determines that the slope (increase rate, differential value) from point P1 to point P2 one minute after is 50% or more of the predetermined value. Therefore, the relevant time zone is extracted.

Then, in the extracted time zone, the evaluation unit 137 determines the first utterance and the person who made the utterance (step S505). For example, after the time point of point P1 (when 11 minutes has elapsed from the start), the evaluation unit 137 specifies the first speech and speaker.

Furthermore, the evaluation unit 137 determines that the evaluation level of the speaker of the first-to-be-sent utterance is the first high level determined in advance (step S506). The first high level is a high evaluation level given to the person as a person who contributes to or is similar to the person at the meeting after making a statement and increasing other remarks. is there.

According to the third embodiment, the person who has caused another utterance to increase according to his / her own utterance is judged to be the first high level and evaluated, whereby the influence of the utterance of a certain member on the other utterance is made. Can be given to the members concerned.

Next, a first modification of the evaluation process by the information processing apparatus 1B according to the third embodiment will be described. FIG. 17 is a flowchart showing a first modification of the evaluation process by the information processing apparatus 1B. The same processes as in the third embodiment will not be described.

In the first modification, after the evaluation unit 137 determines that the person is at the first high level (step S606), the textification unit 122 converts the contents of the utterance of each person included in the voice data into a character Convert and convert into text (step S 607).

Subsequently, the evaluation unit 137 extracts, from the text data converted into text by the text conversion unit 122, each word included in the first utterance made in the time zone in which the extracted utterance amount shows an increasing tendency. (Step S608).

The control unit 100 causes the display unit 115 to display the words extracted by the evaluation unit 137 (step S609).

According to this first modification, it becomes clear which word has subsequently activated the conference, in order to extract words included in the utterances in which other utterances are increased.

Next, a second modification of the evaluation process by the information processing apparatus 1B according to the third embodiment will be described. FIG. 18 is a flowchart showing a second modified example of the evaluation process by the information processing apparatus 1B. The same processes as those of the third embodiment or the first modification of the third embodiment will not be described.

In the second modification, after the control unit 100 causes the display unit 115 to display the word extracted by the evaluation unit 137 (step S709), the evaluation unit 137 determines each of the words included in the first utterance. It is determined whether the word is included in the other utterances performed after the first utterance during the extracted time zone to increase the appearance number (step S710). For example, the evaluation unit 137 may be configured such that each word included in the first utterance is the other utterance (all utterances) performed after the first utterance during the extracted time zone. It is determined whether the data appears in the text data converted into text data. When the respective words appear, the evaluation unit 137 detects the number of appearances of each of the words, and determines whether the number of appearances of each of the words is greater than the number of appearances in the first utterance. to decide.

Here, the evaluation unit 137 indicates that the word included in the first utterance is included in the other utterances performed after the utterance during the extracted time zone, and the appearance number is increased. If it is determined (YES in step S710), the evaluation level of the person who made the first utterance is determined to be a second evaluation level higher than the first high level (step S711). For example, if the number of occurrences of even one of the above-described words increases, the evaluation unit 137 determines that the number of occurrences is “increased” in step S710. This second high level means that people who contribute to the meeting or have an approximation of the increase in the amount of occurrence of the word after the utterance of a certain word increases in the meeting. As a person, it is a high evaluation level given to the person.

According to the second modification, the person's words (words) are later frequently used by others in the discussion, and those who give the words that cause the activation of the conference It will be possible to accurately find out and evaluate.

Control unit 100 may be configured to be able to display the evaluation result by evaluation unit 137 on display unit 115 in accordance with an instruction input by the user via instruction input unit 119. The evaluation result is not particularly limited as long as it is information indicating an evaluation result, but indicates, for example, whether or not at least one of the first evaluation level and the second evaluation level is given to each person. There is information.

In the first to third embodiments, the identification of the person who made the utterance was performed based on the identification information attached to the headset 2. However, the present invention relates to such an embodiment. It is not limited to. For example, a general speaker identification technique may be used to identify the person who made the utterance.

Further, in the above embodiment, the configurations and processes shown by the above embodiment using FIGS. 1 to 18 are only one embodiment of the present invention, and the present invention is not limited to the configurations and processes.

Claims

A first detection unit that detects a speech time for each utterance in the utterance from voice data in which the utterance of each person in a plurality of predetermined groups of people is recorded;
A first counting unit that counts, for each person, the speech time for each of the utterances detected by the first detection unit;
An reconstructing unit configured to reconstruct the group based on the speaking time of each person aggregated by the first aggregating unit.
The information according to claim 1, wherein the reconstruction unit rearranges the group by performing grouping into groups according to a predetermined number of people in order from the person with the longest utterance time counted by the first tabulation unit. Processing unit.
The information processing apparatus according to claim 1, wherein the reconstruction unit reconstructs the group without setting persons having a long speech time counted by the first aggregation unit as the same group.
The reconstruction unit is configured to reorganize the group for each predetermined number of persons by putting the group total of the utterance time of each person aggregated by the first aggregation unit to a predetermined difference. The information processing apparatus according to 1.
The reconstruction unit ranks each person according to the utterance time counted by the first counting unit, and the person from the lowest rank to a predetermined rank and the highest rank to a predetermined rank The information processing apparatus according to claim 1, wherein the group is reconfigured for each of a predetermined number of people, while satisfying a condition of making the same group of people.
A textification unit for converting the contents of each person's utterance included in the voice data into text;
A second detection unit that determines whether the utterance by each person includes a predetermined keyword based on text data converted into text by the text conversion unit;
The reconstruction unit satisfies the condition that each person who has made a speech including the same keyword is not to be in the same group based on the detection result by the second detection unit, and the above-mentioned reconstruction unit is configured for each predetermined number of people. The information processing apparatus according to claim 2, which reconfigures a group.
The first detection unit detects, from voice data in which an utterance of each person belonging to the group reconstructed by the reconstruction unit is recorded, a speech time for each utterance in the utterance,
The first totaling unit totals, for each person, the speech time for each of the utterances detected by the first detection unit,
The information processing apparatus according to claim 1, further comprising a second aggregation unit that aggregates the utterance time of each person aggregated by the first aggregation unit for each of the reconfigured groups.
A speech time detection step of detecting a speech time for each speech in the speech from speech data in which the speech of each person in a plurality of groups of a predetermined number of people is recorded;
Individual-speaking-time totaling step for counting, for each person, the speaking time for each of the utterances detected in the speaking-time detecting step;
And D. a group reconfiguring step of reconfiguring the group based on the speaking time of each of the persons tabulated in the per-person speaking time tabulating step.
A voice input unit to which an electrical signal indicating voice is input;
A storage unit that stores voice data based on the input electrical signal for each person who has issued the voice each time the electrical signal is input to the voice input unit;
By including a processor, the processor executes a group configuration program,
A first detection unit which extracts a portion corresponding to a speech from the voice data and detects a time during which the speech continues as a speech time;
A first counting unit that counts at least one of the utterance times for each person and calculates the utterance time of each person;
An information processing apparatus, comprising: a component that constitutes a group to which each person belongs, and a control unit that functions as a group to which each person belongs based on the speaking time of each person.
Further comprising a display unit,
The control unit further
A second aggregation unit that calculates the utterance time of each group by totaling the utterance times of the respective persons for each of the groups configured by the configuration unit;
A control unit that causes the display unit to display information indicating the group configured by the configuration unit, information indicating the person belonging to the group, and information indicating a speaking time for each group The information processing apparatus according to claim 9.
The storage unit further stores identification information for identifying the person and group information for identifying the group in association with each other.
The control unit further updates the identification information and the group information stored in the storage unit so as to reflect the result of the configuration when the group is configured by the configuration unit. The information processing apparatus according to claim 9, which functions.