WO2024023901A1 - Communication terminal, comment output method, and program - Google Patents

Communication terminal, comment output method, and program Download PDF

Info

Publication number
WO2024023901A1
WO2024023901A1 PCT/JP2022/028670 JP2022028670W WO2024023901A1 WO 2024023901 A1 WO2024023901 A1 WO 2024023901A1 JP 2022028670 W JP2022028670 W JP 2022028670W WO 2024023901 A1 WO2024023901 A1 WO 2024023901A1
Authority
WO
WIPO (PCT)
Prior art keywords
comment
communication terminal
topic
data
dialogue
Prior art date
Application number
PCT/JP2022/028670
Other languages
French (fr)
Japanese (ja)
Inventor
陽子 石井
桃子 中谷
晴美 齋藤
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2022/028670 priority Critical patent/WO2024023901A1/en
Publication of WO2024023901A1 publication Critical patent/WO2024023901A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types

Definitions

  • the present disclosure relates to a technology that automatically outputs comments such as answers to topics in a dialog between multiple people.
  • the present invention aims to output comments devised by a person at an appropriate timing in the flow of dialogue.
  • the invention according to claim 1 provides a communication terminal that outputs comments during a dialogue between a plurality of participants, and which outputs a text from audio data indicating the content of the dialogue including a topic during the dialogue.
  • a topic input unit that converts the data into data;
  • a topic determination unit that determines the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data; and
  • a topic determination unit that stores comment data devised by a person in advance.
  • a comment selection unit that selects a predetermined comment to be output by acquiring comment data related to the content of the determined topic from a storage unit that has been previously stored;
  • the communication terminal has an output unit that outputs data.
  • FIG. 1 is an overall configuration diagram of a communication system according to this embodiment. It is an image diagram of dialogue at base ⁇ .
  • FIG. 2 is an electrical hardware configuration diagram of each communication terminal and server according to the present embodiment.
  • 3 is a functional configuration diagram of a communication terminal 5.
  • FIG. 5 is a diagram showing processing or operation of the communication terminal 5.
  • the communication terminal 5 provides specific improvements over the conventional technology, and the present embodiment automatically outputs comments such as answers to topics in response to conversations between multiple people. It shows the improvement in the technical field.
  • FIG. 1 is an overall configuration diagram of a communication system according to an embodiment.
  • the communication system 1 of this embodiment is constructed by a display device 2, a communication terminal 5, an input/output device 6, a server 7, a communication terminal 8, and a communication terminal 9.
  • the display device 2, the communication terminal 5, and the input/output device 6 are used at the base ⁇ (first base) during the dialogue.
  • the communication terminal 8 is being used at the base ⁇ (second base) during the dialogue at the base ⁇ .
  • the communication terminal 9 is used by an arbitrary person among an unspecified number of people before a conversation at the base ⁇ .
  • the server 7 is installed on a cloud or the like, and stores comments sent from the communication terminal 9 before a conversation at the base ⁇ , or comments sent from the communication terminal at the base ⁇ during a conversation at the base ⁇ . I remember the comments I receive. Further, the server 7 transmits the stored comments to the communication terminal 5 during the dialogue at the base ⁇ .
  • the communication terminals 5, 8, and 9 are PCs (Personal Computers) or the like, and are capable of communicating via a communication network 100 such as the Internet.
  • the display device 2 is a display or the like.
  • the input/output device 6 is a device that transmits video data obtained by photographing the surroundings and sound data obtained by collecting surrounding sounds to the communication terminal 5. This transmission method may be wired or wireless.
  • the communication terminal 5 acquires the video data and sound data output from the input/output device 6, and transmits the video data and sound data to the communication terminal 10 at the base ⁇ .
  • the communication terminal 8 at the base ⁇ may acquire the video data and sound data from the base ⁇ and transmit it to the communication terminal 5 at the base ⁇ . Note that the communication terminal 5 will be explained in detail later using FIG. 4.
  • the server 7 plays a role as a DB (Data Base) server.
  • DB Data Base
  • information such as comments is obtained from the communication terminal 9 and stored in association with the comments, identification information, and, as the case may be, priorities. Identification information and priority will be explained later.
  • information such as comments is acquired from the communication terminal 8 and stored in association with the comments, identification information, and in some cases priorities. Additionally, information such as comments is sent to the communication terminal 5 in response to a request from the communication terminal 5 during the dialogue.
  • the communication terminal 8 at the base ⁇ is a PC or the like used by an observer who does not make a direct voice statement during the dialogue at the base ⁇ , but who views the video and audio of the dialogue at the base ⁇ .
  • the observer at the base ⁇ can grasp the flow of the dialogue at the base ⁇ and the state of the participants in the current dialogue.
  • An output device and an input device are connected to the communication terminal 8.
  • Examples of the output device include a headphone-type speaker that reproduces the audio of the dialogue, a display that plays a video image that allows you to see the dialogue, a display that reproduces text information of the content of the dialogue, and a display that reproduces the text of the dialogue. These are headphones or speakers that read out the converted information.
  • the input device is, for example, a microphone that allows voice input, a keyboard that allows character input, or the like.
  • the communication terminal 8 converts the voice of the input comment into text information.
  • the communication terminal 8 gives identification information for classifying the content of the comment to the comment obtained by the observer. For example, if the topic is asking about a place, the viewer is asked to input latitude and longitude information on a map, and the latitude and longitude are stored in the server 7 as identification information. Alternatively, the viewer may be asked to input tags such as parks, commercial facilities, cultural facilities, etc. from the content of the comments, and these may be used as identification information. In addition to this, the observer may be asked to input arbitrary information and these may be used as identification information. These identification information may be input by a third party who is not an observer.
  • the identification information may be coordinate values.
  • the server 7 converts the collected comments into high-order coordinates using APIs (Application Programming Interfaces) such as doc2vec and fast2text, and performs principal component analysis on these coordinates to reduce the dimension to two or three dimensions. Then, the converted coordinates are used as identification information.
  • the server 7 may calculate all the Euclidean distances for each coordinate of the aggregated comments, cluster those with close distances, categorize them based on the clustered comments, and use each category as identification information.
  • the category in this case may be the coordinates of the center of the group after clustering.
  • a priority order (such as a numerical value) in which comments are used in dialogue may be added to the identification information.
  • Priority is assigned by an observer or third party. The higher the priority, the higher the possibility of appearing in the dialogue, so if the observer has a strong desire to convey the message, the priority is set to "1," for example. There is no particular upper limit to the numerical value of the priority order, and any numerical value can be set.
  • the server 7 stores the comment, identification information, and/or priority in association with each other.
  • the communication terminal 9 is a PC or the like used by any person (respondent) among the unspecified number of people who are interested in the content of the dialogue or those who meet predetermined conditions before the dialogue is held at the base ⁇ . .
  • This respondent uses the communication terminal 9 to transmit and save his/her comments to the server 7 before the conversation at the base ⁇ .
  • the predetermined conditions include, for example, people living in a specific place, age, gender, family composition, hobbies, and the like.
  • the server 7 sends a topic to people who live in a certain area and are participating in the mailing list before a conversation at the base ⁇ , and each respondent sends a topic from each communication terminal 9 to the server 7. Send comments on topics using the internet form.
  • the server 7 adds identification information to the comments collected from each respondent and stores them.
  • the identification information is the same as that explained for the communication terminal 8 above, so the explanation will be omitted.
  • a priority order (such as a numerical value) in which comments are used in dialogue may be added to the identification information.
  • the priority order is also the same as that explained for the communication terminal 8 above, so the explanation will be omitted.
  • the server 7 stores the comment, identification information, and/or priority in association with each other.
  • FIG. 2 is an image diagram of the dialogue at the base ⁇ .
  • the input/output device 6 includes input means such as a microphone and a camera, and output means such as a microphone.
  • the display device 2 displays materials, etc. (shared screen).
  • the communication terminal 5 in FIG. 1 is one of the communication terminals among the participants in FIG. A display device 2 and an input/output device 6 are connected to the communication terminal 5. Further, the number of participants at the base ⁇ may be any number as long as there are two or more.
  • FIG. 3 is an electrical hardware configuration diagram of the communication terminal.
  • the communication terminal 5 is a computer that includes a CPU 501, ROM 502, RAM 503, SSD 504, external device connection I/F (Interface) 505, network I/F 506, display 507, input device 508, and media. It includes an I/F 509 and a bus line 510.
  • I/F Interface
  • the CPU 501 as a processor controls the operation of the communication terminal 5 as a whole.
  • the ROM 502 stores programs used to drive the CPU 501 such as IPL.
  • RAM 503 is used as a work area for CPU 501.
  • the SSD 504 reads or writes various data under the control of the CPU 501. Note that an HDD (Hard Disk Drive) may be used instead of the SSD 504.
  • HDD Hard Disk Drive
  • the external device connection I/F 505 is an interface for connecting various external devices.
  • External devices in this case include a display, speaker, keyboard, mouse, USB memory, printer, and the like.
  • the network I/F 506 is an interface for data communication via the communication network 100.
  • the display 507 is a type of display means such as liquid crystal or organic EL (Electro Luminescence) that displays various images.
  • the input device 508 is a keyboard, pointing device, etc., and is a type of input means for inputting, selecting, executing, etc. various instructions. Note that the input device 508 can be used in combination with an external keyboard and mouse.
  • the media I/F 509 controls reading or writing (storage) of data to a recording medium 509m such as a flash memory.
  • the recording media 509m also include DVDs, Blu-ray Discs (registered trademark), and the like.
  • the bus line 510 is an address bus, a data bus, etc. for electrically connecting each component such as the CPU 501 shown in FIG. 4.
  • the communication terminal 5 may be provided with at least one of a microphone, a camera, and a speaker. Further, since the server 7 and the communication terminals 8 and 9 have basically the same hardware configuration as the communication terminal 5, the description thereof will be omitted.
  • the communication terminal 8 at the base ⁇ is equipped with a microphone, a camera, and a speaker, and may be used instead of the above-mentioned input device and output device.
  • FIG. 4 is a functional configuration diagram of the communication terminal 5. As shown in FIG.
  • the communication terminal 5 includes an initial value setting section 50, a topic input section 51, a topic determination section 53, a comment selection section 55, a participant emotion determination section 57, and an output section 59.
  • Each of these units is a function realized by instructions from the CPU 501 in FIG. 3 based on a program.
  • the initial value setting unit 50 accepts the input of the volume s (db) and the time T (seconds), which are used as criteria for the comment selection unit 55 to determine that silence has continued during the conversation, from participants etc. before the conversation. .
  • the topic input unit 51 receives as input audio data indicating the content of the conversation, including the topic being discussed. In the case of voice input using the microphone of the input/output device 6, the topic input unit 51 converts the voice input into text (character) data. The topic input unit 51 also acquires text (character) data such as comments from the communication terminal 5 or the server 7 . Text data indicating the content of the dialogue including the input topic is sent to the topic determining section 53. The text data of the content of this dialogue is also sent to the communication terminal 8 of the base ⁇ as the content of the dialogue at the base ⁇ .
  • the topic determination unit 53 determines the content of the topic being discussed in the dialogue. Therefore, the topic determination unit 53 morphologically analyzes the text data received from the topic input unit 51 and extracts predetermined words (for example, only nouns). The topic determining unit 53 determines the contents of frequently appearing words among the plurality of extracted words as the contents of the "topic". Whether or not a word appears frequently is determined based on whether a word appears a predetermined number of times (for example, three times) or more in a predetermined period of time (for example, 60 seconds). Note that any part of speech of the word to be extracted can be specified.
  • the topic determination unit 53 converts the topic data in the text into higher-order coordinates using APIs such as doc2vec and fast2text, and performs principal component analysis on these coordinates.
  • the converted coordinates may be obtained by dimensional compression to two or three dimensions. In this case, the value of this coordinate indicates the "topic".
  • the obtained data indicating the topic is sent to the comment selection section 55.
  • the comment selection unit 55 acquires the audio data of all participants present at the base ⁇ from the input/output device 6. Furthermore, the comment selection unit 55 analyzes the acquired audio data and determines whether "silence continued for a predetermined period of time” or not. For example, the comment selection unit 55 determines that "silence has continued for a predetermined period of time" when the sound volume remains below s decibels for T seconds.
  • the comment selection section 55 upon acquiring topic data from the topic determining section 53, acquires comment data related to the content of the topic from the server 7. Further, the comment selection section 55 acquires participant emotion information from the participant emotion determination section 57. Emotional information will be explained later. Then, the comment selection unit 55 selects a predetermined comment from among the comments acquired from the server 7, depending on the emotion of the participant during the dialogue indicated by the emotion information. The predetermined comment is sent to the output section 59. Note that depending on the content of the emotion, the comment selection unit 55 does not select a predetermined comment to be output, or does not send a predetermined comment that has already been selected to the output unit 59. Detailed processing of the comment selection section 55 will be explained with reference to FIG.
  • the participant emotion determination unit 57 determines the emotions of the participants present in the dialog.
  • the participant emotion determination unit 57 acquires at least one of video and audio data of a plurality of participants having a dialogue at the base ⁇ from the input/output device 6.
  • a headset microphone, a lavalier microphone, a gooseneck microphone, or the like may be used to individually acquire the voices of each participant.
  • Reference Document 1 a first example of the processing performed by the participant emotion determining unit 57 is disclosed in Reference Document 1.
  • Reference Document 2 a first example of the processing performed by the participant emotion determining unit 57 is disclosed in Reference Document 1.
  • Reference Document 2 a second example of the processing performed by the participant emotion determining unit 57 is disclosed in Reference Document 2.
  • human emotions can be predicted as numerical values based on the quality of the voice and the content of the words, using input such as uttered audio.
  • ⁇ Reference 2> https://group.ntt/jp/newsrelease/2021/11/01/211101b.html
  • the numerical values indicating the type of emotion and the degree of emotion obtained by the participant emotion judgment unit 57 are sent to the comment selection unit 55 as “emotion information” including utterance time information t when the participant (utterance) occurred. .
  • the output unit 59 outputs comments devised by a person from the communication terminal 8 at appropriate timings in the flow of the dialogue.
  • the comment selection unit 55 acquires the audio data of all participants present at the base ⁇ from the input/output device 6.
  • the comment selection unit 55 analyzes the acquired audio data and determines whether "silence continued for a predetermined period of time" or not. For example, the comment selection unit 55 determines that "silence has continued for a predetermined period of time" when the sound volume remains below s decibels for T seconds. If the topic determining unit 53 does not determine that silence has continued for a predetermined period of time (S12; NO), the process returns to S11. On the other hand, if the topic determining unit 53 determines that silence has continued for a predetermined period of time (S12; YES), the process proceeds to S13.
  • the topic input unit 51 converts audio data indicating the content of the topic during the dialogue into text data
  • the topic determination unit 53 converts words that have appeared a predetermined number of times or more in a predetermined time in the text data. Based on this, the content of the topic is determined and topic data is sent to the comment selection section 55.
  • the comment selection unit 55 acquires topic data from the topic determination unit 53.
  • the comment selection unit 55 selects a corresponding predetermined comment by searching for comment data related to predetermined identification information related to the topic among the identification information stored in the server 7. If there is one or more comments related to the predetermined identification information, the comment selection unit 55 selects one predetermined comment with the highest priority among them. If there are multiple comments with the same priority, the comment selection unit 55 randomly selects one predetermined comment from among these comments. Further, the comment selection unit 55 outputs a numerical value regarding the degree of similarity with the selected predetermined comment.
  • the comment selection unit 55 receives emotional information (information such as type of emotion, numerical value indicating the degree of emotion, and time of utterance) of each participant in the dialogue taking place at the base ⁇ from the participant emotion determination unit 57. get.
  • emotional information information such as type of emotion, numerical value indicating the degree of emotion, and time of utterance
  • the comment selection unit 55 estimates the emotions of each participant in the dialogue based on the emotional information of each participant, as shown below.
  • the comment selection unit 55 adds up the numerical values for each type of emotion of all the participants at the base ⁇ , and determines the emotion with the largest numerical value as the predetermined emotion for the dialog.
  • the comment selection unit 55 may divide the emotion types and numerical distribution characteristics of all participants into several categories, instead of dividing them by emotion type, and then divide the emotion of the conversation for each category into categories. You may decide. In this case, the comment selection unit 55 determines the emotion associated with the category whose characteristics of the obtained emotion type and numerical distribution are closest to each other as the predetermined emotion for the conversation.
  • the comment selection unit 55 controls whether or not to output comments depending on the emotion of the conversation. For example, when the type of emotion is "concentrated,” the comment selection section 55 does not send data of the predetermined comment to the output section 59. This is because the participants are concentrating, so even if a predetermined comment is output, there is a possibility that the participants will not see or ignore the predetermined comment. On the other hand, if the type of emotion is "excited” or "distracted", the comment selection unit 55 sends predetermined comment data to the output unit 59.
  • the output unit 59 outputs the comment data received from the comment selection unit 55.
  • the output unit 59 may acquire the degree of similarity (numeric value) with a predetermined comment from the comment selection unit 55, add this degree of similarity (numeric value), and output the result.
  • the input/output device 6 may output audio, or the display device 2 may display text information.
  • the output content may be to output the comment as is (in the case of audio, the comment is read aloud using speech synthesis), or it may be possible to output the comment as is, such as "I think" Predetermined sentences, words, or conjunctions may be added before and after.
  • the process of S12 above may be performed between the process of S16 and S17.
  • the communication terminal 5 since the communication terminal 5 has already selected the predetermined comment to be output when silence continues for the predetermined period of time, the communication terminal 5 can quickly output the predetermined comment. This is effective when the communication speed in the communication network is slow.
  • the process of S15 may be omitted. That is, the communication terminal 5 may output comments related to the topic without considering the atmosphere (level of emotion) of the participants at the base ⁇ .
  • the communication terminal 5 selects the topic of the comment made by the participant at the base ⁇ among a plurality of comments devised not by a computer but by a person (observer, respondent). Since predetermined comments related to the topic are output, it is possible to output the comments of the person who fits the topic at the appropriate timing in the flow of the dialogue.
  • the present invention is not limited to the above-described embodiments, and may have the following configuration or processing (operation).
  • the communication terminal 5 can also be realized by a computer and a program, but this program can also be recorded on a (non-temporary) recording medium or provided via the communication network 100.
  • a notebook computer is shown as an example of the communication terminals 5, 8, and 9, but the invention is not limited to this.
  • a desktop computer, a tablet terminal, a smartphone, etc. may be used. .
  • Each CPU 501 may be a single CPU or a plurality of CPUs.
  • comments are stored on the server, but the present invention is not limited to this.
  • the comment may be stored in the RAM 503 or SSD 504 in the communication terminal 5.
  • the comment selection unit 55 reads comment data stored in the communication terminal 5 and selects a predetermined comment.
  • the RAM 503 or SSD 505 in the communication terminal 5 and the server 7 are examples of storage units.
  • a communication terminal having a processor that outputs comments during dialogue between multiple participants, The processor includes: Converting audio data indicating the content of the dialogue including the topic during the dialogue into text data, determining the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data; Selecting a predetermined comment to be output by acquiring comment data related to the determined content of the topic from a storage unit that stores comment data devised by a person in advance; outputting data of the selected predetermined comment; communication terminal.
  • the processor receives comment data related to the content of the topic from the storage unit as comment data corresponding to the identification information, based on identification information associated with each comment data and used to classify the comment content.
  • the communication terminal includes: Determining the emotions of the plurality of participants based on at least one of the video and audio of the plurality of participants in the dialogue place, selecting the predetermined comment to be output according to the emotions of the plurality of participants; communication terminal.
  • the communication terminal includes: Determining the emotions of the plurality of participants based on at least one of the video and audio of each of the participants in the dialogue place, not selecting the predetermined comment to be output according to the emotions of the plurality of participants; communication terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The purpose of the present disclosure is to output a comment proposed by a person at a suitable timing in the flow of a dialogue. To this end, the present disclosure is a communication terminal that outputs a comment during a dialogue between a plurality of participants, the communication terminal including: a topic input unit that converts, into text data, speech data that indicates the content of a dialogue that includes a topic of the dialogue; a topic determining unit that determines the content of the topic on the basis of a term that has appeared at least a prescribed number of times in a prescribed time period in the text data; a comment selecting unit that acquires data about a comment pertaining to the determined content of the topic from a storage unit in which the data about the comment proposed by a person has been previously stored, thereby selecting a prescribed comment to be outputted; and an output unit that outputs the data about the selected prescribed comment.

Description

通信端末、コメント出力方法、及びプログラムCommunication terminal, comment output method, and program
 本開示は、複数人の対話の場において自動的に話題に対する回答等のコメントを出力する技術に関する。 The present disclosure relates to a technology that automatically outputs comments such as answers to topics in a dialog between multiple people.
 参加者が何かしらのエピソードを話し合ったり、アイディア出しや意思疎通の議論を行ったりするような対話の場面において、その場の参加者以外の意見を議論の場に出力すると、話が盛り上がったり、その場の参加者の偏った意思決定を抑制したりすることができるため、意義な議論を行うことが可能である。 In a dialogue situation where participants are discussing anecdotes, coming up with ideas, and discussing communication, outputting the opinions of people other than the participants in the discussion can liven up the discussion and increase the excitement. Since it is possible to suppress biased decision-making by participants, it is possible to have meaningful discussions.
 そのため、従来、コンピュータが話題文に適切な回答文を自動生成又は選択する技術が提案されている(特許文献1参照)。また、コンピュータがユーザの発話に対する自然な発話候補を生成する技術も提案されている(特許文献2参照)。 Therefore, a technique has been proposed in the past in which a computer automatically generates or selects an appropriate answer sentence for a topic sentence (see Patent Document 1). Furthermore, a technique has been proposed in which a computer generates natural utterance candidates for user's utterances (see Patent Document 2).
特開2020-4224号公報JP 2020-4224 Publication 特開2015-79383号公報JP2015-79383A
 しかし、人がコンピュータに対して明示的に質問を投げかけ、コンピュータから回答を求める場合と異なり、複数の参加者同士が議論している対話においては、コンピュータが対話の流れの中の適切なタイミングでコメントを出力することは困難である。また、議論中の参加者にコメントの案出元がコンピュータだと認識されてしまうと、参加者同士の対話が優先されてしまい、参加者達はコンピュータの発する意見に十分に耳を傾けない、又はひとつの意見として採用しない傾向にあるという課題がある。 However, unlike when a person explicitly asks a computer a question and requests an answer from the computer, in a dialogue between multiple participants, the computer responds at an appropriate time during the flow of the dialogue. It is difficult to output comments. Additionally, if participants in a discussion recognize that the computer is the source of their comments, priority will be given to dialogue between participants, and participants will not listen to the opinions expressed by the computer. Alternatively, there is a problem that there is a tendency not to adopt it as an opinion.
 本発明は上述の課題を解決するため、対話の流れの中の適切なタイミングで、人が案出したコメントを出力することを目的とする。 In order to solve the above-mentioned problems, the present invention aims to output comments devised by a person at an appropriate timing in the flow of dialogue.
 上記課題を解決するため、請求項1に係る発明は、複数の参加者の対話中にコメントを出力する通信端末であって、前記対話中の話題を含む対話の内容を示す音声データからテキストのデータに変換する話題入力部と、前記テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて前記話題の内容を判定する話題判定部と、予め人により案出されたコメントのデータを保存しておいた保存部から、前記判定された前記話題の内容に関連するコメントのデータを取得することで、出力すべき所定のコメントを選択するコメント選択部と、前記選択された前記所定のコメントのデータを出力する出力部と、を有する通信端末である。 In order to solve the above problem, the invention according to claim 1 provides a communication terminal that outputs comments during a dialogue between a plurality of participants, and which outputs a text from audio data indicating the content of the dialogue including a topic during the dialogue. a topic input unit that converts the data into data; a topic determination unit that determines the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data; and a topic determination unit that stores comment data devised by a person in advance. a comment selection unit that selects a predetermined comment to be output by acquiring comment data related to the content of the determined topic from a storage unit that has been previously stored; The communication terminal has an output unit that outputs data.
 以上説明したように本発明によれば、対話の流れの中の適切なタイミングで、人が案出したコメントを出力することができるという効果を奏する。 As explained above, according to the present invention, it is possible to output comments devised by a person at an appropriate timing in the flow of dialogue.
本実施形態に係る通信システムの全体構成図である。1 is an overall configuration diagram of a communication system according to this embodiment. 拠点αにおける対話のイメージ図である。It is an image diagram of dialogue at base α. 本実施形態に係る各通信端末及びサーバの電気的なハードウェア構成図である。FIG. 2 is an electrical hardware configuration diagram of each communication terminal and server according to the present embodiment. 通信端末5の機能構成図である。3 is a functional configuration diagram of a communication terminal 5. FIG. 通信端末5の処理又は動作を示す図である。5 is a diagram showing processing or operation of the communication terminal 5. FIG.
 以下、図面に基づいて本発明の実施形態を説明する。なお、本実施形態に係る通信端末5は、従来技術に対して特定の改善を提供するものであり、本実施形態は、複数人の対話に対して自動的に話題に対する回答等のコメントを出力する技術分野の向上を示す。 Hereinafter, embodiments of the present invention will be described based on the drawings. Note that the communication terminal 5 according to the present embodiment provides specific improvements over the conventional technology, and the present embodiment automatically outputs comments such as answers to topics in response to conversations between multiple people. It shows the improvement in the technical field.
 〔システムの全体構成〕
 図1は、実施形態に係る通信システムの全体構成図である。
[Overall system configuration]
FIG. 1 is an overall configuration diagram of a communication system according to an embodiment.
 図1に示すように、本実施形態の通信システム1は、表示装置2、通信端末5、入出力装置6、サーバ7、通信端末8、及び通信端末9によって構築されている。表示装置2、通信端末5、及び入出力装置6は、対話中に拠点α(第1の拠点)で使用されている。通信端末8は、拠点αでの対話中に拠点β(第2の拠点)で使用されている。通信端末9は、不特定多数の人のうちの任意の人が、拠点αでの対話の前に使用されている。サーバ7は、クラウド上等に設置されており、拠点αでの対話の前に通信端末9から送られて来たコメントを記憶したり、拠点αでの対話中に拠点βの通信端末から送られて来たコメントを記憶したりする。また、サーバ7は、記憶しているコメントを拠点αでの対話中に通信端末5に送信する。 As shown in FIG. 1, the communication system 1 of this embodiment is constructed by a display device 2, a communication terminal 5, an input/output device 6, a server 7, a communication terminal 8, and a communication terminal 9. The display device 2, the communication terminal 5, and the input/output device 6 are used at the base α (first base) during the dialogue. The communication terminal 8 is being used at the base β (second base) during the dialogue at the base α. The communication terminal 9 is used by an arbitrary person among an unspecified number of people before a conversation at the base α. The server 7 is installed on a cloud or the like, and stores comments sent from the communication terminal 9 before a conversation at the base α, or comments sent from the communication terminal at the base β during a conversation at the base α. I remember the comments I receive. Further, the server 7 transmits the stored comments to the communication terminal 5 during the dialogue at the base α.
 通信端末5,8,9は、PC(Personal Computer)等であり、インターネット等の通信ネットワーク100を介して通信可能である。 The communication terminals 5, 8, and 9 are PCs (Personal Computers) or the like, and are capable of communicating via a communication network 100 such as the Internet.
 表示装置2は、ディスプレイ等である。 The display device 2 is a display or the like.
 また、図1では、2つの拠点が示されているが3つ以上の拠点で通信することも可能である。この場合、通信端末は、拠点の数に応じて増える。 Furthermore, although two bases are shown in FIG. 1, it is also possible to communicate with three or more bases. In this case, the number of communication terminals increases according to the number of bases.
 入出力装置6は、周囲を撮影して得た映像データ及び周囲の音を収音して得た音データを通信端末5に送信する装置である。この送信方法は、有線でも無線でもよい。 The input/output device 6 is a device that transmits video data obtained by photographing the surroundings and sound data obtained by collecting surrounding sounds to the communication terminal 5. This transmission method may be wired or wireless.
 また、拠点αでは、通信端末5は、入出力装置6から出力された映像データ及び音データを取得して、拠点βの通信端末10に映像データ及び音データを送信する。なお、同様に、拠点βの通信端末8が、拠点β側の映像データ及び音データを取得して、拠点αの通信端末5に送信してもよい。なお、通信端末5については、後ほど図4を用いて詳細に説明する。 Furthermore, at the base α, the communication terminal 5 acquires the video data and sound data output from the input/output device 6, and transmits the video data and sound data to the communication terminal 10 at the base β. Similarly, the communication terminal 8 at the base β may acquire the video data and sound data from the base β and transmit it to the communication terminal 5 at the base α. Note that the communication terminal 5 will be explained in detail later using FIG. 4.
 <サーバ7>
 サーバ7は、DB(Data Base)サーバとしての役割を果たす。拠点αでの対話前では、通信端末9から、コメント等の情報を取得して、コメント、識別情報、場合によっては優先順位を関連付けて保存しておく。識別情報、及び優先順位については後ほど説明する。
<Server 7>
The server 7 plays a role as a DB (Data Base) server. Before the dialogue at the base α, information such as comments is obtained from the communication terminal 9 and stored in association with the comments, identification information, and, as the case may be, priorities. Identification information and priority will be explained later.
 また、拠点αでの対話中では、通信端末8から、コメント等の情報を取得して、コメント、識別情報、場合によっては優先順位を関連付けて保存しておく。また、対話中に通信端末5からの要求に応じて、通信端末5にコメント等の情報を送る。 Also, during the conversation at the base α, information such as comments is acquired from the communication terminal 8 and stored in association with the comments, identification information, and in some cases priorities. Additionally, information such as comments is sent to the communication terminal 5 in response to a request from the communication terminal 5 during the dialogue.
 <通信端末8>
 拠点βの通信端末8は、拠点αでの対話に直接音声による発言をしないが、拠点αでの対話の映像及び音声を視聴する観察者が使用するPC等である。拠点βの観察者は、通信端末8を用いることで、拠点αの対話の流れや現在の対話に対する参加者の様子を把握することができる。通信端末8には、出力装置及び入力装置が接続される。
<Communication terminal 8>
The communication terminal 8 at the base β is a PC or the like used by an observer who does not make a direct voice statement during the dialogue at the base α, but who views the video and audio of the dialogue at the base α. By using the communication terminal 8, the observer at the base β can grasp the flow of the dialogue at the base α and the state of the participants in the current dialogue. An output device and an input device are connected to the communication terminal 8.
 出力装置は、例えば、対話の場の音声を再生するヘッドフォン型スピーカ、対話の場の様子を見ることのできるビデオ映像を再生するディスプレイ、対話の内容をテキスト化した情報を再生するディスプレイ、このテキスト化した情報の読み上げを行うヘッドフォン又はスピーカ等である。 Examples of the output device include a headphone-type speaker that reproduces the audio of the dialogue, a display that plays a video image that allows you to see the dialogue, a display that reproduces text information of the content of the dialogue, and a display that reproduces the text of the dialogue. These are headphones or speakers that read out the converted information.
 また、入力装置は、例えば、音声入力を可能とするマイク、文字入力を可能とするキーボード等である。観察者が音声入力を行なった場合、通信端末8は、入力されたコメントの音声を文字情報に変換する。 Further, the input device is, for example, a microphone that allows voice input, a keyboard that allows character input, or the like. When the observer performs voice input, the communication terminal 8 converts the voice of the input comment into text information.
 更に、通信端末8は、観察者によって得られたコメントに対して、コメントの内容を分類するための識別情報を付与する。例えば、話題が場所について尋ねるものであった場合、観察者にマップ上の緯度及び経度の情報を入力してもらい、この緯度及び経度が識別情報としてサーバ7に保存される。また、コメントの内容から、公園、商業施設、文化施設等のタグを観察者に入力してもらい、これらを識別情報としてもよい。またこれ以外にも任意の情報を観察者に入力してもらい、これらを識別情報としてもよい。これらの識別情報は、観察者ではない第三者によって入力されてもよい。 Furthermore, the communication terminal 8 gives identification information for classifying the content of the comment to the comment obtained by the observer. For example, if the topic is asking about a place, the viewer is asked to input latitude and longitude information on a map, and the latitude and longitude are stored in the server 7 as identification information. Alternatively, the viewer may be asked to input tags such as parks, commercial facilities, cultural facilities, etc. from the content of the comments, and these may be used as identification information. In addition to this, the observer may be asked to input arbitrary information and these may be used as identification information. These identification information may be input by a third party who is not an observer.
 また、識別情報は、座標の数値でも良い。この場合、サーバ7は、収集したコメントをdoc2vecやfast2textなどのAPI(Application Programming Interface)を用いて高次の座標に変換し、この座標を主成分分析することで2次元又は3次元に次元圧縮し、変換後の座標を識別情報として用いる。なお、サーバ7は、集約されたコメントの持つ座標ごとののユークリッド距離をすべて計算し、距離の近いものをクラスタリングし、クラスタリングできたものでカテゴリ分けし、各カテゴリを識別情報としてもよい。この場合のカテゴリは、クラスタリング後のグループの中心の座標であってもよい。 Additionally, the identification information may be coordinate values. In this case, the server 7 converts the collected comments into high-order coordinates using APIs (Application Programming Interfaces) such as doc2vec and fast2text, and performs principal component analysis on these coordinates to reduce the dimension to two or three dimensions. Then, the converted coordinates are used as identification information. Note that the server 7 may calculate all the Euclidean distances for each coordinate of the aggregated comments, cluster those with close distances, categorize them based on the clustered comments, and use each category as identification information. The category in this case may be the coordinates of the center of the group after clustering.
 さらに、識別情報には、コメントが対話で使われる優先順位(数値等)が付加されてもよい。優先順位は、観察者又は第三者によって付与される。この優先順位は、高いほど対話の場に出現する可能性が高くなるため、観察者が、ぜひ伝えたいという気持ちが強い場合には、例えば優先順位を「1」とする。優先順位の数値の上限は特に無く、任意の数値で設定可能である。この結果、サーバ7は、コメント及び識別情報、並びに(又は)優先順位を関連付けて保存する。 Furthermore, a priority order (such as a numerical value) in which comments are used in dialogue may be added to the identification information. Priority is assigned by an observer or third party. The higher the priority, the higher the possibility of appearing in the dialogue, so if the observer has a strong desire to convey the message, the priority is set to "1," for example. There is no particular upper limit to the numerical value of the priority order, and any numerical value can be set. As a result, the server 7 stores the comment, identification information, and/or priority in association with each other.
 <通信端末9>
 通信端末9は、拠点αで対話が行われる前に、対話の内容に興味がある不特定多数又は所定の条件に該当する人のうちの任意の人(回答者)が使用するPC等である。この回答者は、拠点αでの対話の前に、通信端末9を用いて、自分のコメントをサーバ7に送信して保存しておく。所定の条件とは、例えば、特定の居住地に住んでいる人、年齢、性別、家族構成、趣味などである。例えば、サーバ7が、ある地域に住んでおりメーリングリストに参加している人々に対して、拠点αでの対話の前に話題を送り、各回答者は各通信端末9からサーバ7に対して、インターネットフォームを用いて話題に対するコメントを送っておく。
<Communication terminal 9>
The communication terminal 9 is a PC or the like used by any person (respondent) among the unspecified number of people who are interested in the content of the dialogue or those who meet predetermined conditions before the dialogue is held at the base α. . This respondent uses the communication terminal 9 to transmit and save his/her comments to the server 7 before the conversation at the base α. The predetermined conditions include, for example, people living in a specific place, age, gender, family composition, hobbies, and the like. For example, the server 7 sends a topic to people who live in a certain area and are participating in the mailing list before a conversation at the base α, and each respondent sends a topic from each communication terminal 9 to the server 7. Send comments on topics using the internet form.
 この場合、サーバ7は、各回答者から収集したコメントに対して、識別情報を付与して保存しておく。識別情報については、上述の通信端末8で説明した内容と同様であるため、説明を省略する。また、識別情報には、コメントが対話で使われる優先順位(数値等)が付加されてもよい。優先順位についても、上述の通信端末8で説明した内容と同様であるため、説明を省略する。この結果、サーバ7は、コメント及び識別情報、並びに(又は)優先順位を関連付けて保存する。 In this case, the server 7 adds identification information to the comments collected from each respondent and stores them. The identification information is the same as that explained for the communication terminal 8 above, so the explanation will be omitted. Further, a priority order (such as a numerical value) in which comments are used in dialogue may be added to the identification information. The priority order is also the same as that explained for the communication terminal 8 above, so the explanation will be omitted. As a result, the server 7 stores the comment, identification information, and/or priority in association with each other.
 〔使用イメージ〕
 図2は、拠点αにおける対話のイメージ図である。
[Usage image]
FIG. 2 is an image diagram of the dialogue at the base α.
 図2に示すように、例えば、拠点αで行われる会議等に4名が参加し、机110の上に入出力装置6が設置されている。入出力装置6は、マイク及びカメラ等の入力手段と、マイク等の出力手段によって構成されている。表示装置2には、資料等(共有画面)が表示されている。図1の通信端末5は図2の参加者のうちのいずれかの通信端末である。通信端末5には、表示装置2及び入出力装置6が接続されている。また、拠点αの参加者は2名以上であれば、何名であってもよい。 As shown in FIG. 2, for example, four people participate in a meeting etc. held at the base α, and the input/output device 6 is installed on the desk 110. The input/output device 6 includes input means such as a microphone and a camera, and output means such as a microphone. The display device 2 displays materials, etc. (shared screen). The communication terminal 5 in FIG. 1 is one of the communication terminals among the participants in FIG. A display device 2 and an input/output device 6 are connected to the communication terminal 5. Further, the number of participants at the base α may be any number as long as there are two or more.
 <通信端末のハードウェア構成>
 次に、図3を用いて、通信端末5の電気的なハードウェア構成を説明する。図3は、通信端末の電気的なハードウェア構成図である。
<Hardware configuration of communication terminal>
Next, the electrical hardware configuration of the communication terminal 5 will be described using FIG. 3. FIG. 3 is an electrical hardware configuration diagram of the communication terminal.
 通信端末5は、コンピュータとして、図3に示されているように、CPU501、ROM502、RAM503、SSD504、外部機器接続I/F(Interface)505、ネットワークI/F506、ディスプレイ507、入力デバイス508、メディアI/F509、及びバスライン510を備えている。 As shown in FIG. 3, the communication terminal 5 is a computer that includes a CPU 501, ROM 502, RAM 503, SSD 504, external device connection I/F (Interface) 505, network I/F 506, display 507, input device 508, and media. It includes an I/F 509 and a bus line 510.
 これらのうち、プロセッサとしてのCPU501は、通信端末5全体の動作を制御する。ROM502は、IPL等のCPU501の駆動に用いられるプログラムを記憶する。RAM503は、CPU501のワークエリアとして使用される。 Among these, the CPU 501 as a processor controls the operation of the communication terminal 5 as a whole. The ROM 502 stores programs used to drive the CPU 501 such as IPL. RAM 503 is used as a work area for CPU 501.
 SSD504は、CPU501の制御に従って各種データの読み出し又は書き込みを行う。なお、SSD504の代わりに、HDD(Hard Disk Drive)を用いてもよい。 The SSD 504 reads or writes various data under the control of the CPU 501. Note that an HDD (Hard Disk Drive) may be used instead of the SSD 504.
 外部機器接続I/F505は、各種の外部機器を接続するためのインターフェースである。この場合の外部機器は、ディスプレイ、スピーカ、キーボード、マウス、USBメモリ、及びプリンタ等である。 The external device connection I/F 505 is an interface for connecting various external devices. External devices in this case include a display, speaker, keyboard, mouse, USB memory, printer, and the like.
 ネットワークI/F506は、通信ネットワーク100を介してデータ通信をするためのインターフェースである。 The network I/F 506 is an interface for data communication via the communication network 100.
 ディスプレイ507は、各種画像を表示する液晶や有機EL(Electro Luminescence)などの表示手段の一種である。 The display 507 is a type of display means such as liquid crystal or organic EL (Electro Luminescence) that displays various images.
 入力デバイス508は、キーボード、ポインティングデバイス等であり、各種指示の入力、選択、実行等を行う入力手段の一種である。なお、入力デバイス508は、外付けのキーボードやマウスとの併用が可能である。 The input device 508 is a keyboard, pointing device, etc., and is a type of input means for inputting, selecting, executing, etc. various instructions. Note that the input device 508 can be used in combination with an external keyboard and mouse.
 メディアI/F509は、フラッシュメモリ等の記録メディア509mに対するデータの読み出し又は書き込み(記憶)を制御する。記録メディア509mには、DVDやBlu-ray Disc(登録商標)等も含まれる。 The media I/F 509 controls reading or writing (storage) of data to a recording medium 509m such as a flash memory. The recording media 509m also include DVDs, Blu-ray Discs (registered trademark), and the like.
 バスライン510は、図4に示されているCPU501等の各構成要素を電気的に接続するためのアドレスバスやデータバス等である。 The bus line 510 is an address bus, a data bus, etc. for electrically connecting each component such as the CPU 501 shown in FIG. 4.
 通信端末5には、マイク、カメラ、及びスピーカのうち少なくとも1つが設けられていてもよい。また、サーバ7、通信端末8,9は、通信端末5と基本的に同様のハードウェア構成であるため、説明を省略する。 The communication terminal 5 may be provided with at least one of a microphone, a camera, and a speaker. Further, since the server 7 and the communication terminals 8 and 9 have basically the same hardware configuration as the communication terminal 5, the description thereof will be omitted.
 なお、拠点βの通信端末8には、マイク、カメラ、及びスピーカが設けられており、上述の入力装置及び出力装置の代わりに使用されてもよい。 Note that the communication terminal 8 at the base β is equipped with a microphone, a camera, and a speaker, and may be used instead of the above-mentioned input device and output device.
 〔通信端末の機能構成〕
 本実施形態に係る通信端末5の機能構成について説明する。図4は、通信端末5の機能構成図である。
[Functional configuration of communication terminal]
The functional configuration of the communication terminal 5 according to this embodiment will be explained. FIG. 4 is a functional configuration diagram of the communication terminal 5. As shown in FIG.
 図4に示すように、通信端末5は、初期値設定部50、話題入力部51、話題判定部53、コメント選択部55、参加者感情判断部57及び出力部59を有する。これら各部は、プログラムに基づき図3のCPU501による命令によって実現される機能である。 As shown in FIG. 4, the communication terminal 5 includes an initial value setting section 50, a topic input section 51, a topic determination section 53, a comment selection section 55, a participant emotion determination section 57, and an output section 59. Each of these units is a function realized by instructions from the CPU 501 in FIG. 3 based on a program.
 初期値設定部50は、対話前に参加者等によって、コメント選択部55が対話の際に沈黙が続いたと判断するための基準となる音量s(db)と時間T(秒)の入力を受け付ける。 The initial value setting unit 50 accepts the input of the volume s (db) and the time T (seconds), which are used as criteria for the comment selection unit 55 to determine that silence has continued during the conversation, from participants etc. before the conversation. .
 話題入力部51は、対話中の話題を含む対話の内容を示す音声データを入力として受け付ける。話題入力部51は、入出力装置6のマイクによる音声入力の場合、音声入力をテキスト(文字)のデータに変換する。また、話題入力部51は、通信端末5又はサーバ7からコメント等のテキスト(文字)のデータを取得する。入力された話題を含む対話の内容を示すテキストのデータは、話題判定部53へ送られる。また、この対話の内容のテキストデータは、拠点αでの対話の内容として、拠点βの通信端末8にも送られる。 The topic input unit 51 receives as input audio data indicating the content of the conversation, including the topic being discussed. In the case of voice input using the microphone of the input/output device 6, the topic input unit 51 converts the voice input into text (character) data. The topic input unit 51 also acquires text (character) data such as comments from the communication terminal 5 or the server 7 . Text data indicating the content of the dialogue including the input topic is sent to the topic determining section 53. The text data of the content of this dialogue is also sent to the communication terminal 8 of the base β as the content of the dialogue at the base α.
 話題判定部53は、対話の場で話されている話題の内容を判定する。そのため、話題判定部53は、話題入力部51から受け取ったテキストのデータを形態素解析して、所定の単語(例えば名詞のみ)を抽出する。話題判定部53は、抽出した複数の単語のうち、頻出の単語の内容を「話題」の内容と判定する。頻出か否かの判断は、単語が所定時間(例えば60秒間)に所定回数(例えば3回)以上出現するか否かにより判断される。なお、抽出される単語の品詞は任意のものを指定することができる。また、話題判定部53は、話題入力部51から受け取った文章をdoc2vecやfast2textなどのAPIを用い、テキスト中の話題のデータを高次の座標に変換し、この座標を主成分分析することで、2次元又は3次元に次元圧縮して変換後の座標を得てもよい。この場合、この座標の値が「話題」を示す。得られた話題を示すデータは、コメント選択部55へ送られる。 The topic determination unit 53 determines the content of the topic being discussed in the dialogue. Therefore, the topic determination unit 53 morphologically analyzes the text data received from the topic input unit 51 and extracts predetermined words (for example, only nouns). The topic determining unit 53 determines the contents of frequently appearing words among the plurality of extracted words as the contents of the "topic". Whether or not a word appears frequently is determined based on whether a word appears a predetermined number of times (for example, three times) or more in a predetermined period of time (for example, 60 seconds). Note that any part of speech of the word to be extracted can be specified. In addition, the topic determination unit 53 converts the topic data in the text into higher-order coordinates using APIs such as doc2vec and fast2text, and performs principal component analysis on these coordinates. , the converted coordinates may be obtained by dimensional compression to two or three dimensions. In this case, the value of this coordinate indicates the "topic". The obtained data indicating the topic is sent to the comment selection section 55.
 コメント選択部55は、入出力装置6から、拠点αに存在する参加者全員の音声データを取得する。更に、コメント選択部55は、取得した音声データを解析し、「所定時間の沈黙が続いた」か否かを判定する。例えば、コメント選択部55は、音量がsデシベル以下の状態がT秒間続いた場合に「所定時間の沈黙が続いた」と判定する。 The comment selection unit 55 acquires the audio data of all participants present at the base α from the input/output device 6. Furthermore, the comment selection unit 55 analyzes the acquired audio data and determines whether "silence continued for a predetermined period of time" or not. For example, the comment selection unit 55 determines that "silence has continued for a predetermined period of time" when the sound volume remains below s decibels for T seconds.
 また、コメント選択部55は、話題判定部53から話題のデータを取得すると、サーバ7から話題の内容に関連するコメントのデータを取得する。また、コメント選択部55は、参加者感情判断部57から参加者の感情情報を取得する。感情情報については、後ほど説明する。そして、コメント選択部55は、感情情報で示される対話中の参加者の感情に応じて、サーバ7から取得したコメントのうちの所定のコメントを選択する。所定のコメントは、出力部59に送られる。なお、感情の内容によっては、コメント選択部55は、出力すべき所定のコメントを選択しない、又は既に選択した所定のコメントを出力部59に送らない。コメント選択部55の詳細な処理は、図5で説明する。 Furthermore, upon acquiring topic data from the topic determining section 53, the comment selection section 55 acquires comment data related to the content of the topic from the server 7. Further, the comment selection section 55 acquires participant emotion information from the participant emotion determination section 57. Emotional information will be explained later. Then, the comment selection unit 55 selects a predetermined comment from among the comments acquired from the server 7, depending on the emotion of the participant during the dialogue indicated by the emotion information. The predetermined comment is sent to the output section 59. Note that depending on the content of the emotion, the comment selection unit 55 does not select a predetermined comment to be output, or does not send a predetermined comment that has already been selected to the output unit 59. Detailed processing of the comment selection section 55 will be explained with reference to FIG.
 参加者感情判断部57は、対話の場に存在する参加者の感情を判断する。参加者感情判断部57は、入出力装置6から拠点αで対話を行なっている複数の参加者の映像及び音声の各データの少なくとも一方を取得する。なお、入出力装置6のマイクの代わりに、ヘッドセットマイク、ラベリアマイク、グースネックマイク等を用い、各参加者の音声を個別に取得してもよい。 The participant emotion determination unit 57 determines the emotions of the participants present in the dialog. The participant emotion determination unit 57 acquires at least one of video and audio data of a plurality of participants having a dialogue at the base α from the input/output device 6. Note that instead of the microphone of the input/output device 6, a headset microphone, a lavalier microphone, a gooseneck microphone, or the like may be used to individually acquire the voices of each participant.
 また、参加者感情判断部57が行う処理の第1例が、参考文献1に開示されている。参考文献1の技術では、カメラからの映像又は発話の有無の情報から、人の主観的に感じる値を数値で予測することができる。
<参考文献1>大土隼平,石井陽子,中谷桃子,大塚和弘, "頭部運動機能を用いた複数人対話における対話参加者の主観的印象の予測", 信学技報, vol.121, no.143, HCS2021-20, pp. 19-24(2021)
 また、参加者感情判断部57が行う処理の第2例が、参考文献2に開示されている。参考文献2の技術では、発話音声などを入力として、声質とその言葉の内容などから人の感情を数値として予測することができる。
<参考文献2>https://group.ntt/jp/newsrelease/2021/11/01/211101b.html
 参加者感情判断部57によって得られた感情の種類と感情の程度を示す数値は、参加者(発話)が発生した発話時刻情報tを含めた「感情情報」として、コメント選択部55へ送られる。
Further, a first example of the processing performed by the participant emotion determining unit 57 is disclosed in Reference Document 1. With the technique of Reference 1, it is possible to numerically predict the value that a person subjectively feels based on images from a camera or information on the presence or absence of speech.
<Reference 1> Junpei Otochi, Yoko Ishii, Momoko Nakatani, Kazuhiro Otsuka, "Prediction of subjective impressions of dialogue participants in multi-person dialogue using head motor functions", IEICE Technical Report, vol.121 , no.143, HCS2021-20, pp. 19-24(2021)
Further, a second example of the processing performed by the participant emotion determining unit 57 is disclosed in Reference Document 2. With the technology of Reference 2, human emotions can be predicted as numerical values based on the quality of the voice and the content of the words, using input such as uttered audio.
<Reference 2> https://group.ntt/jp/newsrelease/2021/11/01/211101b.html
The numerical values indicating the type of emotion and the degree of emotion obtained by the participant emotion judgment unit 57 are sent to the comment selection unit 55 as “emotion information” including utterance time information t when the participant (utterance) occurred. .
 出力部59は、通信端末8から、対話の流れの中の適切なタイミングで、人が案出したコメントを出力する。 The output unit 59 outputs comments devised by a person from the communication terminal 8 at appropriate timings in the flow of the dialogue.
 〔通信端末の処理又は動作〕
 続いて、図5を用いて、通信端末5によるコメントの選択処理を説明する。なお、下記処理は図4に示す各部の処理の一例である。
[Processing or operation of communication terminal]
Next, comment selection processing by the communication terminal 5 will be described using FIG. 5. Note that the following processing is an example of the processing of each part shown in FIG.
 S11:コメント選択部55は、入出力装置6から、拠点αに存在する参加者全員の音声データを取得する。 S11: The comment selection unit 55 acquires the audio data of all participants present at the base α from the input/output device 6.
 S12:コメント選択部55は、取得した音声データを解析し、「所定時間の沈黙が続いた」か否かを判定する。例えば、コメント選択部55は、音量がsデシベル以下の状態がT秒間続いた場合に「所定時間の沈黙が続いた」と判定する。そして、話題判定部53は所定時間の沈黙が続いたと判断しない場合には(S12;NO)、S11の処理に戻る。一方、話題判定部53は所定時間の沈黙が続いたと判断した場合には(S12;YES)、S13の処理に進む。 S12: The comment selection unit 55 analyzes the acquired audio data and determines whether "silence continued for a predetermined period of time" or not. For example, the comment selection unit 55 determines that "silence has continued for a predetermined period of time" when the sound volume remains below s decibels for T seconds. If the topic determining unit 53 does not determine that silence has continued for a predetermined period of time (S12; NO), the process returns to S11. On the other hand, if the topic determining unit 53 determines that silence has continued for a predetermined period of time (S12; YES), the process proceeds to S13.
 ここで、事前に、話題入力部51は、対話中の話題の内容を示す音声データからテキストのデータに変換し、話題判定部53は、テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて話題の内容を判定してコメント選択部55に話題のデータを送る。 Here, in advance, the topic input unit 51 converts audio data indicating the content of the topic during the dialogue into text data, and the topic determination unit 53 converts words that have appeared a predetermined number of times or more in a predetermined time in the text data. Based on this, the content of the topic is determined and topic data is sent to the comment selection section 55.
 S13:コメント選択部55は、話題判定部53から、話題のデータを取得する。 S13: The comment selection unit 55 acquires topic data from the topic determination unit 53.
 S14:コメント選択部55は、サーバ7に保存された識別情報の中で、話題に関連している所定の識別情報に係るコメントのデータを検索することで、対応する所定のコメントを選択する。関連している所定の識別情報に係るコメントが1つ以上ある場合、コメント選択部55は、その中から最も優先順位の高い所定のコメントを一つ選択する。もしも、同じ優先順位のコメントが複数存在している場合には、コメント選択部55は、これらのコメントの中からランダムにいずれか1つの所定のコメントを選択する。また、コメント選択部55は、選択した所定のコメントとの類似度に関しては数値で出力する。 S14: The comment selection unit 55 selects a corresponding predetermined comment by searching for comment data related to predetermined identification information related to the topic among the identification information stored in the server 7. If there is one or more comments related to the predetermined identification information, the comment selection unit 55 selects one predetermined comment with the highest priority among them. If there are multiple comments with the same priority, the comment selection unit 55 randomly selects one predetermined comment from among these comments. Further, the comment selection unit 55 outputs a numerical value regarding the degree of similarity with the selected predetermined comment.
 S15:コメント選択部55は、参加者感情判断部57から、拠点αで行われている対話の各参加者の感情情報(感情の種類、感情の度合いを示す数値、発話時刻の各情報)を取得する。 S15: The comment selection unit 55 receives emotional information (information such as type of emotion, numerical value indicating the degree of emotion, and time of utterance) of each participant in the dialogue taking place at the base α from the participant emotion determination unit 57. get.
 S16:コメント選択部55は、各参加者の感情情報に基づいて、以下に示すように、対話の場の各参加者の感情を推定する。 S16: The comment selection unit 55 estimates the emotions of each participant in the dialogue based on the emotional information of each participant, as shown below.
 コメント選択部55は、例えば、拠点αの全ての参加者の感情の種類毎に数値を合算し、最も数値の大きい感情を、対話の場の所定の感情と判断する。 For example, the comment selection unit 55 adds up the numerical values for each type of emotion of all the participants at the base α, and determines the emotion with the largest numerical value as the predetermined emotion for the dialog.
 または、コメント選択部55は、感情の種類毎に分けるのではなく、全ての参加者の感情の種類と数値の分布の特徴をいくつかのカテゴリとして分けておき、カテゴリ毎に対話の場の感情を決定していても良い。この場合、コメント選択部55は、得られた感情の種類と数値の分布の特徴が最も近いカテゴリに対応づけられた感情を、対話の場の所定の感情と判断する。 Alternatively, the comment selection unit 55 may divide the emotion types and numerical distribution characteristics of all participants into several categories, instead of dividing them by emotion type, and then divide the emotion of the conversation for each category into categories. You may decide. In this case, the comment selection unit 55 determines the emotion associated with the category whose characteristics of the obtained emotion type and numerical distribution are closest to each other as the predetermined emotion for the conversation.
 対話の場の感情は、例えば「盛り上がっている」、「集中している」、「散漫としている」などが例として考えられる。 Possible examples of emotions in a conversation include "excited," "concentrated," and "distracted."
 コメント選択部55は、対話の場の感情毎に、コメントの出力の有無を制御する。例えば、感情の種類が「集中している」の場合には、コメント選択部55は出力部59に所定コメントのデータを送らない。参加者が集中しているため、所定のコメントを出力しても、参加者が所定のコメントを見ない又は無視する可能性があるからである。一方、感情の種類が「盛り上がっている」、「散漫としている」の場合には、コメント選択部55は出力部59に所定のコメントのデータを送る。 The comment selection unit 55 controls whether or not to output comments depending on the emotion of the conversation. For example, when the type of emotion is "concentrated," the comment selection section 55 does not send data of the predetermined comment to the output section 59. This is because the participants are concentrating, so even if a predetermined comment is output, there is a possibility that the participants will not see or ignore the predetermined comment. On the other hand, if the type of emotion is "excited" or "distracted", the comment selection unit 55 sends predetermined comment data to the output unit 59.
 S17:出力部59は、コメント選択部55から受け取ったコメントのデータを出力する。なお、出力部59は、コメント選択部55から所定のコメントとの類似度(数値)を取得し、この類似度(数値)を加えて出力してもよい。出力方法としては、入出力装置6から音声で出力しても良いし、表示装置2からテキスト情報を表示してもよい。また、出力内容としては、コメントをそのままの内容で出力する(音声の場合はそのまま音声合成によりそのコメントを読み上げる)等でも良いし、「私は・・・と思います」等のように、コメントの前後に予め定められた文章、単語又は接続語などを追加してもよい。 S17: The output unit 59 outputs the comment data received from the comment selection unit 55. Note that the output unit 59 may acquire the degree of similarity (numeric value) with a predetermined comment from the comment selection unit 55, add this degree of similarity (numeric value), and output the result. As an output method, the input/output device 6 may output audio, or the display device 2 may display text information. In addition, the output content may be to output the comment as is (in the case of audio, the comment is read aloud using speech synthesis), or it may be possible to output the comment as is, such as "I think..." Predetermined sentences, words, or conjunctions may be added before and after.
 なお、上記S12の処理は、上記S16の処理とS17の処理の間で行われてもよい。この場合、通信端末5は、所定時間の沈黙が続いたときには既に出力すべき所定のコメントを選択した状態であるため、迅速に所定のコメントを出力することができる。これは、通信ネットワークにおける通信速度が遅い等の場合に有効である。 Note that the process of S12 above may be performed between the process of S16 and S17. In this case, since the communication terminal 5 has already selected the predetermined comment to be output when silence continues for the predetermined period of time, the communication terminal 5 can quickly output the predetermined comment. This is effective when the communication speed in the communication network is slow.
 また、S15の処理は省略してもよい。即ち、通信端末5は、拠点αの参加者の雰囲気(感情の度合い)を考慮せずに、話題に関連したコメントを出力してもよい。 Additionally, the process of S15 may be omitted. That is, the communication terminal 5 may output comments related to the topic without considering the atmosphere (level of emotion) of the participants at the base α.
 以上により、図5の説明が終了する。 This concludes the description of FIG. 5.
 〔実施形態の効果〕
 以上説明したように本実施形態によれば、通信端末5は、コンピュータではなくて人(観察者、回答者)が案出した複数のコメントのうち、拠点αの参加者により出ているの話題に関連する所定のコメントを出力するため、対話の流れの中の適切なタイミングで話題に合った人のコメントを出力することができる。
[Effects of embodiment]
As explained above, according to the present embodiment, the communication terminal 5 selects the topic of the comment made by the participant at the base α among a plurality of comments devised not by a computer but by a person (observer, respondent). Since predetermined comments related to the topic are output, it is possible to output the comments of the person who fits the topic at the appropriate timing in the flow of the dialogue.
 また、拠点αの参加者による感情の度合いに応じて所定のコメントの出力を制御するため、より対話の流れの中の適切なタイミングで所定のコメントを出力することが可能である。 Furthermore, since the output of a predetermined comment is controlled according to the level of emotion of the participants at the base α, it is possible to output the predetermined comment at a more appropriate timing in the flow of the dialogue.
 ●補足
 本発明は上述の実施形態に限定されるものではなく、以下に示すような構成又は処理(動作)であってもよい。
●Supplement The present invention is not limited to the above-described embodiments, and may have the following configuration or processing (operation).
 (1)通信端末5はコンピュータとプログラムによっても実現できるが、このプログラムを(非一時的な)記録媒体に記録することも、通信ネットワーク100を介して提供することも可能である。 (1) The communication terminal 5 can also be realized by a computer and a program, but this program can also be recorded on a (non-temporary) recording medium or provided via the communication network 100.
 (2)上記実施形態では、通信端末5,8,9の一例としてノート型パソコンが示されているが、これに限るものではなく、例えば、デスクトップパソコン、タブレット端末、スマートフォン等であってもよい。 (2) In the above embodiment, a notebook computer is shown as an example of the communication terminals 5, 8, and 9, but the invention is not limited to this. For example, a desktop computer, a tablet terminal, a smartphone, etc. may be used. .
 (3)各CPU501は、単一だけでなく、複数であってもよい。 (3) Each CPU 501 may be a single CPU or a plurality of CPUs.
 (4)上記実施形態では、サーバにコメントが保存されているがこれに限るものではない。例えば、通信端末5内のRAM503又はSSD504等にコメントが保存されていてもよい。この場合、コメント選択部55は、通信端末5内に保存されているコメントのデータを読み出して、所定のコメントを選択する。なお、通信端末5内のRAM503又はSSD505、及びとサーバ7は、保存部の一例である。 (4) In the above embodiment, comments are stored on the server, but the present invention is not limited to this. For example, the comment may be stored in the RAM 503 or SSD 504 in the communication terminal 5. In this case, the comment selection unit 55 reads comment data stored in the communication terminal 5 and selects a predetermined comment. Note that the RAM 503 or SSD 505 in the communication terminal 5 and the server 7 are examples of storage units.
 ●付記項
 上述の実施形態には、以下に示す発明としても表すことができる。
●Additional Notes The above-described embodiments can also be expressed as inventions shown below.
 〔付記項1〕
 複数の参加者の対話中にコメントを出力するプロセッサを有する通信端末であって、
 前記プロセッサは、
 前記対話中の話題を含む対話の内容を示す音声データからテキストのデータに変換し、
 前記テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて前記話題の内容を判定し、
 予め人により案出されたコメントのデータを保存しておいた保存部から、前記判定された前記話題の内容に関連するコメントのデータを取得することで、出力すべき所定のコメントを選択し、
 前記選択された前記所定のコメントのデータを出力する、
 通信端末。
[Additional Note 1]
A communication terminal having a processor that outputs comments during dialogue between multiple participants,
The processor includes:
Converting audio data indicating the content of the dialogue including the topic during the dialogue into text data,
determining the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data;
Selecting a predetermined comment to be output by acquiring comment data related to the determined content of the topic from a storage unit that stores comment data devised by a person in advance;
outputting data of the selected predetermined comment;
communication terminal.
 〔付記項2〕
 前記プロセッサは、各コメントのデータに関連付けられコメントの内容を分類するための識別情報に基づき、前記保存部から前記識別情報に対応するコメントのデータとして、前記話題の内容に関連するコメントのデータを取得する、付記項1に記載の通信端末。
[Additional note 2]
The processor receives comment data related to the content of the topic from the storage unit as comment data corresponding to the identification information, based on identification information associated with each comment data and used to classify the comment content. The communication terminal according to Supplementary Note 1 to be acquired.
 〔付記項3〕
 付記項1又は2に記載の通信端末であって、
 前記プロセッサは、
 前記対話の場の前記複数の参加者の映像及び音声の少なくとも一方に基づいて、前記複数の参加者の感情を判断し、
 前記複数の参加者の感情に応じて、出力すべき前記所定のコメントを選択する、
 通信端末。
[Additional note 3]
The communication terminal according to supplementary note 1 or 2,
The processor includes:
Determining the emotions of the plurality of participants based on at least one of the video and audio of the plurality of participants in the dialogue place,
selecting the predetermined comment to be output according to the emotions of the plurality of participants;
communication terminal.
 〔付記項4〕
 付記項1又は2に記載の通信端末であって、
 前記プロセッサは、
 前記対話の場の前記各参加者の映像及び音声の少なくとも一方に基づいて、前記複数の参加者の感情を判断し、
 前記複数の参加者の感情に応じて、出力すべき前記所定のコメントを選択しない、
 通信端末。
[Additional note 4]
The communication terminal according to supplementary note 1 or 2,
The processor includes:
Determining the emotions of the plurality of participants based on at least one of the video and audio of each of the participants in the dialogue place,
not selecting the predetermined comment to be output according to the emotions of the plurality of participants;
communication terminal.
 〔付記項5〕
 前記プロセッサは、前記テキストのデータを形態素解析して抽出した複数の単語のうち、所定時間に所定回数以上出現した単語の内容を前記話題の内容と判定する、付記項1又は2に記載の通信端末。
[Additional Note 5]
The communication according to appendix 1 or 2, wherein the processor determines the content of a word that appears a predetermined number of times or more in a predetermined time from among a plurality of words extracted by morphologically analyzing the data of the text as the content of the topic. terminal.
 〔付記項6〕
 前記プロセッサは、前記対話の場で所定時間の沈黙が続いた場合に、前記所定のコメントのデータを出力する、付記項1又は2に記載の通信端末。
[Additional Note 6]
3. The communication terminal according to claim 1 or 2, wherein the processor outputs data of the predetermined comment when silence continues for a predetermined period of time in the dialog.
 〔付記項7〕
 複数の参加者の対話中にコメントを出力するプロセッサを有する通信端末が実行するコメント出力方法であって、
 前記プロセッサは、
 前記対話中の話題を含む対話の内容を示す音声データからテキストのデータに変換し、
 前記テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて前記話題の内容を判定し、
 予め人により案出されたコメントのデータを保存しておいた保存部から、前記判定された前記話題の内容に関連するコメントのデータを取得することで、出力すべき所定のコメントを選択し、
 前記選択された前記所定のコメントのデータを出力する、
 ことを実行するコメント出力方法。
[Additional Note 7]
A comment output method executed by a communication terminal having a processor that outputs comments during a dialogue between multiple participants, the method comprising:
The processor includes:
Converting audio data indicating the content of the dialogue including the topic during the dialogue into text data,
determining the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data;
Selecting a predetermined comment to be output by acquiring comment data related to the determined content of the topic from a storage unit that stores comment data devised by a person in advance;
outputting data of the selected predetermined comment;
A comment output method that does that.
 〔付記項8〕
 コンピュータに、付記項7に記載の方法を実行させるプログラムが記録された非一時的記録媒体。
[Additional Note 8]
A non-temporary recording medium on which a program for causing a computer to execute the method set forth in Supplementary Note 7 is recorded.
1 通信システム
2 表示装置
5 通信端末
6 入出力装置
7 サーバ(保存部の一例)
8 通信端末
9 通信端末
50 初期値設定部
51 話題入力部
53 話題判定部
55 コメント選択部
57 参加者感情判断部
59 出力部
503 RAM(保存部の一例)
504 SSD(保存部の一例)
1 Communication system 2 Display device 5 Communication terminal 6 Input/output device 7 Server (an example of a storage unit)
8 Communication terminal 9 Communication terminal 50 Initial value setting section 51 Topic input section 53 Topic judgment section 55 Comment selection section 57 Participant emotion judgment section 59 Output section 503 RAM (an example of a storage section)
504 SSD (example of storage section)

Claims (8)

  1.  複数の参加者の対話中にコメントを出力する通信端末であって、
     前記対話中の話題を含む対話の内容を示す音声データからテキストのデータに変換する話題入力部と、
     前記テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて前記話題の内容を判定する話題判定部と、
     予め人により案出されたコメントのデータを保存しておいた保存部から、前記判定された前記話題の内容に関連するコメントのデータを取得することで、出力すべき所定のコメントを選択するコメント選択部と、
     前記選択された前記所定のコメントのデータを出力する出力部と、
     を有する通信端末。
    A communication terminal that outputs comments during dialogue between multiple participants,
    a topic input unit that converts audio data indicating the content of the dialogue including the topic during the dialogue into text data;
    a topic determination unit that determines the content of the topic based on words that have appeared a predetermined number of times or more in the text data;
    A comment that selects a predetermined comment to be output by acquiring comment data related to the content of the determined topic from a storage unit that stores comment data devised by a person in advance. a selection section;
    an output unit that outputs data of the selected predetermined comment;
    A communication terminal with
  2.  前記コメント選択部は、各コメントのデータに関連付けられコメントの内容を分類するための識別情報に基づき、前記保存部から前記識別情報に対応するコメントのデータとして、前記話題の内容に関連するコメントのデータを取得する、請求項1に記載の通信端末。 The comment selection unit selects comments related to the content of the topic from the storage unit as comment data corresponding to the identification information, based on identification information associated with each comment data and for classifying comment content. The communication terminal according to claim 1, which acquires data.
  3.  請求項1又は2に記載の通信端末であって、
     前記対話の場の前記複数の参加者の映像及び音声の少なくとも一方に基づいて、前記複数の参加者の感情を判断する参加者感情判断部を有し、
     前記コメント選択部は、前記複数の参加者の感情に応じて、出力すべき前記所定のコメントを選択する、
     通信端末。
    The communication terminal according to claim 1 or 2,
    comprising a participant emotion determination unit that determines the emotions of the plurality of participants based on at least one of video and audio of the plurality of participants in the dialogue place;
    The comment selection unit selects the predetermined comment to be output according to the emotions of the plurality of participants.
    communication terminal.
  4.  請求項1又は2に記載の通信端末であって、
     前記対話の場の前記複数の参加者の映像及び音声の少なくとも一方に基づいて、前記複数の参加者の感情を判断する参加者感情判断部を有し、
     前記コメント選択部は、前記複数の参加者の感情に応じて、出力すべき前記所定のコメントを選択しない、
     通信端末。
    The communication terminal according to claim 1 or 2,
    a participant emotion determination unit that determines the emotions of the plurality of participants based on at least one of video and audio of the plurality of participants in the dialogue place;
    The comment selection unit does not select the predetermined comment to be output according to the emotions of the plurality of participants.
    communication terminal.
  5.  前記話題判定部は、前記テキストのデータを形態素解析して抽出した複数の単語のうち、所定時間に所定回数以上出現した単語の内容を前記話題の内容と判定する、請求項1又は2に記載の通信端末。 3. The topic determination unit determines the content of a word that appears a predetermined number of times or more in a predetermined time as the content of the topic, among a plurality of words extracted by morphologically analyzing the data of the text. communication terminal.
  6.  前記出力部は、前記対話の場で所定時間の沈黙が続いた場合に、前記所定のコメントのデータを出力する、請求項1又は2に記載の通信端末。 The communication terminal according to claim 1 or 2, wherein the output unit outputs data of the predetermined comment when silence continues for a predetermined period of time in the dialogue place.
  7.  複数の参加者の対話中にコメントを出力する通信端末が実行するコメント出力方法であって、
     前記通信端末は、
     前記対話中の話題を含む対話の内容を示す音声データからテキストのデータに変換し、 前記テキストのデータにおいて所定時間に所定回数以上出現した単語に基づいて前記話題の内容を判定し、
     予め人により案出されたコメントのデータを保存しておいた保存部から、前記判定された前記話題の内容に関連するコメントのデータを取得することで、出力すべき所定のコメントを選択し、
     前記選択された前記所定のコメントのデータを出力する、
     ことを実行するコメント出力方法。
    A comment output method executed by a communication terminal that outputs comments during a dialogue between multiple participants, the method comprising:
    The communication terminal is
    converting audio data indicating the content of the dialogue including the topic during the dialogue into text data, determining the content of the topic based on words that appear more than a predetermined number of times in a predetermined time in the text data;
    Selecting a predetermined comment to be output by acquiring comment data related to the determined content of the topic from a storage unit that stores comment data devised by a person in advance;
    outputting data of the selected predetermined comment;
    A comment output method that does that.
  8.  コンピュータに、請求項7に記載の方法を実行させるプログラム。 A program that causes a computer to execute the method according to claim 7.
PCT/JP2022/028670 2022-07-25 2022-07-25 Communication terminal, comment output method, and program WO2024023901A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/028670 WO2024023901A1 (en) 2022-07-25 2022-07-25 Communication terminal, comment output method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/028670 WO2024023901A1 (en) 2022-07-25 2022-07-25 Communication terminal, comment output method, and program

Publications (1)

Publication Number Publication Date
WO2024023901A1 true WO2024023901A1 (en) 2024-02-01

Family

ID=89705765

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/028670 WO2024023901A1 (en) 2022-07-25 2022-07-25 Communication terminal, comment output method, and program

Country Status (1)

Country Link
WO (1) WO2024023901A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000339250A (en) * 1999-05-25 2000-12-08 Nippon Telegr & Teleph Corp <Ntt> Dialog message generating method, its device and medium in which its program is recorded
JP2018097185A (en) * 2016-12-14 2018-06-21 パナソニックIpマネジメント株式会社 Voice dialogue device, voice dialogue method, voice dialogue program and robot
JP2021135426A (en) * 2020-02-28 2021-09-13 ホロアッシュインク Online conversation support method
JP2021144370A (en) * 2020-03-11 2021-09-24 本田技研工業株式会社 Vehicle share-ride support system and vehicle share-ride support method
JP2022038423A (en) * 2020-08-26 2022-03-10 トヨタ自動車株式会社 Vehicular agent device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000339250A (en) * 1999-05-25 2000-12-08 Nippon Telegr & Teleph Corp <Ntt> Dialog message generating method, its device and medium in which its program is recorded
JP2018097185A (en) * 2016-12-14 2018-06-21 パナソニックIpマネジメント株式会社 Voice dialogue device, voice dialogue method, voice dialogue program and robot
JP2021135426A (en) * 2020-02-28 2021-09-13 ホロアッシュインク Online conversation support method
JP2021144370A (en) * 2020-03-11 2021-09-24 本田技研工業株式会社 Vehicle share-ride support system and vehicle share-ride support method
JP2022038423A (en) * 2020-08-26 2022-03-10 トヨタ自動車株式会社 Vehicular agent device

Similar Documents

Publication Publication Date Title
Hustad et al. Implementing speech supplementation strategies
Hantke et al. I hear you eat and speak: Automatic recognition of eating condition and food type, use-cases, and impact on asr performance
JP2023501728A (en) Privacy-friendly conference room transcription from audio-visual streams
US11789695B2 (en) Automatic adjustment of muted response setting
US20220231873A1 (en) System for facilitating comprehensive multilingual virtual or real-time meeting with real-time translation
US11810585B2 (en) Systems and methods for filtering unwanted sounds from a conference call using voice synthesis
JP7323098B2 (en) Dialogue support device, dialogue support system, and dialogue support program
CN114391145A (en) Personal assistant with adaptive response generation AI driver
US11699043B2 (en) Determination of transcription accuracy
WO2019026360A1 (en) Information processing device and information processing method
JP4250938B2 (en) Communication support method and communication server
US20240029753A1 (en) Systems and methods for filtering unwanted sounds from a conference call
WO2024023901A1 (en) Communication terminal, comment output method, and program
JP2011199550A (en) Call speech processor and call speech controller and method
JP4899383B2 (en) Language learning support method
WO2022215361A1 (en) Information processing device and information processing method
JP7152453B2 (en) Information processing device, information processing method, information processing program, and information processing system
WO2019026395A1 (en) Information processing device, information processing method, and program
JP2014109998A (en) Interactive apparatus and computer interactive method
Klessa et al. Paralingua–a new speech corpus for the studies of paralinguistic features
JP7310907B2 (en) DIALOGUE METHOD, DIALOGUE SYSTEM, DIALOGUE DEVICE, AND PROGRAM
Hustad et al. Effects of visual information on intelligibility of open and closed class words in predictable sentences produced by speakers with dysarthria
JP7313518B1 (en) Evaluation method, evaluation device, and evaluation program
Drager et al. Speech synthesis in background noise: Effects of message formulation and visual information on the intelligibility of American English DECTalk™
WO2024084855A1 (en) Remote conversation assisting method, remote conversation assisting device, remote conversation system, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952999

Country of ref document: EP

Kind code of ref document: A1