WO2020194828A1 - 情報処理システム、情報処理装置、および情報処理方法 - Google Patents
情報処理システム、情報処理装置、および情報処理方法 Download PDFInfo
- Publication number
- WO2020194828A1 WO2020194828A1 PCT/JP2019/042079 JP2019042079W WO2020194828A1 WO 2020194828 A1 WO2020194828 A1 WO 2020194828A1 JP 2019042079 W JP2019042079 W JP 2019042079W WO 2020194828 A1 WO2020194828 A1 WO 2020194828A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- receiver
- sender
- processing
- communication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M11/00—Telephonic communication systems specially adapted for combination with other electrical systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Definitions
- the present invention relates to an information processing system, an information processing device, an information processing method, and a program.
- Patent Document 1 discloses a technique for adjusting the resolution and frame rate according to whether or not there is a person in the image data in order to keep the amount of image data to be transmitted within the network band in the video conferencing system. Has been done.
- the present invention has been made in view of such a situation, and an object of the present invention is to provide a technique for assisting smooth communication between a sender and a receiver of information.
- the information processing system of one aspect of the present invention is an information processing system used for a predetermined method of communication between a sender and a receiver, and is described by the sender by the predetermined method.
- the processing means includes means, a processing means for processing the original information based on the method, and a presenting means for presenting the information processed by the processing means to the receiver. At least a part of the original information is changed as the receiver prefers, without changing the content itself that the sender wants to convey by a predetermined communication.
- the information processing system of another aspect of the present invention is used for communication of a predetermined method between a sender and a receiver, and in the communication, at least the receiver uses a device for realizing the predetermined method.
- information acquisition means for acquiring information about an object transmitted from the sender to the receiver by the predetermined method as original information from the sender, and the original information so as to be compatible with the receiver.
- a determination means for determining a method for processing at least a part, a processing means for processing the original information based on the method, and information processed by the processing means are presented to the receiver by the apparatus.
- the processing means changes at least a part of the original information as the receiver prefers, without changing the content itself that the sender wants to convey by the predetermined communication.
- the information processing device of one aspect of the present invention is an information processing device that is used for a predetermined method of communication between a sender and a receiver and can communicate with a device used by the receiver in the communication.
- the processing means includes a means for transmitting the processed information to the device, and the processing means does not change the content itself that the sender wants to convey by the predetermined communication, and the acquired information as the receiver prefers. At least a part of the above is modified, and the transmitted processed information is presented to the receiver by the apparatus.
- the information processing device of another aspect of the present invention is an information processing device used by the receiver in a predetermined method of communication between the sender and the receiver, and the information processing device from the sender to the receiver by the predetermined method.
- the processing means does not change the content itself that the sender wants to convey by the predetermined communication, and changes at least a part of the acquired information as the receiver prefers.
- the information processing device of one aspect of the present invention is used for predetermined communication between the sender and the receiver, and in the communication, the first device used by the sender and the second device used by the receiver.
- An information processing device capable of communicating with both of the above and the first device, which is input by the sender by the transmission method of the first form and indicates original information indicating the contents to be transmitted from the sender to the receiver.
- FIG. 1 It is a figure which shows the outline of the service to which the information processing system of one Embodiment of this invention is applied. It is a figure which shows the example different from the example of FIG. 1 about the service which applied the information processing system of one Embodiment of this invention. It is a system block diagram of the information processing system of one Embodiment of this invention. It is a hardware block diagram of the server of one Embodiment of this invention. It is a functional block diagram of the server of one Embodiment of this invention. It is a flowchart explaining an example of the personalization process of one Embodiment of this invention. It is a figure which shows the example different from the example of FIG. 1 and FIG.
- FIG. 2 about the service to which the information processing system of one Embodiment of this invention is applied. It is a figure which shows the example different from the example of FIG. 1, FIG. 2, and FIG. 7 about the service to which the information processing system of one Embodiment of this invention is applied. It is a figure which shows the outline of the service to which the information processing system of one Embodiment of this invention is applied. It is a figure which shows the specific example of personalization in a personalization service. It is a system block diagram of the information processing system of one Embodiment of this invention. It is a functional block diagram of the server of one Embodiment of this invention. It is a figure which shows the outline of the service to which the information processing system of one Embodiment of this invention is applied.
- FIG. 1 is a diagram showing an outline of a service (hereinafter, referred to as “the service”) to which the information processing system according to the embodiment of the present invention is applied.
- the service communication between the sender S and the receiver R of information utilizing a device (interface) for realizing a predetermined information transmission method (video conference, telephone, e-mail, chat, etc.).
- a predetermined device is utilized in the above communication, and the elements related to the communication from the sender S (the facial expression and gesture of the sender S, the transmission medium (voice).
- Such services include, for example, video and audio communication services such as video conferencing services and virtual assistants, audio communication services such as telephone services, text communication services such as e-mail services and chat, and even text communication services. Examples include services in which video, audio, and text are appropriately mixed. Furthermore, this service can also be applied to communication in which the receiver wears smart glasses, a head-mounted display, or a hearable device (including a hearing aid) and directly confronts the sender S.
- this service provides at least a part of the original information IS, which is information about an object (hereinafter referred to as "original information") transmitted from the sender S to the receiver R by the method of information transmission used. It is a service that processes according to a processing method (hereinafter, referred to as "processing method") determined according to the receiver R (preferably based on a mode suitable for the receiver R) and presents it to the receiver R.
- processing method a processing method
- the object to be transmitted is, for example, when the method of information transmission is a video conference, a physical mode (facial expression, gesture, etc.) at the time of communication, and a voice indicating the content to be transmitted. Further, when the method of information transmission is a telephone, the object to be transmitted is a voice indicating the content of transmission. Further, when the method of information transmission is e-mail or chat, the target to be transmitted is a character string (sentence or the like) indicating the content of transmission.
- the original information includes the information indicating the transmission content and the information emitted by the sender S in the communication of the corresponding method, although it does not necessarily represent the transmission content itself.
- the form of the original information IS is not particularly limited, and various forms may be included.
- an image (including a "moving image” and a “still image") showing at least a part of the body (face, upper body, whole body, etc.) of the sender S in a certain communication, a sound emitted by the sender S, and the like.
- Information about a sentence or the like indicating the content that the sender S wants to convey is included in the original information IS.
- the receiver R in communication with the sender S, the receiver R does not acquire the original information IS transmitted from the sender S as it is, but is determined to suit itself (for example, according to his / her taste). Acquires the processed original information IS (hereinafter referred to as "post-processed information"). As a result, for example, even if the original information IS transmitted from the sender S contains content that causes stress to the receiver R, the receiver R is processed so that the stress is less likely to be felt by the processed information IP. Therefore, the recipient R can reduce the stress felt by himself / herself in the target communication.
- step SS1 the receiver R inputs the information regarding the request for the processing method desired by itself to the receiving terminal 2a described later as the request information IR, and the receiving terminal 2a transmits the information to the server 1a described later.
- the request information IR is defined as "information instructing the sender S to change the face to a smile".
- the method in which the receiver R inputs the request information IR and the content of the specific request are not particularly limited, but for example, the receiver R may select from the presented list via an application described later. You may. In this case, for example, if the receiver R is selected from the list in which the processing items such as "replace the head of the sender S with the head of the character A" and "change the facial expression of the sender S to a smile" are listed. good. Further, when the head is replaced with another, the character A may also be selected by the receiver R. Specific examples of the processing method and the contents of the request information IR will be described later with reference to FIG.
- step SS2 the server 1a determines the processing method of the original information IS based on the request information IR transmitted from the receiving terminal 2a.
- the processing method changes the face detected in the predetermined image to a smile. ..
- step SS3 the sender S executes an operation for communication with the receiver R by using the transmission terminal 3a (determined for each type of service) as an interface related to the service to be utilized. Since it is a video conference in FIG. 1, when the sender S talks in front of the camera provided in the transmitting terminal 3a, the camera photographs the sender S, and the transmitting terminal 3a captures the image obtained by the imaging and The voice is acquired as original information, and the information is transmitted to the server 1a. The server 1a acquires the original information IS transmitted from the transmission terminal 3a.
- the sender S inputs voice to the microphone of the transmitting terminal 3a, and if the service is mail or chat, the sender S is the input unit (keyboard) of the transmitting terminal 3a.
- the transmission terminal 3a acquires the original information by inputting the transmission content via the touch panel).
- step SS4 the server 1a processes the original information IS acquired in step SS3 according to the processing method determined in step SS2 (hereinafter, referred to as "personalization processing").
- the server 1a processes the image included in the original information. For example, the server 1a performs face detection processing on the image included in the acquired original information, and changes the detected face to a smile. At this time, facial expression analysis is performed, and if the face detected in the original information is originally a smile, this change does not have to be performed. Further, it may be changed to a smile when a predetermined facial expression (expressionlessness, anger, etc.) is obtained by facial expression analysis.
- the server 1a transmits the processed original information IS as the processed information IP to the receiving terminal 2a, which is a device utilized by the receiver R during this service.
- step SS5 the receiving terminal 2a receives the processed information IP transmitted from the server 1a, and presents the received processed information IP to the receiver R along with this service.
- the above is the outline of this service.
- the image IS1 included in the original information IS transmitted from the sender S has the expressionless sender S. Will be included in the captured image.
- the image IP1 corresponding to the image IS1 included in the processed information IP acquired by the receiver R is the sender. The image that S laughed at will be included. That is, in the example of FIG.
- the image taken by the expressionless sender S is processed into an image showing the laughing sender S based on the request of the receiver R, and is presented to the receiver R. .. Therefore, the receiver R sees the image of the laughing sender S instead of the image of the expressionless sender S, so he says what he is thinking with no expression. As a result, stress can be reduced without having to worry about unnecessary things. Further, such an effect is useful because even from the standpoint of the sender S, it is possible to give a good impression to the receiver R without giving an extra anxiety.
- a predetermined application (hereinafter referred to as "application") provided by the provider of this service or the like is installed in each of the receiving terminal 2a and the transmitting terminal 3a in advance, and the service including the server 1a is provided. It is assumed that various information required for provision is stored in a DB or the like (not shown in advance). For example, when this service is a video conference, the applications installed on the receiving terminal 2a and the transmitting terminal 3a are related to the video conference.
- FIG. 2 is a diagram showing an example different from the example of FIG. 1 with respect to a service to which the information processing system of one embodiment of the present invention is applied.
- FIG. 2 shows an example of the image as a result of processing the image IS1 included in the original information IS as a modification of the image IP1 of FIG.
- the image IP2 processed to "replace the head of the sender S with the head of the character A" is displayed.
- the receiver R When such a processing method is desired, for example, when the receiver R has an aversion to the face of the sender S and an image in which the face of the sender S is photographed is presented, the receiver R , The motivation for communication may be diminished. That is, in such a case, the receiver R can use a processing method of replacing the head of the sender S with the head of the favorite character A as a countermeasure against the diminished motivation for communication.
- the image IP3 processed to “mask the eye portion of the sender S” is displayed.
- the receiver R may request such a processing method when, for example, the face of the sender S feels intimidating due to the sharp eyes of the captured image. That is, the receiver R can aim at the effect of removing the cause of feeling intimidating and increasing the concentration on communication by hiding the eye portion of the image in which the face of the sender S is captured.
- the image IP4 processed to "add a gesture of dynamically moving the hand as the movement of the sender S" is displayed.
- Such processing can be requested, for example, when the receiver R receives a bad impression on the sender S who speaks lightly without moving his / her body, removes the bad impression on the sender S, and reduces stress related to communication. Useful in terms of points.
- the specific processing method may be determined in step SS1 according to the desire of the receiver R, or the server. 1a may be randomly determined.
- the specific processing examples in this service have been described above. Summarizing these, when the original information IS is information about an image in which the head of the sender S is presumed to include an image taken, the processing method can be roughly classified into five types. That is, a process of changing at least a part of the body of the sender S, a process of replacing at least a part of the body of the sender S, a process of hiding at least a part of the body of the sender S, and a process of changing the movement of the sender S.
- processing to perform processing There are five types of processing: processing to perform processing and processing to add something to the body of the sender S.
- these examples are merely examples, and the provider of this service may adopt any other processing method in this service.
- the original information IS is information related to voice or information related to characters will be described later with reference to FIGS. 7 and 8.
- FIG. 3 is a system configuration diagram of an information processing system according to an embodiment of the present invention.
- the information processing system shown in FIG. 1 is an information processing system including a server 1a managed by a provider of the present service, a receiving terminal 2a used by a receiver R, and a transmitting terminal 3a used by a sender S. Is.
- the exchange of information may be interactive (for example, video conferencing, telephone, etc.).
- the side that sends the information becomes the sender S, the side that receives the information becomes the receiver R, and the sender S and the receiver R are interchanged depending on the situation. Therefore, in the system shown in FIG.
- the receiving terminal 2a may be the transmitting terminal, and the transmitting terminal 3a may be the receiving terminal. That is, in the present specification, the "reception terminal” and the “transmission terminal” do not mean reception-only and transmission-only devices, respectively (of course, they may be), and information at a certain timing of communication is used.
- the device on the transmitting side is referred to as a transmitting terminal, and the device on the side receiving information at that time is referred to as a receiving terminal. Also, when using a head-mounted display or hearable device, the transmitting terminal may not exist.
- the server 1a, the receiving terminal 2a, and the transmitting terminal 3a are connected to each other via a predetermined network N such as the Internet.
- FIG. 4 is a block diagram showing an example of the hardware configuration of the server 1a in the information processing system according to the embodiment of the present invention.
- the server 1a includes a CPU (Central Processing Unit) 11a, a ROM (Read Only Memory) 12a, a RAM (Random Access Memory) 13a, a bus 14a, an input / output interface 15a, an output unit 16a, and an input unit 17a.
- a storage unit 18a, a communication unit 19a, and a drive 20a are provided.
- the CPU 11a executes various processes according to the program recorded in the ROM 12a or the program loaded from the storage unit 18a into the RAM 13a. Information necessary for the CPU 11a to execute various processes is also appropriately stored in the RAM 13a.
- the CPU 11a, ROM 12a and RAM 13a are connected to each other via the bus 14a.
- An input / output interface 15a is also connected to the bus 14a.
- An output unit 16a, an input unit 17a, a storage unit 18a, a communication unit 19a, and a drive 20a are connected to the input / output interface 15a.
- the Drive 20a is provided as needed.
- the drive 20a is composed of a magnetic disk or the like, and a removable media 31a is appropriately mounted.
- the program read from the removable media 31a by the drive 20a is installed in the storage unit 18a as needed.
- FIG. 5 is a block diagram showing an example of a functional configuration capable of executing personalization processing among the functional configurations of the server 1a, the receiving terminal 2a, and the transmitting terminal 3a of FIG.
- the communication methods are video conferencing (video chat), voice call, and chat, and the application installed on the receiving terminal 2a and the transmitting terminal 3a realizes each of the above three methods. It should be possible. Therefore, the sender S and the receiver R select the method to be utilized this time at a predetermined timing (for example, at the start) of the desired communication.
- a predetermined timing for example, at the start
- each terminal of the reference numerals 2a and 3a becomes a receiving terminal or a transmitting terminal depending on the situation. Will be.
- the request information receiving unit 111a, the processing method determining unit 112a, the original information acquisition unit 113a, the processing unit 114a, and the presenting unit 115a function.
- a processing information DB 200a is provided in one area of the storage unit 18a of the server 1a.
- the receiving terminal 2a transmits the request information IR for the processing method input by the receiver R to the server 1a. Therefore, the request information receiving unit 111a of the server 1a acquires the request information IR transmitted from the receiving terminal 2a via the communication unit 19a.
- the method in which the receiver R inputs the request information IR and the specific form thereof are not particularly limited, and the provider of this service can arbitrarily determine the content.
- the processing method determination unit 112a determines the processing method of the original information IS, which will be described later, based on the condition finally presented to the receiver R (when the receiver R actually recognizes the transmitted content by the receiving terminal 2a). .. That is, the processing method determination unit 112a determines the processing method of the original information IS, which will be described later, based on the request information IR transmitted from the receiving terminal 2a.
- the mode presented to the receiver R is, for example, an explicit request for a processing method by the receiver R. However, this is merely an example, and the mode presented to the receiver R includes all information about the receiver R that contributes to the determination of the processing method.
- the processing method determination unit 112a acquires the specific content of the processing (for example, a predetermined algorithm for processing) associated with each of the above selected contents in advance from the processing information DB 200a. By doing so, the processing method of the original information IS is determined.
- the processing information DB 200a may include a specific algorithm or the like that can be used for processing associated with the content of the request that the receiver R can select, regardless of the form or the specific format. It is stored.
- the method of associating such a request with the specific content of processing is not particularly limited.
- the receiving terminal 2a may generate an ID (Identification) that uniquely identifies the specific content of the processing, and the ID may be transmitted from the receiving terminal 2a to the server 1a.
- the original information acquisition unit 113a acquires information about what should be transmitted from the sender side to the receiver side from the sender S as the original information IS by the communication of the method selected by the user. That is, the original information acquisition unit 113a acquires the original information IS transmitted from the transmission terminal 3a via the communication unit 19a.
- the original information IS may include any form of information, including information about images, sounds, characters and the like. That is, specifically, for example, the original information IS is information about a video of a sender S attending a video conference, information about a voice emitted by the sender S at a telephone or video conference, or sent by e-mail or chat. Information about sentences written by hand S.
- the processing unit 114a processes the original information IS acquired by the original information acquisition unit 113a based on the processing method determined by the processing method determination unit 112a, and acquires the information as the post-processing information IP.
- the processing unit 114a is provided with an image processing unit 121a, a voice processing unit 122a, and a character processing unit 123a. As described above, in the present embodiment, one is selected from the three methods in each communication, but the one that functions for each function after the image processing unit 121a, the voice processing unit 122a, and the character processing unit 123a is different. Hereinafter, the functions of these functional blocks will be described in detail.
- the request information IR transmitted to the request information reception unit 111a can be "processed into the voice spoken by the character B".
- the voice processing unit 122a processes the voice presumed to include the voice of the sender S into the voice of the character B, and further processes the voice into the voice of the character B, and further, the sentence described by the words is Kansai dialect Features (including, for example, first-person nouns "Washi” and “Uchi”, endings "Nen” and “Yan”, etc.) can be added.
- the estimation of whether or not the sender S includes the voice and the specific method of processing the voice are not particularly limited, and the provider of this service provides voice recognition, sentence analysis, and language analysis. Any technology related to such can be adopted.
- the character processing unit 123a processes the characters included in the original information IS acquired by the original information acquisition unit 113a, and functions in chat. Specifically, for example, when the request of the receiver R is "change the ending of the sentence to" dechu "", the character processing unit 123a executes the following series of processing. That is, the character processing unit 123a extracts a portion corresponding to the ending of the sentence from the information of the sentence by using the technique of sentence analysis. The character processing unit 123a further acquires the information of the sentence in which the extracted part is changed to "dechu" as the processed information IP.
- the processing unit 114a generates a post-processing information IP based on various information appropriately processed by the image processing unit 121a, the voice processing unit 122a, and the character processing unit 123a, and acquires the information.
- the presentation unit 115a causes the receiver R to present the processed information IP acquired by the processing unit 114a to a device (for example, a display or a speaker in the case of a video conference) according to the selected method.
- a device for example, a display or a speaker in the case of a video conference
- the terminals 2a and 3a transmit the information indicating the necessity to the server 1a.
- a person to be personalized may be registered in advance, and the personalization process may be executed when the person appears (for example, when a predetermined person is detected by image recognition). .. That is, it is not limited to the selection of the execution of the personalization process itself, and the personalization process may be executed when a predetermined condition is satisfied.
- the terminals 2a and 3a may prompt the user to input them and transmit the input information to the server 1a.
- this personalization process may be executed without fail in a predetermined communication.
- the user of the terminal 2a is referred to as the receiver R, and the terminal 2a is described as the receiving terminal 2a (the terminal 2a receives the personalization processing).
- the user of the terminal 3a is referred to as the sender S, and the terminal 3a is referred to as the transmitting terminal 3a. That is, the person who receives the personalized processing according to the present invention is the receiver R, and the terminal that outputs the personalized content (post-processed information) is the receiving terminal 2a.
- the receiving terminal 2a since only the user (receiver R) of the receiving terminal 2a selects the personalization process in the video conference, the receiving terminal 2a displays the request information input screen for the receiving terminal R. And prompt the input of request information IR.
- the receiving terminal 2a transmits the input request information IR to the server 1a.
- the request information receiving unit 111a acquires the request information IR transmitted from the receiving terminal 2a via the communication unit 19a.
- the processing method determination unit 112a determines the processing method for the original information IS based on the request information IR acquired in step S11.
- the original information acquisition unit 113a acquires the original information IS acquired and transmitted by the transmission terminal 3a via the communication unit 19a. Since the video conference is selected as the communication method here, in this step, the original information acquisition unit 113a acquires a still image corresponding to one frame of the moving image in the video conference.
- the processing unit 114a processes the original information IS acquired in step S13 based on the processing method determined in step S12, and acquires the information as post-processing information IP. For example, when the determined processing method is "change the face of the sender S to a smile", the image processing unit 121a changes the still image corresponding to one frame included in the original information. Is processed.
- the actual words uttered by the sender S and the words presented to the receiver R may be changed. Specifically, for example, even if the sender S actually utters the word " ⁇ ", the receiver R may be presented with the voice " ⁇ dechu". That is, the meaning and content of the voice itself is not processed, but is processed to the extent that the impression of the receiver R is changed, so that the impression received by the receiver R and the stress that can be applied to the receiver R are changed while communicating. is important.
- one of the problems premised on the present invention is, for example, to improve the impression of the recipient (for example, to reduce the stress of the recipient of the information) in communication with the sender of the information.
- such communication between two parties is generally performed through an interface that is responsible for acquiring and presenting information regarding, for example, images, sounds, and characters. That is, the information sent by the sender is presented through the interface of the receiver.
- the information emitted by the sender is adapted to the receiver (eg, based on the preferences of the receiver) before being presented through the interface of the receiver. It is a thing.
- the recipient side has on elements other than the content to be transmitted by the above communication (for example, the sender itself, the facial expression of the sender, the intonation of the sender's voice, etc.), and the content other than the content to be transmitted. It is possible to reduce the obstacles to communication caused by the above factors (for example, in the case of a person who is not good at sending, it is not possible to concentrate on communication just because of communication from that person). Therefore, communication can proceed more smoothly. In addition, usually, for example, when the receiver side feels uncomfortable with the sender side in communication, it is necessary for the receiver side to endure or ask the sender side to improve the unpleasant factor. ..
- a subject that transmits information via some information transmission means can also be a sender.
- some information transmission means for example, a smartphone, a tablet, a webcam, a computer, a smart speaker, a telephone, etc.
- the sender is a predetermined program or artificial intelligence (virtual assistant or the like)
- an information processing device server or the like
- the present invention focuses on the receiver side that receives the information sent by the sender in a certain communication, and the information transmission itself of the sender is not a problem. That is, as described above, it does not matter whether the sender utilizes any device, and the sender does not have to use any information transmitting means.
- the sender may speak directly to the receiver equipped with the hearable device or the head-mounted display.
- the means by which the recipient of the information receives the information is not particularly limited. That is, for example, a subject that receives information via some information receiving means (for example, a smartphone, a tablet, a personal computer, a telephone, a smart speaker, a hearable device, a head-mounted display, etc.) can also be a receiver.
- the recipient must use some kind of information receiving means in the target communication in order to benefit from the personalized processing according to the present invention.
- the above interface may be connected to, for example, the receiving terminal 2a or the transmitting terminal 3a as an external device, or may be structurally integrated with the receiving terminal 2a or the transmitting terminal 3a.
- the information processing system including the server 1a, the receiving terminal 2a, and the transmitting terminal 3a of FIG. 3 appropriately acquires the original information IS and presents the processed information IP as the entire information processing system. It's enough if possible.
- the concept of the service in the present invention is further supplemented.
- the "elements” that characterize the sender presented via a predetermined interface are processed to suit the recipient (for example, according to the taste of the recipient), or depending on the place and purpose of communication.
- the element means a unit or a set of specific information that can be processed in any form of information including images, sounds, and characters.
- the head, facial expression, eye part, movement of the person in the information about the image, and the voice itself, speed, high and low, tone, intonation, dialect, ending, in the voice information And the endings, punctuation marks, etc. in the character information are examples of elements. However, these examples are merely examples and are not particularly limited.
- the machining method is determined based on the elements recommended by the server 1a and the machining destination, regardless of the sender, or the elements to be machined only for the first time by the receiver or the elements thereof. It is assumed that the processing destination is determined, and from the second time onward, the processing method is automatically adjusted according to the default processing method. In any case, it is possible to process the elements related to the sender's communication (image, sound, text, etc.) so that the shape reflects the taste of the recipient and the impression that the recipient feels is improved as much as possible. is important.
- the mode presented to the recipient can be broadly divided into not only the mode that is consciously determined by the recipient and presented to the recipient, but also all aspects related to the recipient that can be acquired regardless of the recipient. It may include a mode that is determined based on information and presented to the recipient. Specifically, for example, a person who is not necessarily a recipient (for example, server 1a) can acquire various information about the recipient's intention and the receiver, determine the form presented to the receiver, and determine the processing method. ..
- a specific symbol may be presented on the screen. For example, if the sender touches the tie, an exclamation mark may be presented on the screen.
- the recipient can obtain information such as that the sender is a person who cares about clothes, and can help build a relationship with the sender by sending a tie on the sender's birthday. it can.
- visual elements that characterize the sender for example, information about what the sender is wearing may be presented on the screen. For example, information such as brand names and product names of sender's clothes, eyeglasses, watches, and prices may be presented on the screen.
- the receiver can obtain detailed information on the hobbies and tastes of the sender, for example, which can be useful for building a relationship with the sender.
- the sender does not actually move his hand, but the processed information presented to the receiver presents an image that seems to move his hand dynamically.
- An example of improving the impression of the recipient (reducing the stress of the recipient) has been described in the above embodiment.
- the sender may present a nodding image in response to the content of the voice or words uttered by the receiver. This gives the recipient the impression that the sender is listening to and trying to understand his or her own words, and as a result, the stress of the recipient is reduced and smoother communication can be achieved.
- the function of the voice processing unit 122a for acquiring a sentence described by voice with the functions of sentence analysis, language analysis, machine learning, etc. to obtain a new function.
- the minutes and summary of the contents of the meeting may be automatically generated.
- the person who made the statement may be specified by using the functions of image recognition and facial expression recognition, and the emotion of the person at the time of the statement may be estimated.
- information for example, a person, a statement, and an emotion associated with the statement may be acquired. Such information makes it possible to analyze the person more deeply, which can be useful, for example, in building a relationship with the person.
- this service does not aim to faithfully reproduce the actions actually taken by the parties involved in communication, as in ordinary video conferencing, but to ensure smooth communication (at least the recipient). It is important to process the movement of the other party (to relieve stress and engage in the communication).
- the request that can be selected by the receiver is described on the premise that an algorithm or the like required for processing is set in advance in association with the request and stored in the processing information DB 200a.
- the processing method determination unit 112a may automatically determine the processing method by the feedback of the receiver or the sender.
- the processing method may be determined (such as changing the character to be processed) according to the vital data of the recipient (heartbeat, pulse, body temperature, sweating amount, etc.).
- a sensor for acquiring them may be provided separately, or they may be acquired by image processing of the recipient's face.
- acquisition of vital data itself is not essential, so any method for acquiring them may be used. More specifically, for example, when the recipient's vital data is low (for example, when he is sleepy), he changes to a character that raises the tension, and when the vital data is high (for example, when he is impatient), he has a calm impression. It may be changed to the character of.
- the processing method determination unit 112a may estimate the recipient's preference from the character selected by the receiver in the past, and determine the processing method based on the result. In that case, information (characters, etc.) that can be candidates for selection may be stored in a database or the like that cannot be shown in advance, or crawling may be performed using the Internet or the like.
- the processing method determination unit 112a may be able to automatically determine the processing method according to the past history (behavior history and processing history) of the receiver, and may determine the state of the receiver or the sender.
- the processing method may be automatically determined according to the detected result.
- the server 1a accumulates the past history of a predetermined receiver and uses a combination of various technologies related to AI (Artificial Intelligence) including machine learning, image analysis, facial expression recognition, voice recognition, language analysis, and the like. can do.
- AI Artificial Intelligence
- the processing method determination unit 112a may not only determine the processing method, but may determine whether or not to process, the degree of processing, and the conditions for executing the processing according to the partner and the situation. .. Specifically, for example, a person to be processed may be registered in advance among the senders, and the processing method may be automatically or manually determined only when the communication partner is the registered person. Also, for example, if the sender is a person who is not particularly good at receiving (a person who does not want to see the face), the entire face should be processed, or if only the voice is not good, the face should be processed. Instead, only the voice may be processed, or if the recipient is not good at an expressionless person, only the facial expression may be changed to a smile.
- the setting that does not process the original information IS from the sender may be adopted to prioritize the accuracy of communication.
- the provider or recipient of this service can proceed with communication more smoothly by properly using these settings.
- the provider of this service may take the following measures. Can also be taken. Specifically, for example, the provider of this service presents two images, for example, an image in which the actual state of the sender is photographed and an image after processing in which the personalization process is executed, on two screens. Alternatively, the specifications may be switched and presented. At this time, for example, only one of an image in which the actual state is taken and an image after processing may be presented in a small size.
- the provider of this service presents the processed image to the receiver and at the same time, the sender. It is also possible to record the video of the other party so that the situation of the other party can be confirmed later at an arbitrary timing. At this time, the voice of the other party and the text of the e-mail may also be recorded.
- a request regarding a processing method is input by selection from a receiver, and a specific processing method is determined based on the request in the processing method determination unit 112a.
- the request from the receiver does not have to be input by selection, it may be input in any format, and the request input from the receiver does not necessarily have to be the request, and the processing method determination unit 112a , Anything that can determine the processing method is sufficient.
- the processing method determination unit 112a automatically determines the processing method as described above, the request itself does not need to be input, and the server 1a is presented to the receiver as described above. As a mode, it is sufficient to be able to acquire the behavior history of the receiver and sender, context information, vital data, and so on.
- the context information of the receiver and the sender refers to all the internal and external states of the receiver and the sender.
- the internal state of the receiver or sender is the state in which the receiver or sender can be sensed. Specifically, for example, height, weight, age, physical condition (body temperature, pulse, etc.) and emotions (emotions, emotions, etc.) of the receiver or sender are examples of the internal state of the receiver or sender.
- the external state regarding the receiver and the sender is a state in which the environment around the receiver and the sender can be sensed. Specifically, for example, weather, temperature, room temperature, temperature, spatial or temporal arrangement position (including forecast) (temporal arrangement position means, for example, the current time), as well as a receiver and a sender.
- a predetermined state distributed in at least one of the spatial direction and the time direction around the hand is an example of the external state of the receiver and the sender.
- the processing method determination unit 112a has been described as determining a single processing method based on a request from the receiver, but the present invention is not particularly limited to this. That is, for example, the processing method determination unit 112a may select a plurality of processing methods. In this case, for example, the presentation unit 115a may present the processed information of a plurality of types to the receiver, or may present only one of the types of information to the receiver. Specifically, the processing method determination unit 112a automatically temporarily determines at least one or a plurality of processing methods according to, for example, the vital data of the recipient and the past history of the recipient, and determines one or more processing methods that have been provisionally determined.
- the receiver selects one processing method that he / she desires from the presented processing method candidates.
- the processing method determination unit 112a determines the processing method 1 selected by the receiver from the processing method candidates
- the character processing unit 123a processes the original information IS based on the processing method determined by the processing method determination unit 112a. ..
- the receiver can select not to process by not selecting any of the processing method candidates.
- the timing at which the processing method determination unit 112a presents (recommends) a processing method candidate to the receiver may be, for example, at the start or in the middle of communication.
- the processing method determination unit 112a may select a different processing method for each time depending on the content and situation of communication. As a result, the recipient can more effectively reduce the stress of communication without getting bored.
- the above-mentioned series of processes can be executed by hardware or software.
- the functional configuration of FIG. 5 is merely an example and is not particularly limited. That is, it is sufficient that the information processing system is provided with a function capable of executing the above-mentioned series of processes as a whole, and what kind of functional block is used to realize this function is not particularly limited to the example of FIG.
- the location of the functional block is not particularly limited to FIG. 5, and may be arbitrary.
- the functional block of the server 1a may be transferred to a receiving terminal 2a, a transmitting terminal 3a, or the like.
- the functional block of the receiving terminal 2a or the transmitting terminal 3a may be transferred to the server 1a or the like.
- one functional block may be configured by a single piece of hardware, a single piece of software, or a combination thereof.
- the recording medium containing such a program is not only composed of a removable medium (not shown) distributed separately from the main body of the device in order to provide the program to the receiving terminal 2a or the transmitting terminal 3a, but also is preliminarily attached to the main body of the device. It is composed of a recording medium or the like provided to the receiving terminal 2a or the transmitting terminal 3a in the incorporated state.
- text data as original information is converted into post-processed information consisting of audio data and image data (conversion from a text transmission method to a voice and video transmission method).
- text data Converting text data as original information into post-processed information consisting of text data and image data (conversion from text transmission method to text and video transmission method)
- post-processed information consisting of text data and image data
- text data is converted into text data.
- Processed information consisting of voice data and image data (conversion from text transmission method to text, voice and video transmission method), (5) Voice data as original information, text and voice data Processed information consisting of image data (conversion from voice transmission method to text, voice and video transmission method), (6) Image data and voice data as original information, text data and voice data Post-processing information consisting of (conversion from video transmission method to text and voice transmission method), (7) Original information audio data to post-processing information consisting of audio data and image data (Conversion from a voice transmission method to a voice and video transmission method). Further, as a mode of personalization, (8) text data as original information is converted into post-processed information which is voice data (conversion from text transmission method to voice transmission method), (9) original information.
- converting the voice data as text data into post-processed information conversion from a voice transmission method to a text transmission method.
- “processing” does not change the content that the sender wants to convey in a predetermined transmission (meaning of thought or expression to be transmitted, subject, etc.) to another content. That is, the "processing” of the present invention may change the way of expression, but it does not change the content itself to be conveyed, but changes the transmission method which is the means of communication.
- the “processing” of the present invention also includes changing the way of expression as long as the content itself is not changed.
- the AI artificial intelligence
- the transmission method so as to match the receiver R regarding the situation at that time in a certain communication, so that the situation information includes at least the request information.
- step SS1 the sender S inputs the original information as the information of the first form by operating the sender terminal 2b or the like.
- step SS2 the sender terminal 2b transmits the original information input in step SS1 to the server 1b based on the operation of the sender S or automatically.
- step SS3 the server 1b acquires the status information transmitted from the receiving terminal 3b.
- step SS4 the server 1b acquires the original information transmitted from the sender terminal 2b, and a predetermined mode determined based on the situation information (a transmission method suitable for the receiver, for example, designated by the receiver R). Personalization is performed according to the mode and the mode according to the taste and condition of the recipient.
- the input original information can be expressed with the situation of the receiver R (for example, the personality, values, tastes, mood, physical condition, surrounding environment, or sender of the receiver R). It will be personalized and output to the post-processing information corresponding to (power balance, etc.).
- a message (text data) indicating denial such as "NO” or "No” is output to the receiver R side as post-processing information together with a predetermined image.
- the transmission method is based on text input, but when the sender R reaches the receiver R, the transmission method is converted to a text and an image.
- the image is a character preferred by the receiver R. Therefore, according to this example, for example, even if the transmitted content input by the sender S is unfavorable to the receiver R, or the sender S itself is not good at the receiver R, that is, the input is input by the sender S.
- the sender S suffers disadvantages such as stress, emotional, depressedness, and illness.
- the receiver R When the receiver R comes into contact with the transmitted content, the receiver R can be made to feel as if the message is from the character C that the receiver R likes. Therefore, the disadvantage that the receiver R due to the sender S side may receive can be reduced. As a result, comfortable communication can be realized not only for the receiver R but also for both parties including the sender S.
- FIG. 11 is a system configuration diagram of an information processing system according to an embodiment of the present invention.
- the information processing system shown in FIG. 11 is configured to include a server 1b, a sender terminal 2b, and a receiver terminal 3b.
- the server 1b, the sender terminal 2b, and the receiver terminal 3b are each connected to each other via a predetermined network N such as the Internet.
- the server 1b is managed by the service provider G, and executes various processes for realizing the personalized service while appropriately communicating with the sender terminal 2b and the receiver terminal 3b.
- the sender terminal 2b is an information processing device operated by the sender S, and is, for example, a personal computer, a smartphone, a tablet, or the like.
- the receiver terminal 3b is an information processing device operated by the receiver R, and is, for example, a personal computer, a smartphone, a tablet, or the like.
- the hardware configurations of the server 1b, the sender terminal 2b, and the receiver terminal 3b can be basically the same as the hardware configuration of the server 1a in FIG.
- FIG. 12 is a functional block diagram of the server 1b for executing the personalization process.
- the status acquisition unit 101b, the mode determination unit 102b, the original information acquisition unit 103b, the personalization unit 104b, and the presentation control unit 105b function.
- a processing information DB 181b is provided in one area of the storage unit 18b of the server 1b.
- the exchange of information may be interactive (for example, video conferencing, telephone, etc.).
- the side that sends the information becomes the sender S
- the side that receives the information becomes the receiver R
- the sender S and the receiver R are interchanged depending on the situation. Therefore, in the system shown in FIG.
- the receiver terminal 3b transmits the status information including the request information indicating the request regarding the mode of personalization input by the receiver R to the server 1b.
- the receiver terminal 3b may acquire the request information based on the desired personalization mode (transmission method of the change destination) input by the receiver R, and the usage status of the sensor and the receiver terminal 3b provided by the receiver R, currently Request information may be determined by automatically acquiring the personalization mode from the application or the like being used.
- the request information includes, for example, "replace the head of the sender S with the head of the character C”, “change the word spoken by the sender S to Kansai dialect”, and “end the sentence with” dechu " It is possible to include a request for a personalization mode regarding a change in a specific expression method such as "change to”.
- the original information acquisition unit 103b acquires the information by the transmission method of the first form input by the sender S as the original information. Specifically, when the original information input to the sender terminal 2b is transmitted to the server 1b, the original information acquisition unit 103b acquires the original information via the communication unit 19b.
- the original information can include any form of information including text data, audio data, and image data. Specifically, for example, text data related to a message input by sender S in an email or chat, audio data related to voice emitted by sender S in a telephone or video conference, and an image related to a photographed image of sender S attending a video conference.
- the data and the like are all examples of original information.
- the personalization unit 104b executes control for personalizing the original information acquired by the original information acquisition unit 103b based on the personalization mode determined by the mode determination unit 102b, and outputs post-processing information. ..
- the processing control unit 140b, the text processing unit 141b, the voice processing unit 142b, and the image processing unit 143b function.
- the processing control unit 140b has a text processing unit 141b and a voice processing unit 142b so as to be a transmission method when presenting the content of the original information to the receiver R based on the personalization mode determined by the mode determination unit 102b.
- the image processing unit 143b is appropriately controlled.
- the processing control unit 140b is a necessary processing unit among the text processing unit 141b, the voice processing unit 142b, and the image processing unit 143b, according to the transmission method of the change destination determined by the mode determination unit 102b. Is selected, and the selected processing unit is made to execute the necessary processing.
- the processing control unit 140 is the original acquired by the original information acquisition unit 103b in the corresponding processing unit. The above new data is generated based on the information.
- the processing control unit 140b causes the corresponding processing unit to process the original information acquired by the original information acquisition unit 103b.
- the presentation control unit 105b executes a control of presenting the processed information processed under the control of the personalization unit 104b to the receiver R. Specifically, the presentation control unit 105b executes a control for causing the receiving terminal 3b to present the processed information. For example, the presentation control unit 105b executes control for presenting processed information to various devices (for example, a display or a speaker in the case of a video conference) according to a method selected by the receiver R. Regarding the configuration illustrated in FIG. 12, for example, the sender S inputs the transmission content by a text-type transmission method such as an email, while the receiver R is different from the above text method for the mail from the sender S.
- a text-type transmission method such as an email
- the presentation control unit 105b transmits the processed information to the receiver terminal 3b, and causes the display of the receiver terminal 3b to display a text indicating the content input by the sender S.
- the transmission by voice from the sender S is performed. Since the recipient R of the content is recognized by text (the input of the receiver R is voice), the receiver R can recognize the transmitted content from the sender S without lowering or turning off the volume of the content. Therefore, the receiver R can perform interactive communication while viewing the music content or the video content without impairing the way of enjoying the content.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021508709A JPWO2020194828A1 (https=) | 2019-03-22 | 2019-10-27 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2019055398 | 2019-03-22 | ||
| JP2019-055398 | 2019-03-22 | ||
| JP2019149095 | 2019-08-15 | ||
| JP2019-149095 | 2019-08-15 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020194828A1 true WO2020194828A1 (ja) | 2020-10-01 |
Family
ID=72608878
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2019/042079 Ceased WO2020194828A1 (ja) | 2019-03-22 | 2019-10-27 | 情報処理システム、情報処理装置、および情報処理方法 |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2020194828A1 (https=) |
| WO (1) | WO2020194828A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7138997B1 (ja) * | 2021-10-14 | 2022-09-20 | 株式会社I’mbesideyou | ビデオミーティング評価端末 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005346252A (ja) * | 2004-06-01 | 2005-12-15 | Nec Corp | 情報伝達システムおよび情報伝達方法 |
| JP2007537650A (ja) * | 2004-05-14 | 2007-12-20 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | 受信者から受信者へメッセージを伝送する方法、メッセージ伝送システム及びメッセージ変換手段 |
| WO2018185988A1 (ja) * | 2017-04-06 | 2018-10-11 | ソニー株式会社 | 情報処理システムおよび記憶媒体 |
| JP2018163657A (ja) * | 2017-03-24 | 2018-10-18 | エヌエイチエヌ エンターテインメント コーポレーションNHN Entertainment Corporation | 自動翻訳提供方法、記憶媒体、及び自動翻訳提供サーバ |
-
2019
- 2019-10-27 JP JP2021508709A patent/JPWO2020194828A1/ja active Pending
- 2019-10-27 WO PCT/JP2019/042079 patent/WO2020194828A1/ja not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007537650A (ja) * | 2004-05-14 | 2007-12-20 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | 受信者から受信者へメッセージを伝送する方法、メッセージ伝送システム及びメッセージ変換手段 |
| JP2005346252A (ja) * | 2004-06-01 | 2005-12-15 | Nec Corp | 情報伝達システムおよび情報伝達方法 |
| JP2018163657A (ja) * | 2017-03-24 | 2018-10-18 | エヌエイチエヌ エンターテインメント コーポレーションNHN Entertainment Corporation | 自動翻訳提供方法、記憶媒体、及び自動翻訳提供サーバ |
| WO2018185988A1 (ja) * | 2017-04-06 | 2018-10-11 | ソニー株式会社 | 情報処理システムおよび記憶媒体 |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7138997B1 (ja) * | 2021-10-14 | 2022-09-20 | 株式会社I’mbesideyou | ビデオミーティング評価端末 |
| WO2023062795A1 (ja) * | 2021-10-14 | 2023-04-20 | 株式会社I’mbesideyou | ビデオミーティング評価端末 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2020194828A1 (https=) | 2020-10-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11948241B2 (en) | Robot and method for operating same | |
| US20090079816A1 (en) | Method and system for modifying non-verbal behavior for social appropriateness in video conferencing and other computer mediated communications | |
| KR20220024557A (ko) | 자동화된 어시스턴트에 의한 응답 액션을 트리거하기 위한 핫 명령의 검출 및/또는 등록 | |
| CN108701142A (zh) | 信息处理系统、客户终端、信息处理方法和记录介质 | |
| US20110283190A1 (en) | Electronic personal interactive device | |
| US20240012839A1 (en) | Apparatus, systems and methods for providing conversational assistance | |
| CN119731730A (zh) | 头像表示和音频生成 | |
| WO2020026850A1 (ja) | 情報処理装置、情報処理方法及びプログラム | |
| CN119768846A (zh) | 基于语义情境的头像面部表情 | |
| WO2017062165A1 (en) | Facilitating awareness and conversation throughput in an augmentative and alternative communication system | |
| US20240323332A1 (en) | System and method for generating and interacting with conversational three-dimensional subjects | |
| JPWO2019026360A1 (ja) | 情報処理装置および情報処理方法 | |
| CN114566187A (zh) | 具有会话后表示的系统、电子装置和相关方法 | |
| US20210136323A1 (en) | Information processing device, information processing method, and program | |
| JP7716917B2 (ja) | 会議制御装置、会議制御方法及びコンピュータープログラム | |
| WO2020194828A1 (ja) | 情報処理システム、情報処理装置、および情報処理方法 | |
| JP7152453B2 (ja) | 情報処理装置、情報処理方法、情報処理プログラム及び情報処理システム | |
| US20240330380A1 (en) | Real-time ai-driven speaking suggestions during asynchronous video capture | |
| EP4297018A1 (en) | Techniques for presenting textual messages using a user-specific voice model | |
| JP7845940B2 (ja) | 情報処理装置、情報処理方法、および情報処理プログラム | |
| WO2019142420A1 (ja) | 情報処理装置および情報処理方法 | |
| JP2026072545A (ja) | システム | |
| JP2026070960A (ja) | システム | |
| JP2026033171A (ja) | システム | |
| WO2024062779A1 (ja) | 情報処理装置、および情報処理システム、並びに情報処理方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19922206 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021508709 Country of ref document: JP |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 19922206 Country of ref document: EP Kind code of ref document: A1 |