WO2020188885A1

WO2020188885A1 - Information processing method, program, and terminal

Info

Publication number: WO2020188885A1
Application number: PCT/JP2019/045439
Authority: WO
Inventors: 亮介濱窄
Original assignee: Ｌｉｎｅ株式会社
Priority date: 2019-03-19
Filing date: 2019-11-20
Publication date: 2020-09-24
Also published as: JP7057455B2; JP6832971B2; JP2021119455A; JP2020154652A

Abstract

This information processing method of a terminal, which transmits content to a first terminal or receives content that has been transmitted from the first terminal, includes: displaying, on a display region of the terminal, first content that has been transmitted from the first terminal and second content transmitted to the first terminal by a communication unit of the terminal; performing, by a control unit of the terminal, a control pertaining to a call with the first terminal on the basis of an input by a user of the terminal onto the display region on which the first content and the second content are displayed; acquiring, by the control unit, first information based on the voice of a user of the first terminal and second information based on the voice of the user of the terminal on the basis of the call with the first terminal; and displaying, on the display region, call information based on the first information and the second information.

Description

Information processing method, program, terminal

This disclosure relates to information processing methods, programs, and terminals of terminals.

In recent years, users have been exchanging messages by communication via messaging services. Further, in such a messaging service, there is also a messaging service in which users can make a call or a video call. Patent Document 1 discloses an example of such a system.

JP-A-2014-232502

According to the first aspect of the present invention, it is an information processing method of a terminal that transmits content to a first terminal or receives content transmitted from the first terminal, and is a first content transmitted from the first terminal. Based on the display of the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal and the input by the user of the terminal to the display area for displaying the first content and the second content. The control unit of the terminal controls the call with the first terminal, and the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal are referred to as the first terminal. It includes the acquisition by the control unit based on the call of the above and the display of the call information based on the first information and the second information in the display area.
According to the second aspect of the present invention, it is a program to be executed by the computer of the terminal that transmits the content to the first terminal or receives the content transmitted from the first terminal, and is transmitted from the first terminal. The first content and the second content transmitted to the first terminal by the communication unit of the terminal are displayed in the display area of the terminal, and the first content and the second content are displayed. Based on the input by the user of the terminal to the display area, the control unit of the terminal controls the call with the first terminal, the first information based on the voice of the user of the first terminal, and the first. The second information based on the voice of the user of the terminal is acquired by the control unit based on the call with the first terminal, and the call information based on the first information and the second information is stored in the display area. Includes displaying.
According to a third aspect of the present invention, a terminal that transmits content to or receives content transmitted from the first terminal, the first content transmitted from the first terminal, and Based on a display unit that displays the second content transmitted to the first terminal by the communication unit of the terminal, and input by the user of the terminal to the display unit that displays the first content and the second content. The control unit includes a control unit that controls a call with the first terminal, and the control unit includes first information based on the voice of the user of the first terminal and a second information based on the voice of the user of the first terminal. The information is acquired based on the call with the first terminal, and the display unit displays the call information based on the first information and the second information.

The figure which shows the structure of the communication system in one aspect of Embodiment. It is a figure which shows one Embodiment of a communication system, (a) is a figure which shows the state of a call between users through a messaging service, (b) is a table of a talk room in a messaging service after a call. The figure which shows the illustration. A sequence diagram showing an exchange in a communication system. A flowchart showing an example of display processing of a call and a call content on a terminal. A flowchart showing an operation example of switching the display / non-display of the call content in the talk room by the terminal. (A) is a screen view showing an example of a talk room before a call on a terminal. (B) is a screen view showing an example of a talk room after a call on a terminal. (A) is a screen view showing an example of a talk room in a state where the contents of a call on the terminal are not expanded. (B) is a screen view showing an example of displaying a talk room on a terminal and expanding the contents of a call. (A) is a diagram showing how the user brings his / her finger closer to the call icon. (B) is a screen view showing an example in which the call icon is enlarged and displayed. (A) is a screen view showing an example of displaying a message indicating the content of a call by pop-up. (B) is a screen view showing an example of transitioning to another screen and displaying a message indicating the contents of a call. A flowchart showing an operation example when a video call is executed on a terminal. (A) is a schematic diagram showing a part of a call, (b) is a schematic diagram showing an example of a situation following FIG. 12 (a), and (c) is a display example of a talk room after a call. The screen view which shows. (A) is a screen view showing an example of displaying an image relating to the position of a terminal as a background image. (B) is a screen view showing a display example in which the background image and the contents of the call are linked. (A) is a schematic diagram showing a part of a call, (b) is a schematic diagram showing an example of a situation following FIG. 13 (a), and (c) is a display example of a talk room after a call. The screen view which shows. A flowchart showing an operation example related to the display of a call icon on a terminal. (A) is a screen view showing a display example when the case where the call volume is relatively small is expressed by the display size of the call icon. (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a). (A) is a screen view showing a display example in which the case where the call volume is relatively small is represented by the color of the call icon. (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a). (A) and (b) are screen views showing an example of displaying an image related to the contents of a call in a talk room as an alternative to a call icon. (A) is a screen view showing an example in which an image relating to the contents during a call is displayed in the talk room as an alternative to the call icon, and the call icon is also displayed. (B) is a screen view showing an example in which an image is enlarged and displayed as a substitute for a call icon. (A) is a screen view showing a display example of a message showing the contents of a call. (B) is a screen view showing an example of a heading (summary) to be displayed in the case of the call content shown in (a).

<Compliance with legal matters>
It should be noted that the disclosures described herein are premised on compliance with the legal matters of the implementing country necessary for the implementation of this disclosure, such as secrecy of communications.

An embodiment for implementing a display method or the like capable of confirming a situation related to transmission or reception by a terminal according to the present disclosure will be described with reference to the drawings.

<System configuration>
FIG. 1 shows the configuration of the communication system 1 according to the embodiment of the present disclosure. As disclosed in FIG. 1, in the communication system 1, the server 10 and the terminal 20 (terminal 20A, terminal 20B, terminal 20C) are connected via the network 30. The server 10 provides a service for transmitting and receiving a message between terminals 20 to a terminal 20 owned by a user via a network 30. The number of terminals 20 connected to the network 30 is not limited.

The network 30 plays a role of connecting one or more terminals 20 and one or more servers 10. That is, the network 30 means a communication network that provides a connection route so that data can be transmitted and received after the terminal 20 connects to the server 10.

One or more parts of the network 30 may or may not be a wired network or a wireless network. The network 30 is not limited, but is, for example, an ad hoc network, an intranet, an extra net, a virtual private network (VPN), a local area network (LAN), and a wireless network. LAN (wireless LAN: WLAN), wide area network (WAN), wireless WAN (wireless WAN: WWAN), metropolitan area network (metropolitan area network: MAN), part of the Internet, public exchange telephone network (Public) Part of Switched Telephone Network: PSTN), mobile phone network, ISDN (integrated service digital networks), wireless LAN, LTE (long term evolution), CDMA (code division multiple access), Bluetooth (Bluetooth (registered trademark)), satellite It can include communications, etc., or a combination of two or more of these. The network 30 may include one or more networks 30.

The terminal 20 (terminal 20A, terminal 20B, terminal 20C) may be any information processing terminal that can realize the functions described in each embodiment. The terminal 20 is not limited but, for example, a smartphone, a mobile phone (feature phone), a computer (not limited, for example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, for example, a cable, a satellite set). Top boxes, digital video recorders), handheld computer devices (not limited to, for example, PDAs (personal digital assistants), email clients, etc.), wearable terminals (glasses devices, clock devices, etc.), or other types of computers , Or includes a communication platform. Further, the terminal 20 may be expressed as an information processing terminal.

Since the configurations of the terminal 20A, the terminal 20B, and the terminal 20C are basically the same, the terminal 20 will be described in the following description. Further, if necessary, the terminal used by the user X is expressed as the terminal 20X, and the user information in the predetermined service associated with the user X or the terminal 20X is expressed as the user information X. The user information is user information associated with an account used by the user in a predetermined service. The user information is not limited but, as an example, input by the user or given by a predetermined service, the user's name, the user's icon image, the user's age, the user's gender, the user's address, the user's hobby. It includes information associated with the user, such as a preference, a user's identifier, and may or may not be any one or combination of these.

The server 10 has a function of providing a predetermined service to the terminal 20. The server 10 may be any device as long as it is an information processing device that can realize the functions described in each embodiment. The server 10 is not limited, but by example, a server device, a computer (not limited, by example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, by example, a cable, a satellite set top box, a digital video recorder). ), Handheld computer devices (for example, but not limited to PDAs, email clients, etc.), or other types of computers, or communication platforms. Further, the server 10 may be expressed as an information processing device. When it is not necessary to distinguish between the server 10 and the terminal 20, the server 10 and the terminal 20 may or may not be expressed as information processing devices, respectively.

<Hardware (HW) configuration>
The HW configuration of each device included in the communication system 1 will be described with reference to FIG.

(1) HW configuration of the terminal

The terminal 20 includes a control unit 21 (CPU: central processing unit), a storage unit 28, a communication I / F 22 (interface), an input / output unit 23, a display unit 24, and a position information acquisition unit 25. Each component of the HW of the terminal 20 is connected to each other via bus B, for example, without limitation. It is not essential that the HW configuration of the terminal 20 includes all the components. As an example without limitation, the terminal 20 may or may not be configured to remove individual components such as a microphone 232, a camera 234, a position information acquisition unit 25, or a plurality of components. Good.

The communication I / F 22 transmits and receives various data via the network 30. The communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed. The communication I / F 22 has a function of executing communication with the server 10 via the network 30. The communication I / F 22 transmits various data to the server 10 according to an instruction from the control unit 21. Further, the communication I / F 22 receives various data transmitted from the server 10 and transmits the various data to the control unit 21. Further, the communication I / F 22 may be simply expressed as a communication unit. Further, when the communication I / F 22 is composed of a physically structured circuit, it may be expressed as a communication circuit.

The input / output unit 23 includes a device for inputting various operations to the terminal 20 and a device for outputting the processing result processed by the terminal 20. The input / output unit 23 may or may not be integrated with the input unit and the output unit, or may be separated into the input unit and the output unit.

The input unit is realized by any or a combination of all kinds of devices capable of receiving input from the user and transmitting information related to the input to the control unit 21. The input unit is not limited, but as an example, hardware keys such as a touch panel 231 and a touch display and a keyboard, a pointing device such as a mouse, a camera 234 (operation input via a moving image), and a microphone 232 (operation input by voice). including.

The output unit is realized by any or a combination of all kinds of devices capable of outputting the processing result processed by the control unit 21. The output unit includes, for example, a touch panel, a touch display, a speaker 233 (audio output), a lens (not limited, as an example, 3D (three dimensions) output, hologram output), a printer, and the like.

The display unit 24 is realized by any or a combination of all kinds of devices that can display according to the display data written in the frame buffer. The display unit 24 is not limited but, for example, a touch panel, a touch display, a monitor (not limited but, for example, a liquid crystal display or OELD (organic electroluminescence display)), a head mounted display (HDM: Head Mounted Display), projection mapping, a hologram. , Includes a device capable of displaying images, text information, etc. in the air (which may or may not be vacuum). It should be noted that these display units 24 may or may not be able to display display data in 3D.

When the input / output unit 23 is a touch panel, the input / output unit 23 and the display unit 24 may be arranged so as to face each other with substantially the same size and shape.

The control unit 21 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by. Therefore, the control unit 21 may or may not be expressed as a control circuit.

The control unit 21 is not limited, but as an example, a central processing unit (CPU), a microprocessor (microprocessor), a processor core (processor core), a multiprocessor (multiprocessor), an ASIC (application-specific integrated circuit), and an FPGA (field programmable). gate array) is included.

The storage unit 28 has a function of storing various programs and various data required for the terminal 20 to operate. The storage unit 28 includes various storage media such as HDD (hard disk drive), SSD (solid state drive), flash memory, RAM (random access memory), and ROM (read only memory) as examples without limitation. Further, the storage unit 28 may or may not be expressed as a memory.

The terminal 20 stores the program P in the storage unit 28, and by executing this program P, the control unit 21 executes the processing as each unit included in the control unit 21. That is, the program P stored in the storage unit 28 causes the terminal 20 to realize each function executed by the control unit 21. Further, this program P may or may not be expressed as a program module.

The microphone 232 is used for inputting voice data. The speaker 233 is used for outputting audio data. The camera 234 is used for acquiring moving image data. The camera 234 may be provided on both sides of the side where the display unit 24 of the terminal 20 is provided and the side opposite to the side where the display unit 24 is provided, and the in-camera and the out-camera, respectively, may be provided. Sometimes called a camera. Switching between the in-camera and the out-camera is executed by input from the user of the terminal 20.

(2) Server HW configuration
The server 10 includes a control unit 11 (CPU), a storage unit 15, a communication I / F 14 (interface), an input / output unit 12, and a display unit 13. Each component of the HW of the server 10 is connected to each other via bus B, for example, without limitation. It should be noted that the HW of the server 10 does not necessarily include all the components as the configuration of the HW of the server 10. As an example but not a limitation, the HW of the server 10 may or may not be configured to remove the display unit 13.

The control unit 11 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by.

The control unit 11 is typically a central processing unit (CPU), and may or may not be a microprocessor, a processor core, a multiprocessor, an ASIC, or an FPGA. In the present disclosure, the control unit 11 is not limited to these.

The storage unit 15 has a function of storing various programs and various data required for the server 10 to operate. The storage unit 15 is realized by various storage media such as HDD, SSD, and flash memory. However, in the present disclosure, the storage unit 15 is not limited to these. Further, the storage unit 15 may or may not be expressed as a memory.

The communication I / F 14 transmits and receives various data via the network 30. The communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed. The communication I / F 14 has a function of executing communication with the terminal 20 via the network 30. The communication I / F 14 transmits various data to the terminal 20 according to an instruction from the control unit 11. Further, the communication I / F 14 receives various data transmitted from the terminal 20 and transmits the various data to the control unit 11. Further, the communication I / F 14 may be simply expressed as a communication unit. Further, when the communication I / F 14 is composed of a physically structured circuit, it may be expressed as a communication circuit.

The input / output unit 12 is realized by a device that inputs various operations to the server 10. The input / output unit 12 is realized by any or a combination of all kinds of devices capable of receiving an input from a user and transmitting information related to the input to the control unit 11. The input / output unit 12 is typically realized by a hardware key typified by a keyboard or the like, or a pointing device such as a mouse. The input / output unit 12 is not limited to the input / output unit 12, and may or may not include a touch panel, a camera (operation input via moving image), and a microphone (operation input by voice) as an example. However, in the present disclosure, the input / output unit 12 is not limited to these.

The display unit 13 is typically realized by a monitor (not limited to, for example, a liquid crystal display or an OELD (organic electroluminescence display)). The display unit 13 may or may not be a head-mounted display (HDMI) or the like. It should be noted that these display units 13 may or may not be able to display display data in 3D. However, in the present disclosure, the display unit 13 is not limited to these.
The server 10 stores the program P in the storage unit 15, and by executing the program P, the control unit 11 executes the processing as each unit included in the control unit 11. That is, the program P stored in the storage unit 15 causes the server 10 to realize each function executed by the control unit 11. This program P may or may not be expressed as a program module.

In each embodiment of the present disclosure, it will be described as being realized by the CPU of the terminal 20 and / or the server 10 executing the program P.

The control unit 21 of the terminal 20 and / or the control unit 11 of the server 10 is formed not only in a CPU having a control circuit but also in an integrated circuit (IC (Integrated Circuit) chip, LSI (Large Scale Integration)) and the like. Each process may or may not be realized by a logic circuit (hardware) or a dedicated circuit. Further, these circuits may be realized by one or a plurality of integrated circuits, and the plurality of processes shown in each embodiment may or may not be realized by one integrated circuit. Further, the LSI may be referred to as a VLSI, a super LSI, an ultra LSI, or the like depending on the degree of integration. Therefore, the control unit 21 may or may not be expressed as a control circuit.

Further, the program P (not limited to, for example, a software program, a computer program, or a program module) of each embodiment of the present disclosure may be provided in a state of being stored in a computer-readable storage medium. , Does not have to be. The storage medium can store the program P in a “non-temporary tangible medium”. In addition, the program P may or may not be for realizing a part of the functions of each embodiment of the present disclosure. Further, it may or may not be a so-called difference file (difference program) that can realize the functions of each embodiment of the present disclosure in combination with the program P already recorded on the storage medium.

The storage medium may be one or more semiconductor-based or other integrated circuits (ICs) (for example, but not limited to field programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hardware. Disk drive (HDD), hybrid hard drive (HHD), optical disk, optical disk drive (ODD), magneto-optical disk, magneto-optical drive, floppy diskette, floppy disk drive (FDD), magnetic tape, solid It can include a drive (SSD), a RAM drive, a secure digital card, or a drive, any other suitable storage medium, or any suitable combination of two or more of these. The storage medium may be volatile, non-volatile, or a combination of volatile and non-volatile, where appropriate. The storage medium is not limited to these examples, and any device or medium may be used as long as the program P can be stored. Further, the storage medium may or may not be expressed as a memory.

The server 10 and / or the terminal 20 can read the program P stored in the storage medium and execute the read program P to realize the functions of the plurality of functional units shown in each embodiment.

Further, the program PDD of the present disclosure may or may not be provided to the server 10 and / or the terminal 20 via an arbitrary transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. Good. The server 10 and / or the terminal 20 realizes the functions of the plurality of functional units shown in each embodiment by executing the program P downloaded via the Internet or the like, as an example without limitation.

Each embodiment of the present disclosure may also be realized in the form of a data signal embedded in a carrier wave, in which the program P is embodied by electronic transmission.
At least part of the processing on the server 10 and / or the terminal 20 may or may not be realized by cloud computing composed of one or more computers.
At least a part of the processing in the terminal 20 may or may not be performed by the server 10. In this case, at least a part of the processing of each functional unit of the control unit 21 of the terminal 20 may or may not be performed by the server 10.
At least a part of the processing in the server 10 may or may not be performed by the terminal 20. In this case, at least a part of the processing of each functional unit of the control unit 11 of the server 10 may or may not be performed by the terminal 20.
Unless explicitly stated, the configuration of the determination in the embodiment of the present disclosure is not essential, and a predetermined process is operated when the determination condition is satisfied, or a predetermined process is performed when the determination condition is not satisfied. It may or may not be.

The program of this disclosure is not limited to, but examples include scripting languages such as ActionScript and JavaScript (registered trademark), object-oriented programming languages such as Objective-C and Java (registered trademark), and markup languages such as HTML5. Implemented using.

<Functional configuration>
<Embodiment 1>
<Overview>
In the communication system 1 according to the present embodiment, messages can be exchanged on the talk room between the terminals 20 via the server 10 and via the messaging application. The talk room is a place where users who use the messaging service exchange contents in the messaging service provided by the server 10. In addition, the content exchanged in the talk room includes various file information such as character information input by the user using his / her own terminal 20, image information including photos and stamps, audio files, video files, and data files. Including, but not limited to these.

In the communication system 1, the users of the terminal 20 can further execute a call via the talk room. In the communication system 1, the

users

10a and 10b make a call as shown in FIG. 2A. After the call is terminated, in the talk room, image information indicating that the users have made a call (hereinafter referred to as a call icon. The image indicating that a call has been made is not limited to the icon. No. The image information is not limited, but an example of information related to a call.) Is displayed. Further, in the present embodiment, as shown in FIG. 2B, the terminal further displays a message (not limited, but an example of call information) indicating the content of the call as text. FIG. 2B is a diagram showing an example of a display screen of the terminal 20b of the user 10b. The details will be described below.

(1) Functional configuration of the terminal As shown in FIG. 1, the terminal 20 includes a message processing unit 211, a call unit 212, a voice recognition unit 213, and a display processing unit 214 as functions realized by the control unit 21. To be equipped.

The message processing unit 211 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display. When the input from the user is received, the communication I / F 22 is instructed to transmit the received input content to the server 10. Here, the target to be processed by the message processing unit 211 is not limited to the text message input by the user to the talk room, but also includes image information including photos and stamps, audio files, video files, data files, and the like. Good.

Further, the message processing unit 211 determines the display size of the call icon according to the text amount of the text data generated by the voice recognition unit 213, and displays the call icon having a size corresponding to the text amount. It may or may not be instructed to the display processing unit 214. By displaying the call icon in a size corresponding to the amount of text, the call amount can be estimated from the size of the call icon when the user confirms it later. By estimating the call volume and checking the date and time when the call was made, it is possible to make it easier for the user to recall the content of the call at that time. At this time, instead of the size of the call icon, the amount of call may be expressed by changing the color according to the amount of text.

The call unit 212 has a function of executing a call with another user who uses the messaging service via the server 10 on the messaging service. The call unit 212 has a function of making a call to a designated party when receiving an input of a call on the messaging service from a user of the terminal 20, and a call from another user who uses the messaging service. Has a function of accepting (calling). The call unit 212 executes a call by a function called VoIP (Voice over Internet Protocol), for example, without limitation. The call unit 212 may or may not record the contents of the call in the storage unit 28 during the call. Further, the call unit 212 may have a video call function. That is, during a video call, the call unit 212 transmits the sound collected by the microphone 232 and the image captured by the camera 234 to the server 10 via the communication I / F22, and at the same time, the communication I / The voice signal and the video signal transmitted from the other party via the F via the server 10 are received, the sound based on the voice signal is output from the speaker 233, and the video based on the video signal is displayed on the display unit 24. Instruct the display processing unit 214. Further, the call unit 212 makes a call based on the input to the image information (call icon or image for a call different from the call icon) indicating that the user has made a call displayed on the talk room. May or may not be started (calling). That is, the call processing may be executed in order to make a call with the user corresponding to the talk room by inputting a predetermined input to the call icon indicating that the call has been made on the talk room. It does not have to be done. The call may be a call via a speaker having an AI assistant function such as a smart speaker held by the user of the terminal 20. In that case, a call is made with another terminal through the smart speaker, but in that case, the voice collected by the smart speaker is directly transmitted to the server 10 and the server 10 sends the call to the terminal of the other party. Will be sent. In this case, the smart speaker itself performs voice recognition processing and sends a text message to the server 10, and the server 10 sends a text indicating the content of the call to the talk room of the user's terminal 20 associated with the smart speaker. A message may be transmitted so that the display unit 24 of the terminal 20 displays a message indicating the content of the call on the talk room, or the smart speaker only transmits the voice to the server 10, and the server 10 performs the voice recognition process. A text message indicating the content of the call may be transmitted to the terminal 20 of the user corresponding to the smart speaker, and the display unit 24 of the terminal 20 may display the message indicating the content of the call on the special room. Further, as another method using the smart speaker, the communication I / F 22 of the terminal 20 receives the user's voice from the smart speaker at one end, and the call unit 212 receives the voice collected by the smart speaker. The voice may be transmitted to the server 10 via the communication I / F 22.

The voice recognition unit 213 has a function of recognizing the voice of the call executed by the call unit 212 and converting it into text data. The voice recognition by the voice recognition unit 213 may be executed for the recorded data of the call recorded in the storage unit 28 by the call unit 212. The voice recognition unit 213 may or may not record the text data obtained by voice recognition in the storage unit 28. The voice recognition unit 213 transmits the text data obtained by voice recognition to the message processing unit 211. The voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order, and associates the information indicating the speaker with the text data after the classification obtained by voice recognition. , Is transmitted to the message processing unit 211. To identify the speaker from the content of the voice, the speaker is classified by extracting the feature amount (not limited, but as an example, the frequency spectrum) of the voice in which the conversation is being made. Can be identified.

The display processing unit 214 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display. When the input from the user is received, the communication I / F 22 is instructed to transmit the received input content to the server 10. The message processing unit 211 includes content transmitted by the terminal 20 (not limited, but an example of the second content) and content transmitted by a terminal held by a user other than the terminal 20 (not limited, but an example of the first content). (As an example, not limited to the display area of the display unit 24, the content transmitted by the user of the terminal 20 is displayed on the left side of the display area of the display unit 24. It may be displayed on the right side of, or the background color of the content sent by each user may be changed). Displaying the content transmitted by another user on the left side of the display area of the display unit 24 means displaying the content closer to the left side of the display area. That is, as shown in the display example of the talk room of FIG. 2B, the left end of the message corresponding to the voice spoken by another user is displayed closer to the left side of the display area. Similarly, displaying the content transmitted by the user of the terminal 20 on the right side of the display area of the display unit 24 means displaying the content (message) on the right side of the display area. That is, as shown in the display example of the talk room of FIG. 2B, the right end of the message corresponding to the utterance of the user of the terminal 20 is shifted to the right side of the display area of the terminal 20 and displayed. Further, the display processing unit 214 sends a message (not limited to an example of the second information) indicating the contents of the call spoken by the user of the terminal 20 to the user of the terminal 20 with respect to the text message based on the voice recognized by the voice recognition unit 213. It is associated and displayed in the display area, and a message (not a limitation but an example of the first information) indicating the content of the call spoken by the user of the other party is displayed in the display area in association with the user of the other party.

(2) Functional Configuration of Server As shown in FIG. 1, the server 10 includes a message processing unit 111 as a function realized by the control unit 11.

The message processing unit 111 has a function of managing a talk room for exchanging information between users. The message processing unit 111 relays the exchange of contents including the contents between the terminals provided with the contenting service provided by the server 10. That is, when the content is transmitted from a certain user to the talk room, the talk room is specified and the content is transmitted to another user belonging to the talk room.

<Operation>
FIG. 3 is a sequence diagram showing an example of communication between each device in the communication system 1 according to the present embodiment. The sequence diagram shown in FIG. 3 is a diagram showing exchanges when users make a call on a message application.

As shown in FIG. 3, first, the terminal 20a makes a call by designating a call partner from the message application according to the input from the user (step S301). That is, the terminal 20a transmits a call request including the information of the other party to the server 10.

When the server 10 receives a call request from the terminal 20a, the server 10 identifies the user (terminal 20b) of the call partner from the information of the call partner included in the call request, and calls the specified user (terminal 20b). A signal is transmitted (step S302).

The terminal 20b receives the call signal transmitted from the server 10. That is, the terminal 20b receives a call request from the user of the terminal 20a on the message application (step S303). Then, the

terminals

20a and 20b make a call on the message application via the server 10 (step S304). Here, the content of the call may or may not be recorded. Then, the user of the terminal 20a and the user of the terminal 20b make an input to end the call to each terminal and end the call (step S305).

After the end of the call, the terminal 20b performs voice recognition for the content of the call and converts the content of the call into text information (step S306). In step S304, when recording the content of the call, the voice recognition process can be executed even after the end of the call, but when not recording, the on-time voice recognition process is executed immediately after the start of the call. Become. The terminal 20b stores a message (text message) obtained by voice recognition (step S307). The message obtained by voice recognition may be transmitted not only to the terminal 20b but also to the server 10 and the terminal 20a and stored in the server 10 and the terminal 20a. Further, it may be stored only in the server 10 instead of the terminal 20b. In any device related to the communication system, the data of the text message obtained by voice recognition can be stored and displayed in the talk room.

When the terminal 20b executes the voice recognition process of the call content, the voice-recognized text data is displayed as a message in the display area of the display unit 24 of the terminal 20 (step S308).

Although not shown in FIG. 3, the terminal 20a also executes the processes of steps S306 to S308, that is, the process of executing the voice recognition process for the content of the call and displaying the voice-recognized text data. It may or may not be done. Further, since the call is made via the server 10, the voice recognition process may be executed by the server 10. In that case, a text indicating the content of the call obtained by the server 10 voice recognition. The data is transmitted to each user (terminal 20) involved in the call and displayed on the talk room of each terminal. In this way, the content of the call is automatically converted into text data and displayed on the talk room, so that the content of the call is surely recognized even when you want to remember the content of the call executed by the user later. can do.

FIG. 4 is a flowchart showing an operation example of the terminal 20 for realizing the processing of the sequence diagram shown in FIG.

The control unit 21 of the terminal 20 detects whether or not a call has been started on the messaging application (step S401). This is detected by the call unit 212 on the messaging application depending on whether there is a response to the outgoing call from the terminal 20 according to the input from the user, or whether there is an incoming call input for the incoming call from another terminal. can do.

The control unit 21 of the terminal 20 records the voice of the call while the call unit 212 is talking, and stores the recorded voice data in the storage unit 28 (step S402).

The control unit 21 of the terminal 20 determines whether or not the call has ended based on whether or not there is a call end input from the user via the input / output unit 23 (step S403). If the call has not ended (NO in step S403), it waits until the call ends.

If it is determined that the call has ended (YES in step S403), the control unit 21 ends the recording. The voice recognition unit 213 executes voice recognition processing on the recorded voice data. Then, the text message obtained by voice recognition is stored in the storage unit 28 (step S404). That is, the voice recognition unit 213 converts the recorded voice data into text data indicating the contents of the call.

The text message obtained by voice recognition may or may not be sent to the server 10. Furthermore, when the server 10 receives the text message, it may or may not be transmitted to the terminal of the other party. By transmitting the text message obtained by the terminal 20 by voice recognition to the server 10 or the terminal of the other party, the message indicating the contents of the call can be displayed in text on the other party's terminal, and the other party can also display the text message. , Later, when you check the contents of the call, you can see the message and check the contents of the call. The terminal of the other party may or may not display the received text message on the talk room in the same manner as the terminal 20.

The voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order (step S405). At this time, the voice recognition unit 213 may or may not classify the text data of the content spoken by the same speaker according to a predetermined standard. As an example, not a limitation, it may or may not be divided by sentence. The voice recognition unit 213 transmits the divided text data to the display processing unit 214.

Then, the display processing unit 214 displays each text data divided by the voice recognition unit 213 on the display unit 24 as a message on the talk room in association with the corresponding speaker (step S406). That is, the control unit 21 of the terminal 20 displays a text message (not limited, but an example of the second information) obtained by voice-recognizing the voice of the user holding the terminal 20 in association with the user of the terminal 20. , A text message (not limited, but an example of the first information) that voice-recognizes the voice of the other party is displayed in association with the other party.

The control unit 21 determines whether or not there is a termination input of the messaging application from the user via the input / output unit 23 (step S407). If there is no end input (NO in step S407), the process returns to the process of step S401. On the other hand, if there is an end input (YES in step S407), the process ends. As described above, according to the terminal 20 according to the present embodiment, as shown in FIG. 2A, when a call is executed on the messaging application, the terminal 20 is shown in FIG. 2B. Call content can be automatically converted to text and displayed as a message. Therefore, it can be used later to help the user to recall the conversation content when the call is made.

FIG. 5 is a flowchart showing an operation example of the process related to the display of the message indicating the content of the call on the terminal 20. The terminal 20 may or may not have a function of switching between display and non-display of a message of the contents of a call when users make a call on the talk room. .. FIG. 5 is a flowchart showing an operation example of the terminal 20 when the display / non-display of the message can be switched. Here, it is a flowchart showing the operation of the terminal 20 when the talk room is displayed on the display unit 24 of the terminal 20 and the talk has been made on the messaging application in the past. The process shown in FIG. 5 is a process in which the user executes a messaging application on the terminal 20 and displays the talk room.

A talk room is displayed on the display unit 24 of the terminal 20, and if there has been a call in the past on the messaging application, image information (call icon) indicating that the call has been made to the talk room is displayed. To. The control unit 21 of the terminal 20 determines whether or not the input to the call icon displayed on the talk room (not limited but touch input as an example) is made to the input / output unit 23 (step S501). ..

When there is a touch input to the call icon (YES in step S501), the control unit 21 determines whether or not the content of the message corresponding to the call icon has been expanded (step S502). Expanding a message is synonymous with displaying a message indicating the content of the call.

If the call message has already been expanded (YES in step S502), the display processing unit 214 hides the displayed call message (step S503). On the other hand, if the call message has not been expanded (NO in step S502), the display processing unit 214 displays the content of the call message on the display unit 24 (step S504), and ends. At the end of the call, it is arbitrary whether the terminal 20 displays the message in the expanded state or the unexpanded state in the talk room, and is determined by the setting made to the terminal 20 by the user. May be good. Further, when displaying the message of the contents of the call, all of the text messages converted by voice recognition of the contents of the call may be displayed, or only a part of the excerpts may be displayed. When displaying a part of the excerpt, the text message may be analyzed to display a text message indicating the content that is presumed to be important in the call.

FIG. 6 is a diagram showing an example of changes in the display of the talk room before and after the call when a call is made on the talk room on the terminal 20 shown in FIG. FIG. 6A shows an example of displaying the talk room before the call, and FIG. 6B shows an example of displaying the talk room after the call.

FIG. 6A shows a display example of a certain talk room of the user of the terminal 20, and shows a state in which the message 601 sent at 22:11 is displayed. In this state, it is assumed that the user of the terminal 20 makes a call with another user related to the talk room. The content of this call is recorded and converted into a text message by voice recognition processing. Then, the text message is displayed in association with each user related to the call. That is, as shown in FIG. 6B, the terminal 20 displays the call icon 611 indicating that the call has been made on the talk room, following the message 601. The call icon 611 may or may not be associated with date and time information 612 (which may be the start date and time or the end date and time of the call) when the call was made. Then, following the call icon 611, the terminal 20 displays the call content as a message in which the content of the call is converted into text by voice recognition, as shown in the portion surrounded by the dotted line 613. As a result, the terminal 20 can leave information indicating the contents of the call on the talk room in the form of a message.

FIG. 7 is a diagram showing a display example when processing is performed on the terminal 20 shown in FIG. FIG. 7A is a screen view showing a state in which a message indicating the content of the call is not displayed, and FIG. 7B shows a state in which a message indicating the content of the call is expanded and displayed. It is a screen view.

As shown in FIG. 7A, the talk room of the messaging application is displayed on the display unit 24 of the terminal 20. Then, it is assumed that the call icon 611 indicating that the call has been made is displayed on the talk room. When the user wants to know the contents of the call at this time, as shown in FIG. 7A, the user touch-inputs the call icon 611 using his / her finger or a stylus, that is, a message of the contents of the call. Instruct the deployment of.

As shown in FIG. 7A, when a touch input is detected on the call icon 611 in a state where the call message is not expanded (displayed), the terminal 20 expands a message indicating the content of the corresponding call, that is, As shown in FIG. 7B, it is displayed on the display unit 24. As shown in FIG. 7B, below the call icon 611, an example in which the content of the call is displayed in a message format is shown.

Further, when a touch input to the call icon 611 is detected while a message indicating the content of the call is displayed as in the display mode shown in FIG. 7B, the display processing unit 214 of the terminal 20 Can be changed from the display mode shown in FIG. 7 (b) to the display mode shown in FIG. 7 (a). The first display mode after the call may be the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a). Further, as to which display mode is to be the initial display mode, the user of the terminal 20 may be configured to be able to set the messaging application in the terminal 20, and the terminal 20 has the setting contents set by the user. Therefore, either the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a) may be displayed.

As shown in FIGS. 6 and 7, by displaying a message indicating the content of the call on the call icon 611 on the terminal, the terminal 20 can remind the user of the conversation he / she wants to remember. Although an example of expanding the message is shown here, the display method of the message indicating the content of the call is not limited to the expansion, and as an example, the user touches the vicinity of the call icon 611. It may be a display that pops up a message when the call is being made, or a display that transitions to a screen different from the talk room. The call icon 611 may be used to display an image of a user involved in the call. In that case, the call icon 611 may be displayed as a substitute for the call icon 611, or may be displayed together with the call icon 611. In addition, the user's image is not limited, but as an example, the user's face photograph, the profile image used by the user on the messaging application, and the user's face photograph (or the user's face photograph) taken by using the in-camera when making a call. The processed product) and the like can be used, but the present invention is not limited to these.

FIG. 8 is a diagram showing one display mode of the call icon 611. FIG. 8A shows an example in which the user brings his / her finger closer to the call icon 611, and FIG. 8B shows an example in which the user's finger approaches the call icon 611 by a certain amount or more. FIG. 8 shows a state in which a message indicating the content of the call is not expanded.

As shown by arrow 801 in FIG. 8, the user brings his or her finger closer to the call icon 611a. At this time, the touch panel 231 of the terminal detects a state in which the user's finger is in contact with the touch panel 231 or is in close proximity to the touch panel 231 or more, and detects the operation position. Then, the control unit 21 of the terminal 20 determines whether the coordinates on the touch panel 231 indicated by the detected operation position are close to the display coordinates of the call icon 611a. Then, when it is determined that the user's finger is approaching the call icon 611a, the control unit 21 of the terminal 20 may enlarge the call icon 611b as shown in FIG. 8B. You don't have to. By enlarging the call icon 611b, the user can easily touch the call icon 611b. Then, by touching the enlarged call icon 611b, it is possible to perform an operation of switching between expansion and non-expansion of the message as shown in FIG. 7.

Further, in FIGS. 6 and 7, an example of expanding the message under the call icon 611 is shown, but the display method of the message indicating the content of the call is not limited to this example. As an example, but not a limitation, the terminal 20 may be configured to display the content of a message indicating the content of a call as a pop-up message 901 as shown in FIG. 9A. Alternatively, the terminal 20 may be configured to transition to a screen different from the talk room and display the content of the message indicating the content of the call, as shown in FIG. 9B. At that time, the return icon 902 for returning to the original talk room display may or may not be displayed. By touching the return icon 902, the original talk room display can be returned.

In the embodiment, the speaker in the call is specified by using the voice feature amount, but in the utterance, the terminal that acquired each utterance identifies each terminal (or the user) with respect to the voice signal. It may be configured so that the speaker of each voice can be distinguished by giving possible information. In addition, when the smart speaker picks up the voices of a plurality of users and makes a call with a user of another terminal, each speaker of the voice picked up by the smart speaker receives the position information of each speaker together with the voice. By doing so, the speaker may be identified. This is because by using a directional microphone as the microphone of the smart speaker, the speaker can be distinguished from the voice from which direction the voice is heard, so that the information indicating the direction in which the smart speaker receives the voice is displayed with respect to the voice. Speakers can be distinguished by giving them. As a result, the message processing unit 211 can display a message indicating the contents of the call in association with the speaker. Further, the voice recognition unit 213 uses the text data obtained by voice recognition based on sentence breaks, conversation breaks, context breaks, etc., even when the same speaker continues the conversation. It may or may not be divided. Further, this division may or may not be simply divided when the number of characters exceeds a predetermined number of characters. Further, the voice recognition unit 213 may delete the content related to the ambient noise in the text data obtained by the voice recognition. This may be achieved by using known noise canceling techniques or by using contextual analysis to remove unnatural words in the text data, if any. In addition, the voice recognition unit 213 may or may not delete the message related to the reciprocity in the obtained text data. Alternatively, in the case of hitting an agitation, the image information (not limited, but as an example, a stamp showing the appearance of agitation) is used to express the agitation as information indicating that the agitation has been made. May be good.

<Effect of embodiment>
Hereinafter, the effect of the first embodiment will be described.

The user of the terminal 20 according to the above embodiment uses the terminal 20 to make a call with another user via the messaging application provided by the server 10. Then, the terminal 20 displays the contents of the call in the display area of the display unit 24 of the terminal 20 in the talk room of the messaging application, and displays the information indicating the contents of the call. Specifically, the terminal 20 converts the content of the call into text data by performing voice recognition processing. Then, the terminal 20 displays the converted text data in the talk room of the messaging application.

With this configuration, when the user of the terminal 20 wants to remember the contents of the call later, he / she can help to recall the contents of the call by checking the information indicating the contents of the call. In addition, the terminal 20 can convert the contents of the call into a text message and display it without forcing the user to perform a special operation.

Further, the terminal 20 may display a call icon indicating that a call has been made on the talk room. Then, the display or non-display of the message indicating the content of the call may be switched by the input from the user to the call icon.

As a result, the terminal 20 can prevent the talk room from becoming difficult to see due to the huge amount of messages when displaying the contents of the call when the call is prolonged by hiding the message. At the same time, by inputting to the call icon, the message can be expanded and the user can recognize the content of the call.

Further, the terminal 20 may not display all, may display a part, or display all of the text messages obtained by performing voice recognition processing on the contents of the call. You may. In addition, which display mode to use may be determined by the user's setting for the terminal 20.

When not all are displayed, the display contents of the talk room become concise, the user can easily operate in the talk room, and by displaying only a part of the contents of the call, the conciseness of the talk room and the contents of the call are displayed. Can be made to be recognized by the user at the same time, and when all are displayed, the user can be made to recognize the contents of the call in detail. Further, the terminal 20 can provide convenience to the user by selecting and setting which display mode to use.

In addition, the terminal 20 uses an image of the other party (a face image as an example, or a profile image used on the messaging application) as information indicating that the call has been made when the call is made. Further, an image of the user of the terminal 20 (not limited to a face image as an example, or a profile image used on a messaging application) may also be displayed.

As a result, the terminal 20 can make the user recognize at a glance that the call was made and who the other party was.

Further, the terminal 20 identifies who is the speaking user when converting the contents of the call into text data by the voice recognition process. Then, the text data is converted so as to correspond to the specified user, and the text data is displayed as if it were a message sent to the other user.

As a result, the terminal 20 can distinguish between the user of the terminal 20 during a call and the user of the other party and display a message, so that it is possible to later confirm who said each statement. it can.

Further, the terminal 20 displays image information indicating that a call has been made on the messaging application on the talk room, and when there is a user input for the image information, the terminal 20 is associated with the talk room. It may be configured to initiate a call with a user.

With this configuration, even if the user wants to make another call with the user related to the talk room, he / she can easily make a call without making complicated input.

<Embodiment 2>
In the first embodiment, an example in which a normal voice call is made between users of the messaging application has been described. In the second embodiment, an example in which a video call is made between users of the messaging application will be described.

FIG. 10 is a flowchart showing an operation example of the terminal when the user makes a video call. In the messaging application according to the present embodiment, a call by a video call is also possible. A video call is a so-called video telephone function. As shown in FIG. 10, the call unit 212 of the terminal 20 starts a video call with the other party via the server 10 (step S1001). This is started when the user of the terminal 20 gives a call instruction or receives a call from another user on the messaging application.

When the call unit 212 starts a video call, the call unit 212 instructs the camera 234 of the input / output unit 23 to start imaging. As an in-camera, the camera 234 images the display unit 24 side of the terminal 20, that is, the user of the terminal 20. In addition, the call unit 212 instructs the microphone 232 to acquire the conversation sound of the user of the terminal 20. The call unit 212 transmits the video captured by the camera 234 and the voice acquired by the microphone 232 to the server 10 via the communication I / F 22 during the video call. The video captured by the camera 234 and the voice acquired by the microphone 232 are transmitted from the server 10 to the terminal of the other party. Further, the communication I / F 22 of the terminal 20 receives the video and audio transmitted from the terminal of the other party in sequence from the sequential server 10 and instructs the display processing unit 214 to display the received video on the display unit 24. At the same time, the input / output unit 23 is instructed to output the received voice from the speaker 233. In a video call, the call unit 212 stores the video captured by the terminal 20 and the acquired voice, and the video and voice transmitted from the terminal of the other party in the call in the storage unit 28.

The terminal 20 ends the video call by inputting an instruction to end the video call from the terminal 20 or by the other party disconnecting the call (step S1003).

The voice recognition unit 213 of the terminal 20 performs voice recognition on the recorded voice of the video call (step S1004). Further, the control unit 21 of the terminal 20 may or may not specify the user's emotion from the content of the image.

When the voice recognition unit 213 of the terminal 20 finishes the voice recognition, the text message obtained by the voice recognition is displayed in the talk room (step S1005). Further, when the control unit 21 has specified the emotion of the user, the message may or may not be displayed in a display mode according to the emotion of the user who specified the message. Here, the display mode according to the user's emotion is to change the shape of the bubble (speech balloon) for displaying the message (for example, when the user is angry, the shape of the balloon is jagged). Or add a character to the message to indicate a specific emotion (for example, if the user is angry, add # at the end of the message, or if the user is happy, add the ♪ symbol to the message. It may be given at the end), or the characters may be displayed in a color according to the emotion. Alternatively, an emoticon or image information (not limited, but a stamp as an example) indicating the user's emotion may be displayed together.

The control unit 21 of the terminal 20 determines whether or not the user has switched to the out-camera or activated the out-camera during the video call (step S1006). This can be detected by the input from the user to the terminal 20 when the user of the terminal 20 switches or activates the out-camera, and the image captured by the other party switching to the out-camera. When is transmitted, an unnatural break occurs in the image, and it can be detected by detecting the break.

If the switch to the out-camera is performed during the video call (YES in step S1006), the control unit 21 uses one frame of the image taken by the out-camera as a still image or The video obtained during the shooting by the out-camera is converted into a text message of the video call in the talk room and displayed in association with the displayed message (step S1007). In the case of a moving image, the moving image may be between the timing of switching to the out-camera and the timing of switching to the in-camera again, but the present invention is not limited to this. The insertion position of the still image or the moving image may be arbitrary, for example, it may be the first or the last of the text message converted by voice recognition of the video call, or the out-camera. It may be the timing when the switch to is generated. If the out-camera is not switched during the video call (NO in step S1007), the process proceeds to step S1008.

The control unit 21 determines whether or not there is an input related to position information during a call (step S1008). Here, the input related to the position information may be any form of input as long as it is an input of information that can specify the position of the terminal 20 or the terminal of the other party, and is not limited, but as an example, voice or Input of place name and facility name by direct input from the user or the other party, input of acquisition instruction of location information (location information by GPS) from the user, automatic acquisition of location information by GPS that is always activated, call partner It is possible, but not limited to, transmission of position information from the user, input of an image or information that can specify the position from the user, and the like. If there is no input regarding the position information during the call (NO in step S1008), the process ends.

On the other hand, when there is an input regarding the position information during the call (YES in step S1008), the control unit 21 inserts an image related to the position information into the talk room (step S1009). Here, the image related to the position information is an image related to the position of the terminal 20 or the position of the terminal of the other party, and may be any image as long as it is related.

If the user or the other party inputs a place name or facility by voice or direct input during a call, map information including the area around the place name may be acquired as an image and inserted, or the location of the facility may be inserted. Map information to be shown, or a photograph showing the appearance of the facility may be acquired and inserted.

Further, when the user inputs an instruction to acquire the location information, the image of the surrounding map including the acquired location information may be acquired and inserted. Similarly, when the other party sends the location information during the call, the image of the surrounding map including the received location information may be acquired and inserted.

In addition, the homepage of the store or facility where the user (or the other party) is located may be accepted as information on the user's location, and the address and representative image of the homepage may be acquired and inserted. An image may be inserted, or map information indicating a location that can be identified from the homepage may be acquired and inserted.

Note that the processes of steps S1008 and S1009 may be executed not only during a video call but also during a normal call. Further, the number of images to be inserted is not limited to one, and may be any number, and the number may or may not be limited. Further, the three processes of step S1004 and step S1005, step S1006 and step S1007, and step S1008 and step S1009 do not have to be all performed, and at least one may be performed. However, at least two of these three processes may be combined and executed.

In addition, the image (still image, moving image) captured when the out-camera is activated during a video call (when the out-camera is switched to) is used as information indicating the content of the call together with a message indicating the content of the call (or). I decided to display it in the talk room (without displaying the message), but this is not the case either. First, the image to be displayed as an image showing the contents of the talk in the talk room is not limited to the one captured by the out-camera, but may be the one captured by the in-camera. Therefore, as an example of the image captured by the in-camera, the face image of each user involved in the call may be displayed in the talk room.

In addition, the display of the image is not limited to the mode of displaying by inserting it between messages. For example, it may be displayed as a background image of a section displaying a message indicating the content of the call. At this time, the display is not limited to the background image of the entire message, and may be configured to display only the period during which the conversation related to the acquired image is taking place. The period of conversation related to the image can be realized by analyzing the text message obtained by voice recognition processing of the contents of the call. An example of this will be described with reference to FIG.

The following describes an example of inputting information regarding the position during a call and a specific example of a talk room display example at that time.

FIG. 11 shows an example of a call and a display example of a talk room displayed after the call at that time. FIG. 11 (a) shows a part of the call, and FIG. 11 (b) shows an example of the situation following FIG. 11 (a). Further, FIG. 11C shows an example of displaying the talk room after a call.

As shown in FIG. 11A, it is assumed that the user 10a of the terminal 20a makes a call to visit the location or a video call to the user 10b of the terminal 20b. On the other hand, it is assumed that the user 10b has taken a picture of a nearby facility as information on the place where he / she exists, as shown in FIG. 11B.

When the exchanges shown in FIGS. 11 (a) and 11 (b) are performed during a call, as an example, the terminal 20 is positioned with respect to the terminal 20b acquired by the terminal 20b as shown in FIG. 11 (c). The information-based image 1101 is inserted into the talk room. Here, the terminal 20b may display the image obtained by the shooting shown in FIG. 11B as it is as an image relating to the position of the terminal 20b in the talk room, or can be extracted from the shot image. Information related to the position may be extracted by an image recognition process, and then an image may be acquired from the network and displayed from the information. In the example of FIG. 11B, the word "AA Mart" is extracted from the image captured by the user 10b using the terminal 20b, the wording is searched on the Internet, and the image obtained by the search ( As an example, not a limitation), it is displayed as shown in FIG. 11 (c). In the example of FIG. 11C, the image 1101 is displayed in the form following the utterance of the user 10b in FIG. 11B so as to synchronize with the timing at which the image was taken. However, as described above, The image 1101 may be displayed as a background image of the talk room. Alternatively, it may be inserted at the beginning of a message indicating the content of the call, or at the end of the message.

FIG. 12A is a diagram showing a display example in which the acquired image is displayed as the background of the talk room based on the information regarding the position of the terminal. Then, FIG. 12B is a diagram showing a display example in a state in which the talk room shown in FIG. 12A is scrolled up and displayed. As shown in FIG. 12A, the terminal 20 displays an image specified from the position information regarding the terminal specified during a call as a background image of the talk room.

As shown in FIG. 12A, the terminal 20 uses an image acquired during a call as a background image of the talk room (as an example, not limited to an image relating to the position of the terminal, an image input by the user during the call, and an image by the user. An image taken during a call, an image related to the content of the call, etc.) is displayed, and a message indicating the content of the call is displayed by superimposing it on the background image. As shown in FIG. 12A, by displaying an image specified from the position information about the terminal specified during the call as the background image of the message, the content of the message indicating the content during the call is displayed together with the content of the message. It is possible to make it easier for the user to recall the contents of the call. Further, at this time, the background image may or may not be displayed only during the section T1 in which the message of the related topic is displayed. That is, as shown in FIG. 12B, in the section T2, an image based on the information regarding the position of the terminal acquired during the topic is displayed as the background image, and in the section T3, the background image is not displayed. That is, by linking the display section of the topic message related to the image and the display section that displays the image acquired based on the information about the position of the terminal acquired during the topic as the background image, during a call. It is possible to reproduce the sense of reality and make it easier for the user to recall the contents of the call.

Further, as another example of the display, it will be described with reference to FIG. FIG. 13A shows a part of the call, and FIG. 13B shows an example of the situation following FIG. 13B. Then, FIG. 13 (c) shows a display example of a talk room displayed on the terminal 20 when the call shown in FIGS. 13 (a) and 13 (b) is made.

As shown in FIG. 13A, the user 10a proposes to the user 10b of the terminal 20b to visit a certain place via a telephone call or a video call, whereas the user 10b proposes a visit to a certain place. I'm asking for a description of the location.

In response to the request from the user 10b, the user 10a uses his / her own terminal 20a to input location information during a call. The input of the location information may be, for example, an instruction input for acquiring the location information if the user is at the destination store (or near the destination), or the location information of the destination recognized by the user (limitedly). As an example, it may be the latitude and longitude information or the address information), or it may be a web page containing information related to the destination.

When such a call including the exchanges shown in FIGS. 13 (a) and 13 (b) is performed, the terminal 20 is based on the position information input in FIG. 13 (b). As shown in (c), a map 1301 showing the location of the destination is inserted between messages and displayed. The image related to the location information is not limited to the map 1301, and may be another image, for example, image information about the homepage of the destination, or address information thereof.

As shown in FIGS. 12 and 13, the terminal 20 not only displays a message based on the conversation between users during the call, but also automatically displays an image based on the information on the location information input during the call. Can be collected and displayed. As a result, when a call is made through the talk room, the terminal 20 can provide more information indicating the content of the call.

Here, the voice recognition unit 213 is supposed to execute the voice recognition after the video call is completed, but this is not limited to this, and it may be executed during the call. Furthermore, when the user makes a call using the speakerphone using the terminal 20, the terminal 20 performs voice recognition in real time, and is analyzed and converted in real time on the talk room while making the call. Message may be displayed. In this way, even in the case of a video call, the terminal 20 can display the content of the call between the users made in the video call as a message on the talk room. Further, in making a video call, it is conceivable to give some lessons, specifically, English conversation (language) lessons, but in such a case, the terminal 20 is used in that language together with the text message. A more appropriate phrase may be collected from a network or the like and displayed.

<Effect of embodiment>
Hereinafter, the effect of the second embodiment will be described.

Further, the user of the terminal 20 makes a call with another user by a video call via the messaging application provided by the server 10. At this time, the terminal 20 may display an image taken during a call including a video call, or an image taken by the terminal of the user of the other party, or information based on the image in the talk room.

As a result, the terminal 20 can make it easier for the user to recall the content of the conversation during a call.

Further, the terminal 20 uses an image taken by an out-camera provided on the side opposite to the side where the display unit 24 of the terminal 20 is located during a talk, based on a camera switching instruction taken by the user, in a talk room. It may be displayed in.

As a result, it is highly possible that taking an image using the out-camera, especially during a video call, was taken closely related to the content of the call, and information based on that image is displayed in the talk room. By doing so, it is possible to make it easier for the user to remember the content of the call later. In addition, by using the switching of the camera being shot by the user (switching from the in-camera to the out-camera) as a trigger and displaying the image taken by the out-camera in the talk room, it is easier to recall the contents of the call. Information can be automatically generated and displayed.

Further, the terminal 20 may display the image input by the user in the talk room during the video call. At this time, the terminal 20 may display the image between messages indicating the contents of the call so as to match the timing at which the image is input, but the present invention is not limited to this. It may be displayed at the beginning of the message or at the end of the message.

As a result, the terminal 20 can easily remind the user of the contents of the call by viewing the image later.

Further, the terminal 20 may display the image acquired during the call as the background image of the message indicating the content of the call.

As a result, the user recalls the contents of the call by checking the contents of the message (text) indicating the contents of the call and checking the image seen, photographed, or acquired during the call as the background image. It will be easier to do.

Further, the terminal 20 is based on an image based on information on the position of the terminal 20 or information on the position of the terminal of the other party in addition to the image input by the user and the image obtained by taking a picture during the call. The image may be acquired and displayed in association with the message.

By acquiring an image based on the position of the terminal 20 during a call or the position of the other party's terminal, the terminal 20 is not limited to, but as an example, what kind of place the user is called. Alternatively, by recognizing where the other party was, the content of the call can be reminded.

Further, when a call is made, the terminal 20 may display an image according to the contents of the call. After converting the contents of the call into a text message by voice recognition processing, the terminal 20 analyzes the contents of the call by morphological analysis, context analysis, etc., and displays a highly relevant image from the results obtained by the analysis. .. The terminal 20 is not limited, but as an example, when there is a topic about a certain store as the content of a call, a picture of the store may be displayed as an image in association with a message, or a topic about a certain food. If there is, a photograph of the food may be displayed as an image in association with the message.

By displaying an image highly relevant to the topic, the terminal 20 can easily remind the user of the content of the call.

<Embodiment 3>
FIG. 14 is a flowchart showing an operation example of processing for realizing a display mode for allowing the user to easily recognize the contents of the call when the call is made on the talk room. The terminal 20 may or may not execute the process shown in FIG. Further, although not shown, the terminal 20 may be configured to be able to select and set whether or not to execute the process shown in FIG. 14 according to the input from the user. The process shown in FIG. 14 shows an example of the process after step S404 shown in FIG.

As shown in FIG. 14, the voice recognition unit 213 executes voice recognition processing on the recorded voice (step S404).

The control unit 21 specifies the amount of text data obtained by the voice recognition unit 213 converted by voice recognition (step S1405). The control unit 21 may specify the number of characters of the text data or the data capacity of the text data as the amount of sentences as an example, not the limitation. The control unit 21 determines the display size of the call icon 611 based on the specified amount of text (step S1406). Specifically, the control unit 21 determines the display size so that the larger the amount of text, the larger the display size of the call icon 611. As an example, not a limitation, the control unit 21 may determine the display size by a function that determines the display size by inputting a predetermined amount of text, or displays the display size in the storage unit 28 in advance according to the range of the text amount. A sized table may be stored and the display size may be determined according to the table. Here, the display size of the call icon 611 is determined based on the amount of characters after the text conversion, but the length of the call time may be used instead of the amount of characters. That is, since it is assumed that the longer the call time, the deeper the conversation, it is assumed that the display size of the call icon 611 is increased and the shorter the call time, the simpler the conversation. Therefore, the display size of the call icon 611 is reduced.

Further, the control unit 21 executes a context analysis on the sentence by using morphological analysis or the like (step S1407). This can be achieved by using existing text mining techniques. Then, the control unit 21 determines a heading that is presumed to be appropriate as the title of the call content from the analysis result (step S1408). This heading is not limited, but as an example, words that frequently appear in the analyzed text data can be used, or words that are estimated as some schedule from the analysis result of the text data can be used. Further, the wording used in the heading may be based on the content spoken by the user of the terminal 20, may be based on the content spoken by the user of the other party, or may be both. Good. In addition, when there is a content related to the schedule in the conversation, the terminal 20 starts a schedule management application that manages the schedule, which is different from the messaging application, and registers the schedule on the calendar. You may or may not do it.

Then, the control unit 21 displays the call icon 611 on the talk room with the determined display size on the display processing unit 214, and displays the call icon 611 in association with the call icon 611 with the determined heading. (Step S1409), and the process ends.

FIG. 15 shows a display example in which the size of the call icon is changed and displayed according to the call volume (the text volume of the message of the call content). FIG. 15A shows a display example of the call icon 1501 displayed when the call volume (the text volume of the message of the call content) is relatively small. Note that FIG. 15 shows a state in which the message is not expanded for the sake of readability. FIG. 15 (b) shows an example of displaying the call icon when a call volume larger than the call volume of the call corresponding to the call icon 1501 is made with respect to FIG. 15 (a). As shown in FIG. 15B, the call icon 1502 is displayed in a size larger than that of the call icon 1501 shown in FIG. 15A. As shown in FIG. 15, by displaying the call icon according to the call volume (the amount of text of the message of the call content), the user can recognize at a glance how much he / she was talking. Can be done.

By the way, the method of displaying the call volume is not limited to the size of the call icon as described above. For example, as shown in FIG. 16, the call volume may be expressed by the shade of color of the call icon. In FIG. 16, the colors are shown by hatching. FIG. 16A shows a call icon 1601 when the call charge is relatively low. On the other hand, FIG. 16B shows a display example of the call icon 1602 having a call volume larger than the call charge of the call corresponding to the call icon 1601 of FIG. 16A. As shown in the call icon 1602, when the call icon 1601 shown in FIG. 16A has a larger call volume than the corresponding call charge, the call icon 1602 is displayed in a darker color to charge the call charge. Is shown. That is, the shade of the color of the call icon allows the user to recognize the call volume at a glance. In this way, the call volume can be expressed by the display mode of the call icon.

Further, the image displayed as the call icon is not limited to the symbol indicating the call as shown in FIGS. 15 and 16, and may be an image related to the call or may be a combination with the icon. That is, an image related to the call (an image relating to the position of the terminal during the call) may be displayed as a background image of the call symbol of the call icon shown in FIGS. 15 and 16. Specifically, as shown in FIG. 17, the terminal 20 has an image related to the call (not limited, but as an example, an image related to the position of the terminal, which the user has input during the call) at the display position displayed as the call icon. Images, images taken by the user during a call, images about the content of the call, etc., but not limited to, examples of information related to the call) may be displayed as an alternative to the call icon. At this time, as shown in FIG. 17A, the terminal 20 is displayed as shown in the image 1701 in a manner showing a part of the image related to the call acquired at the time of the call in the outer shape as the call icon. Alternatively, as shown in FIG. 17B, the image related to the call may be displayed as it is, as shown in image 1702, regardless of the outer shape of the call icon. Further, when the image 1702 is displayed as shown in FIG. 17 (b), the terminal 20 is such that the displayed image 1702 is related to the call as shown in FIG. 18 (a). In order to clarify that there is, the call icon 1801 may also be displayed. In FIG. 18A, the call icon 1801 is superimposed on the image 1702 and displayed, but this is displayed outside the frame of the image 1702 if it can be understood that the call icon 1801 is associated with the image 1702. It may be that. Further, the terminal 20 may enlarge and display the image 1702 as shown in FIG. 18B by detecting the touch input from the user to the portion other than the call icon 1801 of the image 1702. .. At this time, the call icon 1801 may or may not be displayed. FIG. 18B shows an example in which the call icon 1801 is not displayed. Further, when the terminal 20 detects a touch input from the user for the call icon 1801 shown in FIG. 18A, the call unit 212 of the terminal 20 calls the user corresponding to the talk room. May be configured to start.

FIG. 19 is a diagram showing a display example when a heading is added to the content of the call. FIG. 19A shows an example in which the voice of a call is converted into a text message by voice recognition processing and displayed as a message on the talk room. As shown in the message of FIG. 19A, it can be understood that the users have promised a drinking party. In the case of such an exchange, the control unit 21 of the terminal 20 performs morphological analysis and context analysis on the text of the message, and as an example, holds a drinking party and holds a drinking party on Saturday. Identify that. Then, the control unit 21 of the terminal 20 displays the heading 1902 indicating the content of the call in association with the call icon 1901. In the example shown in FIG. 19B, the heading 1902 with the content "Saturday drinking party" is displayed. In this way, the terminal 20 can not only display a message indicating the content of the call, but also display a heading 1902 indicating the content of the call. As a result, the user can recognize the contents of the call without reading all the messages indicating the contents of the call.

Further, although FIG. 19 shows an example of assigning a heading, the terminal 20 performs morphological analysis after converting the contents of the call into a text message by voice recognition processing in order to make it easier to recognize the contents of the call. An analysis technique such as context analysis may be used to recognize the content of the call and display a summarized sentence. By summarizing the contents of the call, if the call is prolonged and the amount of sentences as a message to be displayed increases, it will take time for the user to read the contents when all are displayed, which is troublesome. By summarizing, it is possible to make the user recognize the content of the call while simplifying the displayed message. Note that the summary may or may not be displayed as if it were a conversation of any of the users involved in the call. In addition, if the summary contains content related to any schedule in the conversation, the schedule may or may not be included.

<Effect of embodiment>
The effects of the embodiments will be described below.

Further, the terminal 20 provides image information (not limited but a call icon as an example) indicating that a call has been made based on the amount of characters in the text data obtained by voice recognition of the call content and the length of the call time. It may be displayed. Displaying a call icon based on the amount of characters and call time means changing the display size of the call icon or changing the display color of the call icon depending on the amount of characters and the length of the call time. It may be to display.

This makes it easier to estimate the momentum and volume (call volume) of the conversation at that time by just looking at the size of the call icon without looking at the content of the message related to the content of the call. It can be a factor that reminds us of the contents of.

In addition to displaying the call icon as an image indicating that a call has been made in the talk room, the terminal 20 displays an image related to the call in the talk room (not limited to, as an example, an image related to the position of the terminal, and the user is talking The image input to, the image taken by the user during the call, the image related to the contents of the call, etc.) may be displayed. That is, instead of the call icon, an image related to the contents of the call may be displayed in the talk room as information indicating that the call has been made.

As a result, the image related to the actual call is displayed instead of the call icon as the image related to the call, so that the user can recognize the content of the call without looking at the message indicating the content of the call.

Further, in displaying the information indicating the contents of the call, the terminal 20 may analyze the contents of the call, convert the analysis result into a summary sentence indicating the contents of the call, and display the summary. .. This is not limited, but as an example, a learning model is generated by using a learning process using the contents of a call between users and a summary of the contents of the call as teacher data, and voice recognition is performed for the learning model. A summary may be created by inputting the text data obtained by the processing.

As a result, the terminal 20 can make the user recognize the content of the call with simple content. In addition, by displaying a summary instead of displaying all long calls, the convenience of the messaging application can be improved, and the design of the display can be improved.

1 Communication system 10 Server 11 Control unit 111 Message processing unit 12 Input / output unit 13 Display unit 14 Communication I / F (communication unit)
20 Terminal 21 Control unit 211 Message processing unit 212 Calling unit 213 Voice recognition unit 214 Display processing unit 22 Communication I / F
23 Input / output unit 231 Touch panel 232 Microphone 233 Speaker 234 Camera 24 Display unit (display)
25 Location information acquisition unit 28 Storage unit 30 Network

Claims

An information processing method for a terminal that transmits content to a first terminal or receives content transmitted from the first terminal.
Displaying the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal.
Based on the input by the user of the terminal to the display area for displaying the first content and the second content, the control unit of the terminal controls the call with the first terminal.
Acquiring the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal by the control unit based on the call with the first terminal.
It includes displaying call information based on the first information and the second information in the display area.
The information processing method according to claim 1.
At least the image taken during the call is acquired by the control unit of the terminal, and
Based on the image, the control unit controls the display mode in which the call information is displayed in the display area.
The information processing method according to claim 1.
Acquiring an image taken during the call by the control unit of the terminal, and
It includes displaying the call information and the image in the display area.
The information processing method according to claim 3.
This includes controlling the image to be captured by the control unit based on the activation of the image pickup unit of the terminal on the side opposite to the display area.
The information processing method according to claim 3 or 4.
The image is a moving image and is
This includes displaying an image input by the user of the terminal between the first information and the second information while the moving image is being captured.
The information processing method according to claim 3 or 4.
The call information is superimposed on the image and displayed in the display area.
The information processing method according to claim 1.
Acquiring an image based on the terminal by the control unit and
The call information and the image are displayed in the display area.
The information processing method according to claim 7.
The image is acquired based on the information on the position of the terminal based on the call or the information on the position of the first terminal.
The information processing method according to any one of claims 1 to 8.
Displaying the first display for displaying the call information in the display area and
It includes displaying the call information in the display area based on the input of the user of the terminal with respect to the first display.
The information processing method according to claim 9.
The control unit controls the display mode of the first display based on the time of the call and at least one of the call information.
The information processing method according to claim 9 or 10.
The first display includes at least a part of the call information.
The information processing method according to claim 9 or 10.
The first display is an image related to the call.
The information processing method according to claim 12.
The image related to the call is an image related to the position of the terminal or the first terminal that made the call.
The information processing method according to any one of claims 1 to 13.
The call information includes a content based on the first information and a content based on the second information.
The information processing method according to claim 14.
Displaying an image showing a user of the first terminal corresponding to the content based on the first information in the display area, and
The present invention includes displaying an image showing a user of the terminal corresponding to the content based on the second information in the display area.
The information processing method according to claim 15.
The first information is specified to be voice information of the user of the first terminal based on the information of the first terminal.
The second information is specified to be voice information of a user of the terminal based on the information of the terminal.
The information processing method according to any one of claims 1 to 13.
The call information is information that summarizes the contents of the call based on the first information and the second information.
The information processing method according to any one of claims 1 to 17.
The control unit includes setting not to display the call information in the display area.
The information processing method according to any one of claims 1 to 18.
The control related to the call with the first terminal is for controlling the call by displaying the first content, the second content, and an image for controlling the call in the display area. Based on the input by the user of the terminal to the image, the control unit controls the start of the call with the first terminal.
The information processing method according to any one of claims 1 to 19.
The first content, the second content, and the call information are displayed in the display area by the application software stored in the storage unit of the terminal.
The call is executed by the application software.
A program to be executed by the computer of the terminal that transmits the content to the first terminal or receives the content transmitted from the first terminal.
Displaying the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal.
Based on the input by the user of the terminal to the display area for displaying the first content and the second content, the control unit of the terminal controls the call with the first terminal.
Acquiring the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal by the control unit based on the call with the first terminal.
It includes displaying call information based on the first information and the second information in the display area.
A terminal that transmits content to the first terminal or receives content transmitted from the first terminal.
A display unit that displays the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal.
A control unit that controls a call with the first terminal based on an input by a user of the terminal to the display unit that displays the first content and the second content is provided.
The control unit acquires the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal based on the call with the first terminal.
The display unit displays call information based on the first information and the second information.