WO2020188885A1 - Information processing method, program, and terminal - Google Patents

Information processing method, program, and terminal Download PDF

Info

Publication number
WO2020188885A1
WO2020188885A1 PCT/JP2019/045439 JP2019045439W WO2020188885A1 WO 2020188885 A1 WO2020188885 A1 WO 2020188885A1 JP 2019045439 W JP2019045439 W JP 2019045439W WO 2020188885 A1 WO2020188885 A1 WO 2020188885A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
call
information
user
content
Prior art date
Application number
PCT/JP2019/045439
Other languages
French (fr)
Japanese (ja)
Inventor
亮介 濱窄
Original Assignee
Line株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Line株式会社 filed Critical Line株式会社
Publication of WO2020188885A1 publication Critical patent/WO2020188885A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • H04M1/65Recording arrangements for recording a message from the calling party

Definitions

  • This disclosure relates to information processing methods, programs, and terminals of terminals.
  • Patent Document 1 discloses an example of such a system.
  • the first aspect of the present invention is an information processing method of a terminal that transmits content to a first terminal or receives content transmitted from the first terminal, and is a first content transmitted from the first terminal. Based on the display of the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal and the input by the user of the terminal to the display area for displaying the first content and the second content.
  • the control unit of the terminal controls the call with the first terminal, and the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal are referred to as the first terminal.
  • It includes the acquisition by the control unit based on the call of the above and the display of the call information based on the first information and the second information in the display area.
  • it is a program to be executed by the computer of the terminal that transmits the content to the first terminal or receives the content transmitted from the first terminal, and is transmitted from the first terminal.
  • the first content and the second content transmitted to the first terminal by the communication unit of the terminal are displayed in the display area of the terminal, and the first content and the second content are displayed.
  • the control unit of the terminal controls the call with the first terminal, the first information based on the voice of the user of the first terminal, and the first.
  • the second information based on the voice of the user of the terminal is acquired by the control unit based on the call with the first terminal, and the call information based on the first information and the second information is stored in the display area. Includes displaying.
  • a terminal that transmits content to or receives content transmitted from the first terminal, the first content transmitted from the first terminal, and Based on a display unit that displays the second content transmitted to the first terminal by the communication unit of the terminal, and input by the user of the terminal to the display unit that displays the first content and the second content.
  • the control unit includes a control unit that controls a call with the first terminal, and the control unit includes first information based on the voice of the user of the first terminal and a second information based on the voice of the user of the first terminal.
  • the information is acquired based on the call with the first terminal, and the display unit displays the call information based on the first information and the second information.
  • (B) is a screen view showing an example of a talk room after a call on a terminal.
  • (A) is a screen view showing an example of a talk room in a state where the contents of a call on the terminal are not expanded.
  • (B) is a screen view showing an example of displaying a talk room on a terminal and expanding the contents of a call.
  • (A) is a diagram showing how the user brings his / her finger closer to the call icon.
  • (B) is a screen view showing an example in which the call icon is enlarged and displayed.
  • (A) is a screen view showing an example of displaying a message indicating the content of a call by pop-up.
  • (B) is a screen view showing an example of transitioning to another screen and displaying a message indicating the contents of a call.
  • (A) is a schematic diagram showing a part of a call
  • (b) is a schematic diagram showing an example of a situation following FIG. 12 (a)
  • (c) is a display example of a talk room after a call.
  • (A) is a screen view showing an example of displaying an image relating to the position of a terminal as a background image.
  • (B) is a screen view showing a display example in which the background image and the contents of the call are linked.
  • (A) is a schematic diagram showing a part of a call
  • (b) is a schematic diagram showing an example of a situation following FIG. 13 (a)
  • (c) is a display example of a talk room after a call.
  • the screen view which shows.
  • (A) is a screen view showing a display example when the case where the call volume is relatively small is expressed by the display size of the call icon.
  • (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a).
  • (A) is a screen view showing a display example in which the case where the call volume is relatively small is represented by the color of the call icon.
  • (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a).
  • (A) and (b) are screen views showing an example of displaying an image related to the contents of a call in a talk room as an alternative to a call icon.
  • (A) is a screen view showing an example in which an image relating to the contents during a call is displayed in the talk room as an alternative to the call icon, and the call icon is also displayed.
  • (B) is a screen view showing an example in which an image is enlarged and displayed as a substitute for a call icon.
  • (A) is a screen view showing a display example of a message showing the contents of a call.
  • (B) is a screen view showing an example of a heading (summary) to be displayed in the case of the call content shown in (a).
  • FIG. 1 shows the configuration of the communication system 1 according to the embodiment of the present disclosure.
  • the server 10 and the terminal 20 are connected via the network 30.
  • the server 10 provides a service for transmitting and receiving a message between terminals 20 to a terminal 20 owned by a user via a network 30.
  • the number of terminals 20 connected to the network 30 is not limited.
  • the network 30 plays a role of connecting one or more terminals 20 and one or more servers 10. That is, the network 30 means a communication network that provides a connection route so that data can be transmitted and received after the terminal 20 connects to the server 10.
  • the network 30 may or may not be a wired network or a wireless network.
  • the network 30 is not limited, but is, for example, an ad hoc network, an intranet, an extra net, a virtual private network (VPN), a local area network (LAN), and a wireless network.
  • VPN virtual private network
  • LAN local area network
  • the network 30 may include one or more networks 30.
  • the terminal 20 may be any information processing terminal that can realize the functions described in each embodiment.
  • the terminal 20 is not limited but, for example, a smartphone, a mobile phone (feature phone), a computer (not limited, for example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, for example, a cable, a satellite set). Top boxes, digital video recorders), handheld computer devices (not limited to, for example, PDAs (personal digital assistants), email clients, etc.), wearable terminals (glasses devices, clock devices, etc.), or other types of computers , Or includes a communication platform. Further, the terminal 20 may be expressed as an information processing terminal.
  • the terminal 20 Since the configurations of the terminal 20A, the terminal 20B, and the terminal 20C are basically the same, the terminal 20 will be described in the following description. Further, if necessary, the terminal used by the user X is expressed as the terminal 20X, and the user information in the predetermined service associated with the user X or the terminal 20X is expressed as the user information X.
  • the user information is user information associated with an account used by the user in a predetermined service.
  • the user information is not limited but, as an example, input by the user or given by a predetermined service, the user's name, the user's icon image, the user's age, the user's gender, the user's address, the user's hobby. It includes information associated with the user, such as a preference, a user's identifier, and may or may not be any one or combination of these.
  • the server 10 has a function of providing a predetermined service to the terminal 20.
  • the server 10 may be any device as long as it is an information processing device that can realize the functions described in each embodiment.
  • the server 10 is not limited, but by example, a server device, a computer (not limited, by example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, by example, a cable, a satellite set top box, a digital video recorder). ), Handheld computer devices (for example, but not limited to PDAs, email clients, etc.), or other types of computers, or communication platforms.
  • the server 10 may be expressed as an information processing device. When it is not necessary to distinguish between the server 10 and the terminal 20, the server 10 and the terminal 20 may or may not be expressed as information processing devices, respectively.
  • HW Hardware
  • the terminal 20 includes a control unit 21 (CPU: central processing unit), a storage unit 28, a communication I / F 22 (interface), an input / output unit 23, a display unit 24, and a position information acquisition unit 25.
  • Each component of the HW of the terminal 20 is connected to each other via bus B, for example, without limitation. It is not essential that the HW configuration of the terminal 20 includes all the components.
  • the terminal 20 may or may not be configured to remove individual components such as a microphone 232, a camera 234, a position information acquisition unit 25, or a plurality of components. Good.
  • the communication I / F 22 transmits and receives various data via the network 30.
  • the communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed.
  • the communication I / F 22 has a function of executing communication with the server 10 via the network 30.
  • the communication I / F 22 transmits various data to the server 10 according to an instruction from the control unit 21. Further, the communication I / F 22 receives various data transmitted from the server 10 and transmits the various data to the control unit 21. Further, the communication I / F 22 may be simply expressed as a communication unit. Further, when the communication I / F 22 is composed of a physically structured circuit, it may be expressed as a communication circuit.
  • the input / output unit 23 includes a device for inputting various operations to the terminal 20 and a device for outputting the processing result processed by the terminal 20.
  • the input / output unit 23 may or may not be integrated with the input unit and the output unit, or may be separated into the input unit and the output unit.
  • the input unit is realized by any or a combination of all kinds of devices capable of receiving input from the user and transmitting information related to the input to the control unit 21.
  • the input unit is not limited, but as an example, hardware keys such as a touch panel 231 and a touch display and a keyboard, a pointing device such as a mouse, a camera 234 (operation input via a moving image), and a microphone 232 (operation input by voice). including.
  • the output unit is realized by any or a combination of all kinds of devices capable of outputting the processing result processed by the control unit 21.
  • the output unit includes, for example, a touch panel, a touch display, a speaker 233 (audio output), a lens (not limited, as an example, 3D (three dimensions) output, hologram output), a printer, and the like.
  • the display unit 24 is realized by any or a combination of all kinds of devices that can display according to the display data written in the frame buffer.
  • the display unit 24 is not limited but, for example, a touch panel, a touch display, a monitor (not limited but, for example, a liquid crystal display or OELD (organic electroluminescence display)), a head mounted display (HDM: Head Mounted Display), projection mapping, a hologram. , Includes a device capable of displaying images, text information, etc. in the air (which may or may not be vacuum). It should be noted that these display units 24 may or may not be able to display display data in 3D.
  • the input / output unit 23 is a touch panel
  • the input / output unit 23 and the display unit 24 may be arranged so as to face each other with substantially the same size and shape.
  • the control unit 21 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by. Therefore, the control unit 21 may or may not be expressed as a control circuit.
  • the control unit 21 is not limited, but as an example, a central processing unit (CPU), a microprocessor (microprocessor), a processor core (processor core), a multiprocessor (multiprocessor), an ASIC (application-specific integrated circuit), and an FPGA (field programmable). gate array) is included.
  • CPU central processing unit
  • microprocessor microprocessor
  • processor core processor core
  • multiprocessor multiprocessor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the storage unit 28 has a function of storing various programs and various data required for the terminal 20 to operate.
  • the storage unit 28 includes various storage media such as HDD (hard disk drive), SSD (solid state drive), flash memory, RAM (random access memory), and ROM (read only memory) as examples without limitation. Further, the storage unit 28 may or may not be expressed as a memory.
  • the terminal 20 stores the program P in the storage unit 28, and by executing this program P, the control unit 21 executes the processing as each unit included in the control unit 21. That is, the program P stored in the storage unit 28 causes the terminal 20 to realize each function executed by the control unit 21. Further, this program P may or may not be expressed as a program module.
  • the microphone 232 is used for inputting voice data.
  • the speaker 233 is used for outputting audio data.
  • the camera 234 is used for acquiring moving image data.
  • the camera 234 may be provided on both sides of the side where the display unit 24 of the terminal 20 is provided and the side opposite to the side where the display unit 24 is provided, and the in-camera and the out-camera, respectively, may be provided. Sometimes called a camera. Switching between the in-camera and the out-camera is executed by input from the user of the terminal 20.
  • the server 10 includes a control unit 11 (CPU), a storage unit 15, a communication I / F 14 (interface), an input / output unit 12, and a display unit 13.
  • Each component of the HW of the server 10 is connected to each other via bus B, for example, without limitation.
  • the HW of the server 10 does not necessarily include all the components as the configuration of the HW of the server 10.
  • the HW of the server 10 may or may not be configured to remove the display unit 13.
  • the control unit 11 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by.
  • the control unit 11 is typically a central processing unit (CPU), and may or may not be a microprocessor, a processor core, a multiprocessor, an ASIC, or an FPGA. In the present disclosure, the control unit 11 is not limited to these.
  • the storage unit 15 has a function of storing various programs and various data required for the server 10 to operate.
  • the storage unit 15 is realized by various storage media such as HDD, SSD, and flash memory. However, in the present disclosure, the storage unit 15 is not limited to these. Further, the storage unit 15 may or may not be expressed as a memory.
  • the communication I / F 14 transmits and receives various data via the network 30.
  • the communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed.
  • the communication I / F 14 has a function of executing communication with the terminal 20 via the network 30.
  • the communication I / F 14 transmits various data to the terminal 20 according to an instruction from the control unit 11. Further, the communication I / F 14 receives various data transmitted from the terminal 20 and transmits the various data to the control unit 11. Further, the communication I / F 14 may be simply expressed as a communication unit. Further, when the communication I / F 14 is composed of a physically structured circuit, it may be expressed as a communication circuit.
  • the input / output unit 12 is realized by a device that inputs various operations to the server 10.
  • the input / output unit 12 is realized by any or a combination of all kinds of devices capable of receiving an input from a user and transmitting information related to the input to the control unit 11.
  • the input / output unit 12 is typically realized by a hardware key typified by a keyboard or the like, or a pointing device such as a mouse.
  • the input / output unit 12 is not limited to the input / output unit 12, and may or may not include a touch panel, a camera (operation input via moving image), and a microphone (operation input by voice) as an example. However, in the present disclosure, the input / output unit 12 is not limited to these.
  • the display unit 13 is typically realized by a monitor (not limited to, for example, a liquid crystal display or an OELD (organic electroluminescence display)).
  • the display unit 13 may or may not be a head-mounted display (HDMI) or the like. It should be noted that these display units 13 may or may not be able to display display data in 3D. However, in the present disclosure, the display unit 13 is not limited to these.
  • the server 10 stores the program P in the storage unit 15, and by executing the program P, the control unit 11 executes the processing as each unit included in the control unit 11. That is, the program P stored in the storage unit 15 causes the server 10 to realize each function executed by the control unit 11. This program P may or may not be expressed as a program module.
  • the control unit 21 of the terminal 20 and / or the control unit 11 of the server 10 is formed not only in a CPU having a control circuit but also in an integrated circuit (IC (Integrated Circuit) chip, LSI (Large Scale Integration)) and the like. Each process may or may not be realized by a logic circuit (hardware) or a dedicated circuit. Further, these circuits may be realized by one or a plurality of integrated circuits, and the plurality of processes shown in each embodiment may or may not be realized by one integrated circuit. Further, the LSI may be referred to as a VLSI, a super LSI, an ultra LSI, or the like depending on the degree of integration. Therefore, the control unit 21 may or may not be expressed as a control circuit.
  • IC Integrated Circuit
  • LSI Large Scale Integration
  • the program P (not limited to, for example, a software program, a computer program, or a program module) of each embodiment of the present disclosure may be provided in a state of being stored in a computer-readable storage medium. , Does not have to be.
  • the storage medium can store the program P in a “non-temporary tangible medium”.
  • the program P may or may not be for realizing a part of the functions of each embodiment of the present disclosure. Further, it may or may not be a so-called difference file (difference program) that can realize the functions of each embodiment of the present disclosure in combination with the program P already recorded on the storage medium.
  • the storage medium may be one or more semiconductor-based or other integrated circuits (ICs) (for example, but not limited to field programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hardware.
  • the storage medium may be volatile, non-volatile, or a combination of volatile and non-volatile, where appropriate.
  • the storage medium is not limited to these examples, and any device or medium may be used as long as the program P can be stored. Further, the storage medium may or may not be expressed as a memory.
  • the server 10 and / or the terminal 20 can read the program P stored in the storage medium and execute the read program P to realize the functions of the plurality of functional units shown in each embodiment.
  • the program PDD of the present disclosure may or may not be provided to the server 10 and / or the terminal 20 via an arbitrary transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. Good.
  • the server 10 and / or the terminal 20 realizes the functions of the plurality of functional units shown in each embodiment by executing the program P downloaded via the Internet or the like, as an example without limitation.
  • Each embodiment of the present disclosure may also be realized in the form of a data signal embedded in a carrier wave, in which the program P is embodied by electronic transmission.
  • At least part of the processing on the server 10 and / or the terminal 20 may or may not be realized by cloud computing composed of one or more computers.
  • At least a part of the processing in the terminal 20 may or may not be performed by the server 10.
  • At least a part of the processing of each functional unit of the control unit 21 of the terminal 20 may or may not be performed by the server 10.
  • At least a part of the processing in the server 10 may or may not be performed by the terminal 20.
  • at least a part of the processing of each functional unit of the control unit 11 of the server 10 may or may not be performed by the terminal 20.
  • the configuration of the determination in the embodiment of the present disclosure is not essential, and a predetermined process is operated when the determination condition is satisfied, or a predetermined process is performed when the determination condition is not satisfied. It may or may not be.
  • the program of this disclosure is not limited to, but examples include scripting languages such as ActionScript and JavaScript (registered trademark), object-oriented programming languages such as Objective-C and Java (registered trademark), and markup languages such as HTML5. Implemented using.
  • messages can be exchanged on the talk room between the terminals 20 via the server 10 and via the messaging application.
  • the talk room is a place where users who use the messaging service exchange contents in the messaging service provided by the server 10.
  • the content exchanged in the talk room includes various file information such as character information input by the user using his / her own terminal 20, image information including photos and stamps, audio files, video files, and data files. Including, but not limited to these.
  • the users of the terminal 20 can further execute a call via the talk room.
  • the users 10a and 10b make a call as shown in FIG. 2A.
  • image information indicating that the users have made a call (hereinafter referred to as a call icon.
  • the image indicating that a call has been made is not limited to the icon. No.
  • the image information is not limited, but an example of information related to a call.) Is displayed.
  • the terminal further displays a message (not limited, but an example of call information) indicating the content of the call as text.
  • FIG. 2B is a diagram showing an example of a display screen of the terminal 20b of the user 10b. The details will be described below.
  • the terminal 20 includes a message processing unit 211, a call unit 212, a voice recognition unit 213, and a display processing unit 214 as functions realized by the control unit 21. To be equipped.
  • the message processing unit 211 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display.
  • the communication I / F 22 is instructed to transmit the received input content to the server 10.
  • the target to be processed by the message processing unit 211 is not limited to the text message input by the user to the talk room, but also includes image information including photos and stamps, audio files, video files, data files, and the like. Good.
  • the message processing unit 211 determines the display size of the call icon according to the text amount of the text data generated by the voice recognition unit 213, and displays the call icon having a size corresponding to the text amount. It may or may not be instructed to the display processing unit 214.
  • the call amount can be estimated from the size of the call icon when the user confirms it later.
  • the amount of call may be expressed by changing the color according to the amount of text.
  • the call unit 212 has a function of executing a call with another user who uses the messaging service via the server 10 on the messaging service.
  • the call unit 212 has a function of making a call to a designated party when receiving an input of a call on the messaging service from a user of the terminal 20, and a call from another user who uses the messaging service.
  • the call unit 212 executes a call by a function called VoIP (Voice over Internet Protocol), for example, without limitation.
  • the call unit 212 may or may not record the contents of the call in the storage unit 28 during the call. Further, the call unit 212 may have a video call function.
  • the call unit 212 transmits the sound collected by the microphone 232 and the image captured by the camera 234 to the server 10 via the communication I / F22, and at the same time, the communication I / The voice signal and the video signal transmitted from the other party via the F via the server 10 are received, the sound based on the voice signal is output from the speaker 233, and the video based on the video signal is displayed on the display unit 24. Instruct the display processing unit 214. Further, the call unit 212 makes a call based on the input to the image information (call icon or image for a call different from the call icon) indicating that the user has made a call displayed on the talk room. May or may not be started (calling).
  • the call processing may be executed in order to make a call with the user corresponding to the talk room by inputting a predetermined input to the call icon indicating that the call has been made on the talk room. It does not have to be done.
  • the call may be a call via a speaker having an AI assistant function such as a smart speaker held by the user of the terminal 20. In that case, a call is made with another terminal through the smart speaker, but in that case, the voice collected by the smart speaker is directly transmitted to the server 10 and the server 10 sends the call to the terminal of the other party. Will be sent.
  • the smart speaker itself performs voice recognition processing and sends a text message to the server 10, and the server 10 sends a text indicating the content of the call to the talk room of the user's terminal 20 associated with the smart speaker.
  • a message may be transmitted so that the display unit 24 of the terminal 20 displays a message indicating the content of the call on the talk room, or the smart speaker only transmits the voice to the server 10, and the server 10 performs the voice recognition process.
  • a text message indicating the content of the call may be transmitted to the terminal 20 of the user corresponding to the smart speaker, and the display unit 24 of the terminal 20 may display the message indicating the content of the call on the special room.
  • the communication I / F 22 of the terminal 20 receives the user's voice from the smart speaker at one end, and the call unit 212 receives the voice collected by the smart speaker.
  • the voice may be transmitted to the server 10 via the communication I / F 22.
  • the voice recognition unit 213 has a function of recognizing the voice of the call executed by the call unit 212 and converting it into text data.
  • the voice recognition by the voice recognition unit 213 may be executed for the recorded data of the call recorded in the storage unit 28 by the call unit 212.
  • the voice recognition unit 213 may or may not record the text data obtained by voice recognition in the storage unit 28.
  • the voice recognition unit 213 transmits the text data obtained by voice recognition to the message processing unit 211.
  • the voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order, and associates the information indicating the speaker with the text data after the classification obtained by voice recognition. , Is transmitted to the message processing unit 211.
  • To identify the speaker from the content of the voice the speaker is classified by extracting the feature amount (not limited, but as an example, the frequency spectrum) of the voice in which the conversation is being made. Can be identified.
  • the display processing unit 214 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display.
  • the communication I / F 22 is instructed to transmit the received input content to the server 10.
  • the message processing unit 211 includes content transmitted by the terminal 20 (not limited, but an example of the second content) and content transmitted by a terminal held by a user other than the terminal 20 (not limited, but an example of the first content). (As an example, not limited to the display area of the display unit 24, the content transmitted by the user of the terminal 20 is displayed on the left side of the display area of the display unit 24.
  • Displaying the content transmitted by another user on the left side of the display area of the display unit 24 means displaying the content closer to the left side of the display area. That is, as shown in the display example of the talk room of FIG. 2B, the left end of the message corresponding to the voice spoken by another user is displayed closer to the left side of the display area.
  • displaying the content transmitted by the user of the terminal 20 on the right side of the display area of the display unit 24 means displaying the content (message) on the right side of the display area. That is, as shown in the display example of the talk room of FIG.
  • the display processing unit 214 sends a message (not limited to an example of the second information) indicating the contents of the call spoken by the user of the terminal 20 to the user of the terminal 20 with respect to the text message based on the voice recognized by the voice recognition unit 213. It is associated and displayed in the display area, and a message (not a limitation but an example of the first information) indicating the content of the call spoken by the user of the other party is displayed in the display area in association with the user of the other party.
  • the server 10 includes a message processing unit 111 as a function realized by the control unit 11.
  • the message processing unit 111 has a function of managing a talk room for exchanging information between users.
  • the message processing unit 111 relays the exchange of contents including the contents between the terminals provided with the contenting service provided by the server 10. That is, when the content is transmitted from a certain user to the talk room, the talk room is specified and the content is transmitted to another user belonging to the talk room.
  • FIG. 3 is a sequence diagram showing an example of communication between each device in the communication system 1 according to the present embodiment.
  • the sequence diagram shown in FIG. 3 is a diagram showing exchanges when users make a call on a message application.
  • the terminal 20a makes a call by designating a call partner from the message application according to the input from the user (step S301). That is, the terminal 20a transmits a call request including the information of the other party to the server 10.
  • the server 10 When the server 10 receives a call request from the terminal 20a, the server 10 identifies the user (terminal 20b) of the call partner from the information of the call partner included in the call request, and calls the specified user (terminal 20b). A signal is transmitted (step S302).
  • the terminal 20b receives the call signal transmitted from the server 10. That is, the terminal 20b receives a call request from the user of the terminal 20a on the message application (step S303). Then, the terminals 20a and 20b make a call on the message application via the server 10 (step S304). Here, the content of the call may or may not be recorded. Then, the user of the terminal 20a and the user of the terminal 20b make an input to end the call to each terminal and end the call (step S305).
  • the terminal 20b After the end of the call, the terminal 20b performs voice recognition for the content of the call and converts the content of the call into text information (step S306).
  • the voice recognition process when recording the content of the call, the voice recognition process can be executed even after the end of the call, but when not recording, the on-time voice recognition process is executed immediately after the start of the call.
  • the terminal 20b stores a message (text message) obtained by voice recognition (step S307).
  • the message obtained by voice recognition may be transmitted not only to the terminal 20b but also to the server 10 and the terminal 20a and stored in the server 10 and the terminal 20a. Further, it may be stored only in the server 10 instead of the terminal 20b.
  • the data of the text message obtained by voice recognition can be stored and displayed in the talk room.
  • the voice-recognized text data is displayed as a message in the display area of the display unit 24 of the terminal 20 (step S308).
  • the terminal 20a also executes the processes of steps S306 to S308, that is, the process of executing the voice recognition process for the content of the call and displaying the voice-recognized text data. It may or may not be done. Further, since the call is made via the server 10, the voice recognition process may be executed by the server 10. In that case, a text indicating the content of the call obtained by the server 10 voice recognition. The data is transmitted to each user (terminal 20) involved in the call and displayed on the talk room of each terminal. In this way, the content of the call is automatically converted into text data and displayed on the talk room, so that the content of the call is surely recognized even when you want to remember the content of the call executed by the user later. can do.
  • FIG. 4 is a flowchart showing an operation example of the terminal 20 for realizing the processing of the sequence diagram shown in FIG.
  • the control unit 21 of the terminal 20 detects whether or not a call has been started on the messaging application (step S401). This is detected by the call unit 212 on the messaging application depending on whether there is a response to the outgoing call from the terminal 20 according to the input from the user, or whether there is an incoming call input for the incoming call from another terminal. can do.
  • the control unit 21 of the terminal 20 records the voice of the call while the call unit 212 is talking, and stores the recorded voice data in the storage unit 28 (step S402).
  • the control unit 21 of the terminal 20 determines whether or not the call has ended based on whether or not there is a call end input from the user via the input / output unit 23 (step S403). If the call has not ended (NO in step S403), it waits until the call ends.
  • step S403 If it is determined that the call has ended (YES in step S403), the control unit 21 ends the recording.
  • the voice recognition unit 213 executes voice recognition processing on the recorded voice data. Then, the text message obtained by voice recognition is stored in the storage unit 28 (step S404). That is, the voice recognition unit 213 converts the recorded voice data into text data indicating the contents of the call.
  • the text message obtained by voice recognition may or may not be sent to the server 10. Furthermore, when the server 10 receives the text message, it may or may not be transmitted to the terminal of the other party. By transmitting the text message obtained by the terminal 20 by voice recognition to the server 10 or the terminal of the other party, the message indicating the contents of the call can be displayed in text on the other party's terminal, and the other party can also display the text message. , Later, when you check the contents of the call, you can see the message and check the contents of the call. The terminal of the other party may or may not display the received text message on the talk room in the same manner as the terminal 20.
  • the voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order (step S405). At this time, the voice recognition unit 213 may or may not classify the text data of the content spoken by the same speaker according to a predetermined standard. As an example, not a limitation, it may or may not be divided by sentence. The voice recognition unit 213 transmits the divided text data to the display processing unit 214.
  • the display processing unit 214 displays each text data divided by the voice recognition unit 213 on the display unit 24 as a message on the talk room in association with the corresponding speaker (step S406). That is, the control unit 21 of the terminal 20 displays a text message (not limited, but an example of the second information) obtained by voice-recognizing the voice of the user holding the terminal 20 in association with the user of the terminal 20. , A text message (not limited, but an example of the first information) that voice-recognizes the voice of the other party is displayed in association with the other party.
  • the control unit 21 determines whether or not there is a termination input of the messaging application from the user via the input / output unit 23 (step S407). If there is no end input (NO in step S407), the process returns to the process of step S401. On the other hand, if there is an end input (YES in step S407), the process ends.
  • the terminal 20 As described above, according to the terminal 20 according to the present embodiment, as shown in FIG. 2A, when a call is executed on the messaging application, the terminal 20 is shown in FIG. 2B. Call content can be automatically converted to text and displayed as a message. Therefore, it can be used later to help the user to recall the conversation content when the call is made.
  • FIG. 5 is a flowchart showing an operation example of the process related to the display of the message indicating the content of the call on the terminal 20.
  • the terminal 20 may or may not have a function of switching between display and non-display of a message of the contents of a call when users make a call on the talk room. ..
  • FIG. 5 is a flowchart showing an operation example of the terminal 20 when the display / non-display of the message can be switched.
  • it is a flowchart showing the operation of the terminal 20 when the talk room is displayed on the display unit 24 of the terminal 20 and the talk has been made on the messaging application in the past.
  • the process shown in FIG. 5 is a process in which the user executes a messaging application on the terminal 20 and displays the talk room.
  • a talk room is displayed on the display unit 24 of the terminal 20, and if there has been a call in the past on the messaging application, image information (call icon) indicating that the call has been made to the talk room is displayed.
  • image information (call icon) indicating that the call has been made to the talk room is displayed.
  • the control unit 21 of the terminal 20 determines whether or not the input to the call icon displayed on the talk room (not limited but touch input as an example) is made to the input / output unit 23 (step S501). ..
  • step S501 When there is a touch input to the call icon (YES in step S501), the control unit 21 determines whether or not the content of the message corresponding to the call icon has been expanded (step S502). Expanding a message is synonymous with displaying a message indicating the content of the call.
  • step S503 the display processing unit 214 hides the displayed call message (step S503).
  • step S504 displays the content of the call message on the display unit 24 (step S504), and ends.
  • the terminal 20 displays the message in the expanded state or the unexpanded state in the talk room, and is determined by the setting made to the terminal 20 by the user. May be good.
  • all of the text messages converted by voice recognition of the contents of the call may be displayed, or only a part of the excerpts may be displayed.
  • the text message may be analyzed to display a text message indicating the content that is presumed to be important in the call.
  • FIG. 6 is a diagram showing an example of changes in the display of the talk room before and after the call when a call is made on the talk room on the terminal 20 shown in FIG.
  • FIG. 6A shows an example of displaying the talk room before the call
  • FIG. 6B shows an example of displaying the talk room after the call.
  • FIG. 6A shows a display example of a certain talk room of the user of the terminal 20, and shows a state in which the message 601 sent at 22:11 is displayed.
  • the user of the terminal 20 makes a call with another user related to the talk room.
  • the content of this call is recorded and converted into a text message by voice recognition processing.
  • the text message is displayed in association with each user related to the call. That is, as shown in FIG. 6B, the terminal 20 displays the call icon 611 indicating that the call has been made on the talk room, following the message 601.
  • the call icon 611 may or may not be associated with date and time information 612 (which may be the start date and time or the end date and time of the call) when the call was made.
  • the terminal 20 displays the call content as a message in which the content of the call is converted into text by voice recognition, as shown in the portion surrounded by the dotted line 613.
  • the terminal 20 can leave information indicating the contents of the call on the talk room in the form of a message.
  • FIG. 7 is a diagram showing a display example when processing is performed on the terminal 20 shown in FIG.
  • FIG. 7A is a screen view showing a state in which a message indicating the content of the call is not displayed
  • FIG. 7B shows a state in which a message indicating the content of the call is expanded and displayed. It is a screen view.
  • the talk room of the messaging application is displayed on the display unit 24 of the terminal 20. Then, it is assumed that the call icon 611 indicating that the call has been made is displayed on the talk room.
  • the user touch-inputs the call icon 611 using his / her finger or a stylus, that is, a message of the contents of the call. Instruct the deployment of.
  • the terminal 20 expands a message indicating the content of the corresponding call, that is, As shown in FIG. 7B, it is displayed on the display unit 24. As shown in FIG. 7B, below the call icon 611, an example in which the content of the call is displayed in a message format is shown.
  • the display processing unit 214 of the terminal 20 can be changed from the display mode shown in FIG. 7 (b) to the display mode shown in FIG. 7 (a).
  • the first display mode after the call may be the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a).
  • the user of the terminal 20 may be configured to be able to set the messaging application in the terminal 20, and the terminal 20 has the setting contents set by the user. Therefore, either the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a) may be displayed.
  • the terminal 20 can remind the user of the conversation he / she wants to remember.
  • the display method of the message indicating the content of the call is not limited to the expansion, and as an example, the user touches the vicinity of the call icon 611. It may be a display that pops up a message when the call is being made, or a display that transitions to a screen different from the talk room.
  • the call icon 611 may be used to display an image of a user involved in the call. In that case, the call icon 611 may be displayed as a substitute for the call icon 611, or may be displayed together with the call icon 611.
  • the user's image is not limited, but as an example, the user's face photograph, the profile image used by the user on the messaging application, and the user's face photograph (or the user's face photograph) taken by using the in-camera when making a call.
  • the processed product) and the like can be used, but the present invention is not limited to these.
  • FIG. 8 is a diagram showing one display mode of the call icon 611.
  • FIG. 8A shows an example in which the user brings his / her finger closer to the call icon 611
  • FIG. 8B shows an example in which the user's finger approaches the call icon 611 by a certain amount or more.
  • FIG. 8 shows a state in which a message indicating the content of the call is not expanded.
  • the user brings his or her finger closer to the call icon 611a.
  • the touch panel 231 of the terminal detects a state in which the user's finger is in contact with the touch panel 231 or is in close proximity to the touch panel 231 or more, and detects the operation position.
  • the control unit 21 of the terminal 20 determines whether the coordinates on the touch panel 231 indicated by the detected operation position are close to the display coordinates of the call icon 611a.
  • the control unit 21 of the terminal 20 may enlarge the call icon 611b as shown in FIG. 8B. You don't have to.
  • the terminal 20 may be configured to display the content of a message indicating the content of a call as a pop-up message 901 as shown in FIG. 9A.
  • the terminal 20 may be configured to transition to a screen different from the talk room and display the content of the message indicating the content of the call, as shown in FIG. 9B.
  • the return icon 902 for returning to the original talk room display may or may not be displayed. By touching the return icon 902, the original talk room display can be returned.
  • the speaker in the call is specified by using the voice feature amount, but in the utterance, the terminal that acquired each utterance identifies each terminal (or the user) with respect to the voice signal. It may be configured so that the speaker of each voice can be distinguished by giving possible information.
  • each speaker of the voice picked up by the smart speaker receives the position information of each speaker together with the voice. By doing so, the speaker may be identified.
  • the speaker can be distinguished from the voice from which direction the voice is heard, so that the information indicating the direction in which the smart speaker receives the voice is displayed with respect to the voice. Speakers can be distinguished by giving them.
  • the message processing unit 211 can display a message indicating the contents of the call in association with the speaker.
  • the voice recognition unit 213 uses the text data obtained by voice recognition based on sentence breaks, conversation breaks, context breaks, etc., even when the same speaker continues the conversation. It may or may not be divided. Further, this division may or may not be simply divided when the number of characters exceeds a predetermined number of characters.
  • the voice recognition unit 213 may delete the content related to the ambient noise in the text data obtained by the voice recognition. This may be achieved by using known noise canceling techniques or by using contextual analysis to remove unnatural words in the text data, if any. In addition, the voice recognition unit 213 may or may not delete the message related to the reciprocity in the obtained text data.
  • the image information (not limited, but as an example, a stamp showing the appearance of agitation) is used to express the agitation as information indicating that the agitation has been made. May be good.
  • the user of the terminal 20 uses the terminal 20 to make a call with another user via the messaging application provided by the server 10. Then, the terminal 20 displays the contents of the call in the display area of the display unit 24 of the terminal 20 in the talk room of the messaging application, and displays the information indicating the contents of the call. Specifically, the terminal 20 converts the content of the call into text data by performing voice recognition processing. Then, the terminal 20 displays the converted text data in the talk room of the messaging application.
  • the terminal 20 can convert the contents of the call into a text message and display it without forcing the user to perform a special operation.
  • the terminal 20 may display a call icon indicating that a call has been made on the talk room. Then, the display or non-display of the message indicating the content of the call may be switched by the input from the user to the call icon.
  • the terminal 20 can prevent the talk room from becoming difficult to see due to the huge amount of messages when displaying the contents of the call when the call is prolonged by hiding the message.
  • the message can be expanded and the user can recognize the content of the call.
  • the terminal 20 may not display all, may display a part, or display all of the text messages obtained by performing voice recognition processing on the contents of the call. You may. In addition, which display mode to use may be determined by the user's setting for the terminal 20.
  • the terminal 20 can provide convenience to the user by selecting and setting which display mode to use.
  • the terminal 20 uses an image of the other party (a face image as an example, or a profile image used on the messaging application) as information indicating that the call has been made when the call is made. Further, an image of the user of the terminal 20 (not limited to a face image as an example, or a profile image used on a messaging application) may also be displayed.
  • the terminal 20 can make the user recognize at a glance that the call was made and who the other party was.
  • the terminal 20 identifies who is the speaking user when converting the contents of the call into text data by the voice recognition process. Then, the text data is converted so as to correspond to the specified user, and the text data is displayed as if it were a message sent to the other user.
  • the terminal 20 can distinguish between the user of the terminal 20 during a call and the user of the other party and display a message, so that it is possible to later confirm who said each statement. it can.
  • the terminal 20 displays image information indicating that a call has been made on the messaging application on the talk room, and when there is a user input for the image information, the terminal 20 is associated with the talk room. It may be configured to initiate a call with a user.
  • FIG. 10 is a flowchart showing an operation example of the terminal when the user makes a video call.
  • a call by a video call is also possible.
  • a video call is a so-called video telephone function.
  • the call unit 212 of the terminal 20 starts a video call with the other party via the server 10 (step S1001). This is started when the user of the terminal 20 gives a call instruction or receives a call from another user on the messaging application.
  • the call unit 212 instructs the camera 234 of the input / output unit 23 to start imaging.
  • the camera 234 images the display unit 24 side of the terminal 20, that is, the user of the terminal 20.
  • the call unit 212 instructs the microphone 232 to acquire the conversation sound of the user of the terminal 20.
  • the call unit 212 transmits the video captured by the camera 234 and the voice acquired by the microphone 232 to the server 10 via the communication I / F 22 during the video call.
  • the video captured by the camera 234 and the voice acquired by the microphone 232 are transmitted from the server 10 to the terminal of the other party.
  • the communication I / F 22 of the terminal 20 receives the video and audio transmitted from the terminal of the other party in sequence from the sequential server 10 and instructs the display processing unit 214 to display the received video on the display unit 24.
  • the input / output unit 23 is instructed to output the received voice from the speaker 233.
  • the call unit 212 stores the video captured by the terminal 20 and the acquired voice, and the video and voice transmitted from the terminal of the other party in the call in the storage unit 28.
  • the terminal 20 ends the video call by inputting an instruction to end the video call from the terminal 20 or by the other party disconnecting the call (step S1003).
  • the voice recognition unit 213 of the terminal 20 performs voice recognition on the recorded voice of the video call (step S1004). Further, the control unit 21 of the terminal 20 may or may not specify the user's emotion from the content of the image.
  • the text message obtained by the voice recognition is displayed in the talk room (step S1005).
  • the message may or may not be displayed in a display mode according to the emotion of the user who specified the message.
  • the display mode according to the user's emotion is to change the shape of the bubble (speech balloon) for displaying the message (for example, when the user is angry, the shape of the balloon is jagged).
  • add a character to the message to indicate a specific emotion for example, if the user is angry, add # at the end of the message, or if the user is happy, add the ⁇ symbol to the message. It may be given at the end), or the characters may be displayed in a color according to the emotion.
  • an emoticon or image information not limited, but a stamp as an example
  • indicating the user's emotion may be displayed together.
  • the control unit 21 of the terminal 20 determines whether or not the user has switched to the out-camera or activated the out-camera during the video call (step S1006). This can be detected by the input from the user to the terminal 20 when the user of the terminal 20 switches or activates the out-camera, and the image captured by the other party switching to the out-camera. When is transmitted, an unnatural break occurs in the image, and it can be detected by detecting the break.
  • step S1006 the control unit 21 uses one frame of the image taken by the out-camera as a still image or The video obtained during the shooting by the out-camera is converted into a text message of the video call in the talk room and displayed in association with the displayed message (step S1007).
  • the moving image may be between the timing of switching to the out-camera and the timing of switching to the in-camera again, but the present invention is not limited to this.
  • the insertion position of the still image or the moving image may be arbitrary, for example, it may be the first or the last of the text message converted by voice recognition of the video call, or the out-camera. It may be the timing when the switch to is generated. If the out-camera is not switched during the video call (NO in step S1007), the process proceeds to step S1008.
  • the control unit 21 determines whether or not there is an input related to position information during a call (step S1008).
  • the input related to the position information may be any form of input as long as it is an input of information that can specify the position of the terminal 20 or the terminal of the other party, and is not limited, but as an example, voice or Input of place name and facility name by direct input from the user or the other party, input of acquisition instruction of location information (location information by GPS) from the user, automatic acquisition of location information by GPS that is always activated, call partner It is possible, but not limited to, transmission of position information from the user, input of an image or information that can specify the position from the user, and the like. If there is no input regarding the position information during the call (NO in step S1008), the process ends.
  • the control unit 21 inserts an image related to the position information into the talk room (step S1009).
  • the image related to the position information is an image related to the position of the terminal 20 or the position of the terminal of the other party, and may be any image as long as it is related.
  • map information including the area around the place name may be acquired as an image and inserted, or the location of the facility may be inserted. Map information to be shown, or a photograph showing the appearance of the facility may be acquired and inserted.
  • the image of the surrounding map including the acquired location information may be acquired and inserted.
  • the image of the surrounding map including the received location information may be acquired and inserted.
  • the homepage of the store or facility where the user (or the other party) is located may be accepted as information on the user's location, and the address and representative image of the homepage may be acquired and inserted.
  • An image may be inserted, or map information indicating a location that can be identified from the homepage may be acquired and inserted.
  • steps S1008 and S1009 may be executed not only during a video call but also during a normal call.
  • the number of images to be inserted is not limited to one, and may be any number, and the number may or may not be limited.
  • the three processes of step S1004 and step S1005, step S1006 and step S1007, and step S1008 and step S1009 do not have to be all performed, and at least one may be performed. However, at least two of these three processes may be combined and executed.
  • the image (still image, moving image) captured when the out-camera is activated during a video call (when the out-camera is switched to) is used as information indicating the content of the call together with a message indicating the content of the call (or). I decided to display it in the talk room (without displaying the message), but this is not the case either.
  • the image to be displayed as an image showing the contents of the talk in the talk room is not limited to the one captured by the out-camera, but may be the one captured by the in-camera. Therefore, as an example of the image captured by the in-camera, the face image of each user involved in the call may be displayed in the talk room.
  • the display of the image is not limited to the mode of displaying by inserting it between messages.
  • it may be displayed as a background image of a section displaying a message indicating the content of the call.
  • the display is not limited to the background image of the entire message, and may be configured to display only the period during which the conversation related to the acquired image is taking place.
  • the period of conversation related to the image can be realized by analyzing the text message obtained by voice recognition processing of the contents of the call. An example of this will be described with reference to FIG.
  • the following describes an example of inputting information regarding the position during a call and a specific example of a talk room display example at that time.
  • FIG. 11 shows an example of a call and a display example of a talk room displayed after the call at that time.
  • FIG. 11 (a) shows a part of the call
  • FIG. 11 (b) shows an example of the situation following FIG. 11 (a).
  • FIG. 11C shows an example of displaying the talk room after a call.
  • FIG. 11A it is assumed that the user 10a of the terminal 20a makes a call to visit the location or a video call to the user 10b of the terminal 20b.
  • the user 10b has taken a picture of a nearby facility as information on the place where he / she exists, as shown in FIG. 11B.
  • the terminal 20 When the exchanges shown in FIGS. 11 (a) and 11 (b) are performed during a call, as an example, the terminal 20 is positioned with respect to the terminal 20b acquired by the terminal 20b as shown in FIG. 11 (c).
  • the information-based image 1101 is inserted into the talk room.
  • the terminal 20b may display the image obtained by the shooting shown in FIG. 11B as it is as an image relating to the position of the terminal 20b in the talk room, or can be extracted from the shot image.
  • Information related to the position may be extracted by an image recognition process, and then an image may be acquired from the network and displayed from the information.
  • FIG. 11B may display the image obtained by the shooting shown in FIG. 11B as it is as an image relating to the position of the terminal 20b in the talk room, or can be extracted from the shot image.
  • Information related to the position may be extracted by an image recognition process, and then an image may be acquired from the network and displayed from the information.
  • the word "AA Mart" is extracted from the image captured by the user 10b using the terminal 20b, the wording is searched on the Internet, and the image obtained by the search ( As an example, not a limitation), it is displayed as shown in FIG. 11 (c).
  • the image 1101 is displayed in the form following the utterance of the user 10b in FIG. 11B so as to synchronize with the timing at which the image was taken.
  • the image 1101 may be displayed as a background image of the talk room. Alternatively, it may be inserted at the beginning of a message indicating the content of the call, or at the end of the message.
  • FIG. 12A is a diagram showing a display example in which the acquired image is displayed as the background of the talk room based on the information regarding the position of the terminal.
  • FIG. 12B is a diagram showing a display example in a state in which the talk room shown in FIG. 12A is scrolled up and displayed.
  • the terminal 20 displays an image specified from the position information regarding the terminal specified during a call as a background image of the talk room.
  • the terminal 20 uses an image acquired during a call as a background image of the talk room (as an example, not limited to an image relating to the position of the terminal, an image input by the user during the call, and an image by the user.
  • An image taken during a call, an image related to the content of the call, etc.) is displayed, and a message indicating the content of the call is displayed by superimposing it on the background image.
  • FIG. 12A by displaying an image specified from the position information about the terminal specified during the call as the background image of the message, the content of the message indicating the content during the call is displayed together with the content of the message. It is possible to make it easier for the user to recall the contents of the call.
  • the background image may or may not be displayed only during the section T1 in which the message of the related topic is displayed. That is, as shown in FIG. 12B, in the section T2, an image based on the information regarding the position of the terminal acquired during the topic is displayed as the background image, and in the section T3, the background image is not displayed. That is, by linking the display section of the topic message related to the image and the display section that displays the image acquired based on the information about the position of the terminal acquired during the topic as the background image, during a call. It is possible to reproduce the sense of reality and make it easier for the user to recall the contents of the call.
  • FIG. 13A shows a part of the call
  • FIG. 13B shows an example of the situation following FIG. 13B.
  • FIG. 13 (c) shows a display example of a talk room displayed on the terminal 20 when the call shown in FIGS. 13 (a) and 13 (b) is made.
  • the user 10a proposes to the user 10b of the terminal 20b to visit a certain place via a telephone call or a video call, whereas the user 10b proposes a visit to a certain place. I'm asking for a description of the location.
  • the user 10a uses his / her own terminal 20a to input location information during a call.
  • the input of the location information may be, for example, an instruction input for acquiring the location information if the user is at the destination store (or near the destination), or the location information of the destination recognized by the user (limitedly). As an example, it may be the latitude and longitude information or the address information), or it may be a web page containing information related to the destination.
  • the terminal 20 is based on the position information input in FIG. 13 (b).
  • a map 1301 showing the location of the destination is inserted between messages and displayed.
  • the image related to the location information is not limited to the map 1301, and may be another image, for example, image information about the homepage of the destination, or address information thereof.
  • the terminal 20 not only displays a message based on the conversation between users during the call, but also automatically displays an image based on the information on the location information input during the call. Can be collected and displayed. As a result, when a call is made through the talk room, the terminal 20 can provide more information indicating the content of the call.
  • the voice recognition unit 213 is supposed to execute the voice recognition after the video call is completed, but this is not limited to this, and it may be executed during the call. Furthermore, when the user makes a call using the speakerphone using the terminal 20, the terminal 20 performs voice recognition in real time, and is analyzed and converted in real time on the talk room while making the call. Message may be displayed. In this way, even in the case of a video call, the terminal 20 can display the content of the call between the users made in the video call as a message on the talk room. Further, in making a video call, it is conceivable to give some lessons, specifically, English conversation (language) lessons, but in such a case, the terminal 20 is used in that language together with the text message. A more appropriate phrase may be collected from a network or the like and displayed.
  • the user of the terminal 20 makes a call with another user by a video call via the messaging application provided by the server 10.
  • the terminal 20 may display an image taken during a call including a video call, or an image taken by the terminal of the user of the other party, or information based on the image in the talk room.
  • the terminal 20 can make it easier for the user to recall the content of the conversation during a call.
  • the terminal 20 uses an image taken by an out-camera provided on the side opposite to the side where the display unit 24 of the terminal 20 is located during a talk, based on a camera switching instruction taken by the user, in a talk room. It may be displayed in.
  • the terminal 20 may display the image input by the user in the talk room during the video call. At this time, the terminal 20 may display the image between messages indicating the contents of the call so as to match the timing at which the image is input, but the present invention is not limited to this. It may be displayed at the beginning of the message or at the end of the message.
  • the terminal 20 can easily remind the user of the contents of the call by viewing the image later.
  • the terminal 20 may display the image acquired during the call as the background image of the message indicating the content of the call.
  • the user recalls the contents of the call by checking the contents of the message (text) indicating the contents of the call and checking the image seen, photographed, or acquired during the call as the background image. It will be easier to do.
  • the terminal 20 is based on an image based on information on the position of the terminal 20 or information on the position of the terminal of the other party in addition to the image input by the user and the image obtained by taking a picture during the call.
  • the image may be acquired and displayed in association with the message.
  • the terminal 20 By acquiring an image based on the position of the terminal 20 during a call or the position of the other party's terminal, the terminal 20 is not limited to, but as an example, what kind of place the user is called. Alternatively, by recognizing where the other party was, the content of the call can be reminded.
  • the terminal 20 may display an image according to the contents of the call. After converting the contents of the call into a text message by voice recognition processing, the terminal 20 analyzes the contents of the call by morphological analysis, context analysis, etc., and displays a highly relevant image from the results obtained by the analysis. ..
  • the terminal 20 is not limited, but as an example, when there is a topic about a certain store as the content of a call, a picture of the store may be displayed as an image in association with a message, or a topic about a certain food. If there is, a photograph of the food may be displayed as an image in association with the message.
  • the terminal 20 can easily remind the user of the content of the call.
  • FIG. 14 is a flowchart showing an operation example of processing for realizing a display mode for allowing the user to easily recognize the contents of the call when the call is made on the talk room.
  • the terminal 20 may or may not execute the process shown in FIG. Further, although not shown, the terminal 20 may be configured to be able to select and set whether or not to execute the process shown in FIG. 14 according to the input from the user.
  • the process shown in FIG. 14 shows an example of the process after step S404 shown in FIG.
  • the voice recognition unit 213 executes voice recognition processing on the recorded voice (step S404).
  • the control unit 21 specifies the amount of text data obtained by the voice recognition unit 213 converted by voice recognition (step S1405).
  • the control unit 21 may specify the number of characters of the text data or the data capacity of the text data as the amount of sentences as an example, not the limitation.
  • the control unit 21 determines the display size of the call icon 611 based on the specified amount of text (step S1406). Specifically, the control unit 21 determines the display size so that the larger the amount of text, the larger the display size of the call icon 611. As an example, not a limitation, the control unit 21 may determine the display size by a function that determines the display size by inputting a predetermined amount of text, or displays the display size in the storage unit 28 in advance according to the range of the text amount.
  • a sized table may be stored and the display size may be determined according to the table.
  • the display size of the call icon 611 is determined based on the amount of characters after the text conversion, but the length of the call time may be used instead of the amount of characters. That is, since it is assumed that the longer the call time, the deeper the conversation, it is assumed that the display size of the call icon 611 is increased and the shorter the call time, the simpler the conversation. Therefore, the display size of the call icon 611 is reduced.
  • control unit 21 executes a context analysis on the sentence by using morphological analysis or the like (step S1407). This can be achieved by using existing text mining techniques. Then, the control unit 21 determines a heading that is presumed to be appropriate as the title of the call content from the analysis result (step S1408). This heading is not limited, but as an example, words that frequently appear in the analyzed text data can be used, or words that are estimated as some schedule from the analysis result of the text data can be used. Further, the wording used in the heading may be based on the content spoken by the user of the terminal 20, may be based on the content spoken by the user of the other party, or may be both. Good. In addition, when there is a content related to the schedule in the conversation, the terminal 20 starts a schedule management application that manages the schedule, which is different from the messaging application, and registers the schedule on the calendar. You may or may not do it.
  • control unit 21 displays the call icon 611 on the talk room with the determined display size on the display processing unit 214, and displays the call icon 611 in association with the call icon 611 with the determined heading. (Step S1409), and the process ends.
  • FIG. 15 shows a display example in which the size of the call icon is changed and displayed according to the call volume (the text volume of the message of the call content).
  • FIG. 15A shows a display example of the call icon 1501 displayed when the call volume (the text volume of the message of the call content) is relatively small. Note that FIG. 15 shows a state in which the message is not expanded for the sake of readability.
  • FIG. 15 (b) shows an example of displaying the call icon when a call volume larger than the call volume of the call corresponding to the call icon 1501 is made with respect to FIG. 15 (a).
  • the call icon 1502 is displayed in a size larger than that of the call icon 1501 shown in FIG. 15A.
  • FIG. 15 by displaying the call icon according to the call volume (the amount of text of the message of the call content), the user can recognize at a glance how much he / she was talking. Can be done.
  • the method of displaying the call volume is not limited to the size of the call icon as described above.
  • the call volume may be expressed by the shade of color of the call icon.
  • the colors are shown by hatching.
  • FIG. 16A shows a call icon 1601 when the call charge is relatively low.
  • FIG. 16B shows a display example of the call icon 1602 having a call volume larger than the call charge of the call corresponding to the call icon 1601 of FIG. 16A.
  • the call icon 1602 is displayed in a darker color to charge the call charge. Is shown. That is, the shade of the color of the call icon allows the user to recognize the call volume at a glance. In this way, the call volume can be expressed by the display mode of the call icon.
  • the image displayed as the call icon is not limited to the symbol indicating the call as shown in FIGS. 15 and 16, and may be an image related to the call or may be a combination with the icon. That is, an image related to the call (an image relating to the position of the terminal during the call) may be displayed as a background image of the call symbol of the call icon shown in FIGS. 15 and 16. Specifically, as shown in FIG. 17, the terminal 20 has an image related to the call (not limited, but as an example, an image related to the position of the terminal, which the user has input during the call) at the display position displayed as the call icon.
  • Images, images taken by the user during a call, images about the content of the call, etc., but not limited to, examples of information related to the call) may be displayed as an alternative to the call icon.
  • the terminal 20 is displayed as shown in the image 1701 in a manner showing a part of the image related to the call acquired at the time of the call in the outer shape as the call icon.
  • the image related to the call may be displayed as it is, as shown in image 1702, regardless of the outer shape of the call icon.
  • the terminal 20 is such that the displayed image 1702 is related to the call as shown in FIG. 18 (a).
  • the call icon 1801 may also be displayed.
  • the call icon 1801 is superimposed on the image 1702 and displayed, but this is displayed outside the frame of the image 1702 if it can be understood that the call icon 1801 is associated with the image 1702. It may be that.
  • the terminal 20 may enlarge and display the image 1702 as shown in FIG. 18B by detecting the touch input from the user to the portion other than the call icon 1801 of the image 1702. .. At this time, the call icon 1801 may or may not be displayed.
  • FIG. 18B shows an example in which the call icon 1801 is not displayed.
  • the call unit 212 of the terminal 20 calls the user corresponding to the talk room. May be configured to start.
  • FIG. 19 is a diagram showing a display example when a heading is added to the content of the call.
  • FIG. 19A shows an example in which the voice of a call is converted into a text message by voice recognition processing and displayed as a message on the talk room. As shown in the message of FIG. 19A, it can be understood that the users have promised a drinking party.
  • the control unit 21 of the terminal 20 performs morphological analysis and context analysis on the text of the message, and as an example, holds a drinking party and holds a drinking party on Saturday. Identify that. Then, the control unit 21 of the terminal 20 displays the heading 1902 indicating the content of the call in association with the call icon 1901. In the example shown in FIG.
  • the heading 1902 with the content "Saturday drinking party" is displayed.
  • the terminal 20 can not only display a message indicating the content of the call, but also display a heading 1902 indicating the content of the call.
  • the user can recognize the contents of the call without reading all the messages indicating the contents of the call.
  • FIG. 19 shows an example of assigning a heading
  • the terminal 20 performs morphological analysis after converting the contents of the call into a text message by voice recognition processing in order to make it easier to recognize the contents of the call.
  • An analysis technique such as context analysis may be used to recognize the content of the call and display a summarized sentence.
  • By summarizing the contents of the call if the call is prolonged and the amount of sentences as a message to be displayed increases, it will take time for the user to read the contents when all are displayed, which is troublesome. By summarizing, it is possible to make the user recognize the content of the call while simplifying the displayed message.
  • the summary may or may not be displayed as if it were a conversation of any of the users involved in the call.
  • the schedule may or may not be included.
  • the terminal 20 provides image information (not limited but a call icon as an example) indicating that a call has been made based on the amount of characters in the text data obtained by voice recognition of the call content and the length of the call time. It may be displayed. Displaying a call icon based on the amount of characters and call time means changing the display size of the call icon or changing the display color of the call icon depending on the amount of characters and the length of the call time. It may be to display.
  • the terminal 20 displays an image related to the call in the talk room (not limited to, as an example, an image related to the position of the terminal, and the user is talking
  • the image input to, the image taken by the user during the call, the image related to the contents of the call, etc.) may be displayed. That is, instead of the call icon, an image related to the contents of the call may be displayed in the talk room as information indicating that the call has been made.
  • the image related to the actual call is displayed instead of the call icon as the image related to the call, so that the user can recognize the content of the call without looking at the message indicating the content of the call.
  • the terminal 20 may analyze the contents of the call, convert the analysis result into a summary sentence indicating the contents of the call, and display the summary. ..
  • a learning model is generated by using a learning process using the contents of a call between users and a summary of the contents of the call as teacher data, and voice recognition is performed for the learning model.
  • a summary may be created by inputting the text data obtained by the processing.
  • the terminal 20 can make the user recognize the content of the call with simple content.
  • the convenience of the messaging application can be improved, and the design of the display can be improved.
  • Communication system 10 Server 11 Control unit 111 Message processing unit 12 Input / output unit 13 Display unit 14 Communication I / F (communication unit) 20 Terminal 21 Control unit 211 Message processing unit 212 Calling unit 213 Voice recognition unit 214 Display processing unit 22 Communication I / F 23 Input / output unit 231 Touch panel 232 Microphone 233 Speaker 234 Camera 24 Display unit (display) 25 Location information acquisition unit 28 Storage unit 30 Network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Transfer Between Computers (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

This information processing method of a terminal, which transmits content to a first terminal or receives content that has been transmitted from the first terminal, includes: displaying, on a display region of the terminal, first content that has been transmitted from the first terminal and second content transmitted to the first terminal by a communication unit of the terminal; performing, by a control unit of the terminal, a control pertaining to a call with the first terminal on the basis of an input by a user of the terminal onto the display region on which the first content and the second content are displayed; acquiring, by the control unit, first information based on the voice of a user of the first terminal and second information based on the voice of the user of the terminal on the basis of the call with the first terminal; and displaying, on the display region, call information based on the first information and the second information.

Description

情報処理方法、プログラム、端末Information processing method, program, terminal
 本開示は、端末の情報処理方法、プログラム、端末に関する。 This disclosure relates to information processing methods, programs, and terminals of terminals.
 近年、メッセージングサービスを介して、ユーザ同士が通信によるメッセージのやり取りを行っている。また、このようなメッセージングサービスにおいては、ユーザ同士で通話あるいはビデオ通話を行うことができるメッセージングサービスも存在する。特許文献1には、そのようなシステムの一例が開示されている。 In recent years, users have been exchanging messages by communication via messaging services. Further, in such a messaging service, there is also a messaging service in which users can make a call or a video call. Patent Document 1 discloses an example of such a system.
特開2014-232502号公報JP-A-2014-232502
 本発明の第1の態様によると、第1端末にコンテンツの送信または、第1端末から送信されたコンテンツの受信を行う端末の情報処理方法であって、第1端末から送信された第1コンテンツと、端末の通信部によって第1端末に送信された第2コンテンツとを端末の表示領域に表示することと、第1コンテンツと第2コンテンツとを表示する表示領域に対する端末のユーザによる入力に基づいて、第1端末との通話に関する制御を端末の制御部によって行うことと、第1端末のユーザの音声に基づく第1情報と、端末のユーザの音声に基づく第2情報とを第1端末との通話に基づき制御部により取得することと、第1情報と第2情報とに基づく通話情報を表示領域に表示することとを含む。
 本発明の第2の態様によると、第1端末にコンテンツの送信または、前記第1端末から送信されたコンテンツの受信を行う端末のコンピュータに実行させるプログラムであって、前記第1端末から送信された第1コンテンツと、前記端末の通信部によって前記第1端末に送信された第2コンテンツとを前記端末の表示領域に表示することと、前記第1コンテンツと前記第2コンテンツとを表示する前記表示領域に対する前記端末のユーザによる入力に基づいて、前記第1端末との通話に関する制御を前記端末の制御部によって行うことと、前記第1端末のユーザの音声に基づく第1情報と、前記第端末のユーザの音声に基づく第2情報とを前記第1端末との前記通話に基づき前記制御部により取得することと、前記第1情報と前記第2情報とに基づく通話情報を前記表示領域に表示することとを含む。
 本発明の第3の態様によると、第1端末にコンテンツの送信または、前記第1端末から送信されたコンテンツの受信を行う端末であって、前記第1端末から送信された第1コンテンツと、前記端末の通信部によって前記第1端末に送信された第2コンテンツとを表示する表示部と、前記第1コンテンツと前記第2コンテンツとを表示する前記表示部に対する前記端末のユーザによる入力に基づいて、前記第1端末との通話に関する制御を行う制御部とを備え、前記制御部は、前記第1端末のユーザの音声に基づく第1情報と、前記第端末のユーザの音声に基づく第2情報とを前記第1端末との前記通話に基づき取得し、前記表示部は、前記第1情報と前記第2情報とに基づく通話情報を表示する。
According to the first aspect of the present invention, it is an information processing method of a terminal that transmits content to a first terminal or receives content transmitted from the first terminal, and is a first content transmitted from the first terminal. Based on the display of the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal and the input by the user of the terminal to the display area for displaying the first content and the second content. The control unit of the terminal controls the call with the first terminal, and the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal are referred to as the first terminal. It includes the acquisition by the control unit based on the call of the above and the display of the call information based on the first information and the second information in the display area.
According to the second aspect of the present invention, it is a program to be executed by the computer of the terminal that transmits the content to the first terminal or receives the content transmitted from the first terminal, and is transmitted from the first terminal. The first content and the second content transmitted to the first terminal by the communication unit of the terminal are displayed in the display area of the terminal, and the first content and the second content are displayed. Based on the input by the user of the terminal to the display area, the control unit of the terminal controls the call with the first terminal, the first information based on the voice of the user of the first terminal, and the first. The second information based on the voice of the user of the terminal is acquired by the control unit based on the call with the first terminal, and the call information based on the first information and the second information is stored in the display area. Includes displaying.
According to a third aspect of the present invention, a terminal that transmits content to or receives content transmitted from the first terminal, the first content transmitted from the first terminal, and Based on a display unit that displays the second content transmitted to the first terminal by the communication unit of the terminal, and input by the user of the terminal to the display unit that displays the first content and the second content. The control unit includes a control unit that controls a call with the first terminal, and the control unit includes first information based on the voice of the user of the first terminal and a second information based on the voice of the user of the first terminal. The information is acquired based on the call with the first terminal, and the display unit displays the call information based on the first information and the second information.
実施形態の一態様における通信システムの構成を示す図。The figure which shows the structure of the communication system in one aspect of Embodiment. 通信システムの一実施態様を示す図であって、(a)は、メッセージングサービスを介したユーザ同士の通話の様子を示す図であり、(b)は、通話後のメッセージングサービスにおけるトークルームの表示例を示す図。It is a figure which shows one Embodiment of a communication system, (a) is a figure which shows the state of a call between users through a messaging service, (b) is a table of a talk room in a messaging service after a call. The figure which shows the illustration. 通信システムにおけるやり取りを示すシーケンス図。A sequence diagram showing an exchange in a communication system. 端末における通話と通話内容の表示処理例を示すフローチャート。A flowchart showing an example of display processing of a call and a call content on a terminal. 端末によるトークルームにおいて通話内容の表示・非表示の切替の動作例を示すフローチャート。A flowchart showing an operation example of switching the display / non-display of the call content in the talk room by the terminal. (a)は、端末における通話前のトークルームの例を示す画面図。(b)は、端末における通話後のトークルームの例を示す画面図。(A) is a screen view showing an example of a talk room before a call on a terminal. (B) is a screen view showing an example of a talk room after a call on a terminal. (a)は、端末における通話内容を展開していない状態のトークルームの例を示す画面図。(b)は、端末におけるトークルームの表示例であって通話内容を展開した例を示す画面図。(A) is a screen view showing an example of a talk room in a state where the contents of a call on the terminal are not expanded. (B) is a screen view showing an example of displaying a talk room on a terminal and expanding the contents of a call. (a)は、ユーザが指を通話アイコンに近づけている様子を示す図。(b)は、通話アイコンを拡大表示している例を示す画面図。(A) is a diagram showing how the user brings his / her finger closer to the call icon. (B) is a screen view showing an example in which the call icon is enlarged and displayed. (a)は、ポップアップによる通話の内容を示すメッセージを表示する例を示す画面図。(b)は、別の画面に遷移して通話の内容を示すメッセージを表示する例を示す画面図。(A) is a screen view showing an example of displaying a message indicating the content of a call by pop-up. (B) is a screen view showing an example of transitioning to another screen and displaying a message indicating the contents of a call. 端末におけるビデオ通話を実行した場合の動作例を示すフローチャート。A flowchart showing an operation example when a video call is executed on a terminal. (a)は、通話の一部の様子を示す模式図、(b)は、図12(a)に続く状況の一例を示す模式図、(c)は、通話後のトークルームの表示例を示す画面図。(A) is a schematic diagram showing a part of a call, (b) is a schematic diagram showing an example of a situation following FIG. 12 (a), and (c) is a display example of a talk room after a call. The screen view which shows. (a)は、端末の位置に関する画像を背景画像として表示する例を示す画面図。(b)は、背景画像と通話の内容を連動させている表示例を示す画面図。(A) is a screen view showing an example of displaying an image relating to the position of a terminal as a background image. (B) is a screen view showing a display example in which the background image and the contents of the call are linked. (a)は、通話の一部の様子を示す模式図、(b)は、図13(a)に続く状況の一例を示す模式図、(c)は、通話後のトークルームの表示例を示す画面図。(A) is a schematic diagram showing a part of a call, (b) is a schematic diagram showing an example of a situation following FIG. 13 (a), and (c) is a display example of a talk room after a call. The screen view which shows. 端末におけるコールアイコンの表示に係る動作例を示すフローチャート。A flowchart showing an operation example related to the display of a call icon on a terminal. (a)は、通話量が比較的少ない場合を通話アイコンの表示サイズで表現した場合の表示例を示す画面図。(b)は、(a)よりも通話量が多い場合の通話アイコンの表示例を示す画面図。(A) is a screen view showing a display example when the case where the call volume is relatively small is expressed by the display size of the call icon. (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a). (a)は、通話量が比較的少ない場合を通話アイコンの色で表現した表示例を示す画面図。(b)は、(a)よりも通話量が多い場合の通話アイコンの表示例を示す画面図。(A) is a screen view showing a display example in which the case where the call volume is relatively small is represented by the color of the call icon. (B) is a screen view showing a display example of a call icon when the call volume is larger than that of (a). (a)、(b)は、通話アイコンの代替として、通話中の内容に関する画像をトークルームに表示する例を示す画面図。(A) and (b) are screen views showing an example of displaying an image related to the contents of a call in a talk room as an alternative to a call icon. (a)は、通話アイコンの代替として、通話中の内容に関する画像をトークルームに表示する例であって、通話アイコンも併せて表示した例を示す画面図。(b)は、通話アイコンの代替として画像を拡大表示した例を示す画面図。(A) is a screen view showing an example in which an image relating to the contents during a call is displayed in the talk room as an alternative to the call icon, and the call icon is also displayed. (B) is a screen view showing an example in which an image is enlarged and displayed as a substitute for a call icon. (a)は、通話の内容を示すメッセージの表示例を示す画面図。(b)は、(a)に示す通話内容の場合に、表示する見出し(要約)の例を示す画面図。(A) is a screen view showing a display example of a message showing the contents of a call. (B) is a screen view showing an example of a heading (summary) to be displayed in the case of the call content shown in (a).
<法的事項の遵守>
 本明細書に記載の開示は、通信の秘密など、本開示の実施に必要な実施国の法的事項遵守を前提とすることに留意されたい。
<Compliance with legal matters>
It should be noted that the disclosures described herein are premised on compliance with the legal matters of the implementing country necessary for the implementation of this disclosure, such as secrecy of communications.
 本開示に係る端末による送信または受信に係る状況を確認できる表示方法等を実施するための実施形態について、図面を参照して説明する。 An embodiment for implementing a display method or the like capable of confirming a situation related to transmission or reception by a terminal according to the present disclosure will be described with reference to the drawings.
<システム構成>
 図1は、本開示の一実施形態に係る通信システム1の構成を示す。図1に開示されるように、通信システム1では、ネットワーク30を介してサーバ10と、端末20(端末20A,端末20B,端末20C)とが接続される。サーバ10は、ネットワーク30を介してユーザが所有する端末20に、端末20間でのメッセージの送受信を実現するサービスを提供する。なお、ネットワーク30に接続される端末20の数は限定されない。
<System configuration>
FIG. 1 shows the configuration of the communication system 1 according to the embodiment of the present disclosure. As disclosed in FIG. 1, in the communication system 1, the server 10 and the terminal 20 (terminal 20A, terminal 20B, terminal 20C) are connected via the network 30. The server 10 provides a service for transmitting and receiving a message between terminals 20 to a terminal 20 owned by a user via a network 30. The number of terminals 20 connected to the network 30 is not limited.
 ネットワーク30は、1以上の端末20と、1以上のサーバ10とを接続する役割を担う。すなわち、ネットワーク30は、端末20がサーバ10に接続した後、データを送受信することができるように接続経路を提供する通信網を意味する。 The network 30 plays a role of connecting one or more terminals 20 and one or more servers 10. That is, the network 30 means a communication network that provides a connection route so that data can be transmitted and received after the terminal 20 connects to the server 10.
 ネットワーク30のうちの1つまたは複数の部分は、有線ネットワークや無線ネットワークであってもよいし、そうでなくてもよい。ネットワーク30は、限定でなく例として、アドホック・ネットワーク(ad hoc network)、イントラネット、エクストラネット、仮想プライベート・ネットワーク(virtual private network:VPN)、ローカル・エリア・ネットワーク(local area network:LAN)、ワイヤレスLAN(wireless LAN:WLAN)、広域ネットワーク(wide area network:WAN)、ワイヤレスWAN(wireless WAN:WWAN)、大都市圏ネットワーク(metropolitan area network:MAN)、インターネットの一部、公衆交換電話網(Public Switched Telephone Network:PSTN)の一部、携帯電話網、ISDN(integrated service digital networks)、無線LAN、LTE(long term evolution)、CDMA(code division multiple access)、ブルートゥース(Bluetooth(登録商標))、衛星通信など、または、これらの2つ以上の組合せを含むことができる。ネットワーク30は、1つまたは複数のネットワーク30を含むことができる。 One or more parts of the network 30 may or may not be a wired network or a wireless network. The network 30 is not limited, but is, for example, an ad hoc network, an intranet, an extra net, a virtual private network (VPN), a local area network (LAN), and a wireless network. LAN (wireless LAN: WLAN), wide area network (WAN), wireless WAN (wireless WAN: WWAN), metropolitan area network (metropolitan area network: MAN), part of the Internet, public exchange telephone network (Public) Part of Switched Telephone Network: PSTN), mobile phone network, ISDN (integrated service digital networks), wireless LAN, LTE (long term evolution), CDMA (code division multiple access), Bluetooth (Bluetooth (registered trademark)), satellite It can include communications, etc., or a combination of two or more of these. The network 30 may include one or more networks 30.
 端末20(端末20A,端末20B,端末20C)は、各実施形態において記載する機能を実現できる情報処理端末であればどのような端末であってもよい。端末20は、限定ではなく例として、スマートフォン、携帯電話(フィーチャーフォン)、コンピュータ(限定でなく例として、デスクトップ、ラップトップ、タブレットなど)、メディアコンピュータプラットホーム(限定でなく例として、ケーブル、衛星セットトップボックス、デジタルビデオレコーダ)、ハンドヘルドコンピュータデバイス(限定でなく例として、PDA・(personal digital assistant)、電子メールクライアントなど)、ウェアラブル端末(メガネ型デバイス、時計型デバイスなど)、または他種のコンピュータ、またはコミュニケーションプラットホームを含む。また、端末20は情報処理端末と表現されてもよい。 The terminal 20 (terminal 20A, terminal 20B, terminal 20C) may be any information processing terminal that can realize the functions described in each embodiment. The terminal 20 is not limited but, for example, a smartphone, a mobile phone (feature phone), a computer (not limited, for example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, for example, a cable, a satellite set). Top boxes, digital video recorders), handheld computer devices (not limited to, for example, PDAs (personal digital assistants), email clients, etc.), wearable terminals (glasses devices, clock devices, etc.), or other types of computers , Or includes a communication platform. Further, the terminal 20 may be expressed as an information processing terminal.
 端末20A、端末20Bおよび端末20Cの構成は基本的には同一であるため、以下の説明においては、端末20について説明する。また、必要に応じて、ユーザXが利用する端末を端末20Xと表現し、ユーザXまたは端末20Xに対応づけられた、所定のサービスにおけるユーザ情報をユーザ情報Xと表現する。なお、ユーザ情報とは、所定のサービスにおいてユーザが利用するアカウントに対応付けられたユーザの情報である。ユーザ情報は、限定でなく例として、ユーザにより入力される、または、所定のサービスにより付与される、ユーザの名前、ユーザのアイコン画像、ユーザの年齢、ユーザの性別、ユーザの住所、ユーザの趣味趣向、ユーザの識別子などのユーザに対応づけられた情報を含み、これらのいずれか一つまたは、組み合わせであってもよいし、そうでなくてもよい。 Since the configurations of the terminal 20A, the terminal 20B, and the terminal 20C are basically the same, the terminal 20 will be described in the following description. Further, if necessary, the terminal used by the user X is expressed as the terminal 20X, and the user information in the predetermined service associated with the user X or the terminal 20X is expressed as the user information X. The user information is user information associated with an account used by the user in a predetermined service. The user information is not limited but, as an example, input by the user or given by a predetermined service, the user's name, the user's icon image, the user's age, the user's gender, the user's address, the user's hobby. It includes information associated with the user, such as a preference, a user's identifier, and may or may not be any one or combination of these.
 サーバ10は、端末20に対して、所定のサービスを提供する機能を備える。サーバ10は、各実施形態において記載する機能を実現できる情報処理装置であればどのような装置であってもよい。サーバ10は、限定でなく例として、サーバ装置、コンピュータ(限定でなく例として、デスクトップ、ラップトップ、タブレットなど)、メディアコンピュータプラットホーム(限定でなく例として、ケーブル、衛星セットトップボックス、デジタルビデオレコーダ)、ハンドヘルドコンピュータデバイス(限定でなく例として、PDA、電子メールクライアントなど)、あるいは他種のコンピュータ、またはコミュニケーションプラットホームを含む。また、サーバ10は情報処理装置と表現されてもよい。サーバ10と端末20とを区別する必要がない場合は、サーバ10と端末20とは、それぞれ情報処理装置と表現されてもよいし、されなくてもよい。 The server 10 has a function of providing a predetermined service to the terminal 20. The server 10 may be any device as long as it is an information processing device that can realize the functions described in each embodiment. The server 10 is not limited, but by example, a server device, a computer (not limited, by example, a desktop, a laptop, a tablet, etc.), a media computer platform (not limited, by example, a cable, a satellite set top box, a digital video recorder). ), Handheld computer devices (for example, but not limited to PDAs, email clients, etc.), or other types of computers, or communication platforms. Further, the server 10 may be expressed as an information processing device. When it is not necessary to distinguish between the server 10 and the terminal 20, the server 10 and the terminal 20 may or may not be expressed as information processing devices, respectively.
<ハードウェア(HW)構成> 
 図1を用いて、通信システム1に含まれる各装置のHW構成について説明する。
<Hardware (HW) configuration>
The HW configuration of each device included in the communication system 1 will be described with reference to FIG.
 (1)端末のHW構成 (1) HW configuration of the terminal
 端末20は、制御部21(CPU:central processing unit(中央処理装置))、記憶部28、通信I/F22(インタフェース)、入出力部23、表示部24、位置情報取得部25を備える。端末20のHWの各構成要素は、限定でなく例として、バスBを介して相互に接続される。なお、端末20のHW構成として、すべての構成要素を含むことは必須ではない。限定ではなく例として、端末20は、マイク232、カメラ234、位置情報取得部25等、個々の構成要素、または複数の構成要素を取り外すような構成であってもよいし、そうでなくてもよい。 The terminal 20 includes a control unit 21 (CPU: central processing unit), a storage unit 28, a communication I / F 22 (interface), an input / output unit 23, a display unit 24, and a position information acquisition unit 25. Each component of the HW of the terminal 20 is connected to each other via bus B, for example, without limitation. It is not essential that the HW configuration of the terminal 20 includes all the components. As an example without limitation, the terminal 20 may or may not be configured to remove individual components such as a microphone 232, a camera 234, a position information acquisition unit 25, or a plurality of components. Good.
 通信I/F22は、ネットワーク30を介して各種データの送受信を行う。当該通信は、有線、無線のいずれで実行されてもよく、互いの通信が実行できるのであれば、どのような通信プロトコルを用いてもよい。通信I/F22は、ネットワーク30を介して、サーバ10との通信を実行する機能を有する。通信I/F22は、各種データを制御部21からの指示に従って、サーバ10に送信する。また、通信I/F22は、サーバ10から送信された各種データを受信し、制御部21に伝達する。また、通信I/F22を単に通信部と表現する場合もある。また、通信I/F22が物理的に構造化された回路で構成される場合には、通信回路と表現する場合もある。 The communication I / F 22 transmits and receives various data via the network 30. The communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed. The communication I / F 22 has a function of executing communication with the server 10 via the network 30. The communication I / F 22 transmits various data to the server 10 according to an instruction from the control unit 21. Further, the communication I / F 22 receives various data transmitted from the server 10 and transmits the various data to the control unit 21. Further, the communication I / F 22 may be simply expressed as a communication unit. Further, when the communication I / F 22 is composed of a physically structured circuit, it may be expressed as a communication circuit.
 入出力部23は、端末20に対する各種操作を入力する装置、および、端末20で処理された処理結果を出力する装置を含む。入出力部23は、入力部と出力部が一体化していてもよいし、入力部と出力部に分離していてもよいし、そうでなくてもよい。 The input / output unit 23 includes a device for inputting various operations to the terminal 20 and a device for outputting the processing result processed by the terminal 20. The input / output unit 23 may or may not be integrated with the input unit and the output unit, or may be separated into the input unit and the output unit.
 入力部は、ユーザからの入力を受け付けて、当該入力に係る情報を制御部21に伝達できる全ての種類の装置のいずれかまたはその組み合わせにより実現される。入力部は、限定でなく例として、タッチパネル231、タッチディスプレイ、キーボード等のハードウェアキーや、マウス等のポインティングデバイス、カメラ234(動画像を介した操作入力)、マイク232(音声による操作入力)を含む。 The input unit is realized by any or a combination of all kinds of devices capable of receiving input from the user and transmitting information related to the input to the control unit 21. The input unit is not limited, but as an example, hardware keys such as a touch panel 231 and a touch display and a keyboard, a pointing device such as a mouse, a camera 234 (operation input via a moving image), and a microphone 232 (operation input by voice). including.
 出力部は、制御部21で処理された処理結果を出力することができる全ての種類の装置のいずれかまたはその組み合わせにより実現される。出力部は、限定でなく例として、 タッチパネル、タッチディスプレイ、スピーカ233(音声出力)、レンズ(限定でなく例として3D(three dimensions)出力や、ホログラム出力)、プリンターなどを含む。 The output unit is realized by any or a combination of all kinds of devices capable of outputting the processing result processed by the control unit 21. The output unit includes, for example, a touch panel, a touch display, a speaker 233 (audio output), a lens (not limited, as an example, 3D (three dimensions) output, hologram output), a printer, and the like.
 表示部24は、フレームバッファに書き込まれた表示データに従って、表示することができる全ての種類の装置のいずれかまたはその組み合わせにより実現される。表示部24は、限定でなく例として、タッチパネル、タッチディスプレイ、モニタ(限定でなく例として、液晶ディスプレイやOELD(organic electroluminescence display))、ヘッドマウントディスプレイ(HDM:Head Mounted Display)、プロジェクションマッピング、ホログラム、空気中など(真空であってもよいし、そうでなくてもよい)に画像やテキスト情報等を表示可能な装置を含む。なお、これらの表示部24は、3Dで表示データを表示可能であってもよいし、そうでなくてもよい。 The display unit 24 is realized by any or a combination of all kinds of devices that can display according to the display data written in the frame buffer. The display unit 24 is not limited but, for example, a touch panel, a touch display, a monitor (not limited but, for example, a liquid crystal display or OELD (organic electroluminescence display)), a head mounted display (HDM: Head Mounted Display), projection mapping, a hologram. , Includes a device capable of displaying images, text information, etc. in the air (which may or may not be vacuum). It should be noted that these display units 24 may or may not be able to display display data in 3D.
 入出力部23がタッチパネルの場合、入出力部23と表示部24とは、略同一の大きさおよび形状で対向して配置されていてもよい。 When the input / output unit 23 is a touch panel, the input / output unit 23 and the display unit 24 may be arranged so as to face each other with substantially the same size and shape.
 制御部21は、プログラム内に含まれたコードまたは命令によって実現する機能を実行するために物理的に構造化された回路を有し、限定でなく例として、ハードウェアに内蔵されたデータ処理装置により実現される。そのため、制御部21は、制御回路と表現されてもよいし、されなくてもよい。 The control unit 21 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by. Therefore, the control unit 21 may or may not be expressed as a control circuit.
 制御部21は、限定でなく例として、中央処理装置(CPU)、マイクロプロセッサ(microprocessor)、プロセッサコア(processor core)、マルチプロセッサ(multiprocessor)、ASIC(application-specific integrated circuit)、FPGA(field programmable gate array)を含む。 The control unit 21 is not limited, but as an example, a central processing unit (CPU), a microprocessor (microprocessor), a processor core (processor core), a multiprocessor (multiprocessor), an ASIC (application-specific integrated circuit), and an FPGA (field programmable). gate array) is included.
 記憶部28は、端末20が動作するうえで必要とする各種プログラムや各種データを記憶する機能を有する。記憶部28は、限定でなく例として、HDD(hard disk drive)、SSD(solid state drive)、フラッシュメモリ、RAM(random access memory)、ROM(read only memory)など各種の記憶媒体を含む。また、記憶部28は、メモリ(memory)と表現されてもよいし、されなくてもよい。 The storage unit 28 has a function of storing various programs and various data required for the terminal 20 to operate. The storage unit 28 includes various storage media such as HDD (hard disk drive), SSD (solid state drive), flash memory, RAM (random access memory), and ROM (read only memory) as examples without limitation. Further, the storage unit 28 may or may not be expressed as a memory.
 端末20は、プログラムPを記憶部28に記憶し、このプログラムPを実行することで、制御部21が、制御部21に含まれる各部としての処理を実行する。つまり、記憶部28に記憶されるプログラムPは、端末20に、制御部21が実行する各機能を実現させる。また、このプログラムPは、プログラムモジュールと表現されてもよいし、されなくてもよい。 The terminal 20 stores the program P in the storage unit 28, and by executing this program P, the control unit 21 executes the processing as each unit included in the control unit 21. That is, the program P stored in the storage unit 28 causes the terminal 20 to realize each function executed by the control unit 21. Further, this program P may or may not be expressed as a program module.
 マイク232は、音声データの入力に利用される。スピーカ233は、音声データの出力に利用される。カメラ234は、動画像データの取得に利用される。なお、カメラ234としては、端末20の表示部24が設けられている側と、表示部24が設けられている側とは反対側と、の両面に設けられてよく、それぞれ、インカメラ、アウトカメラと呼称することもある。インカメラ、アウトカメラの切替は、端末20のユーザからの入力により実行される。 The microphone 232 is used for inputting voice data. The speaker 233 is used for outputting audio data. The camera 234 is used for acquiring moving image data. The camera 234 may be provided on both sides of the side where the display unit 24 of the terminal 20 is provided and the side opposite to the side where the display unit 24 is provided, and the in-camera and the out-camera, respectively, may be provided. Sometimes called a camera. Switching between the in-camera and the out-camera is executed by input from the user of the terminal 20.
(2)サーバのHW構成 
 サーバ10は、制御部11(CPU)、記憶部15、通信I/F14(インタフェース)、入出力部12、表示部13を備える。サーバ10のHWの各構成要素は、限定でなく例として、バスBを介して相互に接続される。なお、サーバ10のHWは、サーバ10のHWの構成として、全ての構成要素を含むことは必須ではない。限定ではなく例として、サーバ10のHWは、表示部13を取り外すような構成であってもよいし、そうでなくてもよい。
(2) Server HW configuration
The server 10 includes a control unit 11 (CPU), a storage unit 15, a communication I / F 14 (interface), an input / output unit 12, and a display unit 13. Each component of the HW of the server 10 is connected to each other via bus B, for example, without limitation. It should be noted that the HW of the server 10 does not necessarily include all the components as the configuration of the HW of the server 10. As an example but not a limitation, the HW of the server 10 may or may not be configured to remove the display unit 13.
 制御部11は、プログラム内に含まれたコードまたは命令によって実現する機能を実行するために物理的に構造化された回路を有し、限定でなく例として、ハードウェアに内蔵されたデータ処理装置により実現される。 The control unit 11 has a physically structured circuit for executing a function realized by a code or an instruction contained in the program, and is not limited to, but as an example, a data processing device built in hardware. Is realized by.
 制御部11は、代表的には中央処理装置(CPU)、であり、その他にマイクロプロセッサ、プロセッサコア、マルチプロセッサ、ASIC、FPGAであってもよいし、そうでなくてもよい。本開示において、制御部11は、これらに限定されない。 The control unit 11 is typically a central processing unit (CPU), and may or may not be a microprocessor, a processor core, a multiprocessor, an ASIC, or an FPGA. In the present disclosure, the control unit 11 is not limited to these.
 記憶部15は、サーバ10が動作するうえで必要とする各種プログラムや各種データを記憶する機能を有する。記憶部15は、HDD、SSD、フラッシュメモリなど各種の記憶媒体により実現される。ただし、本開示において、記憶部15は、これらに限定されない。また、記憶部15は、メモリ(memory)と表現されてもよいし、されなくてもよい。 The storage unit 15 has a function of storing various programs and various data required for the server 10 to operate. The storage unit 15 is realized by various storage media such as HDD, SSD, and flash memory. However, in the present disclosure, the storage unit 15 is not limited to these. Further, the storage unit 15 may or may not be expressed as a memory.
 通信I/F14は、ネットワーク30を介して各種データの送受信を行う。当該通信は、有線、無線のいずれで実行されてもよく、互いの通信が実行できるのであれば、どのような通信プロトコルを用いてもよい。通信I/F14は、ネットワーク30を介して、端末20との通信を実行する機能を有する。通信I/F14は、各種データを制御部11からの指示に従って、端末20に送信する。また、通信I/F14は、端末20から送信された各種データを受信し、制御部11に伝達する。また、通信I/F14を単に通信部と表現する場合もある。また、通信I/F14が物理的に構造化された回路で構成される場合には、通信回路と表現する場合もある。 The communication I / F 14 transmits and receives various data via the network 30. The communication may be executed by wire or wirelessly, and any communication protocol may be used as long as mutual communication can be executed. The communication I / F 14 has a function of executing communication with the terminal 20 via the network 30. The communication I / F 14 transmits various data to the terminal 20 according to an instruction from the control unit 11. Further, the communication I / F 14 receives various data transmitted from the terminal 20 and transmits the various data to the control unit 11. Further, the communication I / F 14 may be simply expressed as a communication unit. Further, when the communication I / F 14 is composed of a physically structured circuit, it may be expressed as a communication circuit.
 入出力部12は、サーバ10に対する各種操作を入力する装置により実現される。入出力部12は、ユーザからの入力を受け付けて、当該入力に係る情報を制御部11に伝達できる全ての種類の装置のいずれかまたはその組み合わせにより実現される。入出力部12は、代表的にはキーボード等に代表されるハードウェアキーや、マウス等のポインティングデバイスで実現される。なお、入出力部12、限定でなく例として、タッチパネルやカメラ(動画像を介した操作入力)、マイク(音声による操作入力)を含んでいてもよいし、そうでなくてもよい。ただし、本開示において、入出力部12は、これらに限定されない。 The input / output unit 12 is realized by a device that inputs various operations to the server 10. The input / output unit 12 is realized by any or a combination of all kinds of devices capable of receiving an input from a user and transmitting information related to the input to the control unit 11. The input / output unit 12 is typically realized by a hardware key typified by a keyboard or the like, or a pointing device such as a mouse. The input / output unit 12 is not limited to the input / output unit 12, and may or may not include a touch panel, a camera (operation input via moving image), and a microphone (operation input by voice) as an example. However, in the present disclosure, the input / output unit 12 is not limited to these.
 表示部13は、代表的にはモニタ(限定でなく例として、液晶ディスプレイやOELD(organic electroluminescence display))で実現される。なお、表示部13は、ヘッドマウントディスプレイ(HDM)などであってもよいし、そうでなくてもよい。なお、これらの表示部13は、3Dで表示データを表示可能であってもよいし、そうでなくてもよい。ただし、本開示において、表示部13は、これらに限定されない。
 サーバ10は、プログラムPを記憶部15に記憶し、このプログラムPを実行することで、制御部11が、制御部11に含まれる各部としての処理を実行する。つまり、記憶部15に記憶されるプログラムPは、サーバ10に、制御部11が実行する各機能を実現させる。このプログラムPは、プログラムモジュールと表現されてもよいし、されなくてもよい。
The display unit 13 is typically realized by a monitor (not limited to, for example, a liquid crystal display or an OELD (organic electroluminescence display)). The display unit 13 may or may not be a head-mounted display (HDMI) or the like. It should be noted that these display units 13 may or may not be able to display display data in 3D. However, in the present disclosure, the display unit 13 is not limited to these.
The server 10 stores the program P in the storage unit 15, and by executing the program P, the control unit 11 executes the processing as each unit included in the control unit 11. That is, the program P stored in the storage unit 15 causes the server 10 to realize each function executed by the control unit 11. This program P may or may not be expressed as a program module.
 本開示の各実施形態においては、端末20および/または、サーバ10のCPUがプログラムPを実行することにより、実現するものとして説明する。 In each embodiment of the present disclosure, it will be described as being realized by the CPU of the terminal 20 and / or the server 10 executing the program P.
 なお、端末20の制御部21、および/または、サーバ10の制御部11は、制御回路を有するCPUだけでなく、集積回路(IC(Integrated Circuit)チップ、LSI(Large Scale Integration))等に形成された論理回路(ハードウェア)や専用回路によって各処理を実現してもよいし、そうでなくてもよい。また、これらの回路は、1または複数の集積回路により実現されてよく、各実施形態に示す複数の処理を1つの集積回路により実現されることとしてもよいし、そうでなくてもよい。また、LSIは、集積度の違いにより、VLSI、スーパーLSI、ウルトラLSIなどと呼称されることもある。そのため、制御部21は、制御回路と表現されてもよいし、されなくてもよい。 The control unit 21 of the terminal 20 and / or the control unit 11 of the server 10 is formed not only in a CPU having a control circuit but also in an integrated circuit (IC (Integrated Circuit) chip, LSI (Large Scale Integration)) and the like. Each process may or may not be realized by a logic circuit (hardware) or a dedicated circuit. Further, these circuits may be realized by one or a plurality of integrated circuits, and the plurality of processes shown in each embodiment may or may not be realized by one integrated circuit. Further, the LSI may be referred to as a VLSI, a super LSI, an ultra LSI, or the like depending on the degree of integration. Therefore, the control unit 21 may or may not be expressed as a control circuit.
 また、本開示の各実施形態のプログラムP(限定ではなく、例として、ソフトウェアプログラム、コンピュータプログラム、またはプログラムモジュール)は、コンピュータに読み取り可能な記憶媒体に記憶された状態で提供されてもよいし、されなくてもよい。 記憶媒体は、「一時的でない有形の媒体」に、プログラムPを記憶可能である。また、プログラムPは、本開示の各実施形態の機能の一部を実現するためのものであってもよいし、そうでなくてもよい。さらに、本開示の各実施形態の機能を記憶媒体にすでに記録されているプログラムPとの組み合わせで実現できるもの、いわゆる差分ファイル(差分プログラム)であってもよいし、そうでなくてもよい。 Further, the program P (not limited to, for example, a software program, a computer program, or a program module) of each embodiment of the present disclosure may be provided in a state of being stored in a computer-readable storage medium. , Does not have to be. The storage medium can store the program P in a “non-temporary tangible medium”. In addition, the program P may or may not be for realizing a part of the functions of each embodiment of the present disclosure. Further, it may or may not be a so-called difference file (difference program) that can realize the functions of each embodiment of the present disclosure in combination with the program P already recorded on the storage medium.
 記憶媒体は、1つまたは複数の半導体ベースの、または他の集積回路(IC)(限定でなく例として、フィールド・プログラマブル・ゲート・アレイ(FPGA)または特定用途向けIC(ASIC)など)、ハード・ディスク・ドライブ(HDD)、ハイブリッド・ハード・ドライブ(HHD)、光ディスク、光ディスクドライブ(ODD)、光磁気ディスク、光磁気ドライブ、フロッピィ・ディスケット、フロッピィ・ディスク・ドライブ(FDD)、磁気テープ、固体ドライブ(SSD)、RAMドライブ、セキュア・デジタル・カード、またはドライブ、任意の他の適切な記憶媒体、またはこれらの2つ以上の適切な組合せを含むことができる。記憶媒体は、適切な場合、揮発性、不揮発性、または揮発性と不揮発性の組合せでよい。なお、記憶媒体はこれらの例に限られず、プログラムPを記憶可能であれば、どのようなデバイスまたは媒体であってもよい。また、記憶媒体をメモリ(memory)と表現されてもよいし、されなくてもよい。 The storage medium may be one or more semiconductor-based or other integrated circuits (ICs) (for example, but not limited to field programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hardware. Disk drive (HDD), hybrid hard drive (HHD), optical disk, optical disk drive (ODD), magneto-optical disk, magneto-optical drive, floppy diskette, floppy disk drive (FDD), magnetic tape, solid It can include a drive (SSD), a RAM drive, a secure digital card, or a drive, any other suitable storage medium, or any suitable combination of two or more of these. The storage medium may be volatile, non-volatile, or a combination of volatile and non-volatile, where appropriate. The storage medium is not limited to these examples, and any device or medium may be used as long as the program P can be stored. Further, the storage medium may or may not be expressed as a memory.
 サーバ10および/または端末20は、記憶媒体に記憶されたプログラムPを読み出し、読み出したプログラムPを実行することによって、各実施形態に示す複数の機能部の機能を実現することができる。 The server 10 and / or the terminal 20 can read the program P stored in the storage medium and execute the read program P to realize the functions of the plurality of functional units shown in each embodiment.
 また、本開示のプログラムPDDは、当該プログラムを伝送可能な任意の伝送媒体(通信ネットワークや放送波等)を介して、サーバ10および/または端末20に提供されてもよいし、されなくてもよい。サーバ10および/または端末20は、限定でなく例として、インターネット等を介してダウンロードしたプログラムPを実行することにより、各実施形態に示す複数の機能部の機能を実現する。 Further, the program PDD of the present disclosure may or may not be provided to the server 10 and / or the terminal 20 via an arbitrary transmission medium (communication network, broadcast wave, etc.) capable of transmitting the program. Good. The server 10 and / or the terminal 20 realizes the functions of the plurality of functional units shown in each embodiment by executing the program P downloaded via the Internet or the like, as an example without limitation.
 また、本開示の各実施形態は、プログラムPが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 
 サーバ10および/または端末20における処理の少なくとも一部は、1以上のコンピュータにより構成されるクラウドコンピューティングにより実現されていてもよいし、そうでなくてもよい。
 端末20における処理の少なくとも一部を、サーバ10により行う構成としてもよいし、そうでなくてもよい。この場合、端末20の制御部21の各機能部の処理のうち少なくとも一部の処理を、サーバ10で行う構成としてもよいし、そうでなくてもよい。
 サーバ10における処理の少なくとも一部を、端末20により行う構成としてもよいし、そうでなくてもよい。この場合、サーバ10の制御部11の各機能部の処理のうち少なくとも一部の処理を、端末20で行う構成としてもよいし、そうでなくてもよい。
 明示的な言及のない限り、本開示の実施形態における判定の構成は必須でなく、判定条件を満たした場合に所定の処理が動作されたり、判定条件を満たさない場合に所定の処理がされたりしてもよいし、そうでなくてもよい。
Each embodiment of the present disclosure may also be realized in the form of a data signal embedded in a carrier wave, in which the program P is embodied by electronic transmission.
At least part of the processing on the server 10 and / or the terminal 20 may or may not be realized by cloud computing composed of one or more computers.
At least a part of the processing in the terminal 20 may or may not be performed by the server 10. In this case, at least a part of the processing of each functional unit of the control unit 21 of the terminal 20 may or may not be performed by the server 10.
At least a part of the processing in the server 10 may or may not be performed by the terminal 20. In this case, at least a part of the processing of each functional unit of the control unit 11 of the server 10 may or may not be performed by the terminal 20.
Unless explicitly stated, the configuration of the determination in the embodiment of the present disclosure is not essential, and a predetermined process is operated when the determination condition is satisfied, or a predetermined process is performed when the determination condition is not satisfied. It may or may not be.
 なお、本開示のプログラムは、限定でなく例として、ActionScript、JavaScript(登録商標)などのスクリプト言語、Objective-C、Java(登録商標)などのオブジェクト指向プログラミング言語、HTML5などのマークアップ言語などを用いて実装される。 The program of this disclosure is not limited to, but examples include scripting languages such as ActionScript and JavaScript (registered trademark), object-oriented programming languages such as Objective-C and Java (registered trademark), and markup languages such as HTML5. Implemented using.
<機能構成>
<実施形態1>
 <概要>
 本実施の形態に係る通信システム1においては、サーバ10を介して、端末20同士でメッセージングアプリケーションを介して、トークルーム上でメッセージのやり取りを行うことができる。トークルームとは、サーバ10が提供するメッセージングサービスにおいて、メッセージングサービスを利用するユーザ同士がコンテンツのやり取りをする場所のことをいう。また、トークルーム上でやり取りされるコンテンツは、ユーザが自身の端末20を利用して入力した文字情報、写真やスタンプなどを含む画像情報、音声ファイル、動画ファイル、データファイルなど各種のファイル情報を含むが、これらに限定するものではない。
<Functional configuration>
<Embodiment 1>
<Overview>
In the communication system 1 according to the present embodiment, messages can be exchanged on the talk room between the terminals 20 via the server 10 and via the messaging application. The talk room is a place where users who use the messaging service exchange contents in the messaging service provided by the server 10. In addition, the content exchanged in the talk room includes various file information such as character information input by the user using his / her own terminal 20, image information including photos and stamps, audio files, video files, and data files. Including, but not limited to these.
 通信システム1においては、さらに、トークルームを介して、端末20のユーザ同士で通話を実行することができる。通信システム1において、ユーザ10a、10bは、図2(a)に示すように、通話を行う。通話を終了した以降、トークルームにおいては、ユーザ同士で電話を行ったことを示す画像情報(以下、通話アイコンと呼称する。なお、通話を行ったことを示す画像は、アイコンに限定するものではない。画像情報は、限定ではなく、通話に関連する情報の一例。)が表示される。また、本実施の形態においては、更に、図2(b)に示すように、端末は、通話の内容を示すメッセージ(限定ではなく、通話情報の一例)をテキストで表示する。図2(b)は、ユーザ10bの端末20bの表示画面例を示す図である。以下、詳細に説明する。 In the communication system 1, the users of the terminal 20 can further execute a call via the talk room. In the communication system 1, the users 10a and 10b make a call as shown in FIG. 2A. After the call is terminated, in the talk room, image information indicating that the users have made a call (hereinafter referred to as a call icon. The image indicating that a call has been made is not limited to the icon. No. The image information is not limited, but an example of information related to a call.) Is displayed. Further, in the present embodiment, as shown in FIG. 2B, the terminal further displays a message (not limited, but an example of call information) indicating the content of the call as text. FIG. 2B is a diagram showing an example of a display screen of the terminal 20b of the user 10b. The details will be described below.
 (1)端末の機能構成
 図1に示すように、端末20は、制御部21により実現される機能として、メッセージ処理部211と、通話部212と、音声認識部213と、表示処理部214とを備える。
(1) Functional configuration of the terminal As shown in FIG. 1, the terminal 20 includes a message processing unit 211, a call unit 212, a voice recognition unit 213, and a display processing unit 214 as functions realized by the control unit 21. To be equipped.
 メッセージ処理部211は、サーバ10が提供するメッセージングサービスから提供されるメッセージングアプリケーションに従って、ユーザからの入力および/または通信I/F12が受信したメッセージを含むコンテンツの入力を受け付けて、表示処理部214に表示するように指示する。なお、ユーザからの入力を受け付けた場合には、その受け付けた入力内容を通信I/F22にサーバ10に宛てて送信するように指示する。なお、ここでメッセージ処理部211が処理する対象として、トークルームに対してユーザが入力したテキストメッセージに限らず、写真やスタンプなどを含む画像情報、音声ファイル、動画ファイル、データファイルなどを含んでよい。 The message processing unit 211 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display. When the input from the user is received, the communication I / F 22 is instructed to transmit the received input content to the server 10. Here, the target to be processed by the message processing unit 211 is not limited to the text message input by the user to the talk room, but also includes image information including photos and stamps, audio files, video files, data files, and the like. Good.
 また、メッセージ処理部211は、音声認識部213が音声認識により生成したテキストデータの文章量に応じて通話アイコンの表示サイズを決定して、文章量に応じた大きさの通話アイコンを表示するよう表示処理部214に指示してもよいし、しなくてもよい。文章量に応じた大きさで通話アイコンを表示することで、後々ユーザが確認した際に、通話アイコンの大きさから通話量を推測することができる。通話量を推測するとともに、通話を行った日時を確認することで、ユーザにその時の通話の内容を想起させやすくすることができる。このとき、通話アイコンの大きさに代えて、文章量に応じた色の変化で通話量の多少を表現することとしてもよい。 Further, the message processing unit 211 determines the display size of the call icon according to the text amount of the text data generated by the voice recognition unit 213, and displays the call icon having a size corresponding to the text amount. It may or may not be instructed to the display processing unit 214. By displaying the call icon in a size corresponding to the amount of text, the call amount can be estimated from the size of the call icon when the user confirms it later. By estimating the call volume and checking the date and time when the call was made, it is possible to make it easier for the user to recall the content of the call at that time. At this time, instead of the size of the call icon, the amount of call may be expressed by changing the color according to the amount of text.
 通話部212は、メッセージングサービス上で、サーバ10を介して、メッセージングサービスを利用する他のユーザとの間の通話を実行する機能を有する。通話部212は、メッセージングサービス上の通話の入力を端末20のユーザから受け付けた場合に、指定されている相手に対して発呼を行う機能と、メッセージングサービスを利用する他のユーザからの発呼を受け付ける(着呼する)機能とを有する。通話部212は、限定ではなく、一例として、VoIP(Voice over Internet Protocol)と呼称される機能により、通話を実行する。通話部212は、通話中に、通話の内容を録音して、記憶部28に記録してもよいし、しなくてもよい。また、通話部212は、ビデオ通話機能を有していてもよい。即ち、通話部212は、ビデオ通話の際には、マイク232が集音した音声と、カメラ234が撮像した映像とを、通信I/F22を介して、サーバ10に送信するとともに、通信I/Fを介して通話相手からサーバ10を介して送信された音声信号と映像信号とを受信し、音声信号に基づく音声をスピーカ233から出力させるとともに、映像信号に基づく映像を表示部24に表示するよう表示処理部214に指示する。また、通話部212は、ユーザがトークルーム上に表示された通話を行ったことを示す画像情報(通話アイコン。あるいは、通話アイコンとは別の通話のための画像)に対する入力に基づいて、通話を開始する(発呼をかける)こととしてもよいし、しなくてもよい。すなわち、トークルーム上で通話を行ったことを示す通話アイコンに対して所定の入力を行うことで、トークルームに対応するユーザとの通話を行うべく、発呼処理が実行されてもよいし、されなくてもよい。なお、通話は、端末20のユーザが保持するスマートスピーカのようなAIアシスタント機能を有するスピーカを介した通話であってもよい。その場合には、スマートスピーカを通して、他の端末との通話を行うことになるが、その場合には、スマートスピーカが収集した音声は、直接サーバ10に送信され、サーバ10から通話相手の端末に送信される。この場合、スマートスピーカ自体が音声認識処理を行って、テキストメッセージをサーバ10に送信し、サーバ10が、スマートスピーカに対応付けられているユーザの端末20のトークルームに、通話の内容を示すテキストメッセージを送信して、端末20の表示部24がトークルーム上に通話内容を示すメッセージを表示することとしてもよいし、スマートスピーカは音声をサーバ10に送信するのみとし、サーバ10が音声認識処理を行って、通話内容を示すテキストメッセージを、スマートスピーカに対応するユーザの端末20に送信して、端末20の表示部24が特ルーム上に通話内容を示すメッセージを表示することとしてもよい。また、スマートスピーカを利用した他の手法としては、端末20の通信I/F22は、一端、スマートスピーカからユーザの音声を受信し、通話部212は、スマートスピーカが収集した音声を受信し、その音声を、通信I/F22を介して、サーバ10に送信する構成としてもよい。 The call unit 212 has a function of executing a call with another user who uses the messaging service via the server 10 on the messaging service. The call unit 212 has a function of making a call to a designated party when receiving an input of a call on the messaging service from a user of the terminal 20, and a call from another user who uses the messaging service. Has a function of accepting (calling). The call unit 212 executes a call by a function called VoIP (Voice over Internet Protocol), for example, without limitation. The call unit 212 may or may not record the contents of the call in the storage unit 28 during the call. Further, the call unit 212 may have a video call function. That is, during a video call, the call unit 212 transmits the sound collected by the microphone 232 and the image captured by the camera 234 to the server 10 via the communication I / F22, and at the same time, the communication I / The voice signal and the video signal transmitted from the other party via the F via the server 10 are received, the sound based on the voice signal is output from the speaker 233, and the video based on the video signal is displayed on the display unit 24. Instruct the display processing unit 214. Further, the call unit 212 makes a call based on the input to the image information (call icon or image for a call different from the call icon) indicating that the user has made a call displayed on the talk room. May or may not be started (calling). That is, the call processing may be executed in order to make a call with the user corresponding to the talk room by inputting a predetermined input to the call icon indicating that the call has been made on the talk room. It does not have to be done. The call may be a call via a speaker having an AI assistant function such as a smart speaker held by the user of the terminal 20. In that case, a call is made with another terminal through the smart speaker, but in that case, the voice collected by the smart speaker is directly transmitted to the server 10 and the server 10 sends the call to the terminal of the other party. Will be sent. In this case, the smart speaker itself performs voice recognition processing and sends a text message to the server 10, and the server 10 sends a text indicating the content of the call to the talk room of the user's terminal 20 associated with the smart speaker. A message may be transmitted so that the display unit 24 of the terminal 20 displays a message indicating the content of the call on the talk room, or the smart speaker only transmits the voice to the server 10, and the server 10 performs the voice recognition process. A text message indicating the content of the call may be transmitted to the terminal 20 of the user corresponding to the smart speaker, and the display unit 24 of the terminal 20 may display the message indicating the content of the call on the special room. Further, as another method using the smart speaker, the communication I / F 22 of the terminal 20 receives the user's voice from the smart speaker at one end, and the call unit 212 receives the voice collected by the smart speaker. The voice may be transmitted to the server 10 via the communication I / F 22.
 音声認識部213は、通話部212により実行されている通話の音声を認識し、テキストデータに変換する機能を有する。音声認識部213による音声認識は、通話部212により記憶部28に記録された通話の録音データに対して実行するものであってもよい。音声認識部213は、音声認識により得られたテキストデータを、記憶部28に記録してもよいし、しなくてもよい。音声認識部213は、音声認識により得られたテキストデータを、メッセージ処理部211に送信する。音声認識部213は、音声認識により得られたテキストデータを話者ごとに時系列に沿って区分し、話者を示す情報と音声認識して得られた区分後のテキストデータとを対応付けて、メッセージ処理部211に送信する。音声の内容から話者を特定するには、会話を行っている音声の特徴量(限定ではなく一例として、周波数スペクトル)を抽出することによって、会話されている内容各々を分類し、話者を特定することができる。 The voice recognition unit 213 has a function of recognizing the voice of the call executed by the call unit 212 and converting it into text data. The voice recognition by the voice recognition unit 213 may be executed for the recorded data of the call recorded in the storage unit 28 by the call unit 212. The voice recognition unit 213 may or may not record the text data obtained by voice recognition in the storage unit 28. The voice recognition unit 213 transmits the text data obtained by voice recognition to the message processing unit 211. The voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order, and associates the information indicating the speaker with the text data after the classification obtained by voice recognition. , Is transmitted to the message processing unit 211. To identify the speaker from the content of the voice, the speaker is classified by extracting the feature amount (not limited, but as an example, the frequency spectrum) of the voice in which the conversation is being made. Can be identified.
 表示処理部214は、サーバ10が提供するメッセージングサービスから提供されるメッセージングアプリケーションに従って、ユーザからの入力および/または通信I/F12が受信したメッセージを含むコンテンツの入力を受け付けて、表示処理部214に表示するように指示する。なお、ユーザからの入力を受け付けた場合には、その受け付けた入力内容を通信I/F22にサーバ10に宛てて送信するように指示する。メッセージ処理部211は、端末20が送信したコンテンツ(限定ではなく、第2コンテンツの一例)と、端末20以外のユーザが保持する端末が送信したコンテンツ(限定ではなく、第1コンテンツの一例)とで、その表示態様を代えて表示する(限定ではなく一例として、他のユーザが送信したコンテンツを表示部24の表示領域の左側に、端末20のユーザが送信したコンテンツを表示部24の表示領域の右側に表示する、あるいは、各ユーザで送信したコンテンツの背景色を変更するなど)こととしてよい。他のユーザが送信したコンテンツを表示部24の表示領域の左側に表示するとは、表示領域の左側に寄せてコンテンツを表示することを意味する。即ち、図2(b)のトークルームの表示例に示されるように他のユーザが発話した音声に対応するメッセージの左端を、表示領域の左側に寄せて表示する。同様に、端末20のユーザが送信したコンテンツを表示部24の表示領域の右側に表示するとは、表示領域の右側に寄せてコンテンツ(メッセージ)の右端を寄せて表示することを意味する。即ち、図2(b)のトークルームの表示例に示されるように、端末20のユーザの発話に対応するメッセージの右端を端末20の表示領域の右側に寄せて表示する。また、表示処理部214は、音声認識部213が認識した音声に基づくテキストメッセージについて、端末20のユーザが発話した通話内容を示すメッセージ(限定ではなく第2情報の一例)を端末20のユーザに対応付けて表示領域に表示し、通話相手のユーザが発話した通話内容を示すメッセージ(限定ではなく第1情報の一例)を通話相手のユーザに対応付けて表示領域に表示する。 The display processing unit 214 receives the input from the user and / or the input of the content including the message received by the communication I / F 12 according to the messaging application provided by the messaging service provided by the server 10, and causes the display processing unit 214 to receive the input. Instruct to display. When the input from the user is received, the communication I / F 22 is instructed to transmit the received input content to the server 10. The message processing unit 211 includes content transmitted by the terminal 20 (not limited, but an example of the second content) and content transmitted by a terminal held by a user other than the terminal 20 (not limited, but an example of the first content). (As an example, not limited to the display area of the display unit 24, the content transmitted by the user of the terminal 20 is displayed on the left side of the display area of the display unit 24. It may be displayed on the right side of, or the background color of the content sent by each user may be changed). Displaying the content transmitted by another user on the left side of the display area of the display unit 24 means displaying the content closer to the left side of the display area. That is, as shown in the display example of the talk room of FIG. 2B, the left end of the message corresponding to the voice spoken by another user is displayed closer to the left side of the display area. Similarly, displaying the content transmitted by the user of the terminal 20 on the right side of the display area of the display unit 24 means displaying the content (message) on the right side of the display area. That is, as shown in the display example of the talk room of FIG. 2B, the right end of the message corresponding to the utterance of the user of the terminal 20 is shifted to the right side of the display area of the terminal 20 and displayed. Further, the display processing unit 214 sends a message (not limited to an example of the second information) indicating the contents of the call spoken by the user of the terminal 20 to the user of the terminal 20 with respect to the text message based on the voice recognized by the voice recognition unit 213. It is associated and displayed in the display area, and a message (not a limitation but an example of the first information) indicating the content of the call spoken by the user of the other party is displayed in the display area in association with the user of the other party.
 (2)サーバの機能構成
 図1に示すように、サーバ10は、制御部11により実現される機能として、メッセージ処理部111を備える。
(2) Functional Configuration of Server As shown in FIG. 1, the server 10 includes a message processing unit 111 as a function realized by the control unit 11.
 メッセージ処理部111は、各ユーザ間のやり取りを行うためのトークルームを管理する機能を備える。メッセージ処理部111は、サーバ10が提供するコンテンツングサービスの提供を受ける端末間のコンテンツを含むコンテンツのやり取りを中継する。即ち、あるユーザからトークルームへのコンテンツが送信された場合に、そのトークルームを特定し、トークルームに属する他のユーザにコンテンツを送信する。 The message processing unit 111 has a function of managing a talk room for exchanging information between users. The message processing unit 111 relays the exchange of contents including the contents between the terminals provided with the contenting service provided by the server 10. That is, when the content is transmitted from a certain user to the talk room, the talk room is specified and the content is transmitted to another user belonging to the talk room.
 <動作>
 図3は、本実施例に係る通信システム1において、各装置の間のやり取りの一例を示すシーケンス図である。図3に示すシーケンス図は、メッセージアプリケーション上でユーザ同士が通話を行った際のやり取りを示す図である。
<Operation>
FIG. 3 is a sequence diagram showing an example of communication between each device in the communication system 1 according to the present embodiment. The sequence diagram shown in FIG. 3 is a diagram showing exchanges when users make a call on a message application.
 図3に示すように、まず、端末20aは、ユーザからの入力に従って、メッセージアプリケーション上から、通話相手を指定して、発呼を行う(ステップS301)。つまり、端末20aは、サーバ10に、通話相手の情報を含む発呼要求を送信する。 As shown in FIG. 3, first, the terminal 20a makes a call by designating a call partner from the message application according to the input from the user (step S301). That is, the terminal 20a transmits a call request including the information of the other party to the server 10.
 サーバ10は、端末20aから、発呼要求を受信すると、その発呼要求に含まれる通話相手の情報から通話相手のユーザ(端末20b)を特定し、特定したユーザ(端末20b)に、発呼信号を送信する(ステップS302)。 When the server 10 receives a call request from the terminal 20a, the server 10 identifies the user (terminal 20b) of the call partner from the information of the call partner included in the call request, and calls the specified user (terminal 20b). A signal is transmitted (step S302).
 端末20bは、サーバ10から送信された発呼信号を受信する。即ち、端末20bは、メッセージアプリケーション上で、端末20aのユーザからの通話要求を着呼する(ステップS303)。そして、端末20a、20bは、メッセージアプリケーション上でサーバ10を介して通話を行う(ステップS304)。ここで、通話の内容は、録音されてもよいし、されなくてもよい。そして、端末20aのユーザと、端末20bのユーザとは、通話を終了する入力をそれぞれの端末に対して行って、通話を終了する(ステップS305)。 The terminal 20b receives the call signal transmitted from the server 10. That is, the terminal 20b receives a call request from the user of the terminal 20a on the message application (step S303). Then, the terminals 20a and 20b make a call on the message application via the server 10 (step S304). Here, the content of the call may or may not be recorded. Then, the user of the terminal 20a and the user of the terminal 20b make an input to end the call to each terminal and end the call (step S305).
 通話の終了後に、端末20bは、通話の内容に対する音声認識を行って、通話の内容をテキスト情報に変換する(ステップS306)。なお、ステップS304において、通話の内容を録音する場合は、通話の終了後でも音声認識処理を実行できるが、録音しない場合には、通話開始直後からオンタイムでの音声認識処理を実行することになる。端末20bは、音声認識により得られたメッセージ(テキストメッセージ)を記憶する(ステップS307)。なお、音声認識により得られたメッセージは、端末20bのみならず、サーバ10や端末20aに送信されて、サーバ10や端末20aにおいて記憶されてもよい。また、端末20bではなくサーバ10にのみ記憶されることとしてもよい。通信システムに関わるいずれかの装置において、音声認識により得られたテキストメッセージのデータが記憶され、トークルームでの表示を実現することができる。 After the end of the call, the terminal 20b performs voice recognition for the content of the call and converts the content of the call into text information (step S306). In step S304, when recording the content of the call, the voice recognition process can be executed even after the end of the call, but when not recording, the on-time voice recognition process is executed immediately after the start of the call. Become. The terminal 20b stores a message (text message) obtained by voice recognition (step S307). The message obtained by voice recognition may be transmitted not only to the terminal 20b but also to the server 10 and the terminal 20a and stored in the server 10 and the terminal 20a. Further, it may be stored only in the server 10 instead of the terminal 20b. In any device related to the communication system, the data of the text message obtained by voice recognition can be stored and displayed in the talk room.
 端末20bは、通話内容の音声認識処理を実行すると、音声認識したテキストデータをメッセージとしてメッセージングアプリケーションによって、端末20の表示部24の表示領域に表示する(ステップS308)。 When the terminal 20b executes the voice recognition process of the call content, the voice-recognized text data is displayed as a message in the display area of the display unit 24 of the terminal 20 (step S308).
 なお、図3には図示していないが、端末20aにおいても、ステップS306~S308の処理、即ち、通話の内容に対して音声認識処理を実行し、音声認識したテキストデータを表示する処理を実行することとしてもよいし、しなくてもよい。また、通話は、サーバ10を介して行われることから、音声認識処理は、サーバ10が実行することとしてもよく、その場合には、サーバ10が音声認識して得た通話の内容を示すテキストデータは、通話に関わる各ユーザ(端末20)に送信され、それぞれの端末のトークルーム上で表示される。このように、通話の内容が自動的にテキストデータに変換されて、トークルーム上で表示することによって、後々にユーザが実行した通話の内容を思い出したいときにも、確実に通話の内容を認識することができる。 Although not shown in FIG. 3, the terminal 20a also executes the processes of steps S306 to S308, that is, the process of executing the voice recognition process for the content of the call and displaying the voice-recognized text data. It may or may not be done. Further, since the call is made via the server 10, the voice recognition process may be executed by the server 10. In that case, a text indicating the content of the call obtained by the server 10 voice recognition. The data is transmitted to each user (terminal 20) involved in the call and displayed on the talk room of each terminal. In this way, the content of the call is automatically converted into text data and displayed on the talk room, so that the content of the call is surely recognized even when you want to remember the content of the call executed by the user later. can do.
 図4は、図3に示すシーケンス図の処理を実現するための端末20の動作例を示すフローチャートである。 FIG. 4 is a flowchart showing an operation example of the terminal 20 for realizing the processing of the sequence diagram shown in FIG.
 端末20の制御部21は、メッセージングアプリケーション上で通話が開始されたか否かを検出する(ステップS401)。これは、メッセージングアプリケーション上で、ユーザからの入力に従って、端末20からの発呼に対する応答があった場合、または、他の端末からの発呼に対する着呼入力があったか否かによって通話部212により検出することができる。 The control unit 21 of the terminal 20 detects whether or not a call has been started on the messaging application (step S401). This is detected by the call unit 212 on the messaging application depending on whether there is a response to the outgoing call from the terminal 20 according to the input from the user, or whether there is an incoming call input for the incoming call from another terminal. can do.
 端末20の制御部21は、通話部212が通話している間、通話の音声を録音し、録音した音声データを記憶部28に記憶する(ステップS402)。 The control unit 21 of the terminal 20 records the voice of the call while the call unit 212 is talking, and stores the recorded voice data in the storage unit 28 (step S402).
 端末20の制御部21は、入出力部23を介して、ユーザからの通話終了入力があるか否かに基づいて、通話が終了したか否かを判定する(ステップS403)。通話が終了していない場合には(ステップS403のNO)、通話が終了するまで待機する。 The control unit 21 of the terminal 20 determines whether or not the call has ended based on whether or not there is a call end input from the user via the input / output unit 23 (step S403). If the call has not ended (NO in step S403), it waits until the call ends.
 通話が終了したと判定された場合は(ステップS403のYES)、制御部21は、録音を終了する。音声認識部213は、録音されている音声データに対して音声認識処理を実行する。そして、音声認識により得られたテキストメッセージを記憶部28に記憶する(ステップS404)。即ち、音声認識部213は、録音されている音声データを、通話内容を示すテキストデータに変換する。 If it is determined that the call has ended (YES in step S403), the control unit 21 ends the recording. The voice recognition unit 213 executes voice recognition processing on the recorded voice data. Then, the text message obtained by voice recognition is stored in the storage unit 28 (step S404). That is, the voice recognition unit 213 converts the recorded voice data into text data indicating the contents of the call.
 なお、音声認識により得られたテキストメッセージは、サーバ10に送信されてもよいし、されなくてもよい。さらには、サーバ10がテキストメッセージを受信した場合に、通話相手の端末にも送信されてもよいし、送信されなくてもよい。端末20が音声認識により得たテキストメッセージを、サーバ10又は通話相手の端末に送信することにより、通話相手の端末においても通話内容を示すメッセージが、テキストで表示することができ、通話相手もまた、後に通話の内容を確認したときに、メッセージを見て、通話の内容を確認することができる。通話相手の端末は、受信したテキストメッセージを用いて、トークルーム上に、端末20と同様に表示することとしてもよいし、しなくてもよい。 The text message obtained by voice recognition may or may not be sent to the server 10. Furthermore, when the server 10 receives the text message, it may or may not be transmitted to the terminal of the other party. By transmitting the text message obtained by the terminal 20 by voice recognition to the server 10 or the terminal of the other party, the message indicating the contents of the call can be displayed in text on the other party's terminal, and the other party can also display the text message. , Later, when you check the contents of the call, you can see the message and check the contents of the call. The terminal of the other party may or may not display the received text message on the talk room in the same manner as the terminal 20.
 音声認識部213は、音声認識して得られたテキストデータを、それぞれ時系列順で話者ごとに区分けする(ステップS405)。このとき、音声認識部213は、同じ話者が話した内容のテキストデータであっても、所定の基準で、区分けすることとしてもよいし、しなくてもよい。限定ではなく一例として、文単位で区分けすることとしてもよいし、しなくてもよい。音声認識部213は、区分けしたテキストデータを、表示処理部214に伝達する。 The voice recognition unit 213 divides the text data obtained by voice recognition for each speaker in chronological order (step S405). At this time, the voice recognition unit 213 may or may not classify the text data of the content spoken by the same speaker according to a predetermined standard. As an example, not a limitation, it may or may not be divided by sentence. The voice recognition unit 213 transmits the divided text data to the display processing unit 214.
 そして、表示処理部214は、音声認識部213が区分けした各テキストデータを、対応する話者に対応付けて、トークルーム上のメッセージとして、表示部24に表示する(ステップS406)。つまり、端末20の制御部21は、端末20を保持するユーザの音声を音声認識して得たテキストメッセージ(限定ではなく、第2情報の例)を、端末20のユーザに対応付けて表示し、通話相手の音声を音声認識したテキストメッセージ(限定ではなく、第1情報の例)を、通話相手に対応づけて表示する。 Then, the display processing unit 214 displays each text data divided by the voice recognition unit 213 on the display unit 24 as a message on the talk room in association with the corresponding speaker (step S406). That is, the control unit 21 of the terminal 20 displays a text message (not limited, but an example of the second information) obtained by voice-recognizing the voice of the user holding the terminal 20 in association with the user of the terminal 20. , A text message (not limited, but an example of the first information) that voice-recognizes the voice of the other party is displayed in association with the other party.
 制御部21は、入出力部23を介して、ユーザからのメッセージングアプリケーションの終了入力があるか否かを判定する(ステップS407)。終了入力がない場合には(ステップS407のNO)、ステップS401の処理に戻る。一方で、終了入力があった場合には(ステップS407のYES)、処理を終了する。このように、本実施の形態に係る端末20によれば、図2(a)に示されるように、メッセージングアプリケーション上で通話を実行した場合に、図2(b)に示されるように、その通話内容を自動的にテキストに変換してメッセージとして表示することができる。したがって、後々、ユーザが通話をしたときの会話内容を想起するための一助とすることができる。 The control unit 21 determines whether or not there is a termination input of the messaging application from the user via the input / output unit 23 (step S407). If there is no end input (NO in step S407), the process returns to the process of step S401. On the other hand, if there is an end input (YES in step S407), the process ends. As described above, according to the terminal 20 according to the present embodiment, as shown in FIG. 2A, when a call is executed on the messaging application, the terminal 20 is shown in FIG. 2B. Call content can be automatically converted to text and displayed as a message. Therefore, it can be used later to help the user to recall the conversation content when the call is made.
 図5は、端末20における通話の内容を示すメッセージの表示に係る処理の動作例を示すフローチャートである。端末20は、トークルーム上で、ユーザ同士が通話を行った場合であって、その通話内容のメッセージの表示・非表示の切替ができる機能を有してもよいし、有さなくてもよい。図5は、メッセージの表示・非表示の切替ができる場合の端末20の動作例を示すフローチャートである。ここでは、端末20の表示部24にトークルームが表示されており、且つ、過去にメッセージングアプリケーション上で、通話を行ったことがある場合における端末20の動作を示すフローチャートである。なお、図5に示す処理は、ユーザが、端末20においてメッセージングアプリケーションを実行し、トークルームを表示している過程での処理である。 FIG. 5 is a flowchart showing an operation example of the process related to the display of the message indicating the content of the call on the terminal 20. The terminal 20 may or may not have a function of switching between display and non-display of a message of the contents of a call when users make a call on the talk room. .. FIG. 5 is a flowchart showing an operation example of the terminal 20 when the display / non-display of the message can be switched. Here, it is a flowchart showing the operation of the terminal 20 when the talk room is displayed on the display unit 24 of the terminal 20 and the talk has been made on the messaging application in the past. The process shown in FIG. 5 is a process in which the user executes a messaging application on the terminal 20 and displays the talk room.
 端末20の表示部24には、トークルームが表示され、メッセージングアプリケーション上で過去に通話を行ったがある場合には、トークルームに通話を行ったことを示す画像情報(通話アイコン)が表示される。端末20の制御部21は、トークルーム上に表示されている通話アイコンに対する入力(限定ではなく一例としてタッチ入力)が、入出力部23に対して成されたか否かを判定する(ステップS501)。 A talk room is displayed on the display unit 24 of the terminal 20, and if there has been a call in the past on the messaging application, image information (call icon) indicating that the call has been made to the talk room is displayed. To. The control unit 21 of the terminal 20 determines whether or not the input to the call icon displayed on the talk room (not limited but touch input as an example) is made to the input / output unit 23 (step S501). ..
 通話アイコンに対するタッチ入力があった場合には(ステップS501のYES)、制御部21は、通話アイコンに対応するメッセージの内容が展開済みであるか否かを判定する(ステップS502)。メッセージが展開されているとは、通話の内容を示すメッセージが表示されていることと同義である。 When there is a touch input to the call icon (YES in step S501), the control unit 21 determines whether or not the content of the message corresponding to the call icon has been expanded (step S502). Expanding a message is synonymous with displaying a message indicating the content of the call.
 通話メッセージが展開済みである場合には(ステップS502のYES)、表示処理部214は、表示されている通話メッセージを非表示にする(ステップS503)。一方で、通話メッセージが展開済みでない場合には(ステップS502のNO)、表示処理部214は、通話メッセージの内容を表示部24に表示して(ステップS504)、終了する。なお、通話終了時において、端末20がトークルームにおいてメッセージを展開状態で表示するか、展開されていない状態で表示するかは、任意であり、ユーザが端末20に対してした設定により定められてもよい。また、通話の内容のメッセージを表示する際に、通話の内容を音声認識して変換したテキストメッセージの全てを表示することとしてもよいし、一部の抜粋のみを表示することとしてもよい。一部の抜粋を表示する場合には、テキストメッセージを解析することにより、その通話において重要な内容と推察される内容を示すテキストメッセージを表示することとしてよい。 If the call message has already been expanded (YES in step S502), the display processing unit 214 hides the displayed call message (step S503). On the other hand, if the call message has not been expanded (NO in step S502), the display processing unit 214 displays the content of the call message on the display unit 24 (step S504), and ends. At the end of the call, it is arbitrary whether the terminal 20 displays the message in the expanded state or the unexpanded state in the talk room, and is determined by the setting made to the terminal 20 by the user. May be good. Further, when displaying the message of the contents of the call, all of the text messages converted by voice recognition of the contents of the call may be displayed, or only a part of the excerpts may be displayed. When displaying a part of the excerpt, the text message may be analyzed to display a text message indicating the content that is presumed to be important in the call.
 図6は、図4に示す端末20においてトークルーム上で通話を行った場合の、通話の前後でのトークルームの表示の変化例を示す図である。図6(a)は、通話前のトークルームの表示例を示しており、図6(b)は、通話後のトークルームの表示例を示している。 FIG. 6 is a diagram showing an example of changes in the display of the talk room before and after the call when a call is made on the talk room on the terminal 20 shown in FIG. FIG. 6A shows an example of displaying the talk room before the call, and FIG. 6B shows an example of displaying the talk room after the call.
 図6(a)には、端末20のユーザの、あるトークルームの表示例を示しており、22時11分に送信されたメッセージ601が表示されている状態を示している。この状態で、端末20のユーザが、トークルームに関連する他のユーザと通話を行ったとする。この通話の内容は、記録されて音声認識処理により、テキストメッセージに変換される。そして、テキストメッセージは、通話に関連した各ユーザにメッセージを対応付けて表示する。つまり、端末20は、図6(b)に示すように、メッセージ601に続けて、トークルーム上に通話を行ったことを示す通話アイコン611を表示する。通話アイコン611には、通話を行った日時情報612(通話の開始日時でもよいし、終了日時でもよい)が対応付けられて表示されてもよいし、されなくてもよい。そして、端末20は、通話アイコン611に続けて、点線613で囲った部分に示すように、通話の内容を音声認識により、テキストに変換したメッセージとして、通話内容を表示する。これにより、端末20は、トークルーム上に通話した内容を示す情報をメッセージという形式で残すことができる。 FIG. 6A shows a display example of a certain talk room of the user of the terminal 20, and shows a state in which the message 601 sent at 22:11 is displayed. In this state, it is assumed that the user of the terminal 20 makes a call with another user related to the talk room. The content of this call is recorded and converted into a text message by voice recognition processing. Then, the text message is displayed in association with each user related to the call. That is, as shown in FIG. 6B, the terminal 20 displays the call icon 611 indicating that the call has been made on the talk room, following the message 601. The call icon 611 may or may not be associated with date and time information 612 (which may be the start date and time or the end date and time of the call) when the call was made. Then, following the call icon 611, the terminal 20 displays the call content as a message in which the content of the call is converted into text by voice recognition, as shown in the portion surrounded by the dotted line 613. As a result, the terminal 20 can leave information indicating the contents of the call on the talk room in the form of a message.
 図7は、図5に示す端末20における処理を行った場合の表示例を示す図である。図7(a)は、通話の内容を示すメッセージを表示していない状態を示す画面図であり、図7(b)は、通話の内容を示すメッセージを展開して表示している状態を示す画面図である。 FIG. 7 is a diagram showing a display example when processing is performed on the terminal 20 shown in FIG. FIG. 7A is a screen view showing a state in which a message indicating the content of the call is not displayed, and FIG. 7B shows a state in which a message indicating the content of the call is expanded and displayed. It is a screen view.
 図7(a)に示すように、端末20の表示部24には、メッセージングアプリケーションのトークルームが表示される。そして、そのトークルーム上には、通話を行ったことを示す通話アイコン611が表示されているとする。ユーザは、このときの通話内容を知りたい場合には、図7(a)に示すように、通話アイコン611に対して、自身の指やスタイラスなどを用いてタッチ入力、即ち、通話内容のメッセージの展開の指示を行う。 As shown in FIG. 7A, the talk room of the messaging application is displayed on the display unit 24 of the terminal 20. Then, it is assumed that the call icon 611 indicating that the call has been made is displayed on the talk room. When the user wants to know the contents of the call at this time, as shown in FIG. 7A, the user touch-inputs the call icon 611 using his / her finger or a stylus, that is, a message of the contents of the call. Instruct the deployment of.
 図7(a)に示すように、通話メッセージが展開(表示)されていない状態で、通話アイコン611にタッチ入力を検出すると、端末20は、対応する通話の内容を示すメッセージを展開、即ち、図7(b)に示すように、表示部24に表示する。図7(b)に示されるように、通話アイコン611の下には、通話の内容をメッセージ形式で表示した例を示している。 As shown in FIG. 7A, when a touch input is detected on the call icon 611 in a state where the call message is not expanded (displayed), the terminal 20 expands a message indicating the content of the corresponding call, that is, As shown in FIG. 7B, it is displayed on the display unit 24. As shown in FIG. 7B, below the call icon 611, an example in which the content of the call is displayed in a message format is shown.
 また、図7(b)に示す表示態様のように、通話の内容を示すメッセージが表示されている状態で、通話アイコン611に対するタッチ入力が検出された場合には、端末20の表示処理部214は、図7(b)に示す表示態様から、図7(a)に示す表示態様に変更することができる。なお、通話後の最初の表示態様としては、図6(b)に示す表示態様であってもよいし、図7(a)に示す表示態様であってもよい。また、いずれの表示態様を初期の表示態様とするかについては、端末20においてメッセージングアプリケーションに対して、端末20のユーザが設定可能に構成されていてもよく、端末20はユーザが設定した設定内容にしたがって、図6(b)に示す表示態様と、図7(a)に示す表示態様とのいずれかを表示することとしてよい。 Further, when a touch input to the call icon 611 is detected while a message indicating the content of the call is displayed as in the display mode shown in FIG. 7B, the display processing unit 214 of the terminal 20 Can be changed from the display mode shown in FIG. 7 (b) to the display mode shown in FIG. 7 (a). The first display mode after the call may be the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a). Further, as to which display mode is to be the initial display mode, the user of the terminal 20 may be configured to be able to set the messaging application in the terminal 20, and the terminal 20 has the setting contents set by the user. Therefore, either the display mode shown in FIG. 6 (b) or the display mode shown in FIG. 7 (a) may be displayed.
 図6、図7に示したように、通話アイコン611に対して、通話の内容を示すメッセージを端末に表示させることにより、端末20は、ユーザに思い出したい会話を思い出させることができる。なお、ここでは、メッセージを展開する例を示しているが、通話の内容を示すメッセージの表示方法は、展開に限るものではなく、限定ではなく一例として、ユーザが通話アイコン611の付近をタッチしているときにメッセージをポップアップさせる表示であってもよいし、トークルームとは別の画面に遷移しての表示であってもよい。なお、通話アイコン611として、通話に関わるユーザの画像を表示することとしてもよく、その場合に、通話アイコン611の代替として表示してもよいし、通話アイコン611とともに表示してもよい。また、ユーザの画像は、限定ではなく一例として、ユーザの顔写真や、メッセージングアプリケーション上でユーザが用いているプロフィール画像や、通話をした際にインカメラを用いて撮像したユーザの顔写真(あるいはその加工物)などを用いることができるが、これらに限定するものではない。 As shown in FIGS. 6 and 7, by displaying a message indicating the content of the call on the call icon 611 on the terminal, the terminal 20 can remind the user of the conversation he / she wants to remember. Although an example of expanding the message is shown here, the display method of the message indicating the content of the call is not limited to the expansion, and as an example, the user touches the vicinity of the call icon 611. It may be a display that pops up a message when the call is being made, or a display that transitions to a screen different from the talk room. The call icon 611 may be used to display an image of a user involved in the call. In that case, the call icon 611 may be displayed as a substitute for the call icon 611, or may be displayed together with the call icon 611. In addition, the user's image is not limited, but as an example, the user's face photograph, the profile image used by the user on the messaging application, and the user's face photograph (or the user's face photograph) taken by using the in-camera when making a call. The processed product) and the like can be used, but the present invention is not limited to these.
 図8は、通話アイコン611の一表示態様を示す図である。図8(a)は、ユーザが指を通話アイコン611に近づけている例を示しており、図8(b)は、ユーザの指が一定以上通話アイコン611に近づいた例を示している。図8は、通話の内容を示すメッセージが展開されていない状態を示している。 FIG. 8 is a diagram showing one display mode of the call icon 611. FIG. 8A shows an example in which the user brings his / her finger closer to the call icon 611, and FIG. 8B shows an example in which the user's finger approaches the call icon 611 by a certain amount or more. FIG. 8 shows a state in which a message indicating the content of the call is not expanded.
 図8の矢印801に示すように、ユーザは、自身の指を、通話アイコン611aに近づけるとする。このとき、端末のタッチパネル231は、ユーザの指がタッチパネル231に接触している状態、または、一定以上近接している状態を検出し、その操作位置を検出する。そして、端末20の制御部21は、検出した操作位置が示すタッチパネル231上の座標が、通話アイコン611aの表示座標に近づいているかを判定する。そして、ユーザの指が、通話アイコン611aに近づいていると判定した場合に、端末20の制御部21は、図8(b)に示すように通話アイコン611bを拡大表示することとしてもよいし、しなくてもよい。通話アイコン611bを拡大表示することで、ユーザに通話アイコン611bへのタッチを容易にすることができる。そして、拡大表示された通話アイコン611bをタッチすることで、図7に示したように、メッセージの展開・非展開を切替える操作を行うことができる。 As shown by arrow 801 in FIG. 8, the user brings his or her finger closer to the call icon 611a. At this time, the touch panel 231 of the terminal detects a state in which the user's finger is in contact with the touch panel 231 or is in close proximity to the touch panel 231 or more, and detects the operation position. Then, the control unit 21 of the terminal 20 determines whether the coordinates on the touch panel 231 indicated by the detected operation position are close to the display coordinates of the call icon 611a. Then, when it is determined that the user's finger is approaching the call icon 611a, the control unit 21 of the terminal 20 may enlarge the call icon 611b as shown in FIG. 8B. You don't have to. By enlarging the call icon 611b, the user can easily touch the call icon 611b. Then, by touching the enlarged call icon 611b, it is possible to perform an operation of switching between expansion and non-expansion of the message as shown in FIG. 7.
 また、図6や図7においては、メッセージを通話アイコン611の下に展開する例を示したが、通話の内容を示すメッセージの表示方法は、この例に限定するものではない。限定ではなく一例として、端末20は、通話内容を示すメッセージの内容を、図9(a)に示すように、ポップアップメッセージ901として表示するように構成されてもよい。また、あるいは、端末20は、通話内容を示すメッセージの内容を、図9(b)に示すように、トークルームとは別の画面に遷移して、表示するように構成されてもよい。また、その際には、元のトークルームの表示に戻るためのリターンアイコン902が表示されてもよいし、表示されてなくてもよい。リターンアイコン902をタッチすることにより、元のトークルームの表示に戻ることができる。 Further, in FIGS. 6 and 7, an example of expanding the message under the call icon 611 is shown, but the display method of the message indicating the content of the call is not limited to this example. As an example, but not a limitation, the terminal 20 may be configured to display the content of a message indicating the content of a call as a pop-up message 901 as shown in FIG. 9A. Alternatively, the terminal 20 may be configured to transition to a screen different from the talk room and display the content of the message indicating the content of the call, as shown in FIG. 9B. At that time, the return icon 902 for returning to the original talk room display may or may not be displayed. By touching the return icon 902, the original talk room display can be returned.
 なお、実施形態では、音声の特徴量を用いて通話における話者を特定しているが、発話事に、各発話を取得した端末がその音声信号に対して、各端末(またはユーザ)を識別可能な情報を付与することで、各音声の話者を区別できるように構成してもよい。また、スマートスピーカが複数のユーザの音声を拾って別の端末のユーザと通話を行う場合には、スマートスピーカが拾った音声それぞれの話者は、それぞれの話者の位置情報を音声と共に受信することで、話者を特定してもよい。これは、スマートスピーカのマイクとして指向性マイクを用いることで、音声がどの方向からの音声からかで話者を区別できるので、スマートスピーカが音声に対して、音声を受信した方向を示す情報を付与することで話者の区別ができる。これにより、メッセージ処理部211では、話者に対応付けて通話内容を示すメッセージを表示することができる。また、音声認識部213は、連続して同じ話者が会話を続けている場合であっても、文の切れ目、会話の切れ目、文脈の切れ目等によって、音声認識して得られたテキストデータを区分してもよいし、しなくてもよい。また、この区分は単純に文字数が所定の文字数を超えた時点で区切るように構成されてもよいし、しなくてもよい。また、音声認識部213は、音声認識により得られたテキストデータのうち周囲のノイズに関連する内容は削除することとしてもよい。これは、既知のノイズキャンセリング技術を用いることとしてもよいし、文脈解析を用いて、不自然な語がテキストデータの中にある場合にその内容を除去することで実現してもよい。また、音声認識部213は、得られたテキストデータにおいて、相槌に関するメッセージは削除することとしてもよいし、しなくてもよい。また、あるいは、相槌を打っている場合には、相槌をうったことを示す情報として画像情報(限定ではなく一例として、相槌を打っている様子を示すスタンプ)を用いて相槌を表現することとしてもよい。 In the embodiment, the speaker in the call is specified by using the voice feature amount, but in the utterance, the terminal that acquired each utterance identifies each terminal (or the user) with respect to the voice signal. It may be configured so that the speaker of each voice can be distinguished by giving possible information. In addition, when the smart speaker picks up the voices of a plurality of users and makes a call with a user of another terminal, each speaker of the voice picked up by the smart speaker receives the position information of each speaker together with the voice. By doing so, the speaker may be identified. This is because by using a directional microphone as the microphone of the smart speaker, the speaker can be distinguished from the voice from which direction the voice is heard, so that the information indicating the direction in which the smart speaker receives the voice is displayed with respect to the voice. Speakers can be distinguished by giving them. As a result, the message processing unit 211 can display a message indicating the contents of the call in association with the speaker. Further, the voice recognition unit 213 uses the text data obtained by voice recognition based on sentence breaks, conversation breaks, context breaks, etc., even when the same speaker continues the conversation. It may or may not be divided. Further, this division may or may not be simply divided when the number of characters exceeds a predetermined number of characters. Further, the voice recognition unit 213 may delete the content related to the ambient noise in the text data obtained by the voice recognition. This may be achieved by using known noise canceling techniques or by using contextual analysis to remove unnatural words in the text data, if any. In addition, the voice recognition unit 213 may or may not delete the message related to the reciprocity in the obtained text data. Alternatively, in the case of hitting an agitation, the image information (not limited, but as an example, a stamp showing the appearance of agitation) is used to express the agitation as information indicating that the agitation has been made. May be good.
 <実施形態の効果>
 以下、実施形態1の効果について述べる。
<Effect of embodiment>
Hereinafter, the effect of the first embodiment will be described.
 上記実施形態に係る端末20のユーザは、端末20を用いて、サーバ10が提供するメッセージングアプリケーションを介して、他のユーザと通話を行う。そして、端末20は、通話の内容を、端末20の表示部24の表示領域に、メッセージングアプリケーションのトークルームの中で、通話の内容を示す情報を表示する。具体的には、端末20は、通話の内容に対して音声認識処理を行うことにより、テキストデータに変換する。そして、端末20は、変換したテキストデータを、メッセージングアプリケーションのトークルームに表示する。 The user of the terminal 20 according to the above embodiment uses the terminal 20 to make a call with another user via the messaging application provided by the server 10. Then, the terminal 20 displays the contents of the call in the display area of the display unit 24 of the terminal 20 in the talk room of the messaging application, and displays the information indicating the contents of the call. Specifically, the terminal 20 converts the content of the call into text data by performing voice recognition processing. Then, the terminal 20 displays the converted text data in the talk room of the messaging application.
 この構成により、端末20のユーザは、後に通話の内容を思い出したいときに、通話の内容を示す情報を確認することで、通話の内容を想起する一助とすることができる。また、端末20は、ユーザに特別な操作を強いることなく、通話の内容をテキストメッセージに変換して表示することができる。 With this configuration, when the user of the terminal 20 wants to remember the contents of the call later, he / she can help to recall the contents of the call by checking the information indicating the contents of the call. In addition, the terminal 20 can convert the contents of the call into a text message and display it without forcing the user to perform a special operation.
 また、端末20は、トークルーム上に通話を行ったことを示す通話アイコンを表示することとしてよい。そして、その通話アイコンに対するユーザからの入力によって、通話の内容を示すメッセージの表示、非表示を切り替えることとしてよい。 Further, the terminal 20 may display a call icon indicating that a call has been made on the talk room. Then, the display or non-display of the message indicating the content of the call may be switched by the input from the user to the call icon.
 これにより、端末20は、メッセージを非表示とすることで、通話が長引いた場合に通話の内容を表示するにあたってメッセージの量が膨大になることでトークルームが見づらくなるのを防止することができるとともに、通話アイコンに対する入力を行うことで、メッセージを展開して、ユーザに通話の内容を認識させることができる。 As a result, the terminal 20 can prevent the talk room from becoming difficult to see due to the huge amount of messages when displaying the contents of the call when the call is prolonged by hiding the message. At the same time, by inputting to the call icon, the message can be expanded and the user can recognize the content of the call.
 また、端末20は、通話の内容に対して音声認識処理を行って得られたテキストメッセージの内、全てを表示しなくてもよいし、一部を表示してもよいし、全てを表示してもよい。また、いずれの表示態様にするかは、端末20に対するユーザの設定により決定されてもよい。 Further, the terminal 20 may not display all, may display a part, or display all of the text messages obtained by performing voice recognition processing on the contents of the call. You may. In addition, which display mode to use may be determined by the user's setting for the terminal 20.
 すべてを表示しない場合には、トークルームの表示内容が簡潔になり、ユーザにとってトークルームにおける操作が容易になり、通話の内容の一部だけ表示することで、トークルームの簡潔性と通話の内容をユーザに認識させることを両立させることができ、全てを表示した場合には、ユーザにより詳細に通話の内容を認識させることができる。また、いずれの表示態様を用いるかをユーザが選択、設定することにより、端末20は、ユーザに対する利便性を提供することができる。 When not all are displayed, the display contents of the talk room become concise, the user can easily operate in the talk room, and by displaying only a part of the contents of the call, the conciseness of the talk room and the contents of the call are displayed. Can be made to be recognized by the user at the same time, and when all are displayed, the user can be made to recognize the contents of the call in detail. Further, the terminal 20 can provide convenience to the user by selecting and setting which display mode to use.
 また、端末20は、通話を行った際に通話を行ったことを示す情報として、通話相手の画像(限定ではなく一例として顔画像、あるいは、メッセージングアプリケーション上で用いているプロフィール画像)を用いてもよく、更に、端末20のユーザの画像(限定ではなく一例として顔画像、あるいは、メッセージングアプリケーション上で用いているプロフィール画像)も併せて表示することとしてもよい。 In addition, the terminal 20 uses an image of the other party (a face image as an example, or a profile image used on the messaging application) as information indicating that the call has been made when the call is made. Further, an image of the user of the terminal 20 (not limited to a face image as an example, or a profile image used on a messaging application) may also be displayed.
 これにより、端末20は、ユーザに、通話を行ったこと、そして、通話相手が誰であったかを一目で認識させることができる。 As a result, the terminal 20 can make the user recognize at a glance that the call was made and who the other party was.
 また、端末20は、通話の内容を音声認識処理によりテキストデータに変換する際に、発話しているユーザが誰であるかを特定する。そして、特定したユーザに対応するように変換してテキストデータを、他方のユーザに対して送信したメッセージであるかのように表示する。 Further, the terminal 20 identifies who is the speaking user when converting the contents of the call into text data by the voice recognition process. Then, the text data is converted so as to correspond to the specified user, and the text data is displayed as if it were a message sent to the other user.
 これにより、端末20は、通話中の端末20のユーザと、通話相手のユーザとを区別して、メッセージを表示することができるので、それぞれの発言が誰の発言であったかを、後々確認させることができる。 As a result, the terminal 20 can distinguish between the user of the terminal 20 during a call and the user of the other party and display a message, so that it is possible to later confirm who said each statement. it can.
 また、端末20は、メッセージングアプリケーション上で、通話を行ったことを示す画像情報をトークルーム上に表示し、その画像情報に対するユーザの入力が有った場合に、そのトークルームに紐づけられているユーザとの通話を開始するように構成されてもよい。 Further, the terminal 20 displays image information indicating that a call has been made on the messaging application on the talk room, and when there is a user input for the image information, the terminal 20 is associated with the talk room. It may be configured to initiate a call with a user.
 この構成により、ユーザは、トークルームに関連するユーザと今一度通話をしたくなった場合にも、ややこしい入力をすることなく、手軽に発呼することができる。 With this configuration, even if the user wants to make another call with the user related to the talk room, he / she can easily make a call without making complicated input.
<実施形態2>
 上記実施形態1においては、メッセージングアプリケーションのユーザ間で通常の音声通話を行った場合の例を説明した。本実施形態2においては、メッセージングアプリケーションのユーザ間でビデオ通話を行った場合の例について説明する。
<Embodiment 2>
In the first embodiment, an example in which a normal voice call is made between users of the messaging application has been described. In the second embodiment, an example in which a video call is made between users of the messaging application will be described.
 図10は、ユーザがビデオ通話を行った場合の端末の動作例を示すフローチャートである。本実施形態に係るメッセージングアプリケーションにおいては、ビデオ通話による通話も可能である。ビデオ通話とは、所謂、テレビ電話機能のことである。図10に示すように、端末20の通話部212は、サーバ10を介して、通話相手との間でビデオ通話を開始する(ステップS1001)。これは、メッセージングアプリケーション上において、端末20のユーザが発呼指示を行う、もしくは、他のユーザからの発呼を受けることにより開始する。 FIG. 10 is a flowchart showing an operation example of the terminal when the user makes a video call. In the messaging application according to the present embodiment, a call by a video call is also possible. A video call is a so-called video telephone function. As shown in FIG. 10, the call unit 212 of the terminal 20 starts a video call with the other party via the server 10 (step S1001). This is started when the user of the terminal 20 gives a call instruction or receives a call from another user on the messaging application.
 通話部212は、ビデオ通話を開始すると、入出力部23のカメラ234に撮像の開始を指示する。カメラ234は、インカメラとして、端末20の表示部24側、即ち、端末20のユーザを撮像する。また、通話部212は、マイク232に対して、端末20のユーザの会話音を取得するように指示する。通話部212は、ビデオ通話中、カメラ234が撮像した映像、および、マイク232が取得した音声を、通信I/F22を介して、サーバ10に送信する。カメラ234が撮像した映像、および、マイク232が取得した音声は、サーバ10から通話相手の端末に送信される。また、端末20の通信I/F22は、逐次サーバ10から逐次通話相手の端末から送信された映像と音声を受信し、受信した映像を表示部24に表示するように表示処理部214に指示するとともに、受信した音声をスピーカ233から出力するよう入出力部23に指示する。通話部212は、ビデオ通話において、端末20が撮像した映像および取得した音声、ならびに、通話相手の端末から送信された映像および音声を、記憶部28に記憶する。 When the call unit 212 starts a video call, the call unit 212 instructs the camera 234 of the input / output unit 23 to start imaging. As an in-camera, the camera 234 images the display unit 24 side of the terminal 20, that is, the user of the terminal 20. In addition, the call unit 212 instructs the microphone 232 to acquire the conversation sound of the user of the terminal 20. The call unit 212 transmits the video captured by the camera 234 and the voice acquired by the microphone 232 to the server 10 via the communication I / F 22 during the video call. The video captured by the camera 234 and the voice acquired by the microphone 232 are transmitted from the server 10 to the terminal of the other party. Further, the communication I / F 22 of the terminal 20 receives the video and audio transmitted from the terminal of the other party in sequence from the sequential server 10 and instructs the display processing unit 214 to display the received video on the display unit 24. At the same time, the input / output unit 23 is instructed to output the received voice from the speaker 233. In a video call, the call unit 212 stores the video captured by the terminal 20 and the acquired voice, and the video and voice transmitted from the terminal of the other party in the call in the storage unit 28.
 端末20は、端末20からのビデオ通話の終了の指示入力、もしくは、通話相手が通話を切ることによりビデオ通話を終了する(ステップS1003)。 The terminal 20 ends the video call by inputting an instruction to end the video call from the terminal 20 or by the other party disconnecting the call (step S1003).
 端末20の音声認識部213は、録画しておいたビデオ通話の音声に対して音声認識を行う(ステップS1004)。また、端末20の制御部21は、画像の内容からユーザの感情を特定してもよいし、しなくてもよい。 The voice recognition unit 213 of the terminal 20 performs voice recognition on the recorded voice of the video call (step S1004). Further, the control unit 21 of the terminal 20 may or may not specify the user's emotion from the content of the image.
 端末20の音声認識部213は、音声認識を終了すると、音声認識により得られるテキストメッセージを、トークルームに表示する(ステップS1005)。また、制御部21が、ユーザの感情を特定していた場合には、メッセージを特定したユーザの感情に応じた表示態様でメッセージを表示することとしてもよいし、しなくてもよい。ここで、ユーザの感情に応じた表示態様とは、メッセージを表示するためのバブル(吹き出し)の形を変更(例えば、ユーザが怒っている場合には、吹き出しの形をギザギザにしたりする)したり、メッセージに特定の感情を示す文字を付与したり(例えば、ユーザが怒っている場合には、#をメッセージの最後に付与したり、ユーザが喜んでいる場合には、♪記号をメッセージの最後に付与したりする)、感情に応じた色で文字を表示したりすることであってよい。また、あるいは、ユーザの感情を示す顔文字や画像情報(限定ではなく、一例としてスタンプ)を併せて表示するようにしてもよい。 When the voice recognition unit 213 of the terminal 20 finishes the voice recognition, the text message obtained by the voice recognition is displayed in the talk room (step S1005). Further, when the control unit 21 has specified the emotion of the user, the message may or may not be displayed in a display mode according to the emotion of the user who specified the message. Here, the display mode according to the user's emotion is to change the shape of the bubble (speech balloon) for displaying the message (for example, when the user is angry, the shape of the balloon is jagged). Or add a character to the message to indicate a specific emotion (for example, if the user is angry, add # at the end of the message, or if the user is happy, add the ♪ symbol to the message. It may be given at the end), or the characters may be displayed in a color according to the emotion. Alternatively, an emoticon or image information (not limited, but a stamp as an example) indicating the user's emotion may be displayed together.
 端末20の制御部21は、ビデオ通話の間に、ユーザがアウトカメラへの切替又はアウトカメラの起動を行ったか否かを判定する(ステップS1006)。これは、端末20のユーザがアウトカメラへの切替又は起動を行った場合には、端末20に対するユーザからの入力によって検出することができ、通話相手がアウトカメラへの切替を行って撮像した映像を送信した場合には、映像に不自然な切れ目が発生するので、その切れ目を検出することにより検出することができる。 The control unit 21 of the terminal 20 determines whether or not the user has switched to the out-camera or activated the out-camera during the video call (step S1006). This can be detected by the input from the user to the terminal 20 when the user of the terminal 20 switches or activates the out-camera, and the image captured by the other party switching to the out-camera. When is transmitted, an unnatural break occurs in the image, and it can be detected by detecting the break.
 ビデオ通話中にアウトカメラへの切替が行われていた場合には(ステップS1006のYES)、制御部21は、アウトカメラが撮影して得た映像のなかの一フレームを静止画として、あるいは、アウトカメラが撮影した間に得られた映像を動画として、トークルーム上のビデオ通話の内容をテキストメッセージに変換して表示したメッセージに対応付けて表示する(ステップS1007)。動画の場合は、アウトカメラに切換えたタイミングから再びインカメラに切換えたタイミングまでの間の動画であるとしてよいが、これに限るものではない。なお、この静止画もしくは動画の挿入位置は、任意であってよく、例えば、ビデオ通話を音声認識して変換したテキストメッセージの最初であってもよいし、最後であってもよいし、アウトカメラへの切替が発生したタイミングであってもよい。ビデオ通話中において、アウトカメラへの切替が行われていない場合には(ステップS1007のNO)、ステップS1008の処理に移行する。 If the switch to the out-camera is performed during the video call (YES in step S1006), the control unit 21 uses one frame of the image taken by the out-camera as a still image or The video obtained during the shooting by the out-camera is converted into a text message of the video call in the talk room and displayed in association with the displayed message (step S1007). In the case of a moving image, the moving image may be between the timing of switching to the out-camera and the timing of switching to the in-camera again, but the present invention is not limited to this. The insertion position of the still image or the moving image may be arbitrary, for example, it may be the first or the last of the text message converted by voice recognition of the video call, or the out-camera. It may be the timing when the switch to is generated. If the out-camera is not switched during the video call (NO in step S1007), the process proceeds to step S1008.
 制御部21は、通話中に、位置情報に関する入力があるか否かを判定する(ステップS1008)。ここで、位置情報に関する入力とは、端末20もしくは通話相手の端末の位置が特定できる情報の入力であれば、どのような態様での入力であってもよく、限定ではなく一例として、音声もしくはユーザもしくは通話相手からの直接入力による地名や施設の名称の入力、ユーザからの位置情報(GPSによる位置情報)の取得指示入力、常時起動されているGPSによる自動的な位置情報の取得、通話相手からの位置情報の送信、ユーザからの位置を特定可能な画像や情報の入力など、が有り得るが、これらに限定するものではない。通話中に位置情報に関する入力がなかった場合には(ステップS1008のNO)、処理を終了する。 The control unit 21 determines whether or not there is an input related to position information during a call (step S1008). Here, the input related to the position information may be any form of input as long as it is an input of information that can specify the position of the terminal 20 or the terminal of the other party, and is not limited, but as an example, voice or Input of place name and facility name by direct input from the user or the other party, input of acquisition instruction of location information (location information by GPS) from the user, automatic acquisition of location information by GPS that is always activated, call partner It is possible, but not limited to, transmission of position information from the user, input of an image or information that can specify the position from the user, and the like. If there is no input regarding the position information during the call (NO in step S1008), the process ends.
 一方、通話中に位置情報に関する入力があった場合には(ステップS1008のYES)、制御部21は、位置情報に関連する画像をトークルームに挿入する(ステップS1009)。ここで、位置情報に関連する画像とは、端末20の位置、もしくは、通話相手の端末の位置に関連する画像であり、関連すればどのような画像であってもよい。 On the other hand, when there is an input regarding the position information during the call (YES in step S1008), the control unit 21 inserts an image related to the position information into the talk room (step S1009). Here, the image related to the position information is an image related to the position of the terminal 20 or the position of the terminal of the other party, and may be any image as long as it is related.
 ユーザもしくは通話相手から音声もしくは直接入力による地名や施設に関する入力が通話中にあった場合には、その地名の周辺を含む地図情報を画像として取得して挿入してもよいし、施設の位置を示す地図情報、あるいは、施設の外観を示す写真などを取得して挿入してもよい。 If the user or the other party inputs a place name or facility by voice or direct input during a call, map information including the area around the place name may be acquired as an image and inserted, or the location of the facility may be inserted. Map information to be shown, or a photograph showing the appearance of the facility may be acquired and inserted.
 また、ユーザからの位置情報の取得指示入力があった場合には、取得した位置情報を含む周辺地図の画像を取得して挿入してもよい。同様に通話相手が位置情報を通話中に送信してきた場合にも、受信した位置情報を含む周辺地図の画像を取得して挿入してもよい。 Further, when the user inputs an instruction to acquire the location information, the image of the surrounding map including the acquired location information may be acquired and inserted. Similarly, when the other party sends the location information during the call, the image of the surrounding map including the received location information may be acquired and inserted.
 また、ユーザ(もしくは通話相手)から自身がいる店舗や施設等のホームページを、ユーザの位置に関する情報として受け付けて、そのホームページのアドレスと代表画像を、取得して挿入してもよいし、ホームページの画像を挿入してもよいし、ホームページから特定できる場所を示す地図情報を取得して挿入してもよい。 In addition, the homepage of the store or facility where the user (or the other party) is located may be accepted as information on the user's location, and the address and representative image of the homepage may be acquired and inserted. An image may be inserted, or map information indicating a location that can be identified from the homepage may be acquired and inserted.
 なお、ステップS1008、S1009の処理は、ビデオ通話に限らず、通常の通話時にも実行してもよい。また、画像の挿入は1つに限るものではなく、任意の数であってよく、数に制限を設けてもよいし、設けなくてもよい。また、ステップS1004とステップS1005の処理、ステップS1006とステップS1007の処理、そして、ステップS1008とステップS1009の処理の三つの処理は、全てを実施しなくともよく、少なくとも1つを実施してもよいし、これらの三つの処理のうち少なくとも二つを組み合わせて実行することとしてもよい。 Note that the processes of steps S1008 and S1009 may be executed not only during a video call but also during a normal call. Further, the number of images to be inserted is not limited to one, and may be any number, and the number may or may not be limited. Further, the three processes of step S1004 and step S1005, step S1006 and step S1007, and step S1008 and step S1009 do not have to be all performed, and at least one may be performed. However, at least two of these three processes may be combined and executed.
 また、ビデオ通話中にアウトカメラを起動したとき(アウトカメラに切換えたとき)に撮像した画像(静止画、動画)を、通話の内容を示す情報として、通話の内容を示すメッセージとともに(あるいは、メッセージを表示することなく)、トークルームに表示することとしたが、これもその限りではない。まず、トークルームの通話の内容を示す画像として表示する画像は、アウトカメラにより撮像されたものに限らず、インカメラで撮像されたものであってもよい。したがって、インカメラにより撮像された画像の一例として、通話に係るユーザそれぞれの顔画像がトークルームに表示されることとしてもよい。 In addition, the image (still image, moving image) captured when the out-camera is activated during a video call (when the out-camera is switched to) is used as information indicating the content of the call together with a message indicating the content of the call (or). I decided to display it in the talk room (without displaying the message), but this is not the case either. First, the image to be displayed as an image showing the contents of the talk in the talk room is not limited to the one captured by the out-camera, but may be the one captured by the in-camera. Therefore, as an example of the image captured by the in-camera, the face image of each user involved in the call may be displayed in the talk room.
 また、画像の表示は、メッセージ間に挿入する形で表示する態様に限るものではない。例えば、通話の内容を示すメッセージを表示している区間の背景画像として表示することとしてもよい。このとき、メッセージ全体の背景画像として表示することに限らず、取得した画像に関連する会話を行っている期間のみ表示するように構成されてもよい。画像に関連する会話を行っている期間は、通話の内容を音声認識処理して得られたテキストメッセージを解析することにより実現することができる。この一例を、図11を用いて説明する。 In addition, the display of the image is not limited to the mode of displaying by inserting it between messages. For example, it may be displayed as a background image of a section displaying a message indicating the content of the call. At this time, the display is not limited to the background image of the entire message, and may be configured to display only the period during which the conversation related to the acquired image is taking place. The period of conversation related to the image can be realized by analyzing the text message obtained by voice recognition processing of the contents of the call. An example of this will be described with reference to FIG.
 以下には、通話時の位置に関する情報の入力例と、その際のトークルームの表示例についての具体例を説明する。 The following describes an example of inputting information regarding the position during a call and a specific example of a talk room display example at that time.
 図11は、通話の一例と、そのときの通話後に表示されるトークルームの表示例を示している。図11(a)は、通話の一部の様子を示しており、図11(b)は、図11(a)に続く状況の一例を示している。また、図11(c)は、通話後のトークルームの表示例を示している。 FIG. 11 shows an example of a call and a display example of a talk room displayed after the call at that time. FIG. 11 (a) shows a part of the call, and FIG. 11 (b) shows an example of the situation following FIG. 11 (a). Further, FIG. 11C shows an example of displaying the talk room after a call.
 図11(a)に示すように、端末20aのユーザ10aが、端末20bのユーザ10bに、所在を訪ねる通話またはビデオ通話を行ったとする。これに対して、ユーザ10bは、図11(b)に示すように、自身が存在する場所の情報として、近くの施設の写真を撮影したとする。 As shown in FIG. 11A, it is assumed that the user 10a of the terminal 20a makes a call to visit the location or a video call to the user 10b of the terminal 20b. On the other hand, it is assumed that the user 10b has taken a picture of a nearby facility as information on the place where he / she exists, as shown in FIG. 11B.
 図11(a)、(b)に示すようなやり取りを通話中に行った場合には、一例として、端末20は、図11(c)に示すように、端末20bが取得した端末20bに関する位置情報に基づく画像1101をトークルームに挿入する。ここで、端末20bは、図11(b)に示す撮影によって得られた画像を、そのまま、端末20bの位置に関する画像として、トークルームに表示することとしてもよいし、撮影した画像から抽出可能な位置に関連する情報を画像認識処理により抽出したうえで、その情報から、画像をネットワークから取得して表示することとしてもよい。図11(b)の例でいえば、ユーザ10bが端末20bを用いて撮像した画像から、「AAマート」という文言を抽出し、その文言をインターネットで検索して、検索により得られた画像(限定ではなく一例として、ホームページの画像)、図11(c)に示すように表示する。図11(c)の例では、画像1101を、撮影を行ったタイミングに同期するように、図11(b)でユーザ10bの発話に続く形で、表示しているが、前述のように、画像1101は、トークルームの背景画像として表示することとしてもよい。また、あるいは、通話の内容を示すメッセージの先頭部分に挿入してもよいし、終端部分に挿入してもよい。 When the exchanges shown in FIGS. 11 (a) and 11 (b) are performed during a call, as an example, the terminal 20 is positioned with respect to the terminal 20b acquired by the terminal 20b as shown in FIG. 11 (c). The information-based image 1101 is inserted into the talk room. Here, the terminal 20b may display the image obtained by the shooting shown in FIG. 11B as it is as an image relating to the position of the terminal 20b in the talk room, or can be extracted from the shot image. Information related to the position may be extracted by an image recognition process, and then an image may be acquired from the network and displayed from the information. In the example of FIG. 11B, the word "AA Mart" is extracted from the image captured by the user 10b using the terminal 20b, the wording is searched on the Internet, and the image obtained by the search ( As an example, not a limitation), it is displayed as shown in FIG. 11 (c). In the example of FIG. 11C, the image 1101 is displayed in the form following the utterance of the user 10b in FIG. 11B so as to synchronize with the timing at which the image was taken. However, as described above, The image 1101 may be displayed as a background image of the talk room. Alternatively, it may be inserted at the beginning of a message indicating the content of the call, or at the end of the message.
 図12(a)は、端末の位置に関する情報に基づいて、取得した画像を、トークルームの背景として表示した表示例を示す図である。そして、図12(b)は、図12(a)に示すトークルームをスクロールアップして表示した状態の表示例を示す図である。図12(a)に示すように、端末20は、トークルームの背景画像として、通話中に特定された端末に関する位置の情報から特定された画像を表示する。 FIG. 12A is a diagram showing a display example in which the acquired image is displayed as the background of the talk room based on the information regarding the position of the terminal. Then, FIG. 12B is a diagram showing a display example in a state in which the talk room shown in FIG. 12A is scrolled up and displayed. As shown in FIG. 12A, the terminal 20 displays an image specified from the position information regarding the terminal specified during a call as a background image of the talk room.
 図12(a)に示すように、端末20は、トークルームの背景画像として通話中に取得した画像(限定ではなく一例として、端末の位置に関する画像、ユーザが通話中に入力した画像、ユーザが通話中に撮影した画像、通話の内容に関する画像など)を表示し、その背景画像に重畳して、通話内容を示すメッセージを表示する。図12(a)に示すように、メッセージの背景画像として、通話中に特定された、端末に関する位置の情報から特定された画像を表示することで、通話中の内容を示すメッセージの内容とともに、ユーザに通話の内容をより想起させやすくすることができる。また、この時、背景画像として表示するのは、関連する話題のメッセージを表示する区間T1の間のみとしてもよいし、しなくてもよい。つまり、図12(b)に示すように、区間T2においては背景画像として話題中に取得した端末の位置に関する情報に基づく画像を表示し、区間T3においては、背景画像を表示しない。すなわち、画像に関連する話題のメッセージの表示区間と、その話題中に取得できた端末の位置に関する情報に基づいて取得した画像を背景画像として表示する表示区間とを連動させることで、通話時の臨場感を再現することができ、ユーザに通話の内容をより想起させやすくすることができる。 As shown in FIG. 12A, the terminal 20 uses an image acquired during a call as a background image of the talk room (as an example, not limited to an image relating to the position of the terminal, an image input by the user during the call, and an image by the user. An image taken during a call, an image related to the content of the call, etc.) is displayed, and a message indicating the content of the call is displayed by superimposing it on the background image. As shown in FIG. 12A, by displaying an image specified from the position information about the terminal specified during the call as the background image of the message, the content of the message indicating the content during the call is displayed together with the content of the message. It is possible to make it easier for the user to recall the contents of the call. Further, at this time, the background image may or may not be displayed only during the section T1 in which the message of the related topic is displayed. That is, as shown in FIG. 12B, in the section T2, an image based on the information regarding the position of the terminal acquired during the topic is displayed as the background image, and in the section T3, the background image is not displayed. That is, by linking the display section of the topic message related to the image and the display section that displays the image acquired based on the information about the position of the terminal acquired during the topic as the background image, during a call. It is possible to reproduce the sense of reality and make it easier for the user to recall the contents of the call.
 また、表示の他の例として、図13を用いて説明する。図13(a)は、通話の一部の様子を示しており、図13(b)は、図13(b)に続く状況の一例を示している。そして、図13(c)は、図13(a)、図13(b)に示される通話がなされた場合に、端末20に表示されるトークルームの表示例を示している。 Further, as another example of the display, it will be described with reference to FIG. FIG. 13A shows a part of the call, and FIG. 13B shows an example of the situation following FIG. 13B. Then, FIG. 13 (c) shows a display example of a talk room displayed on the terminal 20 when the call shown in FIGS. 13 (a) and 13 (b) is made.
 図13(a)に示すように、ユーザ10aは、通話またはビデオ通話を介して、端末20bのユーザ10bに、ある場所への往訪を提案しており、これに対して、ユーザ10bは、その場所の説明を求めている。 As shown in FIG. 13A, the user 10a proposes to the user 10b of the terminal 20b to visit a certain place via a telephone call or a video call, whereas the user 10b proposes a visit to a certain place. I'm asking for a description of the location.
 ユーザ10bからの要求に対して、ユーザ10aは、自身の端末20aを用いて、通話中に、位置情報の入力を行う。この位置情報の入力は、例えば、行先の店舗(またはそのそば)に居るのであれば、位置情報の取得の指示入力であってもよいし、ユーザが認識している行先の位置情報(限定ではなく一例として、経緯度情報であったり、住所の情報であったりしてよい)の直接入力であってもよいし、行先に関連する情報を掲載したウェブページであってもよい。 In response to the request from the user 10b, the user 10a uses his / her own terminal 20a to input location information during a call. The input of the location information may be, for example, an instruction input for acquiring the location information if the user is at the destination store (or near the destination), or the location information of the destination recognized by the user (limitedly). As an example, it may be the latitude and longitude information or the address information), or it may be a web page containing information related to the destination.
 このような図13(a)、図13(b)に示すようなやり取りを含む通話が行われた場合に、端末20は、図13(b)において入力された位置情報に基づいて、図13(c)に示すように、行先の位置を示す地図1301を、メッセージ間に挿入して表示する。なお、位置情報に関連する画像は、地図1301に限定するものではなく、その他の画像であってもよく、例えば、行先のホームページに関する画像情報、あるいは、そのアドレス情報などであってもよい。 When such a call including the exchanges shown in FIGS. 13 (a) and 13 (b) is performed, the terminal 20 is based on the position information input in FIG. 13 (b). As shown in (c), a map 1301 showing the location of the destination is inserted between messages and displayed. The image related to the location information is not limited to the map 1301, and may be another image, for example, image information about the homepage of the destination, or address information thereof.
 図12や図13に示したように、端末20は、通話中のユーザ同士の会話に基づくメッセージを表示するのみならず、その通話中に入力された位置情報に関する情報に基づく画像を自動的に収集して、表示することができる。これにより、端末20は、トークルームを介して通話が行われた場合に、その通話の内容を示す情報をより多く提供することができる。 As shown in FIGS. 12 and 13, the terminal 20 not only displays a message based on the conversation between users during the call, but also automatically displays an image based on the information on the location information input during the call. Can be collected and displayed. As a result, when a call is made through the talk room, the terminal 20 can provide more information indicating the content of the call.
 なお、ここで、音声認識部213は、ビデオ通話が終了してから音声認識を実行することとしていたが、これはその限りではなく、通話中に実行していてもよい。また、さらには、ユーザが端末20を用いてスピーカーフォンによる通話を行う場合には、端末20は、リアルタイムで音声認識を行うことにより、通話を行いつつ、トークルーム上にリアルタイムで解析され変換されたメッセージを表示するようにしてもよい。このようにビデオ通話であっても、端末20は、そのビデオ通話においてなされたユーザ同士の通話内容をメッセージとして、トークルーム上に表示することができる。また、ビデオ通話を行ううえで、何らかのレッスン、具体的には、英会話(語学)のレッスンを行う態様も考えられるが、そのような場合に、端末20は、テキストメッセージに併せて、その言語でのより適切な言い回しをネットワーク等から収集して表示するようにしてもよい。 Here, the voice recognition unit 213 is supposed to execute the voice recognition after the video call is completed, but this is not limited to this, and it may be executed during the call. Furthermore, when the user makes a call using the speakerphone using the terminal 20, the terminal 20 performs voice recognition in real time, and is analyzed and converted in real time on the talk room while making the call. Message may be displayed. In this way, even in the case of a video call, the terminal 20 can display the content of the call between the users made in the video call as a message on the talk room. Further, in making a video call, it is conceivable to give some lessons, specifically, English conversation (language) lessons, but in such a case, the terminal 20 is used in that language together with the text message. A more appropriate phrase may be collected from a network or the like and displayed.
 <実施形態の効果>
 以下、実施形態2の効果について述べる。
<Effect of embodiment>
Hereinafter, the effect of the second embodiment will be described.
 また、端末20のユーザは、サーバ10が提供するメッセージングアプリケーションを介して、ビデオ通話により、他のユーザと通話を行う。このとき、端末20は、ビデオ通話を含む通話中に撮影した、若しくは、通話相手のユーザの端末により撮影された画像、あるいは、その画像に基づく情報を、トークルームに表示することとしてよい。 Further, the user of the terminal 20 makes a call with another user by a video call via the messaging application provided by the server 10. At this time, the terminal 20 may display an image taken during a call including a video call, or an image taken by the terminal of the user of the other party, or information based on the image in the talk room.
 これにより、端末20は、通話時の会話の内容をユーザに想起しやすくさせることができる。 As a result, the terminal 20 can make it easier for the user to recall the content of the conversation during a call.
 また、端末20は、ユーザからの撮影するカメラの切替指示に基づき、通話中に、端末20の表示部24がある側とは、反対側に設けられたアウトカメラにより撮影した画像を、トークルームに表示することとしてよい。 Further, the terminal 20 uses an image taken by an out-camera provided on the side opposite to the side where the display unit 24 of the terminal 20 is located during a talk, based on a camera switching instruction taken by the user, in a talk room. It may be displayed in.
 これにより、特にビデオ通話中に、アウトカメラを用いて画像を撮影したということは、その通話内容に密接に関連する撮影であった可能性が高く、その画像に基づく情報を、トークルームに表示することにより、ユーザに、その通話の内容を後から思い出させやすくすることができる。また、ユーザからの撮影しているカメラの切替(インカメラからアウトカメラへの切替)をトリガとして、アウトカメラが撮影した画像をトークルームに表示することで、通話の内容を想起しやすくするための情報を自動的に生成して表示することができる。 As a result, it is highly possible that taking an image using the out-camera, especially during a video call, was taken closely related to the content of the call, and information based on that image is displayed in the talk room. By doing so, it is possible to make it easier for the user to remember the content of the call later. In addition, by using the switching of the camera being shot by the user (switching from the in-camera to the out-camera) as a trigger and displaying the image taken by the out-camera in the talk room, it is easier to recall the contents of the call. Information can be automatically generated and displayed.
 また、端末20は、ビデオ通話中に、ユーザから入力された画像を、トークルームに表示することとしてよい。このとき、端末20は、通話の内容を示すメッセージの間であって、画像が入力されたタイミングに一致するように、その画像を表示することとしてもよいが、これに限らず、通話に係るメッセージの先頭部分に表示してもよいし、終端部分に表示することとしてもよい。 Further, the terminal 20 may display the image input by the user in the talk room during the video call. At this time, the terminal 20 may display the image between messages indicating the contents of the call so as to match the timing at which the image is input, but the present invention is not limited to this. It may be displayed at the beginning of the message or at the end of the message.
 これにより、端末20は、ユーザに、後から、画像をみることで、通話の内容を思い出させやすくすることができる。 As a result, the terminal 20 can easily remind the user of the contents of the call by viewing the image later.
 また、端末20は、通話中に取得した画像を、通話の内容を示すメッセージの背景画像として表示することとしてもよい。 Further, the terminal 20 may display the image acquired during the call as the background image of the message indicating the content of the call.
 これにより、ユーザは、通話内容を示すメッセージ(テキスト)の内容を確認しつつ、通話中に見たり、撮影したり、取得したりした画像を背景画像として確認することで、通話の内容を想起しやすくなる。 As a result, the user recalls the contents of the call by checking the contents of the message (text) indicating the contents of the call and checking the image seen, photographed, or acquired during the call as the background image. It will be easier to do.
 また、端末20は、ユーザが入力した画像や、通話中に撮影して得られた画像の他にも、端末20の位置に関する情報に基づく画像、あるいは、通話相手の端末の位置に関する情報に基づく画像を取得して、メッセージに関連付けて表示することとしてもよい。 Further, the terminal 20 is based on an image based on information on the position of the terminal 20 or information on the position of the terminal of the other party in addition to the image input by the user and the image obtained by taking a picture during the call. The image may be acquired and displayed in association with the message.
 端末20は、通話中の端末20の位置、あるいは、通話相手の端末の位置の情報に基づく、画像を取得することで、限定ではなく一例として、ユーザに、どのような場所で通話したのか、または通話相手がどのような場所にいたのかを認識させることで、通話の内容を思い出させることができる。 By acquiring an image based on the position of the terminal 20 during a call or the position of the other party's terminal, the terminal 20 is not limited to, but as an example, what kind of place the user is called. Alternatively, by recognizing where the other party was, the content of the call can be reminded.
 また、端末20は、通話が行われた場合に、通話の内容にしたがった画像を表示するようにしてもよい。端末20は、通話の内容を音声認識処理によりテキストメッセージに変換した後に、形態素解析、文脈解析等により通話の内容を解析し、解析して得られた結果から、関連性の高い画像を表示する。端末20は、限定ではなく一例として、通話の内容として、ある店舗に関する話題があった場合には、その店舗の写真を画像として、メッセージに対応付けて表示してもよいし、ある食物に関する話題があった場合には、その食物の写真を画像として、メッセージに対応付けて表示してもよい。 Further, when a call is made, the terminal 20 may display an image according to the contents of the call. After converting the contents of the call into a text message by voice recognition processing, the terminal 20 analyzes the contents of the call by morphological analysis, context analysis, etc., and displays a highly relevant image from the results obtained by the analysis. .. The terminal 20 is not limited, but as an example, when there is a topic about a certain store as the content of a call, a picture of the store may be displayed as an image in association with a message, or a topic about a certain food. If there is, a photograph of the food may be displayed as an image in association with the message.
 話題に関連性の高い画像を表示することで、端末20は、ユーザに通話の内容を容易に想起させることができる。 By displaying an image highly relevant to the topic, the terminal 20 can easily remind the user of the content of the call.
<実施形態3>
 図14は、トークルーム上で、通話を行った際に、その通話の内容をユーザが容易に認識できるようにするための表示態様を実現するための処理の動作例を示すフローチャートである。端末20は、図14に示す処理を実行してもよいし、しなくてもよい。また、図示していないが、端末20は、ユーザからの入力に従って、図14に示す処理を実行するか否かを選択設定可能に構成されていてもよい。図14に示す処理は、図4に示すステップS404以降の処理例を示す。
<Embodiment 3>
FIG. 14 is a flowchart showing an operation example of processing for realizing a display mode for allowing the user to easily recognize the contents of the call when the call is made on the talk room. The terminal 20 may or may not execute the process shown in FIG. Further, although not shown, the terminal 20 may be configured to be able to select and set whether or not to execute the process shown in FIG. 14 according to the input from the user. The process shown in FIG. 14 shows an example of the process after step S404 shown in FIG.
 図14に示すように、音声認識部213は、録音した音声に対して音声認識処理を実行する(ステップS404)。 As shown in FIG. 14, the voice recognition unit 213 executes voice recognition processing on the recorded voice (step S404).
 制御部21は、音声認識部213が音声認識により変換して得られたテキストデータの文章量を特定する(ステップS1405)。制御部21は、限定ではなく一例として、テキストデータの文字数、あるいは、テキストデータのデータ容量を、文章量として特定してよい。制御部21は、特定した文章量に基づいて、通話アイコン611の表示サイズを決定する(ステップS1406)。具体的には、制御部21は、文章量が多ければ多いほど通話アイコン611の表示サイズが大きくなるように、表示サイズを決定する。限定ではなく一例として、制御部21は、予め定めた文章量を入力として表示サイズを決定する関数により表示サイズを決定することとしてもよいし、予め記憶部28に文章量の範囲に応じて表示サイズが定められたテーブルを記憶しておき、そのテーブルに従って表示サイズを決定することとしてもよい。なお、ここでは、テキスト変換後の文字量に基づいて、通話アイコン611の表示サイズを決定することとしているが、これは、文字量に代えて、通話時間の長さを用いてもよい。即ち、通話時間が長いほど、濃い会話になっていたことが想定されることから、通話アイコン611の表示サイズを大きくし、通話時間が短いほど、簡潔な会話になっていたことが想定されることから、通話アイコン611の表示サイズを小さくする。 The control unit 21 specifies the amount of text data obtained by the voice recognition unit 213 converted by voice recognition (step S1405). The control unit 21 may specify the number of characters of the text data or the data capacity of the text data as the amount of sentences as an example, not the limitation. The control unit 21 determines the display size of the call icon 611 based on the specified amount of text (step S1406). Specifically, the control unit 21 determines the display size so that the larger the amount of text, the larger the display size of the call icon 611. As an example, not a limitation, the control unit 21 may determine the display size by a function that determines the display size by inputting a predetermined amount of text, or displays the display size in the storage unit 28 in advance according to the range of the text amount. A sized table may be stored and the display size may be determined according to the table. Here, the display size of the call icon 611 is determined based on the amount of characters after the text conversion, but the length of the call time may be used instead of the amount of characters. That is, since it is assumed that the longer the call time, the deeper the conversation, it is assumed that the display size of the call icon 611 is increased and the shorter the call time, the simpler the conversation. Therefore, the display size of the call icon 611 is reduced.
 また、制御部21は、文章に対して形態素解析などを利用して文脈解析を実行する(ステップS1407)。これは、既存のテキストマイニング技術を用いることで実現することができる。そして、制御部21は、解析結果から、通話内容のタイトルとして適切と推定される見出しを決定する(ステップS1408)。この見出しは、限定ではなく一例として、解析したテキストデータに頻出する文言を用いたり、テキストデータの解析結果からなんらかのスケジュールとして推定される文言を用いたりすることができる。また、見出しに用いる文言は、端末20のユーザの発話した内容に基づくものであってもよいし、通話相手のユーザが発話した内容に基づくものであってもよいし、その両方であってもよい。また、会話の中にスケジュールに関する内容があった場合には、端末20は、そのスケジュールを、メッセージングアプリケーションとは別の、スケジュール管理を行うスケジュール管理アプリケーションを起動して、そのスケジュールをカレンダー上に登録するようにしてもよいし、しなくてもよい。 Further, the control unit 21 executes a context analysis on the sentence by using morphological analysis or the like (step S1407). This can be achieved by using existing text mining techniques. Then, the control unit 21 determines a heading that is presumed to be appropriate as the title of the call content from the analysis result (step S1408). This heading is not limited, but as an example, words that frequently appear in the analyzed text data can be used, or words that are estimated as some schedule from the analysis result of the text data can be used. Further, the wording used in the heading may be based on the content spoken by the user of the terminal 20, may be based on the content spoken by the user of the other party, or may be both. Good. In addition, when there is a content related to the schedule in the conversation, the terminal 20 starts a schedule management application that manages the schedule, which is different from the messaging application, and registers the schedule on the calendar. You may or may not do it.
 そして、制御部21は、表示処理部214に対して、決定した表示サイズで通話アイコン611を、トークルーム上に表示するとともに、その通話アイコン611に対応付けて、決定した見出しを添えて表示して(ステップS1409)、終了する。 Then, the control unit 21 displays the call icon 611 on the talk room with the determined display size on the display processing unit 214, and displays the call icon 611 in association with the call icon 611 with the determined heading. (Step S1409), and the process ends.
 図15には、通話量(通話内容のメッセージの文章量)に応じて通話アイコンのサイズを変えて表示した表示例を示している。図15(a)は、相対的に、通話量(通話内容のメッセージの文章量)が少なかった場合に表示される通話アイコン1501の表示例を示している。なお、図15においては、見やすさのために、メッセージを展開していない状態を示している。図15(a)に対し、通話アイコン1501に対応する通話の通話量よりも多い通話がなされた場合の通話アイコンの表示例を、図15(b)に示す。図15(b)に示すように、通話アイコン1502は、図15(a)に示す通話アイコン1501よりも大きいサイズで表示されている。図15に示すように、通話の通話量(通話内容のメッセージの文章量)に応じて通話アイコンを表示することで、ユーザは、一目で感覚的に、どのぐらい話し込んでいたのかを認識することができる。 FIG. 15 shows a display example in which the size of the call icon is changed and displayed according to the call volume (the text volume of the message of the call content). FIG. 15A shows a display example of the call icon 1501 displayed when the call volume (the text volume of the message of the call content) is relatively small. Note that FIG. 15 shows a state in which the message is not expanded for the sake of readability. FIG. 15 (b) shows an example of displaying the call icon when a call volume larger than the call volume of the call corresponding to the call icon 1501 is made with respect to FIG. 15 (a). As shown in FIG. 15B, the call icon 1502 is displayed in a size larger than that of the call icon 1501 shown in FIG. 15A. As shown in FIG. 15, by displaying the call icon according to the call volume (the amount of text of the message of the call content), the user can recognize at a glance how much he / she was talking. Can be done.
 ところで、通話量を表示する手法は、上述の通り、通話アイコンのサイズに限定するものではない。例えば、図16に示すように、通話アイコンの色の濃淡で通話量を表現してもよい。図16では、色をハッチングで示している。図16(a)は、相対的に通話料が少なかった場合の通話アイコン1601を示している。これに対して、図16(b)は、図16(a)の通話アイコン1601が対応する通話の通話料よりも多い通話量の通話アイコン1602の表示例を示している。通話アイコン1602に示すように、図16(a)に示す通話アイコン1601が対応する通話の通話料よりも多い通話量である場合には、通話アイコン1602の色を濃く表示することで、通話料を示す。すなわち、通話アイコンの色の濃淡により、ユーザに一目で通話量を認識させることができる。このように、通話アイコンの表示態様によって、通話量を表現することができる。 By the way, the method of displaying the call volume is not limited to the size of the call icon as described above. For example, as shown in FIG. 16, the call volume may be expressed by the shade of color of the call icon. In FIG. 16, the colors are shown by hatching. FIG. 16A shows a call icon 1601 when the call charge is relatively low. On the other hand, FIG. 16B shows a display example of the call icon 1602 having a call volume larger than the call charge of the call corresponding to the call icon 1601 of FIG. 16A. As shown in the call icon 1602, when the call icon 1601 shown in FIG. 16A has a larger call volume than the corresponding call charge, the call icon 1602 is displayed in a darker color to charge the call charge. Is shown. That is, the shade of the color of the call icon allows the user to recognize the call volume at a glance. In this way, the call volume can be expressed by the display mode of the call icon.
 また、通話アイコンとして表示する画像は、図15や図16に示すような、通話を示す記号に限らず、通話に関連する画像であってもよいし、アイコンとの組み合わせであってもよい。即ち、図15や図16に示す通話アイコンの通話の記号の背景画像として、通話に関連する画像(通話中の端末の位置に関する画像)を表示することとしてもよい。具体的には、端末20は、図17に示すように、通話アイコンとして表示する表示位置に、通話に関連する画像(限定ではなく一例として、端末の位置に関する画像、ユーザが通話中に入力した画像、ユーザが通話中に撮影した画像、通話の内容に関する画像など。限定ではなく、通話に関連する情報の例。)を通話アイコンの代替として表示してもよい。このとき、端末20は、図17(a)に示すように、通話アイコンとしての外形内に、通話時に取得した通話に関する画像の一部を示す態様で、画像1701に示すように表示することとしてもよいし、図17(b)に示すように、通話アイコンの外形に拘らず、画像1702のように、通話に関する画像をそのまま表示するようにしてもよい。また、更には、図17(b)に示すように画像1702を表示する場合には、端末20は、図18(a)に示すように、表示している画像1702が通話に関連するものであることを明確にするために、通話アイコン1801も併せて表示することとしてもよい。図18(a)では、通話アイコン1801は画像1702に重畳して表示しているが、これは、通話アイコン1801が画像1702に対応付けられていることが理解できれば、画像1702の枠外に表示することとしてもよい。そして、更には、端末20は、画像1702の通話アイコン1801以外の部分に対するユーザからのタッチ入力を検出することで、図18(b)に示すように、画像1702を拡大表示することとしてもよい。このとき、通話アイコン1801は表示してもしなくてもよい。図18(b)では、通話アイコン1801を表示していない例を示している。また、端末20は、図18(a)に示す通話アイコン1801に対してユーザからのタッチ入力を検出した場合には、端末20の通話部212が、トークルームに対応するユーザに対して発呼を開始するように構成されてもよい。 Further, the image displayed as the call icon is not limited to the symbol indicating the call as shown in FIGS. 15 and 16, and may be an image related to the call or may be a combination with the icon. That is, an image related to the call (an image relating to the position of the terminal during the call) may be displayed as a background image of the call symbol of the call icon shown in FIGS. 15 and 16. Specifically, as shown in FIG. 17, the terminal 20 has an image related to the call (not limited, but as an example, an image related to the position of the terminal, which the user has input during the call) at the display position displayed as the call icon. Images, images taken by the user during a call, images about the content of the call, etc., but not limited to, examples of information related to the call) may be displayed as an alternative to the call icon. At this time, as shown in FIG. 17A, the terminal 20 is displayed as shown in the image 1701 in a manner showing a part of the image related to the call acquired at the time of the call in the outer shape as the call icon. Alternatively, as shown in FIG. 17B, the image related to the call may be displayed as it is, as shown in image 1702, regardless of the outer shape of the call icon. Further, when the image 1702 is displayed as shown in FIG. 17 (b), the terminal 20 is such that the displayed image 1702 is related to the call as shown in FIG. 18 (a). In order to clarify that there is, the call icon 1801 may also be displayed. In FIG. 18A, the call icon 1801 is superimposed on the image 1702 and displayed, but this is displayed outside the frame of the image 1702 if it can be understood that the call icon 1801 is associated with the image 1702. It may be that. Further, the terminal 20 may enlarge and display the image 1702 as shown in FIG. 18B by detecting the touch input from the user to the portion other than the call icon 1801 of the image 1702. .. At this time, the call icon 1801 may or may not be displayed. FIG. 18B shows an example in which the call icon 1801 is not displayed. Further, when the terminal 20 detects a touch input from the user for the call icon 1801 shown in FIG. 18A, the call unit 212 of the terminal 20 calls the user corresponding to the talk room. May be configured to start.
 図19は、通話の内容に見出しを付けた場合の表示例を示す図である。図19(a)は、通話の音声を音声認識処理により、テキストメッセージに変換し、トークルーム上にメッセージとして表示した一例を示している。図19(a)のメッセージに示されるように、ユーザ同士で、飲み会の約束をしていることが理解できる。このようなやり取りをしていた場合に、端末20の制御部21は、メッセージのテキストに対して、形態素解析、文脈解析を行って、一例として、飲み会を行うこと、飲み会を土曜日に行うことを特定する。そして、端末20の制御部21は、通話アイコン1901に対応付けて、通話の内容を示す見出し1902を表示する。図19(b)に示す例では、「土曜日 飲み会」という内容の見出し1902を表示する。このように、端末20は、通話の内容を示すメッセージを表示するだけでなく、その通話の内容を示す見出し1902を表示することもできる。これにより、通話の内容を示すメッセージを全て読まずとも、ユーザは、通話の内容を認識することができる。 FIG. 19 is a diagram showing a display example when a heading is added to the content of the call. FIG. 19A shows an example in which the voice of a call is converted into a text message by voice recognition processing and displayed as a message on the talk room. As shown in the message of FIG. 19A, it can be understood that the users have promised a drinking party. In the case of such an exchange, the control unit 21 of the terminal 20 performs morphological analysis and context analysis on the text of the message, and as an example, holds a drinking party and holds a drinking party on Saturday. Identify that. Then, the control unit 21 of the terminal 20 displays the heading 1902 indicating the content of the call in association with the call icon 1901. In the example shown in FIG. 19B, the heading 1902 with the content "Saturday drinking party" is displayed. In this way, the terminal 20 can not only display a message indicating the content of the call, but also display a heading 1902 indicating the content of the call. As a result, the user can recognize the contents of the call without reading all the messages indicating the contents of the call.
 また、図19では、見出しを付与する例を示したが、端末20は、より通話の内容を認識しやすくするために、通話の内容を音声認識処理によりテキストメッセージに変換した後に、形態素解析、文脈解析等の解析技術を用いて、通話の内容を認識し、要約した文章を表示するものであってもよい。通話の内容を要約することで、通話が長引き、表示すべきメッセージとしての文量が多くなると、全てを表示した場合に、ユーザがその内容を読むのに時間を要することになり手間であるところ、要約することによって、表示するメッセージを簡素化しながらも、ユーザに通話の内容を認識させることができる。なお、要約は、通話に関連するユーザのいずれかの会話であるかのように表示してもよいし、しなくてもよい。また、要約には、会話の中で何等かのスケジュールに関する内容が含まれていた場合には、そのスケジュールについては、必ず含めるようにしてもよいし、含めなくてもよい。 Further, although FIG. 19 shows an example of assigning a heading, the terminal 20 performs morphological analysis after converting the contents of the call into a text message by voice recognition processing in order to make it easier to recognize the contents of the call. An analysis technique such as context analysis may be used to recognize the content of the call and display a summarized sentence. By summarizing the contents of the call, if the call is prolonged and the amount of sentences as a message to be displayed increases, it will take time for the user to read the contents when all are displayed, which is troublesome. By summarizing, it is possible to make the user recognize the content of the call while simplifying the displayed message. Note that the summary may or may not be displayed as if it were a conversation of any of the users involved in the call. In addition, if the summary contains content related to any schedule in the conversation, the schedule may or may not be included.
<実施形態の効果>
 以下、実施形態の効果について述べる。
<Effect of embodiment>
The effects of the embodiments will be described below.
 また、端末20は、通話内容を音声認識して得られるテキストデータの文字量や、通話時間の長さに基づいて、通話を行ったことを示す画像情報(限定ではなく一例として通話アイコン)を表示することとしてもよい。文字量や通話時間に基づいて通話アイコンを表示するとは、文字量の多寡、通話時間の長短によって、通話アイコンの表示サイズを変更して表示したり、通話アイコンの表示色を変更したりして表示することであってよい。 Further, the terminal 20 provides image information (not limited but a call icon as an example) indicating that a call has been made based on the amount of characters in the text data obtained by voice recognition of the call content and the length of the call time. It may be displayed. Displaying a call icon based on the amount of characters and call time means changing the display size of the call icon or changing the display color of the call icon depending on the amount of characters and the length of the call time. It may be to display.
 これにより、通話の内容に関するメッセージの内容を見なくても、通話アイコンの大きさを見るだけで、その時の会話の弾み具合やボリューム(通話量)を想定しやすくすることで、ユーザに、通話の内容を思い出させる一因とすることができる。 This makes it easier to estimate the momentum and volume (call volume) of the conversation at that time by just looking at the size of the call icon without looking at the content of the message related to the content of the call. It can be a factor that reminds us of the contents of.
 また、端末20は、トークルームにおいて通話を行ったことを示す画像として通話アイコンを表示する以外に、トークルームに、通話に関する画像(限定ではなく一例として、端末の位置に関する画像、ユーザが通話中に入力した画像、ユーザが通話中に撮影した画像、通話の内容に関する画像など)を表示することとしてもよい。すなわち、通話アイコンに代えて、通話内容に関する画像を通話を行ったことを示す情報としてトークルームに表示することとしてもよい。 In addition to displaying the call icon as an image indicating that a call has been made in the talk room, the terminal 20 displays an image related to the call in the talk room (not limited to, as an example, an image related to the position of the terminal, and the user is talking The image input to, the image taken by the user during the call, the image related to the contents of the call, etc.) may be displayed. That is, instead of the call icon, an image related to the contents of the call may be displayed in the talk room as information indicating that the call has been made.
 これにより、通話に関連する画像として、通話アイコンではなく、実際の通話に関する画像が表示されることにより、通話内容を示すメッセージを見ずとも、通話の内容をユーザに認識させることができる。 As a result, the image related to the actual call is displayed instead of the call icon as the image related to the call, so that the user can recognize the content of the call without looking at the message indicating the content of the call.
 また、端末20は、通話の内容を示す情報を表示するにあたって、通話内容を解析し、解析した結果から通話の内容を示す要約文に変換する処理を行い、その要約を表示することとしてもよい。これは、限定ではなく一例として、ユーザ同士の通話内容と、その通話内容に対する要約と、を教師データとする学習処理を利用して、学習モデルを生成し、その学習モデルに対して、音声認識処理により得られたテキストデータを入力することで、要約を作成することとしてもよい。 Further, in displaying the information indicating the contents of the call, the terminal 20 may analyze the contents of the call, convert the analysis result into a summary sentence indicating the contents of the call, and display the summary. .. This is not limited, but as an example, a learning model is generated by using a learning process using the contents of a call between users and a summary of the contents of the call as teacher data, and voice recognition is performed for the learning model. A summary may be created by inputting the text data obtained by the processing.
 これにより、端末20は、簡単な内容で、ユーザに通話の内容を認識させることができる。また、長い通話を全て表示するのではなく、要約を表示することで、メッセージングアプリケーションの利便性を向上させることができるとともに、表示の見た目のデザイン性を向上させることもできる。 As a result, the terminal 20 can make the user recognize the content of the call with simple content. In addition, by displaying a summary instead of displaying all long calls, the convenience of the messaging application can be improved, and the design of the display can be improved.
 1 通信システム
  10 サーバ
   11 制御部
    111 メッセージ処理部
   12 入出力部
   13 表示部
   14 通信I/F(通信部)
  20 端末
   21 制御部
    211 メッセージ処理部
    212 通話部
    213 音声認識部
    214 表示処理部
   22 通信I/F
   23 入出力部
    231 タッチパネル
    232 マイク
    233 スピーカ
    234 カメラ
   24 表示部(ディスプレイ)
   25 位置情報取得部
   28 記憶部
  30 ネットワーク
1 Communication system 10 Server 11 Control unit 111 Message processing unit 12 Input / output unit 13 Display unit 14 Communication I / F (communication unit)
20 Terminal 21 Control unit 211 Message processing unit 212 Calling unit 213 Voice recognition unit 214 Display processing unit 22 Communication I / F
23 Input / output unit 231 Touch panel 232 Microphone 233 Speaker 234 Camera 24 Display unit (display)
25 Location information acquisition unit 28 Storage unit 30 Network

Claims (22)

  1.  第1端末にコンテンツの送信または、前記第1端末から送信されたコンテンツの受信を行う端末の情報処理方法であって、
     前記第1端末から送信された第1コンテンツと、前記端末の通信部によって前記第1端末に送信された第2コンテンツとを前記端末の表示領域に表示することと、
     前記第1コンテンツと前記第2コンテンツとを表示する前記表示領域に対する前記端末のユーザによる入力に基づいて、前記第1端末との通話に関する制御を前記端末の制御部によって行うことと、
     前記第1端末のユーザの音声に基づく第1情報と、前記端末のユーザの音声に基づく第2情報とを前記第1端末との前記通話に基づき前記制御部により取得することと、
     前記第1情報と前記第2情報とに基づく通話情報を前記表示領域に表示することとを含む。
    An information processing method for a terminal that transmits content to a first terminal or receives content transmitted from the first terminal.
    Displaying the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal.
    Based on the input by the user of the terminal to the display area for displaying the first content and the second content, the control unit of the terminal controls the call with the first terminal.
    Acquiring the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal by the control unit based on the call with the first terminal.
    It includes displaying call information based on the first information and the second information in the display area.
  2.  請求項1に記載の情報処理方法であって、
     少なくとも前記通話を行っている間に撮影された画像を前記端末の制御部によって取得することと、
     前記画像に基づき、前記通話情報が前記表示領域に表示される表示態様を前記制御部によって制御することとを含む。
    The information processing method according to claim 1.
    At least the image taken during the call is acquired by the control unit of the terminal, and
    Based on the image, the control unit controls the display mode in which the call information is displayed in the display area.
  3.  請求項1に記載の情報処理方法であって、
     前記通話を行っている間に撮影された画像を前記端末の制御部によって取得することと、
     前記通話情報と前記画像とを前記表示領域に表示することを含む。
    The information processing method according to claim 1.
    Acquiring an image taken during the call by the control unit of the terminal, and
    It includes displaying the call information and the image in the display area.
  4.  請求項3に記載の情報処理方法であって、
     前記表示領域とは反対の側の、前記端末の撮像部の起動に基づき、前記画像を撮像する制御を前記制御部によって行うことを含む。
    The information processing method according to claim 3.
    This includes controlling the image to be captured by the control unit based on the activation of the image pickup unit of the terminal on the side opposite to the display area.
  5.  請求項3または請求項4に記載の情報処理方法であって、
     前記画像は、動画像であり、
     前記動画像を撮像している間に、前記端末のユーザによって入力された画像を、前記第1情報と前記第2情報との間に表示することを含む。
    The information processing method according to claim 3 or 4.
    The image is a moving image and is
    This includes displaying an image input by the user of the terminal between the first information and the second information while the moving image is being captured.
  6.  請求項3または請求項4に記載の情報処理方法であって、
     前記通話情報は、前記画像に重畳して前記表示領域に表示される。
    The information processing method according to claim 3 or 4.
    The call information is superimposed on the image and displayed in the display area.
  7.  請求項1に記載の情報処理方法であって、
     前記端末に基づく画像を前記制御部により取得することと、
     前記通話情報と、前記画像とを前記表示領域に表示することとを含む。
    The information processing method according to claim 1.
    Acquiring an image based on the terminal by the control unit and
    The call information and the image are displayed in the display area.
  8.  請求項7に記載の情報処理方法であって、
     前記画像は、前記通話に基づく前記端末の位置に関する情報または、前記第1端末の位置に関する情報に基づき取得される。
    The information processing method according to claim 7.
    The image is acquired based on the information on the position of the terminal based on the call or the information on the position of the first terminal.
  9.  請求項1から請求項8のいずれか一項に記載の情報処理方法であって、
     前記通話情報を表示するための第1表示を前記表示領域に表示することと、
     前記第1表示に対する前記端末のユーザの入力に基づいて、前記表示領域に前記通話情報を表示することとを含む。
    The information processing method according to any one of claims 1 to 8.
    Displaying the first display for displaying the call information in the display area and
    It includes displaying the call information in the display area based on the input of the user of the terminal with respect to the first display.
  10.  請求項9に記載の情報処理方法であって、
     前記通話の時間、および前記通話情報の少なくとも一方に基づいて、前記第1表示の表示態様を前記制御部によって制御することを含む。
    The information processing method according to claim 9.
    The control unit controls the display mode of the first display based on the time of the call and at least one of the call information.
  11.  請求項9または請求項10に記載の情報処理方法であって、
     前記第1表示は、前記通話情報の少なくとも一部を含む。
    The information processing method according to claim 9 or 10.
    The first display includes at least a part of the call information.
  12.  請求項9または請求項10に記載の情報処理方法であって、
     前記第1表示は、前記通話に関連する画像である。
    The information processing method according to claim 9 or 10.
    The first display is an image related to the call.
  13.  請求項12に記載の情報処理方法であって、
     前記通話に関連する画像は、前記通話を行った前記端末または、前記第1端末の位置に関連する画像である。
    The information processing method according to claim 12.
    The image related to the call is an image related to the position of the terminal or the first terminal that made the call.
  14.  請求項1から請求項13のいずれか一項に記載の情報処理方法であって、
     前記通話情報は、前記第1情報に基づくコンテンツと、前記第2情報に基づくコンテンツとを含む。
    The information processing method according to any one of claims 1 to 13.
    The call information includes a content based on the first information and a content based on the second information.
  15.  請求項14に記載の情報処理方法であって、
     前記第1情報に基づくコンテンツに対応する前記第1端末のユーザを示す画像を前記表示領域に表示することと、
     前記第2情報に基づくコンテンツに対応する前記端末のユーザを示す画像を前記表示領域に表示することとを含む。
    The information processing method according to claim 14.
    Displaying an image showing a user of the first terminal corresponding to the content based on the first information in the display area, and
    The present invention includes displaying an image showing a user of the terminal corresponding to the content based on the second information in the display area.
  16.  請求項15に記載の情報処理方法であって、
     前記第1情報は、前記第1端末の情報に基づいて、前記第1端末のユーザの音声の情報であることが特定され、
     前記第2情報は、前記端末の情報に基づいて、前記端末のユーザの音声の情報であることが特定される。
    The information processing method according to claim 15.
    The first information is specified to be voice information of the user of the first terminal based on the information of the first terminal.
    The second information is specified to be voice information of a user of the terminal based on the information of the terminal.
  17.  請求項1から請求項13のいずれか一項に記載の情報処理方法であって、
     前記通話情報は、前記第1情報と前記第2情報とに基づき、前記通話の内容が要約された情報である。
    The information processing method according to any one of claims 1 to 13.
    The call information is information that summarizes the contents of the call based on the first information and the second information.
  18.  請求項1から請求項17のいずれか一項に記載の情報処理方法であって、
     前記通話情報を前記表示領域に表示しない設定を前記制御部によって行うことを含む。
    The information processing method according to any one of claims 1 to 17.
    The control unit includes setting not to display the call information in the display area.
  19.  請求項1から請求項18のいずれか一項に記載の情報処理方法であって、
     前記第1端末との前記通話に関する制御は、前記第1コンテンツと、前記第2コンテンツと、前記通話に関する制御を行うための画像とが前記表示領域に表示され、前記通話に関する制御を行うための画像に対する前記端末のユーザによる入力に基づいて、前記第1端末との前記通話の開始に関する制御が前記制御部によって行われる。
    The information processing method according to any one of claims 1 to 18.
    The control related to the call with the first terminal is for controlling the call by displaying the first content, the second content, and an image for controlling the call in the display area. Based on the input by the user of the terminal to the image, the control unit controls the start of the call with the first terminal.
  20.  請求項1から請求項19のいずれか一項に記載の情報処理方法であって、
     前記第1コンテンツと、前記第2コンテンツと、前記通話情報とは、前記端末の記憶部に記憶されたアプリケーションソフトウェアによって前記表示領域に表示され、
     前記通話は、前記アプリケーションソフトウェアによって実行される。
    The information processing method according to any one of claims 1 to 19.
    The first content, the second content, and the call information are displayed in the display area by the application software stored in the storage unit of the terminal.
    The call is executed by the application software.
  21.  第1端末にコンテンツの送信または、前記第1端末から送信されたコンテンツの受信を行う端末のコンピュータに実行させるプログラムであって、
     前記第1端末から送信された第1コンテンツと、前記端末の通信部によって前記第1端末に送信された第2コンテンツとを前記端末の表示領域に表示することと、
     前記第1コンテンツと前記第2コンテンツとを表示する前記表示領域に対する前記端末のユーザによる入力に基づいて、前記第1端末との通話に関する制御を前記端末の制御部によって行うことと、
     前記第1端末のユーザの音声に基づく第1情報と、前記端末のユーザの音声に基づく第2情報とを前記第1端末との前記通話に基づき前記制御部により取得することと、
     前記第1情報と前記第2情報とに基づく通話情報を前記表示領域に表示することとを含む。
    A program to be executed by the computer of the terminal that transmits the content to the first terminal or receives the content transmitted from the first terminal.
    Displaying the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal in the display area of the terminal.
    Based on the input by the user of the terminal to the display area for displaying the first content and the second content, the control unit of the terminal controls the call with the first terminal.
    Acquiring the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal by the control unit based on the call with the first terminal.
    It includes displaying call information based on the first information and the second information in the display area.
  22.  第1端末にコンテンツの送信または、前記第1端末から送信されたコンテンツの受信を行う端末であって、
     前記第1端末から送信された第1コンテンツと、前記端末の通信部によって前記第1端末に送信された第2コンテンツとを表示する表示部と、
     前記第1コンテンツと前記第2コンテンツとを表示する前記表示部に対する前記端末のユーザによる入力に基づいて、前記第1端末との通話に関する制御を行う制御部とを備え、
     前記制御部は、前記第1端末のユーザの音声に基づく第1情報と、前記端末のユーザの音声に基づく第2情報とを前記第1端末との前記通話に基づき取得し、
     前記表示部は、前記第1情報と前記第2情報とに基づく通話情報を表示する。
    A terminal that transmits content to the first terminal or receives content transmitted from the first terminal.
    A display unit that displays the first content transmitted from the first terminal and the second content transmitted to the first terminal by the communication unit of the terminal.
    A control unit that controls a call with the first terminal based on an input by a user of the terminal to the display unit that displays the first content and the second content is provided.
    The control unit acquires the first information based on the voice of the user of the first terminal and the second information based on the voice of the user of the terminal based on the call with the first terminal.
    The display unit displays call information based on the first information and the second information.
PCT/JP2019/045439 2019-03-19 2019-11-20 Information processing method, program, and terminal WO2020188885A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-051969 2019-03-19
JP2019051969A JP6832971B2 (en) 2019-03-19 2019-03-19 Programs, information processing methods, terminals

Publications (1)

Publication Number Publication Date
WO2020188885A1 true WO2020188885A1 (en) 2020-09-24

Family

ID=72520751

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/045439 WO2020188885A1 (en) 2019-03-19 2019-11-20 Information processing method, program, and terminal

Country Status (2)

Country Link
JP (2) JP6832971B2 (en)
WO (1) WO2020188885A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7489152B2 (en) 2022-02-25 2024-05-23 ビーサイズ株式会社 Information processing terminal, information processing device, information processing method, and information processing program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160205049A1 (en) * 2015-01-08 2016-07-14 Lg Electronics Inc. Mobile terminal and controlling method thereof
JP2016149158A (en) * 2016-04-28 2016-08-18 カシオ計算機株式会社 Method of generating social time line, social net work service system, server, terminal, and program
JP2017517228A (en) * 2014-05-23 2017-06-22 サムスン エレクトロニクス カンパニー リミテッド System and method for providing voice / text call service

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017517228A (en) * 2014-05-23 2017-06-22 サムスン エレクトロニクス カンパニー リミテッド System and method for providing voice / text call service
US20160205049A1 (en) * 2015-01-08 2016-07-14 Lg Electronics Inc. Mobile terminal and controlling method thereof
JP2016149158A (en) * 2016-04-28 2016-08-18 カシオ計算機株式会社 Method of generating social time line, social net work service system, server, terminal, and program

Also Published As

Publication number Publication date
JP7057455B2 (en) 2022-04-19
JP6832971B2 (en) 2021-02-24
JP2021119455A (en) 2021-08-12
JP2020154652A (en) 2020-09-24

Similar Documents

Publication Publication Date Title
CN106024014B (en) A kind of phonetics transfer method, device and mobile terminal
US8373799B2 (en) Visual effects for video calls
EP2210214B1 (en) Automatic identifying
JP2023539820A (en) Interactive information processing methods, devices, equipment, and media
JP6219642B2 (en) Intelligent service providing method and apparatus using input characters in user device
EP2607994A1 (en) Stylus device
KR20170048964A (en) Method and apparatus of providing message, Method and apparatus of controlling display and computer program for executing one of the method
JP2005346252A (en) Information transmission system and information transmission method
CN113259740A (en) Multimedia processing method, device, equipment and medium
KR20110052898A (en) Method for setting background screen and mobile terminal using the same
EP2747464A1 (en) Sent message playing method, system and related device
KR20140078258A (en) Apparatus and method for controlling mobile device by conversation recognition, and apparatus for providing information by conversation recognition during a meeting
JP2022020659A (en) Method and system for recognizing feeling during conversation, and utilizing recognized feeling
CN110704647A (en) Content processing method and device
US10965629B1 (en) Method for generating imitated mobile messages on a chat writer server
US9110888B2 (en) Service server apparatus, service providing method, and service providing program for providing a service other than a telephone call during the telephone call on a telephone
JP7057455B2 (en) Programs, information processing methods, terminals
KR102086780B1 (en) Method, apparatus and computer program for generating cartoon data
US20140129228A1 (en) Method, System, and Relevant Devices for Playing Sent Message
KR20150129182A (en) Method and apparatus of providing messages
JP7307228B2 (en) program, information processing method, terminal
KR20140097668A (en) Method for providing mobile photobook service based on online
CN115048949A (en) Multilingual text replacement method, system, equipment and medium based on term base
CN113450762A (en) Character reading method, device, terminal and storage medium
CN112562733A (en) Media data processing method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19920164

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19920164

Country of ref document: EP

Kind code of ref document: A1