WO2022196769A1 - Information processing method, program and information processing device - Google Patents

Information processing method, program and information processing device Download PDF

Info

Publication number
WO2022196769A1
WO2022196769A1 PCT/JP2022/012347 JP2022012347W WO2022196769A1 WO 2022196769 A1 WO2022196769 A1 WO 2022196769A1 JP 2022012347 W JP2022012347 W JP 2022012347W WO 2022196769 A1 WO2022196769 A1 WO 2022196769A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
state
information
communication device
call
Prior art date
Application number
PCT/JP2022/012347
Other languages
French (fr)
Japanese (ja)
Inventor
未知 佐藤
Original Assignee
株式会社チカク
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社チカク filed Critical 株式会社チカク
Publication of WO2022196769A1 publication Critical patent/WO2022196769A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers

Definitions

  • the present disclosure relates to an information processing method, a program, and an information processing device.
  • An information processing method includes: obtaining image data and/or sound data by a communication device; determining a state of a user using the communication device based on the image data and/or the sound data; and outputting the information about the state of the user to another communication device associated with the communication device before the call.
  • FIG. 1 is a diagram for explaining a system overview of the present disclosure
  • FIG. It is a figure showing an example of a schematic structure of information processing system 1 concerning a 1st embodiment. It is a figure which shows an example of the hardware constitutions of the server which concerns on 1st Embodiment. It is a figure which shows an example of the hardware constitutions of the user terminal which concerns on 1st Embodiment.
  • 3 is a diagram illustrating an example of functions of each device of the information processing system according to the first embodiment;
  • FIG. FIG. 10 is a diagram showing a screen example 1 including a state of a call destination user according to the first embodiment;
  • FIG. 10 is a diagram showing a screen example 2 including a state of a call destination user according to the first embodiment;
  • FIG. 11 is a diagram showing a screen example 3 including a state of a call destination user according to the first embodiment;
  • FIG. 11 is a diagram showing a screen example 4 including a state of a call destination user according to the first embodiment;
  • 4 is a flowchart showing an example of operation processing of the server according to the first embodiment;
  • 8 is a flowchart showing an example of operation processing on the side that displays a state according to the first embodiment;
  • FIG. 10 is a diagram illustrating an overview of a system according to a second embodiment of the present disclosure;
  • FIG. 11 is a diagram showing a screen example 3 including a state of a call destination user according to the first embodiment
  • FIG. 11 is a diagram showing a screen example 4 including a state of a call destination user according to the first embodiment
  • 4 is a flowchart showing an example of operation processing of the server according to the first embodiment
  • 8 is a flowchart showing an example of operation processing on the side that displays a state according to the first embodiment
  • FIG. 1 is a diagram for explaining the system outline of the present disclosure.
  • user UB eg, child
  • a user UA uses an information processing device 20A (first communication device) such as a mobile terminal
  • a user UB uses an information processing device 20B (second communication device) such as a mobile terminal.
  • the server 10 is an information processing device capable of controlling IP (Internet Protocol) phone calls and videophone calls.
  • IP Internet Protocol
  • FIG. 1 includes an example in which a call is realized by P2P communication using IP telephony, WebRTC (Web Real-time Communication) technology, or the like.
  • the information processing device 20A turns on the camera and the microphone to acquire images and sounds around the information processing device 20A.
  • the information processing device 20 ⁇ /b>A transmits media data related to the acquired image data and/or sound data (hereinafter also referred to as “image data/sound data”) to the server 10 .
  • the information processing device 20A obtains image data/sound data by turning on at least one of the camera and the microphone at all times or periodically.
  • the server 10 receives media data related to image data/sound data from the information processing device 20A, and determines the state of the user UA who uses the information processing device 20A based on this media data.
  • the server 10 performs image analysis and sound analysis, for example, to determine the current state of the user UA.
  • the current state of the user UA is, for example, a state regarding whether or not the user UA is around the information processing apparatus 20A based on an image, a state regarding emotions based on facial expressions of the user UA, and a state regarding the user UA's surroundings. It includes at least one of a place state based on sound information, a state related to liveliness, a state indicating whether or not the user UA is available for communication, and the like.
  • the server 10 transmits information about the determined state of the user UA (hereinafter also referred to as "state information") to the information processing device 20B associated with the identification information (ID) of the information processing device 20A.
  • state information information about the determined state of the user UA
  • ID identification information
  • the server 10 holds, for example, call handling information that associates identification information between devices that can communicate with each other.
  • the information processing device 20B used by the user UB displays a screen informing the user UB of the state of the user UA on the display screen.
  • the information processing device 20B displays a UI component M1 including a video, animation, text, or the like indicating the current state of the user UA.
  • the UI component M1 is an icon, window, popup, or a predetermined area within the application screen.
  • current or past image data including the user UA acquired by the information processing device 20A may be displayed. At this time, in order to protect privacy, the images of those other than the user UA may be mosaiced or may have a different background.
  • the information processing device 20B accepts an operation to make a call from the user UB to the user UA. At this time, the information processing device 20B transmits a call request to the server 10 .
  • the server 10 When the server 10 receives a call request from the information processing device 20B, the server 10 transmits the call request to the information processing device 20A associated with the identification information of the information processing device 20B. When the information processing device 20A receives the call request and accepts an operation to answer the call from the user UA, the information processing device 20A transmits a call response to the call request to the server 10 . After that, when a session is established between the information processing device 20A and the information processing device 20B, the server 10 starts transmitting/receiving data for a call (hereinafter also referred to as "call data"). Further, if the server 10 does not receive a call response within a predetermined period of time, the server 10 may send a message to that effect to the information processing device 20B and discard the call request.
  • call data transmitting/receiving data for a call
  • the user UB can appropriately grasp the current state of the user UA who is the callee before the call, just by looking at the display screen of the information processing device 20B. For example, if a state in which the user UA is in a room and is relaxed is displayed on the UI component M1, the user UB determines that the call may be made, and instructs the information processing device 20B to start the call. can give instructions.
  • the user UA is notified of his/her own state to the user UB without being particularly conscious of it, and the user UB can confirm the state of the user UA and grasp the timing of the call without performing any special operation. can.
  • image data and sound data are used to determine the state, it is possible to determine the appropriate state of the user UA. be able to start.
  • Similar processing may be performed from the user UB side to the user UA side. That is, the state of the user UB is displayed on the information processing device 20A of the user UA, and the user UA can grasp the current state of the user UB.
  • FIG. 2 is a diagram showing an example of a schematic configuration of the information processing system 1 according to the first embodiment.
  • a system 1 is configured.
  • symbol 20 is used when it is not necessary to distinguish a user terminal individually.
  • the number of servers 10 connected to the network N may be plural.
  • the server 10 is, for example, an information processing device capable of transmitting and receiving IP-packetized voice data and image data, and is a device that performs call control such as calling (outgoing), receiving (incoming), answering, and disconnecting a telephone call. But also. Also, for call control, the H.264 standard is used. Signaling protocols such as H.323, MGCP (Media Gateway Control Protocol), SIP (Session Initiation Protocol) may be used.
  • the user terminal (communication device) 20 is an information processing device that can access a network.
  • Media computer platforms include set-top boxes, digital video recorders, etc.
  • handheld computing devices include personal digital assistants (PDAs), email clients, etc.
  • wearable terminals examples without limitation , glasses-type devices, watch-type devices, etc.
  • the network N can be configured to include multiple types of communication lines, communication networks, and various network devices.
  • the network N includes a base station wirelessly connected to the server 10, a wireless LAN access point (WiFi router, etc.), a mobile communication network connected to the base station, and a network connected from the access point via a router or modem. It may also include a public line such as a telephone line, a cable television line, or an optical communication line, the Internet connected to the user terminal 20, a mobile communication network, or a gateway device that connects the public line and the Internet.
  • the configuration of the network N is not limited to the above example.
  • the user terminal 20B receives the state information of the first user acquired via the server 10, and generates state notification information capable of notifying this state information. It is assumed that the second user, for example, sees the status notification information displayed on the image display screen, determines whether or not the first user can make a call, and performs an operation to start a call. At this time, the user terminal 20B transmits a call request to the user terminal 20A via the server 10.
  • FIG. 1 is a call request to the user terminal 20A via the server 10.
  • the user terminal 20A When the user terminal 20A acquires the call request, the user terminal 20A transmits a call response to the server 10 when the second user performs an operation to respond. As a result, a call session is established, call data including call content such as voice and images is IP-packetized, and the IP-packetized call data is transmitted and received between the user terminals 20A and 20B, A call is made.
  • FIG. 3 is a diagram showing an example of the hardware configuration of the server 10 according to the first embodiment.
  • the server 10 has a control section 102 , a communication interface 104 and a storage section 106 , each section being connected via a bus line 112 .
  • the control unit 102 includes a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), and the like. Further, the control unit 102 executes an application or the like stored in the storage unit 106, in addition to the function of a general web server, the function of performing call control such as call origination, call reception, answering, and disconnection. may be configured to achieve Further, the control unit 102 may be capable of executing a voice recognition function on acquired sound data and an object recognition function on acquired image data.
  • a voice recognition function on acquired sound data and an object recognition function on acquired image data.
  • the communication interface 104 controls communication with the user terminal 20 via the network N.
  • the storage unit 106 stores applications and data (not shown) for realizing a server function that performs call control, including, for example, a large-capacity HDD (Hard Disk Drive) or SSD (Solid State Drive).
  • the storage unit 106 also stores a control program 108 .
  • Storage unit 106 also has information storage unit 110 .
  • the control program 108 is a program that determines the user's state based on image data/sound data, performs call control, and executes applications.
  • the control program 108 is a program for transmitting a request from the calling side to the called side, receiving a response to the request from the called side, and returning the response to the calling side.
  • the information storage unit 110 stores information of each user terminal that uses the function of determining the user's status.
  • the information storage unit 110 may function as a location server that stores a URI (Uniform Resource Identifier) corresponding to the IP address of the user terminal 20 .
  • the information storage unit 110 also stores image data/sound data included in media data transmitted from the user terminal 20 in association with identification information of the user terminal 20 or the like.
  • FIG. 4 is a diagram showing an example of the hardware configuration of the user terminal 20 according to the first embodiment.
  • the user terminal 20 includes a control unit 202, a communication interface 206, a storage unit 208, a display unit 214, an input unit 216, a microphone 220, a speaker 222, and an imaging device 224. Each part is connected via a bus line 218 .
  • the control unit 202 includes a CPU, ROM, RAM 204, and the like.
  • the control unit 202 is configured to implement, for example, an IP call function in addition to functions as a general information processing apparatus by executing applications and the like stored in the storage unit 208 . Further, the control unit 202 may be capable of executing a voice recognition function on input sound data and an object recognition function on input image data.
  • the RAM 204 temporarily holds various information and is used as a work area when the CPU executes various processes.
  • the communication interface 206 controls communication with the server 10 via the network N.
  • the storage unit 208 includes, for example, an HDD or SSD, and stores application programs 210 in addition to storing applications and data (not shown) for realizing functions as a general information processing device.
  • the storage unit 208 also has an information storage unit 212 .
  • the application program 210 is a program for executing the media data transmission function related to the image data/sound data described above and the user status display function.
  • the application program 210 is a program for transmitting a request to the called party via the server 10, receiving a response to this request from the called party via the server 10, and starting a call.
  • the information storage unit 212 stores information (call handling information) that associates the identification information of devices that can communicate with each other and the identification information of users.
  • the call handling information may further hold the user status information transmitted from the server 10 in association with the identification information of the transmitted device.
  • the display unit 214 is a display such as a touch panel or liquid crystal monitor, and displays information to the user.
  • the display unit 214 displays an application execution screen, and more specifically, displays a screen showing the status of the call destination, a screen during IP call, and the like.
  • the input unit 216 receives input from the user and receives instructions from the user. Note that the display unit 214 and the input unit 216 may be configured as a touch panel.
  • the microphone 220 is a device that collects sound such as voice, and may have a noise canceling function.
  • the speaker 222 is a device that converts sound data into physical vibrations and outputs sounds such as music and voice.
  • the imaging device 224 is, for example, a camera, and images the surroundings of the information processing device 20 .
  • the user terminal 20A is the side that transmits media data and the side that receives call requests.
  • the user terminal 20B is the side that displays the state information of the user using the user terminal 20A, and is the side that transmits the call request.
  • each user terminal 20 may be a calling side and a called side, and may have respective functions.
  • the server 10 has the function of controlling calls using the Internet. For example, when using SIP communication, the server 10 performs call control by exchanging a request (SIP method) based on HTTP (Hyper Text Transfer Protocol) and a response (response code).
  • SIP method Session Initiation Protocol
  • HTTP Hyper Text Transfer Protocol
  • response code response code
  • image data/sound data is transmitted and user status information is displayed on the screen before the above-described request and response are exchanged.
  • the user terminal 20A shown in FIG. 5 mainly the function of transmitting image data/sound data will be explained, and with respect to the user terminal 20B, mainly the function of displaying user status information will be explained.
  • the symbols A and B are omitted when the respective functions are not distinguished.
  • the user terminal 20 has the image acquisition unit 302 .
  • the user terminal 20 may have either the function of transmitting media data or the function of displaying user status information.
  • the user terminal 20A has an image acquisition section 302A, a sound acquisition section 304A, a transmission section 306A, a reception section 312A, a data acquisition section 314A, a generation section 316A, an output control section 318A, and a call processing section 320A.
  • the function of each unit in the user terminal 20A is realized by executing the application program 210 by the control unit 202 shown in FIG.
  • the camera function and microphone function are set to ON, and image data and sound data around the user terminal 20A are input.
  • the image data is image data including the first user or image data including the state of the room.
  • the acquired image data is stored in a buffer, for example, and deleted after a predetermined period of time has elapsed.
  • the sound acquisition unit 304A acquires sound data collected by the microphone 220.
  • the sound data is sound data including the voice of the first user, sound data including life sounds, and the like.
  • Acquired sound data is stored in a buffer, for example, and deleted after a predetermined period of time has elapsed.
  • the transmission unit 306A transmits media data including image data acquired by the image acquisition unit 302A and/or sound data acquired by the sound acquisition unit 304A to the server 10.
  • a destination is set in the server 10 in advance.
  • the transmission unit 306A may also transmit the identification information of the user terminal 20A or the identification information of the first user.
  • the first user can include image data and sound data in media data and transmit the media data to the server 10 without performing any special operation. For example, even an elderly first user who is unfamiliar with computers or the like does not require any special operation, so the transmission process can be easily executed.
  • the receiving unit 312A receives the call request via the server 10.
  • the user terminal 20A notifies the first user that there is a call request from the second user of the user terminal 20B.
  • the user terminal 20A transmits the call response to the user terminal 20B via the server 10.
  • the call processing unit 320A converts call data into call packets. send and receive
  • the call processing unit 320A When the receiving unit 312A directly or indirectly receives a call termination request from the user terminal 20B, the call processing unit 320A directly or indirectly transmits a response to the termination request to the user terminal 20B. This ends the session and ends the call. Note that the call termination request may be transmitted from the user terminal 20A.
  • the server 10 has a reception unit 402 , a call control unit 404 , an acquisition unit 406 , a determination unit 408 and a transmission unit 410 .
  • the function of each unit in the server 10 is realized by executing the control program 108 by the control unit 102 shown in FIG.
  • the receiving unit 402 of the server 10 receives the image data/sound data transmitted from the transmitting unit 306A of the user terminal 20A.
  • the image data/sound data included in the received media data are stored in the RAM of the control unit 102, for example.
  • the acquisition unit 406 acquires image data/sound data stored in RAM or the like. Note that the acquisition unit 406 may acquire the image data/sound data transmitted by the user terminal 20A from the reception unit 402 .
  • the determination unit 408 determines the state of the first user of the user terminal 20A based on the acquired image data/sound data. For example, the determination unit 408 analyzes the image data/sound data and determines the state of the first user, such as whether the first user is around the user terminal 20A.
  • the state of the user may be, for example, one state selected from a plurality of states according to the analysis result of the image data/sound data.
  • the determination unit 408 determines the state each time image data/sound data is acquired, but the updated state information may be transmitted from the transmission unit 410 only when the state changes. . Thereby, the load of the transmission processing of the server 10 can be reduced.
  • the determination unit 408 may determine the state of the first user using a learning model that has learned the state of the first user using image data/sound data as learning data.
  • the transmission unit 410 transmits information (state information) about the state of the first user to the user terminal 20B that is associated with the identification information of the user terminal 20A and that can communicate with the user terminal 20A. For example, the transmission unit 410 transmits state information including whether or not the first user is available for communication to the user terminal 20B. Alternatively, the transmission unit 410 may transmit the state information to the user terminal 20B only when the state is updated.
  • the state of the first user is determined using image data and sound data, and the state information of the first user who is the callee is notified to the user who is the caller. , the status of the called party can be easily grasped.
  • the determining unit 408 may perform object recognition on the acquired image data and determine the state of the user based on the result of the object recognition. For example, the determination unit 408 may detect whether or not a person is included by object recognition for image data. Object recognition may be performed using a learning model that has undergone machine learning for each object in advance. Object recognition may be performed using a learning model that has undergone supervised learning using images of the first user as learning data.
  • the determination unit 408 determines whether or not there is a user (for example, the first user) around the user terminal 20A based on the object recognition result.
  • the determination unit 408 determines that the call is possible when the user is present, and determines that the call is impossible when the user is absent. If the determination unit 408 detects that the user is sitting on a sofa or the like and has not moved for a predetermined time as a result of object recognition, the determination unit 408 determines that the user is in a relaxed state, and determines that the user is moving. If detected, it may be determined that the user is busy.
  • the determination unit 408 may identify the state from the recognition result by associating each recognition result with each state.
  • the determining unit 408 can determine not only whether or not the first user can make a call, but also a more detailed state. For example, the determination unit 408 can determine states such as whether the first user is absent or moving.
  • the determination unit 408 may perform facial expression recognition of the first user's face included in the image data, and determine the state of the first user based on the result of the facial expression recognition. For example, when detecting that the first user is included in the image data, the determination unit 408 recognizes the facial expression of the first user. A known technique may be used for facial expression recognition.
  • the determination unit 408 uses AI (artificial intelligence) to estimate the first user's joy, surprise, anger, sadness, serious face, and the like.
  • the determination unit 408 can not only determine whether or not the first user is able to make a call, but also estimate a more detailed emotional state. For example, the determination unit 408 can estimate what kind of emotional state the first user is currently in, depending on what the facial expression of the first user is.
  • the determining unit 408 may perform frequency analysis on the sound data and determine the state of the first user based on the result of this frequency analysis. For example, as a result of the frequency analysis, the determination unit 408 determines whether the first user's voice is included, whether the user is on a train, whether the user is outside, whether the user is listening to music, and the like. judge.
  • the determination unit 408 can estimate the detailed sound-related state in addition to simply determining whether or not the first user can make a call. For example, the determination unit 408 can estimate whether the surroundings of the first user are lively or quiet.
  • the call control unit 404 identifies the user terminal 20A based on the call request received by the receiving unit 402 from the user terminal 20B. For example, the call control unit 404 identifies call destination information (for example, URI) from the call request, and acquires the IP address of the user terminal 20A using the identified call destination information. The call control unit 404 may query a location server (not shown) to identify the IP address from the destination information.
  • call destination information for example, URI
  • the call control unit 404 may query a location server (not shown) to identify the IP address from the destination information.
  • the transmission unit 410 transmits a call request to the user terminal 20A identified by the call control unit 404.
  • the call control unit 404 transmits a call request to the user terminal 20A having the identified IP address.
  • the receiving unit 402 also receives a call response from the user terminal 20A in response to the transmitted call request. After that, the call control unit 404 establishes a call session and controls mutual transmission and reception of call data.
  • the user terminal 20B has an image acquisition section 302B, a sound acquisition section 304B, a transmission section 306B, a reception section 312B, a data acquisition section 314B, a generation section 316B, an output control section 318B, and a call processing section 320B.
  • the function of each unit in the user terminal 20B is realized by executing the application program 210 by the control unit 202 shown in FIG. Also, the functions of the user terminal 20B are the same as those of the user terminal 20A.
  • the status notification function will be mainly described below.
  • the receiving unit 312B receives the state information of the first user of the user terminal 20A from the server 10. For example, the received state information of the first user is stored in the RAM of the control unit 202 or the like. At this time, the receiving unit 312B may also receive the identification information of the user terminal 20A that transmitted the state information of the first user or the identification information of the first user, and store it in the RAM.
  • the data acquisition unit 314B acquires the state information of the first user stored in the RAM, for example. Also, the data acquisition unit 314B may acquire the state information of the first user from the reception unit 312B.
  • the generation unit 316B generates state notification information that can notify the state of the first user based on the state information of the first user. For example, the generation unit 316B generates a UI component including character data, image data, etc. representing the state of the first user indicated by the state information.
  • the output control unit 318B controls the output of the state notification information by the generation unit 316B.
  • the output control unit 318B controls display on the display screen of the display unit 214 if the state notification information is state display information, and controls so that sound is output from the speaker 222 if the state notification information is state sound information.
  • the output control unit 318B controls the state notification information to be displayed as a popup.
  • the second user can grasp the current situation of the first user by looking at the screen of the user terminal 20B. As a result, if the first user is available for communication, the second user should call the first user.
  • the output control unit 318B may control the display of the status notification information within a predetermined area of the display screen. For example, the output control unit 318B controls to display the state notification information in a window set in advance within the display screen. Further, the output control unit 318B may control to display the status notification information in a predetermined area set in the execution screen of the application that realizes the disclosed call function. Also, the predetermined area may be an area in which a widget capable of executing the application described above is displayed. For example, status notification information is displayed on this widget, and this status notification information is appropriately updated on the widget.
  • the second user can grasp the current situation of the first user by looking at the predetermined area in the screen.
  • the output control unit 318B may control display of the updated state notification information within a predetermined area. For example, when the state notification information is updated from the first state to indicate the second state, the output control unit 318B controls display of the state notification information indicating the second state within a predetermined area.
  • the display screen is updated each time the status notification information is updated, not each time the status notification information is generated. 20B processing load can be reduced.
  • the generating unit 316B may include generating an image of the icon as the status notification information when an icon of an application that executes a call via the server 10 exists on the display screen. For example, using a function for dynamically changing the display of icons such as calendar icons, the generation unit 316B sets icon images or characters corresponding to each state indicated by the state notification information, and the server 10 Identify the image or text of the icon according to the status information obtained from. In this case, the output control unit 318B performs control so that the image or characters specified by the generation unit 316B are displayed on the icon.
  • the second user can grasp the state of the first user by looking at the icon of the application that executes the functions of the present disclosure. For example, by looking at icons displayed on the home screen, it is possible to determine whether or not the first user is ready to talk.
  • the generation unit 316B may also generate state notification information including a moving image or animation indicating the state of the first user.
  • the generation unit 316B may use a moving image, an animation, or the like representing the state of the first user indicated by the state information as the state notification information.
  • the moving image is a past moving image of the first user captured in a predetermined room. In this case, if the state information indicates that the first user is relaxed (such as sitting on a sofa), the generating unit 316B generates a past moving image of the first user sitting on the sofa in the room. If it is selected and indicates that the first user is absent, a past moving image in which the first user is not in the room is selected.
  • the generation unit 316B when acquiring image data captured by the user terminal 20A as state information, the generation unit 316B applies an effect (mosaic, another background, etc.) to images of users other than the first user to protect privacy. can be
  • the generation unit 316B may express the state of the first user as an animation with the movement of a predetermined object. For example, when the first user is moving or exercising, the generating unit 316B moves the object quickly, when the first user is absent, hides the object, or makes the first user relax. If you are in a state where you are moving the object slowly.
  • the generation unit 316B may include specifying a UI component for starting call processing for the user terminal 20A when the information regarding the state of the first user indicates that a call is possible.
  • the output control unit 318B may control the display of the UI component on the display screen together with the output of the status notification information.
  • the generation unit 316B identifies or selects a button to call the first user.
  • the state information may include a plurality of states, and whether or not a call is possible may be set in advance for each state.
  • the output control unit 318B may perform control so that the UI component is displayed in association with the status notification information. For example, control is performed so that the UI component is displayed at a position adjacent to the display information.
  • the second user can confirm the first user's state by operating the UI parts, and then smoothly make a call.
  • the transmission unit 306B transmits a call request to the user terminal 20A via the server 10. If an OK call response is returned from the user terminal 20A in response to this call request, the call is started.
  • the user who is the calling party can grasp the current state of the user who is the calling party before starting the call.
  • the current state includes the emotions and movements of the user to whom the call is being made, it is possible to make a call while grasping the state, so that the start of the call can proceed smoothly.
  • FIG. 6A is a diagram showing a screen example 1 including a state of a call destination user according to the first embodiment.
  • the status notification information is displayed on the screen as a popup A1.
  • the state information indicates that the user (Mr. A) is sitting on the sofa, it is determined that the call is possible.
  • the UI component A2 of the "call" button is displayed on the screen.
  • FIG. 6B is a diagram showing Screen Example 2 including the state of the called user according to the first embodiment.
  • the status notification information is displayed on the icon B1 of the application that executes the call function of the present disclosure.
  • a button to make a call may be displayed.
  • FIG. 6C is a diagram showing a screen example 3 including the state of the called user according to the first embodiment.
  • the status notification information is displayed in a predetermined area C1 of the execution screen of the application that executes the call function of the present disclosure.
  • the state information indicates that the user is sitting on the sofa, it is determined that the call is possible.
  • the UI component A2 of the "call" button is displayed in the execution screen.
  • FIG. 6D is a diagram showing a screen example 4 including the state of the called user according to the first embodiment.
  • Mr. B is absent, and in the predetermined area D1 where the status notification information is displayed, a black screen or the like is displayed to give the impression that the room is dark.
  • the server 10 sets a plurality of users (Mr. A and Mr. B in the example shown in FIG. 6D) in the call handling information as call destination candidates for a predetermined user.
  • the server 10 acquires the image data/sound data from the user terminals 20 of Mr. A and Mr. B, and determines the state of each.
  • the user terminal 20 can display the state notification information of a plurality of users on the screen as shown in FIG. 6D.
  • FIG. 7 is a flowchart showing an example of operation processing of the server 10 according to the first embodiment.
  • the user terminal 20A of the first user who is the callee, transmits image data and the like
  • the user terminal 20B of the second user who is the caller, displays the state of the first user.
  • the receiving unit 402 of the server 10 receives the image data/sound data included in the media data transmitted from the user terminal 20A of the first user.
  • the acquisition unit 406 of the server 10 acquires the received image data/sound data.
  • the determination unit 408 of the server 10 determines the current state of the first user based on the acquired image data/sound data.
  • step S106 the determination unit 408 of the server 10 determines whether the determined state has changed from the current state. If the state changes (step S106-YES), the process proceeds to step S108, and if the state does not change (step S106-NO), the process returns to step S102.
  • step S106 the determination unit 408 of the server 10 is associated with the identification information of the user terminal 20A that transmitted the media data including the image data/sound data or the identification information of the first user, and can communicate with the user terminal 20A.
  • the user terminal 20B is specified as the notification destination.
  • the determination unit 408 can identify the notification destination by referring to the call handling information described above, for example.
  • step S108 the transmission unit 410 of the server 10 transmits information (state information) regarding the determined state of the first user to the user terminal 20B specified as the notification destination.
  • FIG. 8 is a flowchart showing an example of operation processing on the side that displays the state according to the first embodiment.
  • the processing performed by the user terminal 20B will be mainly described.
  • the example shown in FIG. 8 is an example in which the state notification information is state display information, and the output control unit 318B controls the display of the state display information on the display screen.
  • step S202 the receiving unit 312B of the user terminal 20B receives the state information of the first user from the server 10. Also, the data acquisition unit 314B of the user terminal 20B acquires the received state information.
  • step S204 the generation unit 316B of the user terminal 20B generates state display information based on the acquired state information of the first user.
  • the generation unit 316B generates a UI component or the like including an image, characters, or the like indicating the state of the first user.
  • the output control unit 318B of the user terminal 20B controls display of the generated status display information on the display screen.
  • step S208 the user terminal 20B determines whether or not the second user has operated a UI component for starting a call. If there is a call start operation (step S208-YES), the transmission unit 306B transmits a call request to the user terminal 20A via the server 10, the process proceeds to step S210, and if there is no call start operation (step S2108-NO), the process returns to step S202.
  • step S210 the receiving unit 312B of the user terminal 20B determines whether or not a call response has been received from the user terminal 20A via the server. If a call response is received (step S210-YES), the process proceeds to step S212, and if a call response is not received (step S210-NO), the call process ends because the first user did not answer. , the process returns to step S202.
  • the user terminal 20B starts a call session, transmits/receives call data to/from the user terminal 20A, and starts a call.
  • the user who is the caller can grasp the current state of the user who is the callee before starting the call.
  • the information processing described in the first embodiment may be implemented as a program to be executed by a computer.
  • this program By installing this program from a server or the like and causing a computer to execute the program, the information processing described above can be realized.
  • the recording medium can store the program in a "non-temporary tangible medium".
  • Recording media include recording media that record information optically, electrically, or magnetically, such as CD-ROMs, flexible disks, and magneto-optical disks; and recording media that record information electrically, such as ROMs and flash memories.
  • Various types of recording media can be used, such as a semiconductor memory that
  • FIG. 9 is a diagram explaining an overview of the system 2 according to the second embodiment of the present disclosure.
  • a system 2 shown in FIG. 2 is a system in which the system described in the first embodiment is applied to a system for managing images, and includes a management device 10A, a server 10B, a user terminal 20, and an image output device 30. , and a camera 60 .
  • the image output device 30 is also connected to the display device 40 , and the display device 40 is controlled by the remote control device 50 .
  • a camera 60 is attached to the display device 40 , and an image acquired by the camera 60 is input to the image output device 30 .
  • the camera 60 is a web camera and may have a built-in microphone.
  • the server 10B has the functions of the server 10 described in the first embodiment.
  • the management device 10A or the server 10B and the user terminal 20 are mutually connected via a network N1 such as a wireless LAN, a fourth or fifth generation mobile communication system (4G or 5G), or LTE (Long Term Evolution) as an example of communication technology. Communication is possible. Also, the management device 10A and the image output device 30 can communicate with each other via a wireless network N2, such as a third generation mobile communication system (3G), which has a lower communication charge than the network N1 but is slower than the network N1.
  • 3G third generation mobile communication system
  • the network N1 and the wireless network N2 are described separately for the sake of explanation, these networks can be connected to each other via the Internet. Also, the network configuration is not limited to the above example.
  • the management device 10A first acquires information such as the name and address of the viewer or poster of the image from the poster of the image through the website on the Internet. Information such as the name and address of the viewer or contributor is not essential information. At this time, the management device 10A generates an identifier (referred to as a device ID) of the image output device 30 used by the viewer. Next, the device ID is notified to the contributor of the image, for example, by e-mail.
  • the management device 10A sets the generated device ID in the image output device 30. Thereafter, the image output device 30 is shipped to the address of the viewer of the image by the administrator of the management device 10A.
  • the image output device 30 incorporates, for example, a 3G communication module, and is configured to immediately start communication with the management device 10A using the set device ID when the power is turned on.
  • the person who posted the image downloads an application for image sharing that operates on the user terminal 20, such as a smartphone or tablet.
  • This application uses the notified device ID to access the management device 10A.
  • the management device 10A can associate the user terminal 20 (contributor) and the image output device 30 (viewer) using the device ID notified from the application as a key. This information corresponds to the call handling information described above.
  • the contributor can use the application to shoot various subjects.
  • the application automatically transmits image data obtained by shooting to the management device 10A via the network N1.
  • the contributor is not required to perform any special operation in order to transmit the image data to the management device 10A.
  • the management device 10A accumulates the image data transmitted from the application of the user terminal 20 and sequentially distributes it to the image output device 30.
  • the image output device 30 displays the image data on the display device 40 according to an instruction from the viewer.
  • the image output device 30 may have a microphone and a speaker, and IP calls are possible.
  • the display device 40 is, for example, a television widely used in ordinary homes, and the remote control device 50 is a remote controller.
  • the image output device 30 is connected to the display device 40 via, for example, HDMI (High-Definition Multimedia Interface) (registered trademark), and can acquire control signals emitted from the remote control device 50 via HDMI. .
  • HDMI High-Definition Multimedia Interface
  • the image output device 30 can acquire the control signal issued from the remote control device 50 and grasp the content of the operation input by the viewer. In other words, the viewer can view the distributed image data using the familiar TV remote control.
  • the camera 60 is, for example, a web camera, and may have a built-in microphone.
  • Image (still image or moving image) data/sound data captured by the camera 60 is input to the image output device 30 .
  • the image output device 30 transmits media data related to the input image data/sound data to the server 10 .
  • the function for starting a call described in the first embodiment is implemented in the application for image sharing described above.
  • the user terminal 20A shown in FIG. 1 etc. corresponds to the image output device 30.
  • the image output device 30 has a transmission section 306 , a reception section 312 , a data acquisition section 314 , a generation section 316 , an output control section 318 and a call processing section 320 of the user terminal 20 .
  • media data including image data/sound data acquired by the camera 60 is transmitted to the server 10B via the image output device 30.
  • the server 10B determines the state of the viewer of the image based on the image data/sound data included in the acquired media data. For example, the server 10B analyzes the image data/sound data to determine whether the viewer is moving, sitting on a chair, not in the room, busy, or quiet in the room where the display device 40 is located. determine the state.
  • the server 10B refers to the information held in the management device 10A, identifies the user terminal 20 associated with the image output device 30, and transmits the status information of the viewer to this user terminal 20.
  • the user terminal 20 generates state notification information (for example, state display information) based on the state information and controls to display it on the display screen (for example, FIG. 6).
  • the user terminal 20 When the user terminal 20 detects that the contributor has operated the call button, the user terminal 20 transmits a call request to the image output device 30 via the server 10B.
  • the image output device 30 displays on the display device 40 whether or not to respond. For example, a first button on remote control 50 is assigned to answer and a second button is assigned to decline.
  • the image output device 30 detects the operation of the first button and transmits a call response to the user terminal 20 via the server 10B. After this, the call is started.
  • a system for sharing images between viewers and contributors can have the call function disclosed in the first embodiment.
  • the management device 10A and the server 10B may be the same server.
  • the user terminal 20 is provided with the function of the determination unit 408 described in the first embodiment.
  • the determination unit of the user terminal 20 determines the state of the user using the own terminal based on the image data acquired by the image acquisition unit 302 and/or the sound data acquired by the sound acquisition unit 304. .
  • the method for determining the user's status is as described above.
  • the determination unit of the user terminal 20 performs object recognition on image data and/or frequency analysis on sound data, and determines the state of the user using the respective results.
  • the transmission unit 306 of the user terminal 20 transmits information (state information) regarding the state of the user determined by the determination unit to other user terminals before the call.
  • the transmitting unit 306 may first transmit the state information to the server 10 in order to transmit the state information to other user terminals.
  • the server 10 After acquiring the status information, the server 10 identifies a notification destination (another user terminal) associated with the user terminal 20 that transmitted the status information, and transmits the status information to the identified notification destination.
  • the user terminal 20 of the notification destination executes the same processing as in the first embodiment.
  • the load of the server 10 can be distributed.
  • the user terminal 20 processes image data and sound data by itself and does not transmit these image data and sound data to the outside, it is possible to protect privacy.
  • the user terminal 20 is provided with some functions of the determination unit 408 described in the first embodiment.
  • the preprocessing unit of the user terminal 20 performs object recognition on image data acquired by the image acquisition unit 302 and/or performs frequency analysis on sound data acquired by the sound acquisition unit 304. , object detection and frequency feature extraction.
  • the transmission unit 306 of the user terminal 20 transmits to the server 10 media data including processing result information including object recognition results and/or feature amounts (frequency analysis results) processed by the preprocessing unit.
  • the determining unit 408 determines the state of the user based on the processing result information included in the acquired media data. For example, the determination unit 408 can determine the user's state from the result information by holding a table or the like in which the user's state is associated with each obtainable result information.
  • the server 10 After determining the state of the user, the server 10 identifies the notification destination associated with the user terminal 20 that transmitted the media data, and transmits information (state information) regarding the user state to the identified notification destination. . Upon receiving the status information, the user terminal 20 of the notification destination executes the same processing as in the first embodiment.
  • the fourth embodiment it is possible to reduce the amount of communication compared to the first embodiment, and it is possible to share the computational resources of the determination unit that is subject to the processing load.
  • the output control unit 318 of the user terminal 20 may control the operation of the device according to the state information, in addition to controlling the output of the state notification information generated by the generation unit 316 .
  • the output control unit 318 controls the brightness of the lighting in the room, the volume of the device that is playing music (own terminal or music player), the device that is watching TV or video (own terminal or TV), depending on the state information. etc.) may also function as an operation control unit that controls devices connected via a network, such as pausing or volume.
  • the output control unit 318 may brighten the lighting in the room, reduce the volume of the music, or pause the video being played, if the call destination is in a state where the call can be made. As a result, it is possible to create an environment in which the caller can easily make a call.
  • the output control unit 318 may store control commands corresponding to each state of the user in association with each other, and execute the control command specified by the state information.
  • the generation unit 316 is provided in the user terminal 20, but may be provided in the server 10.
  • the generation unit of the server 10 generates state notification information according to the state of the user determined by the determination unit 408 .
  • the transmitting unit 410 of the server 10 transmits state notification information to the user terminal 20 as information (state information) regarding the state of the user. That is, in the case of Modification 2, the information about the state of the user transmitted from the server 10 includes the state notification information.
  • the output control unit 318 of the user terminal 20 controls to display the status notification information on the screen when the status notification information is acquired. As a result, the server 10 can easily add or modify the status notification information.
  • the determination unit 408 uses information acquired from a predetermined sensor (for example, an illuminance sensor, a photo sensor such as a motion sensor, an ultrasonic sensor, etc.) instead of or in addition to the media data. may be used to determine the state of the user. For example, the determination unit 408 can determine that there is a user to be called when the detection signal of a predetermined sensor indicates ON. With this, although it may be costly to introduce a predetermined sensor, it is possible to determine the state of the called party user before the call in the same manner as in the above-described embodiments. In addition, by using media data and a signal acquired by a predetermined sensor, it becomes possible to more appropriately determine the state of the user of the called party before the call.
  • a predetermined sensor for example, an illuminance sensor, a photo sensor such as a motion sensor, an ultrasonic sensor, etc.

Abstract

The present invention appropriately ascertains the state at a call destination before calling. This information processing method involves execution, by a communication device, of: the acquisition of image data and/or audio data; the determination of the state of a user who is using the communication device on the basis of the image data and/or audio data; and the output of information pertaining to the state of the user to another communication device which is associated with said communication device before calling.

Description

情報処理方法、プログラム、及び情報処理装置Information processing method, program, and information processing device
 本開示は、情報処理方法、プログラム、及び情報処理装置に関する。 The present disclosure relates to an information processing method, a program, and an information processing device.
 従来、音声通話を開始する前に、相手方が通話可能であるか否かを知らせる技術がある。例えば、通話元が電話機を操作し、通知先から取得した相手のメールアドレスを用いてインターネットに接続し、通話可能か否かの情報を取得したり、又は、在席センサを設けたりすることで、相手が通話可能か否かを知ることができる技術が知られている(例えば、特許文献1参照)。 Conventionally, there is a technology that informs the other party whether or not it is possible to talk before starting a voice call. For example, the caller operates the telephone, connects to the Internet using the other party's email address obtained from the notification destination, and obtains information on whether or not the call is possible, or by installing a presence sensor. , is known to know whether or not the other party is available for communication (for example, see Patent Document 1).
特開2005-159748号公報JP-A-2005-159748
 しかしながら、従来技術において、通話先が通話可能か否かを知ることができるが、通話先が実際に現在どんな状態にあるのか分からない。例えば、通話先が使用するインターネットが接続されていても、席にいない可能性がある。また、在席センサを設けることで席にいるか否かを把握する場合は、在席センサを設けるためにコストが増加してしまう。したがって、通話先の状態を通話前に適切に把握する技術が求められている。 However, in the conventional technology, it is possible to know whether or not the callee is available for communication, but it is not possible to know what state the callee is actually in. For example, you may not be at your desk even though the person you are calling has an internet connection. Further, if a presence sensor is provided to determine whether or not a person is at a seat, the cost increases due to the provision of the seat presence sensor. Therefore, there is a demand for a technique for appropriately grasping the state of the called party before making a call.
 そこで、通話先の状態を通話前に適切に把握することが可能な情報処理方法、プログラム、及び情報処理装置を提供することを目的とする。 Therefore, it is an object of the present invention to provide an information processing method, a program, and an information processing apparatus capable of appropriately grasping the state of a callee before making a call.
 本開示の一態様における情報処理方法は、通信装置が、画像データ及び/又は音データを取得すること、前記画像データ及び/又は音データに基づいて、前記通信装置を利用するユーザの状態を判定すること、前記ユーザの状態に関する情報を、前記通信装置に対応付けられる通話前の他の通信装置に向けて出力すること、を実行する。 An information processing method according to an aspect of the present disclosure includes: obtaining image data and/or sound data by a communication device; determining a state of a user using the communication device based on the image data and/or the sound data; and outputting the information about the state of the user to another communication device associated with the communication device before the call.
 本開示の所定の態様によれば、通話先の状態を通話前に適切に把握することができる。 According to certain aspects of the present disclosure, it is possible to appropriately grasp the state of the call destination before the call.
本開示のシステム概要を説明するための図である。1 is a diagram for explaining a system overview of the present disclosure; FIG. 第1実施形態に係る情報処理システム1の概略構成の一例を示す図である。It is a figure showing an example of a schematic structure of information processing system 1 concerning a 1st embodiment. 第1実施形態に係るサーバのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the server which concerns on 1st Embodiment. 第1実施形態に係るユーザ端末のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the user terminal which concerns on 1st Embodiment. 第1実施形態に係る情報処理システムの各装置の機能の一例を示す図である。3 is a diagram illustrating an example of functions of each device of the information processing system according to the first embodiment; FIG. 第1実施形態に係る通話先のユーザの状態を含む画面例1を示す図である。FIG. 10 is a diagram showing a screen example 1 including a state of a call destination user according to the first embodiment; 第1実施形態に係る通話先のユーザの状態を含む画面例2を示す図である。FIG. 10 is a diagram showing a screen example 2 including a state of a call destination user according to the first embodiment; 第1実施形態に係る通話先のユーザの状態を含む画面例3を示す図である。FIG. 11 is a diagram showing a screen example 3 including a state of a call destination user according to the first embodiment; 第1実施形態に係る通話先のユーザの状態を含む画面例4を示す図である。FIG. 11 is a diagram showing a screen example 4 including a state of a call destination user according to the first embodiment; 第1実施形態に係るサーバの動作処理の一例を示すフローチャートである。4 is a flowchart showing an example of operation processing of the server according to the first embodiment; 第1実施形態に係る状態を表示する側の動作処理の一例を示すフローチャートである。8 is a flowchart showing an example of operation processing on the side that displays a state according to the first embodiment; 本開示の第2実施形態に係るシステムの概要を説明する図である。FIG. 10 is a diagram illustrating an overview of a system according to a second embodiment of the present disclosure; FIG.
 以下、添付図面を参照しながら本開示の実施形態について詳細に説明する。 Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
 <システム概要>
 図1は、本開示のシステム概要を説明するための図である。図1に示す例では、ユーザUB(例えば子)が、ユーザUA(例えば親)との音声通話、ビデオ通話又はテレビ通話(以下、まとめて「通話」ともいう。)を開始したいことを想定する。ユーザUAは、携帯端末などの情報処理装置20A(第1通信装置)を利用し、ユーザUBは、携帯端末などの情報処理装置20B(第2通信装置)を利用する。また、サーバ10は、IP(Internet Protocol)電話やテレビ電話の通話制御を行うことが可能な情報処理装置である。なお、図1に示す概要は、IP電話、WebRTC(Web Real-time Communication)技術などを用いるP2P通信により通話が実現される例を含む。
<System Overview>
FIG. 1 is a diagram for explaining the system outline of the present disclosure. In the example shown in FIG. 1, it is assumed that user UB (eg, child) wants to initiate a voice call, video call, or video call (hereinafter collectively referred to as "call") with user UA (eg, parent). . A user UA uses an information processing device 20A (first communication device) such as a mobile terminal, and a user UB uses an information processing device 20B (second communication device) such as a mobile terminal. The server 10 is an information processing device capable of controlling IP (Internet Protocol) phone calls and videophone calls. Note that the outline shown in FIG. 1 includes an example in which a call is realized by P2P communication using IP telephony, WebRTC (Web Real-time Communication) technology, or the like.
 (1)画像データ/音データの送信
 まず、情報処理装置20Aは、カメラやマイクをオン状態にし、情報処理装置20Aの周辺の画像や音を取得する。情報処理装置20Aは、取得した画像データ及び/又は音データ(以下、「画像データ/音データ」ともいう。)に関するメディアデータをサーバ10に送信する。なお、情報処理装置20Aは、カメラ及びマイクの少なくとも1つを、常時オンにしたり、定期的にオンにしたりして、画像データ/音データを取得する。
(1) Transmission of Image Data/Sound Data First, the information processing device 20A turns on the camera and the microphone to acquire images and sounds around the information processing device 20A. The information processing device 20</b>A transmits media data related to the acquired image data and/or sound data (hereinafter also referred to as “image data/sound data”) to the server 10 . The information processing device 20A obtains image data/sound data by turning on at least one of the camera and the microphone at all times or periodically.
 (2)状態情報の送信
 サーバ10は、情報処理装置20Aから画像データ/音データに関するメディアデータを受信し、このメディアデータに基づいて、情報処理装置20Aを利用するユーザUAの状態を判定する。サーバ10は、例えば画像解析や音解析を行って、ユーザUAの現在の状態を判定する。ユーザUAの現在の状態は、例えば、画像に基づく情報処理装置20Aの周辺にユーザUAがいるか否かに関する状態や、ユーザUAの顔の表情に基づく感情に関する状態や、情報処理装置20Aの周辺の音の情報に基づく場所の状態や、賑やかさに関する状態や、ユーザUAが通話可能か否かを示す状態などのうち少なくとも1つを含む。
(2) Transmission of state information The server 10 receives media data related to image data/sound data from the information processing device 20A, and determines the state of the user UA who uses the information processing device 20A based on this media data. The server 10 performs image analysis and sound analysis, for example, to determine the current state of the user UA. The current state of the user UA is, for example, a state regarding whether or not the user UA is around the information processing apparatus 20A based on an image, a state regarding emotions based on facial expressions of the user UA, and a state regarding the user UA's surroundings. It includes at least one of a place state based on sound information, a state related to liveliness, a state indicating whether or not the user UA is available for communication, and the like.
 サーバ10は、判定したユーザUAの状態に関する情報(以下、「状態情報」ともいう。)を、情報処理装置20Aの識別情報(ID)に関連付けられる情報処理装置20Bに送信する。サーバ10は、例えば、相互に通話可能な装置同士の識別情報等を関連付ける通話対応情報を保持しておく。 The server 10 transmits information about the determined state of the user UA (hereinafter also referred to as "state information") to the information processing device 20B associated with the identification information (ID) of the information processing device 20A. The server 10 holds, for example, call handling information that associates identification information between devices that can communicate with each other.
 (3)ユーザの状態の確認後、通話リクエストの送信
 ユーザUBが利用する情報処理装置20Bは、ユーザUAの状態情報を受信すると、表示画面にユーザUAの状態をユーザUBに知らせる画面を表示する。例えば、情報処理装置20Bは、現在のユーザUAの状態を示す動画や、アニメーション、又はテキストなどを含むUI部品M1を表示する。UI部品M1は、アイコン、ウィンドウ、ポップアップ、又はアプリケーション画面内の所定領域などである。また、情報処理装置20Aが取得したユーザUAを含む現在又は過去の画像データが表示されてもよい。このとき、プライバシーを保護するため、ユーザUA以外の画像はモザイクをかけたり、別の背景にしたりしてもよい。
(3) Sending a call request after confirming the state of the user When the information processing device 20B used by the user UB receives the state information of the user UA, it displays a screen informing the user UB of the state of the user UA on the display screen. . For example, the information processing device 20B displays a UI component M1 including a video, animation, text, or the like indicating the current state of the user UA. The UI component M1 is an icon, window, popup, or a predetermined area within the application screen. Also, current or past image data including the user UA acquired by the information processing device 20A may be displayed. At this time, in order to protect privacy, the images of those other than the user UA may be mosaiced or may have a different background.
 情報処理装置20Bは、ユーザUBからユーザUAに通話をかける操作を受け付けるとする。このとき、情報処理装置20Bは、通話リクエストを、サーバ10に送信する。 It is assumed that the information processing device 20B accepts an operation to make a call from the user UB to the user UA. At this time, the information processing device 20B transmits a call request to the server 10 .
 (4)通話のセッション確立
 サーバ10は、情報処理装置20Bから通話リクエストを受信すると、情報処理装置20Bの識別情報に関連付けられる情報処理装置20Aに、通話リクエストを送信する。情報処理装置20Aは、通話リクエストを受信し、ユーザUAから通話に応答する操作を受け付けると、通話リクエストに応答する通話レスポンスをサーバ10に送信する。その後、サーバ10は、情報処理装置20Aと情報処理装置20Bとの間でセッションが確立されると、通話のためのデータ(以下、「通話データ」ともいう。)の送受信を開始する。また、サーバ10は、所定時間内に通話レスポンスを受信しなければ、その旨を情報処理装置20Bに送信し、通話リクエストを破棄してもよい。
(4) Call Session Establishment When the server 10 receives a call request from the information processing device 20B, the server 10 transmits the call request to the information processing device 20A associated with the identification information of the information processing device 20B. When the information processing device 20A receives the call request and accepts an operation to answer the call from the user UA, the information processing device 20A transmits a call response to the call request to the server 10 . After that, when a session is established between the information processing device 20A and the information processing device 20B, the server 10 starts transmitting/receiving data for a call (hereinafter also referred to as "call data"). Further, if the server 10 does not receive a call response within a predetermined period of time, the server 10 may send a message to that effect to the information processing device 20B and discard the call request.
 これにより、ユーザUBは、情報処理装置20Bの表示画面を見るだけで、通話前に、通話先のユーザUAの現在の状態を適切に把握することができる。例えば、ユーザUAが部屋にいてリラックスしているような状態がUI部品M1に表示されていれば、ユーザUBは、通話をしてもよいと判断して、通話の開始を情報処理装置20Bに指示することができる。 As a result, the user UB can appropriately grasp the current state of the user UA who is the callee before the call, just by looking at the display screen of the information processing device 20B. For example, if a state in which the user UA is in a room and is relaxed is displayed on the UI component M1, the user UB determines that the call may be made, and instructs the information processing device 20B to start the call. can give instructions.
 また、ユーザUAは、特に意識することなく自身の状態がユーザUBに報知され、ユーザUBは、特別な操作をしなくても、ユーザUAの状態を確認でき、通話のタイミングを把握することができる。また、状態の判定には、画像データや音データが使用されるため、ユーザUAの適切な状態を判定することができ、ユーザUBは、ユーザUAの現在の状態を把握したうえで、通話を開始することが可能になる。 In addition, the user UA is notified of his/her own state to the user UB without being particularly conscious of it, and the user UB can confirm the state of the user UA and grasp the timing of the call without performing any special operation. can. In addition, since image data and sound data are used to determine the state, it is possible to determine the appropriate state of the user UA. be able to start.
 なお、同様の処理が、ユーザUB側からユーザUA側に行われるようにしてもよい。すわなち、ユーザUBの状態が、ユーザUAの情報処理装置20Aに表示され、ユーザUAは、ユーザUBの現在の状態を把握することができる。 Note that similar processing may be performed from the user UB side to the user UA side. That is, the state of the user UB is displayed on the information processing device 20A of the user UA, and the user UA can grasp the current state of the user UB.
 [第1実施形態]
 次に、上述したシステムを実現するためのシステム構成例について説明する。図2は、第1実施形態に係る情報処理システム1の概略構成の一例を示す図である。図2に示すように、サーバ10と、各ユーザが利用するユーザ端末20A、20B、20C、20D、・・・とが、ネットワークNを介して相互に通信可能に接続されることにより、情報処理システム1が構成される。以降、ユーザ端末を個別に区別する必要がない場合は、符号20を用いる。また、サーバ10について、ネットワークNに接続される数は複数あってもよい。
[First embodiment]
Next, a system configuration example for realizing the system described above will be described. FIG. 2 is a diagram showing an example of a schematic configuration of the information processing system 1 according to the first embodiment. As shown in FIG. 2, a server 10 and user terminals 20A, 20B, 20C, 20D, . A system 1 is configured. Henceforth, the code|symbol 20 is used when it is not necessary to distinguish a user terminal individually. Further, the number of servers 10 connected to the network N may be plural.
 サーバ10は、例えば、IPパケット化された音声データや画像データを送受信可能な情報処理装置であり、電話の発呼(発信)、着呼(着信)、応答、切断などの呼制御を行う装置でもある。また、呼制御については、H.323、MGCP(Media Gateway Control Protocol)、SIP(Session Initiation Protocol)などのシグナリングプロトコルが用いられてもよい。 The server 10 is, for example, an information processing device capable of transmitting and receiving IP-packetized voice data and image data, and is a device that performs call control such as calling (outgoing), receiving (incoming), answering, and disconnecting a telephone call. But also. Also, for call control, the H.264 standard is used. Signaling protocols such as H.323, MGCP (Media Gateway Control Protocol), SIP (Session Initiation Protocol) may be used.
 ユーザ端末(通信装置)20は、ネットワークにアクセス可能な情報処理装置であって、限定ではなく例として、スマートフォンなどの携帯端末、コンピュータ(限定でなく例として、デスクトップ、ラップトップ、タブレットなど)、メディアコンピュータプラットホーム(限定でなく例として、セットトップボックス、デジタルビデオレコーダなど)、ハンドヘルドコンピュータデバイス(限定でなく例として、PDA(Personal Digital Assistant)、電子メールクライアントなど)、ウェアラブル端末(限定でなく例として、メガネ型デバイス、時計型デバイスなど)、又は他種のコンピュータを含む。 The user terminal (communication device) 20 is an information processing device that can access a network. Media computer platforms (examples without limitation include set-top boxes, digital video recorders, etc.), handheld computing devices (examples without limitation include personal digital assistants (PDAs), email clients, etc.), wearable terminals (examples without limitation , glasses-type devices, watch-type devices, etc.), or other types of computers.
 ネットワークNは、複数種の通信回線や通信網及び種々のネットワーク機器を含んで構成され得る。例えば、ネットワークNは、サーバ10に無線接続される基地局や、無線LANのアクセスポイント(WiFiルータ等)、基地局に接続された移動体通信網、アクセスポイントからルータやモデムを介して接続された電話回線、ケーブルテレビ回線又は光通信回線などの公衆回線、ユーザ端末20に接続されたインターネット、移動体通信網や、公衆回線とインターネットを接続するゲートウェイ装置を含んでもよい。なお、ネットワークNの構成は、上記例に限られない。 The network N can be configured to include multiple types of communication lines, communication networks, and various network devices. For example, the network N includes a base station wirelessly connected to the server 10, a wireless LAN access point (WiFi router, etc.), a mobile communication network connected to the base station, and a network connected from the access point via a router or modem. It may also include a public line such as a telephone line, a cable television line, or an optical communication line, the Internet connected to the user terminal 20, a mobile communication network, or a gateway device that connects the public line and the Internet. Note that the configuration of the network N is not limited to the above example.
 図2に示すシステム構成において、通話先となる第1ユーザのユーザ端末20Aから送信される画像データ/音データに関するメディアデータに基づいて判定される現在の状態を、通話元となる第2ユーザのユーザ端末20Bに表示する。 In the system configuration shown in FIG. Displayed on the user terminal 20B.
 ユーザ端末20Bは、サーバ10を介して取得した第1ユーザの状態情報を受信し、この状態情報を報知可能な状態報知情報を生成する。第2ユーザは、例えば画表示画面に表示される状態報知情報を見て、第1ユーザが通話可能か否かを判断し、通話を開始する操作を行ったとする。このとき、ユーザ端末20Bは、通話リクエストを、サーバ10を介してユーザ端末20Aに送信する。 The user terminal 20B receives the state information of the first user acquired via the server 10, and generates state notification information capable of notifying this state information. It is assumed that the second user, for example, sees the status notification information displayed on the image display screen, determines whether or not the first user can make a call, and performs an operation to start a call. At this time, the user terminal 20B transmits a call request to the user terminal 20A via the server 10. FIG.
 ユーザ端末20Aは、通話リクエストを取得すると、第2ユーザが応答する操作を行った場合に、通話レスポンスをサーバ10に送信する。これにより、通話のセッションが確立され、音声や画像などの通話内容を含む通話データがIPパケット化されて、IPパケットされた通話データが、ユーザ端末20Aとユーザ端末20Bとの間を送受信され、通話が行われる。 When the user terminal 20A acquires the call request, the user terminal 20A transmits a call response to the server 10 when the second user performs an operation to respond. As a result, a call session is established, call data including call content such as voice and images is IP-packetized, and the IP-packetized call data is transmitted and received between the user terminals 20A and 20B, A call is made.
 <ハードウェア構成>
 次に、本開示の情報処理システム1に係る各装置のハードウェア構成について説明する。図3は、第1実施形態に係るサーバ10のハードウェア構成の一例を示す図である。図3に示すように、サーバ10は、制御部102と、通信インタフェース104と、記憶部106と、を有し、各部はバスライン112を介して接続される。
<Hardware configuration>
Next, the hardware configuration of each device according to the information processing system 1 of the present disclosure will be described. FIG. 3 is a diagram showing an example of the hardware configuration of the server 10 according to the first embodiment. As shown in FIG. 3 , the server 10 has a control section 102 , a communication interface 104 and a storage section 106 , each section being connected via a bus line 112 .
 制御部102は、CPU(Central Processing Unit)、ROM(Read Only Memory)、RAM(Random Access Memory)等を含む。また、制御部102は、記憶部106に記憶されるアプリケーション等を実行することにより、一般的なウェブサーバとしての機能に加え、通話の発呼、着呼、応答、切断の呼制御を行う機能を実現するように構成されてもよい。また、制御部102は、取得された音データに対して音声認識機能や、取得された画像データに対して物体認識機能を実行することを可能としてもよい。 The control unit 102 includes a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), and the like. Further, the control unit 102 executes an application or the like stored in the storage unit 106, in addition to the function of a general web server, the function of performing call control such as call origination, call reception, answering, and disconnection. may be configured to achieve Further, the control unit 102 may be capable of executing a voice recognition function on acquired sound data and an object recognition function on acquired image data.
 通信インタフェース104は、ネットワークNを介してユーザ端末20との通信を制御する。 The communication interface 104 controls communication with the user terminal 20 via the network N.
 記憶部106は、例えば大容量のHDD(Hard Disk Drive)、又はSSD(Solid State Drive)等を含む、呼制御を行うサーバ機能を実現するためのアプリケーション及びデータ(図示省略)を記憶する。また、記憶部106は、制御プログラム108を記憶する。また、記憶部106は、情報記憶部110を有する。 The storage unit 106 stores applications and data (not shown) for realizing a server function that performs call control, including, for example, a large-capacity HDD (Hard Disk Drive) or SSD (Solid State Drive). The storage unit 106 also stores a control program 108 . Storage unit 106 also has information storage unit 110 .
 制御プログラム108は、画像データ/音データに基づいてユーザの状態を判定したり、呼制御を行ったりアプリケーションを実行するプログラムである。また、制御プログラム108は、発呼側からのリクエストを着呼側に送信したり、着呼側からリクエストに対するレスポンスを受信して、発呼側にレスポンスを返したりするためのプログラムである。 The control program 108 is a program that determines the user's state based on image data/sound data, performs call control, and executes applications. The control program 108 is a program for transmitting a request from the calling side to the called side, receiving a response to the request from the called side, and returning the response to the calling side.
 情報記憶部110は、ユーザの状態を判定する機能を利用する各ユーザ端末の情報などを記憶する。例えば、情報記憶部110は、ユーザ端末20のIPアドレスに対応するURI(Uniform Resource Identifier)を記憶するロケーションサーバとして機能してもよい。また、情報記憶部110は、ユーザ端末20から送信されるメディアデータに含まれる画像データ/音データをユーザ端末20の識別情報などに対応付けて記憶する。 The information storage unit 110 stores information of each user terminal that uses the function of determining the user's status. For example, the information storage unit 110 may function as a location server that stores a URI (Uniform Resource Identifier) corresponding to the IP address of the user terminal 20 . The information storage unit 110 also stores image data/sound data included in media data transmitted from the user terminal 20 in association with identification information of the user terminal 20 or the like.
 次に、ユーザ端末20のハードウェア構成について説明する。図4は、第1実施形態に係るユーザ端末20のハードウェア構成の一例を示す図である。図4に示すように、ユーザ端末20は、制御部202と、通信インタフェース206と、記憶部208と、表示部214と、入力部216と、マイク220と、スピーカ222と、撮像装置224とを有し、各部はバスライン218を介して接続される。 Next, the hardware configuration of the user terminal 20 will be explained. FIG. 4 is a diagram showing an example of the hardware configuration of the user terminal 20 according to the first embodiment. As shown in FIG. 4, the user terminal 20 includes a control unit 202, a communication interface 206, a storage unit 208, a display unit 214, an input unit 216, a microphone 220, a speaker 222, and an imaging device 224. Each part is connected via a bus line 218 .
 制御部202は、CPU、ROM、RAM204等を含む。制御部202は、記憶部208に記憶されるアプリケーション等を実行することにより、一般的な情報処理装置としての機能に加え、例えばIP通話機能を実現するように構成される。また、制御部202は、入力された音データに対して音声認識機能や、入力された画像データに対して物体認識機能を実行することを可能としてもよい。 The control unit 202 includes a CPU, ROM, RAM 204, and the like. The control unit 202 is configured to implement, for example, an IP call function in addition to functions as a general information processing apparatus by executing applications and the like stored in the storage unit 208 . Further, the control unit 202 may be capable of executing a voice recognition function on input sound data and an object recognition function on input image data.
 また、RAM204は、各種情報を一時的に保持したり、CPUが各種処理を実行する際のワークエリアとして使用されたりする。 In addition, the RAM 204 temporarily holds various information and is used as a work area when the CPU executes various processes.
 通信インタフェース206は、ネットワークNを介してサーバ10との通信を制御する。 The communication interface 206 controls communication with the server 10 via the network N.
 記憶部208は、例えばHDD又はSSD等を含み、一般的な情報処理装置としての機能を実現するためのアプリケーション及びデータ(図示省略)を記憶することに加え、アプリプログラム210を記憶する。また、記憶部208は、情報記憶部212を有している。 The storage unit 208 includes, for example, an HDD or SSD, and stores application programs 210 in addition to storing applications and data (not shown) for realizing functions as a general information processing device. The storage unit 208 also has an information storage unit 212 .
 アプリプログラム210は、上述した画像データ/音データに関するメディアデータの送信機能や、ユーザの状態表示機能を実行するためのプログラムである。また、アプリプログラム210は、サーバ10を介して着呼側へリクエストを送信し、このリクエストに対するレスポンスを、着呼側からサーバ10を介して受信し、通話を開始するためのプログラムである。 The application program 210 is a program for executing the media data transmission function related to the image data/sound data described above and the user status display function. The application program 210 is a program for transmitting a request to the called party via the server 10, receiving a response to this request from the called party via the server 10, and starting a call.
 情報記憶部212は、通話可能な装置の識別情報やユーザの識別情報を関連付けた情報(通話対応情報)を記憶する。通話対応情報は、さらに、サーバ10から送信されたユーザの状態情報を、送信した装置の識別情報に関連付けて保持してもよい。 The information storage unit 212 stores information (call handling information) that associates the identification information of devices that can communicate with each other and the identification information of users. The call handling information may further hold the user status information transmitted from the server 10 in association with the identification information of the transmitted device.
 表示部214は、例えばタッチパネルや液晶モニターなどのディスプレイであり、ユーザに情報を表示する。例えば、表示部214は、アプリケーションの実行画面を表示し、具体的には、通話先の状態を示す画面や、IP通話中の画面などを表示する。 The display unit 214 is a display such as a touch panel or liquid crystal monitor, and displays information to the user. For example, the display unit 214 displays an application execution screen, and more specifically, displays a screen showing the status of the call destination, a screen during IP call, and the like.
 入力部216は、ユーザからの入力を受け付けたり、ユーザからの指示を受け付けたりする。なお、表示部214と入力部216とは、タッチパネルとして構成されてもよい。 The input unit 216 receives input from the user and receives instructions from the user. Note that the display unit 214 and the input unit 216 may be configured as a touch panel.
 マイク220は、音声などの音を集音するデバイスであり、ノイズキャンセル機能などを有してもよい。スピーカ222は、音データを物理振動に変えて、音楽や音声などの音を出力するデバイスである。撮像装置224は、例えばカメラであり、情報処理装置20周辺の状況を撮像する。 The microphone 220 is a device that collects sound such as voice, and may have a noise canceling function. The speaker 222 is a device that converts sound data into physical vibrations and outputs sounds such as music and voice. The imaging device 224 is, for example, a camera, and images the surroundings of the information processing device 20 .
 <機能構成>
 次に、図5を用いて、第1実施形態に係る情報処理システム1の各装置の機能について説明する。図5に示す例では、ユーザ端末20Aは、メディアデータを送信する側であり、通話リクエストを受信する側である。ユーザ端末20Bは、ユーザ端末20Aを利用するユーザの状態情報を表示する側であり、通話リクエストを送信する側である。なお、各ユーザ端末20は、発呼側にも着呼側にもなるため、それぞれの機能を有してもよい。
<Functional configuration>
Next, functions of each device of the information processing system 1 according to the first embodiment will be described with reference to FIG. In the example shown in FIG. 5, the user terminal 20A is the side that transmits media data and the side that receives call requests. The user terminal 20B is the side that displays the state information of the user using the user terminal 20A, and is the side that transmits the call request. Note that each user terminal 20 may be a calling side and a called side, and may have respective functions.
 サーバ10は、上述したとおり、インターネットを利用した通話を制御する機能を有する。例えば、サーバ10は、SIPにおける通信を利用する場合、HTTP(Hyper Text Transfer Protocol)を基礎とするリクエスト(SIPメソッド)と、レスポンス(応答コード)とのやりとりによって通話制御を行う。 As described above, the server 10 has the function of controlling calls using the Internet. For example, when using SIP communication, the server 10 performs call control by exchanging a request (SIP method) based on HTTP (Hyper Text Transfer Protocol) and a response (response code).
 本開示の通話機能では、上述したリクエストとレスポンスとのやりとりの前に、画像データ/音データの送信や、ユーザの状態情報に関する画面表示が行われる。図5に示すユーザ端末20Aでは、主に画像データ/音データの送信側の機能を説明し、ユーザ端末20Bでは、主にユーザの状態情報の表示側の機能を説明する。 In the call function of the present disclosure, image data/sound data is transmitted and user status information is displayed on the screen before the above-described request and response are exchanged. With regard to the user terminal 20A shown in FIG. 5, mainly the function of transmitting image data/sound data will be explained, and with respect to the user terminal 20B, mainly the function of displaying user status information will be explained.
 また、ユーザ端末20A及びユーザ端末20Bは、同じ機能を有する例について説明するため、それぞれの機能を区別しない場合は、A又はBの符号を省略する。例えば、ユーザ端末20は画像取得部302を有する、という表現が可能である。なお、ユーザ端末20は、メディアデータの送信側と、ユーザの状態情報の表示側とのいずれか一方の機能を有していてもよい。 In addition, since the user terminal 20A and the user terminal 20B have the same functions, the symbols A and B are omitted when the respective functions are not distinguished. For example, it is possible to express that the user terminal 20 has the image acquisition unit 302 . Note that the user terminal 20 may have either the function of transmitting media data or the function of displaying user status information.
 ユーザ端末20Aは、画像取得部302A、音取得部304A、送信部306A、受信部312A、データ取得部314A、生成部316A、出力制御部318A、通話処理部320Aを有する。ユーザ端末20Aにおける各部の機能は、図4に示す制御部202がアプリプログラム210を実行することにより実現される。 The user terminal 20A has an image acquisition section 302A, a sound acquisition section 304A, a transmission section 306A, a reception section 312A, a data acquisition section 314A, a generation section 316A, an output control section 318A, and a call processing section 320A. The function of each unit in the user terminal 20A is realized by executing the application program 210 by the control unit 202 shown in FIG.
 なお、ユーザ端末20Aは、本開示の通話機能に関するアプリケーションを実行している場合、カメラ機能やマイク機能がオンに設定され、ユーザ端末20A周囲の画像データや音データが入力される。 Note that when the user terminal 20A is executing an application related to the call function of the present disclosure, the camera function and microphone function are set to ON, and image data and sound data around the user terminal 20A are input.
 (ユーザ端末20Aのデータ送信機能)
 画像取得部302Aは、撮像装置224により撮像された画像データを取得する。例えば、画像データは、第1ユーザを含む画像データや、部屋の様子などを含む画像データである。取得された画像データは、例えばバッファに記憶され、所定時間経過後に削除される。
(Data transmission function of user terminal 20A)
302 A of image acquisition parts acquire the image data imaged by the imaging device 224. FIG. For example, the image data is image data including the first user or image data including the state of the room. The acquired image data is stored in a buffer, for example, and deleted after a predetermined period of time has elapsed.
 音取得部304Aは、マイク220により集音された音データを取得する。例えば、音データは、第1ユーザの音声を含む音データや、生活音を含む音データなどである。取得された音データは、例えばバッファに記憶され、所定時間経過後に削除される。 The sound acquisition unit 304A acquires sound data collected by the microphone 220. For example, the sound data is sound data including the voice of the first user, sound data including life sounds, and the like. Acquired sound data is stored in a buffer, for example, and deleted after a predetermined period of time has elapsed.
 送信部306Aは、画像取得部302Aにより取得された画像データ、及び/又は、音取得部304Aにより取得された音データを含むメディアデータをサーバ10に送信する。送信先は、予めサーバ10に設定されている。また、送信部306Aは、データをサーバ10に送信する際に、ユーザ端末20Aの識別情報又は第1ユーザの識別情報を合わせて送信してもよい。 The transmission unit 306A transmits media data including image data acquired by the image acquisition unit 302A and/or sound data acquired by the sound acquisition unit 304A to the server 10. A destination is set in the server 10 in advance. In addition, when transmitting data to the server 10, the transmission unit 306A may also transmit the identification information of the user terminal 20A or the identification information of the first user.
 以上の処理により、第1ユーザは特別な操作をすることなく、画像データや音データをメディアデータに含めてサーバ10に送信することができる。例えば、コンピュータ等に不慣れな年配の第1ユーザであっても、特別な操作は必要ないため、送信処理は容易に実行される。 Through the above processing, the first user can include image data and sound data in media data and transmit the media data to the server 10 without performing any special operation. For example, even an elderly first user who is unfamiliar with computers or the like does not require any special operation, so the transmission process can be easily executed.
 (ユーザ端末20Aのセッション確立機能)
 後述するように、ユーザ端末20Bから通話リクエストが送信されると、受信部312Aは、サーバ10を介して通話リクエストを受信する。ユーザ端末20Aは、通話リクエストを受信すると、ユーザ端末20Bの第2ユーザから通話リクエストがあることを第1ユーザに報知する。ここで、第1ユーザが通話リクエストに応答する操作を実行すると、ユーザ端末20Aは、通話レスポンスを、サーバ10を介してユーザ端末20Bに送信する。
(Session Establishment Function of User Terminal 20A)
As will be described later, when a call request is transmitted from the user terminal 20B, the receiving unit 312A receives the call request via the server 10. FIG. When the user terminal 20A receives the call request, the user terminal 20A notifies the first user that there is a call request from the second user of the user terminal 20B. Here, when the first user executes an operation of responding to the call request, the user terminal 20A transmits the call response to the user terminal 20B via the server 10. FIG.
 受信部312Aにより通話レスポンスが送信されることで、サーバ10を介してユーザ端末20Aとユーザ端末20Bとの間でセッションが確立されると、通話処理部320Aは、通話データをパケット化した通話パケットの送受信を行う。 When a session is established between the user terminal 20A and the user terminal 20B via the server 10 by transmitting a call response from the receiving unit 312A, the call processing unit 320A converts call data into call packets. send and receive
 受信部312Aが、ユーザ端末20Bから直接的又は間接的に通話の終了リクエストを受信した場合、通話処理部320Aは、終了リクエストに対するレスポンスを直接的又は間接的にユーザ端末20Bに送信する。これにより、セッションが終了し、通話が終了する。なお、通話の終了リクエストは、ユーザ端末20Aから送信してもよい。 When the receiving unit 312A directly or indirectly receives a call termination request from the user terminal 20B, the call processing unit 320A directly or indirectly transmits a response to the termination request to the user terminal 20B. This ends the session and ends the call. Note that the call termination request may be transmitted from the user terminal 20A.
 (サーバ10の通話制御機能)
 サーバ10は、受信部402、通話制御部404、取得部406、判定部408、及び送信部410を有する。サーバ10における各部の機能は、図3に示す制御部102が制御プログラム108を実行することにより実現される。
(Call control function of server 10)
The server 10 has a reception unit 402 , a call control unit 404 , an acquisition unit 406 , a determination unit 408 and a transmission unit 410 . The function of each unit in the server 10 is realized by executing the control program 108 by the control unit 102 shown in FIG.
 サーバ10の受信部402は、ユーザ端末20Aの送信部306Aから送信される画像データ/音データを受信する。例えば、受信されたメディアデータに含まれる画像データ/音データは、例えば制御部102のRAMなどに記憶される。 The receiving unit 402 of the server 10 receives the image data/sound data transmitted from the transmitting unit 306A of the user terminal 20A. For example, the image data/sound data included in the received media data are stored in the RAM of the control unit 102, for example.
 取得部406は、RAMなどに記憶される画像データ/音データを取得する。なお、取得部406は、受信部402から、ユーザ端末20Aにより送信される画像データ/音データを取得してもよい。 The acquisition unit 406 acquires image data/sound data stored in RAM or the like. Note that the acquisition unit 406 may acquire the image data/sound data transmitted by the user terminal 20A from the reception unit 402 .
 判定部408は、取得された画像データ/音データに基づいて、ユーザ端末20Aの第1ユーザの状態を判定する。例えば、判定部408は、画像データ/音データを解析し、第1ユーザがユーザ端末20Aの周辺にいるかなどの第1ユーザの状態を判定する。ユーザの状態は、例えば、画像データ/音データの解析結果に応じて複数の状態の中から選択される1つの状態でもよい。 The determination unit 408 determines the state of the first user of the user terminal 20A based on the acquired image data/sound data. For example, the determination unit 408 analyzes the image data/sound data and determines the state of the first user, such as whether the first user is around the user terminal 20A. The state of the user may be, for example, one state selected from a plurality of states according to the analysis result of the image data/sound data.
 また、判定部408は、画像データ/音データを取得する度に状態を判定するが、状態が変化したときにだけ、更新された状態の状態情報を送信部410から送信するようにしてもよい。これにより、サーバ10の送信処理の負荷を軽減することができる。 Also, the determination unit 408 determines the state each time image data/sound data is acquired, but the updated state information may be transmitted from the transmission unit 410 only when the state changes. . Thereby, the load of the transmission processing of the server 10 can be reduced.
 また、判定部408は、画像データ/音データを学習データとして、第1ユーザの状態を学習した学習モデルを用いて、第1ユーザの状態を判定するようにしてもよい。 Also, the determination unit 408 may determine the state of the first user using a learning model that has learned the state of the first user using image data/sound data as learning data.
 送信部410は、ユーザ端末20Aの識別情報に関連付けられ、ユーザ端末20Aと通話可能なユーザ端末20Bに、第1ユーザの状態に関する情報(状態情報)を送信する。例えば、送信部410は、第1ユーザが通話可能な状態であるか否かを含む状態情報をユーザ端末20Bに送信する。また、送信部410は、状態が更新されたときにだけ、状態情報をユーザ端末20Bに送信するようにしてもよい。 The transmission unit 410 transmits information (state information) about the state of the first user to the user terminal 20B that is associated with the identification information of the user terminal 20A and that can communicate with the user terminal 20A. For example, the transmission unit 410 transmits state information including whether or not the first user is available for communication to the user terminal 20B. Alternatively, the transmission unit 410 may transmit the state information to the user terminal 20B only when the state is updated.
 以上の処理により、第1ユーザの状態を、画像データや音データを用いて判定し、通話元となるユーザに対して、通話先となる第1ユーザの状態情報を知らせることにより、通話元は、通話先の状態を容易に把握することができる。 By the above processing, the state of the first user is determined using image data and sound data, and the state information of the first user who is the callee is notified to the user who is the caller. , the status of the called party can be easily grasped.
 また、判定部408は、取得された画像データに対して物体認識を行い、物体認識の結果に基づいてユーザの状態を判定することを含んでもよい。例えば、判定部408は、画像データに対する物体認識により、人物が含まれるか否かを検出してもよい。物体認識は、事前に各物体に対して機械学習がされた学習モデルを用いて行われてもよい。また、物体認識は、第1ユーザの画像を学習データとして教師あり学習を行った学習モデルを用いて行われてもよい。 Further, the determining unit 408 may perform object recognition on the acquired image data and determine the state of the user based on the result of the object recognition. For example, the determination unit 408 may detect whether or not a person is included by object recognition for image data. Object recognition may be performed using a learning model that has undergone machine learning for each object in advance. Object recognition may be performed using a learning model that has undergone supervised learning using images of the first user as learning data.
 この場合、判定部408は、物体認識結果に基づいて、ユーザ端末20Aの周辺にユーザ(例えば第1ユーザ)がいるか否かを判定する。判定部408は、ユーザがいる場合は通話可能であると判定し、ユーザがいない場合は、通話不可能であると判定する。また、判定部408は、物体認識の結果、ユーザがソファなどに座って所定時間動いていないことを検出すれば、ユーザはリラックスしている状態であると判定し、ユーザが動いていることを検出すれば、ユーザが忙しい状態であると判定してもよい。判定部408は、各認識結果と各状態とを関連付けておくことで、認識結果から状態を特定してもよい。 In this case, the determination unit 408 determines whether or not there is a user (for example, the first user) around the user terminal 20A based on the object recognition result. The determination unit 408 determines that the call is possible when the user is present, and determines that the call is impossible when the user is absent. If the determination unit 408 detects that the user is sitting on a sofa or the like and has not moved for a predetermined time as a result of object recognition, the determination unit 408 determines that the user is in a relaxed state, and determines that the user is moving. If detected, it may be determined that the user is busy. The determination unit 408 may identify the state from the recognition result by associating each recognition result with each state.
 以上の処理により、判定部408は、第1ユーザが単に通話可能か否かを判定するだけではなく、さらに詳細な状態を判定することが可能になる。例えば、判定部408は、第1ユーザが不在か否か、動いているか否か等の状態を判定することが可能になる。 Through the above processing, the determining unit 408 can determine not only whether or not the first user can make a call, but also a more detailed state. For example, the determination unit 408 can determine states such as whether the first user is absent or moving.
 また、判定部408は、画像データに含まれる第1ユーザの顔の表情認識を行い、表情認識の結果に基づいて第1ユーザの状態を判定することを含んでもよい。例えば、判定部408は、画像データ内に第1ユーザが含まれていることを検知すると、第1ユーザの顔の表情を認識する。表情認識は、公知の技術を用いればよい。判定部408は、AI(人工知能)を用いて、第1ユーザの喜び、驚き、怒り、悲しみ、真顔などを推定する。 Further, the determination unit 408 may perform facial expression recognition of the first user's face included in the image data, and determine the state of the first user based on the result of the facial expression recognition. For example, when detecting that the first user is included in the image data, the determination unit 408 recognizes the facial expression of the first user. A known technique may be used for facial expression recognition. The determination unit 408 uses AI (artificial intelligence) to estimate the first user's joy, surprise, anger, sadness, serious face, and the like.
 以上の処理により、判定部408は、第1ユーザが単に通話可能か否かを判定するだけではなく、さらに詳細な感情の状態を推定することが可能になる。例えば、判定部408は、第1ユーザの表情が何かによって、第1ユーザが今はどんな感情の状態なのかを推定することが可能になる。 Through the above processing, the determination unit 408 can not only determine whether or not the first user is able to make a call, but also estimate a more detailed emotional state. For example, the determination unit 408 can estimate what kind of emotional state the first user is currently in, depending on what the facial expression of the first user is.
 また、判定部408は、音データに対して周波数解析を行い、この周波数解析の結果に基づいて第1ユーザの状態を判定することを含んでもよい。例えば、判定部408は、周波数解析の結果、第1ユーザの音声が含まれているか否か、電車に乗っているか否か、外にいるか否か、音楽を聴いているか否か等の状態を判定する。 Also, the determining unit 408 may perform frequency analysis on the sound data and determine the state of the first user based on the result of this frequency analysis. For example, as a result of the frequency analysis, the determination unit 408 determines whether the first user's voice is included, whether the user is on a train, whether the user is outside, whether the user is listening to music, and the like. judge.
 以上の処理により、判定部408は、第1ユーザが単に通話可能か否かを判定するだけではなく、さらに詳細な音に関する状態を推定することが可能になる。例えば、判定部408は、第1ユーザの周囲が賑やかであるか、静かであるかなどの状態を推定することが可能になる。 Through the above processing, the determination unit 408 can estimate the detailed sound-related state in addition to simply determining whether or not the first user can make a call. For example, the determination unit 408 can estimate whether the surroundings of the first user are lively or quiet.
 通話制御部404は、受信部402により受信されたユーザ端末20Bからの通話リクエストに基づき、ユーザ端末20Aを特定する。例えば、通話制御部404は、通話リクエストから通話先情報(例えばURI)を特定し、特定された通話先情報を用いてユーザ端末20AのIPアドレスを取得する。通話制御部404は、図示しないロケーションサーバに問い合わせて、通話先情報からIPアドレスを特定してもよい。 The call control unit 404 identifies the user terminal 20A based on the call request received by the receiving unit 402 from the user terminal 20B. For example, the call control unit 404 identifies call destination information (for example, URI) from the call request, and acquires the IP address of the user terminal 20A using the identified call destination information. The call control unit 404 may query a location server (not shown) to identify the IP address from the destination information.
 送信部410は、通話制御部404により特定されたユーザ端末20Aに通話リクエストを送信する。例えば、通話制御部404は、特定されたIPアドレスを有するユーザ端末20Aに、通話リクエストを送信する。 The transmission unit 410 transmits a call request to the user terminal 20A identified by the call control unit 404. For example, the call control unit 404 transmits a call request to the user terminal 20A having the identified IP address.
 また、受信部402は、送信された通話リクエストに対して、通話レスポンスを、ユーザ端末20Aから受信する。この後、通話制御部404は、通話セッションを確立し、相互に通話データを送受信するよう制御する。 The receiving unit 402 also receives a call response from the user terminal 20A in response to the transmitted call request. After that, the call control unit 404 establishes a call session and controls mutual transmission and reception of call data.
 (ユーザ端末20Bの状態報知機能)
 ユーザ端末20Bは、画像取得部302B、音取得部304B、送信部306B、受信部312B、データ取得部314B、生成部316B、出力制御部318B、通話処理部320Bを有する。ユーザ端末20Bにおける各部の機能は、図4に示す制御部202がアプリプログラム210を実行することにより実現される。また、ユーザ端末20Bの機能は、ユーザ端末20Aが有する機能と同様である。以下では、状態報知機能を主に説明する。
(Status reporting function of user terminal 20B)
The user terminal 20B has an image acquisition section 302B, a sound acquisition section 304B, a transmission section 306B, a reception section 312B, a data acquisition section 314B, a generation section 316B, an output control section 318B, and a call processing section 320B. The function of each unit in the user terminal 20B is realized by executing the application program 210 by the control unit 202 shown in FIG. Also, the functions of the user terminal 20B are the same as those of the user terminal 20A. The status notification function will be mainly described below.
 受信部312Bは、サーバ10から、ユーザ端末20Aの第1ユーザの状態情報を受信する。例えば、受信された第1ユーザの状態情報は、制御部202のRAMなどに記憶される。このとき、受信部312Bは、第1ユーザの状態情報を送信したユーザ端末20Aの識別情報又は第1ユーザの識別情報を合わせて受信し、RAMに記憶してもよい。 The receiving unit 312B receives the state information of the first user of the user terminal 20A from the server 10. For example, the received state information of the first user is stored in the RAM of the control unit 202 or the like. At this time, the receiving unit 312B may also receive the identification information of the user terminal 20A that transmitted the state information of the first user or the identification information of the first user, and store it in the RAM.
 データ取得部314Bは、例えばRAMに記憶された第1ユーザの状態情報を取得する。また、データ取得部314Bは、受信部312Bから第1ユーザの状態情報を取得してもよい。 The data acquisition unit 314B acquires the state information of the first user stored in the RAM, for example. Also, the data acquisition unit 314B may acquire the state information of the first user from the reception unit 312B.
 生成部316Bは、第1ユーザの状態情報に基づいて、第1ユーザの状態を報知可能な状態報知情報を生成する。例えば、生成部316Bは、状態情報が示す第1ユーザの状態を表す文字データ、画像データ等を含むUI部品を生成する。 The generation unit 316B generates state notification information that can notify the state of the first user based on the state information of the first user. For example, the generation unit 316B generates a UI component including character data, image data, etc. representing the state of the first user indicated by the state information.
 出力制御部318Bは、生成部316Bにより状態報知情報を出力制御する。出力制御部318Bは、例えば、状態報知情報が状態表示情報であれば表示部214の表示画面に表示制御し、状態報知情報が状態音情報であればスピーカ222から音を出力するよう制御する。例えば、出力制御部318Bは、状態報知情報をポップアップとして表示されるよう制御したりする。 The output control unit 318B controls the output of the state notification information by the generation unit 316B. For example, the output control unit 318B controls display on the display screen of the display unit 214 if the state notification information is state display information, and controls so that sound is output from the speaker 222 if the state notification information is state sound information. For example, the output control unit 318B controls the state notification information to be displayed as a popup.
 以上の処理により、第2ユーザは、ユーザ端末20Bの画面を見ることで、第1ユーザの現在の状況を把握することができる。その結果、第1ユーザが通話可能な状態であれば、第2ユーザは、第1ユーザに電話をかければよい。 By the above process, the second user can grasp the current situation of the first user by looking at the screen of the user terminal 20B. As a result, if the first user is available for communication, the second user should call the first user.
 また、出力制御部318Bは、表示画面の所定領域内に状態報知情報を表示制御してもよい。例えば、出力制御部318Bは、表示画面内に予め設定しておいたウィンドウに状態報知情報を表示するよう制御する。また、出力制御部318Bは、開示の通話機能を実現するアプリケーションの実行画面内に設定された所定領域に状態報知情報を表示するよう制御してもよい。また、所定領域は、上述したアプリケーションを実行可能なウィジェットが表示される領域でもよい。例えば、このウィジェット上に状態報知情報が表示され、この状態報知情報がウィジェット上で適宜更新される。 Also, the output control unit 318B may control the display of the status notification information within a predetermined area of the display screen. For example, the output control unit 318B controls to display the state notification information in a window set in advance within the display screen. Further, the output control unit 318B may control to display the status notification information in a predetermined area set in the execution screen of the application that realizes the disclosed call function. Also, the predetermined area may be an area in which a widget capable of executing the application described above is displayed. For example, status notification information is displayed on this widget, and this status notification information is appropriately updated on the widget.
 以上の処理により、第2ユーザは、画面内の所定領域を見ることで、第1ユーザの現在の状況を把握することができる。 By the above processing, the second user can grasp the current situation of the first user by looking at the predetermined area in the screen.
 また、出力制御部318Bは、状態報知情報が更新される場合に、更新後の状態報知情報を所定領域内に表示制御してもよい。例えば、出力制御部318Bは、状態報知情報が第1状態から第2状態を示すように更新された場合に、第2状態を示す状態報知情報を所定領域内に表示制御する。 In addition, when the state notification information is updated, the output control unit 318B may control display of the updated state notification information within a predetermined area. For example, when the state notification information is updated from the first state to indicate the second state, the output control unit 318B controls display of the state notification information indicating the second state within a predetermined area.
 以上の処理により、状態報知情報が生成される度ではなく、状態報知情報が更新される度に表示画面が更新されることで、状態報知情報による画面の更新頻度を抑えることができ、ユーザ端末20Bの処理負荷を軽減することができる。 With the above processing, the display screen is updated each time the status notification information is updated, not each time the status notification information is generated. 20B processing load can be reduced.
 また、生成部316Bは、サーバ10を介して通話を実行するアプリケーションのアイコンが表示画面に存在する場合、状態報知情報としてアイコンの画像を生成することを含んでもよい。例えば、カレンダーアイコンのように動的にアイコンの表示を変更する機能を使用して、生成部316Bは、状態報知情報が示す各状態に対応するアイコンの画像又は文字を設定しておき、サーバ10から取得した状態情報に応じたアイコンの画像又は文字を特定する。この場合、出力制御部318Bは、生成部316Bにより特定された画像又は文字をアイコンに表示するよう制御する。 In addition, the generating unit 316B may include generating an image of the icon as the status notification information when an icon of an application that executes a call via the server 10 exists on the display screen. For example, using a function for dynamically changing the display of icons such as calendar icons, the generation unit 316B sets icon images or characters corresponding to each state indicated by the state notification information, and the server 10 Identify the image or text of the icon according to the status information obtained from. In this case, the output control unit 318B performs control so that the image or characters specified by the generation unit 316B are displayed on the icon.
 以上の処理により、第2ユーザは、本開示の機能を実行するアプリケーションのアイコンを見ることで、第1ユーザの状態を把握することができる。例えば、ホーム画面に表示されるアイコンを見ることにより、第1ユーザが通話可能な状態にあるか否かを判断することが可能になる。 By the above processing, the second user can grasp the state of the first user by looking at the icon of the application that executes the functions of the present disclosure. For example, by looking at icons displayed on the home screen, it is possible to determine whether or not the first user is ready to talk.
 また、生成部316Bは、第1ユーザの状態を示す動画像又はアニメーションを含む状態報知情報を生成してもよい。例えば、生成部316Bは、状態情報により示される第1ユーザの状態を表す動画像やアニメーションなどを状態報知情報としてもよい。具体例としては、動画像は、所定の部屋で第1ユーザを撮影した過去の動画像である。この場合、生成部316Bは、状態情報が、第1ユーザがリラックスしている状態(ソファに座っている等)を示す場合は、第1ユーザが部屋のソファに座っている過去の動画像を選択し、第1ユーザが不在の状態を示す場合は、第1ユーザが部屋にいない過去の動画像を選択したりする。 The generation unit 316B may also generate state notification information including a moving image or animation indicating the state of the first user. For example, the generation unit 316B may use a moving image, an animation, or the like representing the state of the first user indicated by the state information as the state notification information. As a specific example, the moving image is a past moving image of the first user captured in a predetermined room. In this case, if the state information indicates that the first user is relaxed (such as sitting on a sofa), the generating unit 316B generates a past moving image of the first user sitting on the sofa in the room. If it is selected and indicates that the first user is absent, a past moving image in which the first user is not in the room is selected.
 また、生成部316Bは、状態情報として、ユーザ端末20Aにより撮影される画像データを取得する場合は、第1ユーザ以外の画像にはエフェクト(モザイク、別背景など)をかけてプライバシーを保護するようにしてもよい。 In addition, when acquiring image data captured by the user terminal 20A as state information, the generation unit 316B applies an effect (mosaic, another background, etc.) to images of users other than the first user to protect privacy. can be
 また、生成部316Bは、アニメーションとして、所定のオブジェクトの動きで第1ユーザの状態を表現してもよい。例えば、生成部316Bは、第1ユーザが移動中や運動中の状態の場合、オブジェクトを早く動かしたり、第1ユーザが不在の状態の場合、オブジェクトを非表示にしたり、第1ユーザがリラックスしている状態の場合、オブジェクトをゆっくりと動かしたりする。 In addition, the generation unit 316B may express the state of the first user as an animation with the movement of a predetermined object. For example, when the first user is moving or exercising, the generating unit 316B moves the object quickly, when the first user is absent, hides the object, or makes the first user relax. If you are in a state where you are moving the object slowly.
 以上の処理により、第1ユーザの状態を表示する際に、プライバシーを保護することができ、第1ユーザは、安心して本開示の機能の実行を許可することができる。 With the above processing, it is possible to protect the privacy when displaying the state of the first user, and the first user can safely permit the execution of the functions of the present disclosure.
 また、生成部316Bは、第1ユーザの状態に関する情報が通話可能であることを示す場合、ユーザ端末20Aに対して通話処理を開始するUI部品を特定することを含んでもよい。この場合、出力制御部318Bは、状態報知情報の出力とともにUI部品を表示画面に表示制御することを含んでもよい。 In addition, the generation unit 316B may include specifying a UI component for starting call processing for the user terminal 20A when the information regarding the state of the first user indicates that a call is possible. In this case, the output control unit 318B may control the display of the UI component on the display screen together with the output of the status notification information.
 例えば、生成部316Bは、状態情報が通話可能であることを示す場合、第1ユーザに電話をかけるボタンを特定又は選択する。なお、状態情報には複数の状態が含まれうるが、各状態について通話可能であるか否かが予め設定されていればよい。また、出力制御部318Bは、UI部品を状態報知情報に関連付けて表示するように制御してもよい。例えば、表示情報に隣接する位置にUI部品を表示するように制御する。 For example, when the status information indicates that a call is possible, the generation unit 316B identifies or selects a button to call the first user. Note that the state information may include a plurality of states, and whether or not a call is possible may be set in advance for each state. In addition, the output control unit 318B may perform control so that the UI component is displayed in association with the status notification information. For example, control is performed so that the UI component is displayed at a position adjacent to the display information.
 以上の処理により、第2ユーザは、第1ユーザが通話可能な状態であれば、UI部品を操作することで、第1ユーザの状態を確認してからスムーズに電話をかけることができる。 With the above processing, if the first user is in a callable state, the second user can confirm the first user's state by operating the UI parts, and then smoothly make a call.
 第1ユーザに電話をかけるUI部品が第2ユーザにより操作されると、送信部306Bは、通話リクエストを、サーバ10を介してユーザ端末20Aに送信する。この通話リクエストに対して、ユーザ端末20AからOKの通話レスポンスが返されれば、通話が開始される。 When the second user operates the UI component for making a call to the first user, the transmission unit 306B transmits a call request to the user terminal 20A via the server 10. If an OK call response is returned from the user terminal 20A in response to this call request, the call is started.
 以上、各装置の機能を実行することにより、通話元となるユーザは、通話を開始する前に、通話先となるユーザの現在の状態を把握することができる。また、現在の状態として、通話先のユーザの感情や動きなどが含められる場合は、その状態を把握したうえで通話をすることが可能になるため、通話の開始を円滑に進めることができる。 As described above, by executing the functions of each device, the user who is the calling party can grasp the current state of the user who is the calling party before starting the call. In addition, when the current state includes the emotions and movements of the user to whom the call is being made, it is possible to make a call while grasping the state, so that the start of the call can proceed smoothly.
 <画面例>
 次に、図6を用いて、第1実施形態に係る通話先となるユーザの状態を含む画面例を説明する。図6Aは、第1実施形態に係る通話先のユーザの状態を含む画面例1を示す図である。図6Aに示す例では、状態報知情報がポップアップA1として画面に表示される。また、状態情報が、ユーザ(Aさん)がソファに座っている状態を示すため、通話可能であると判定される。また、通話可能であると判定されたことに伴い、「電話する」ボタンのUI部品A2が画面内に表示される。
<Screen example>
Next, with reference to FIG. 6, an example screen including the state of the user who is the callee according to the first embodiment will be described. FIG. 6A is a diagram showing a screen example 1 including a state of a call destination user according to the first embodiment. In the example shown in FIG. 6A, the status notification information is displayed on the screen as a popup A1. Also, since the state information indicates that the user (Mr. A) is sitting on the sofa, it is determined that the call is possible. Also, when it is determined that the call is possible, the UI component A2 of the "call" button is displayed on the screen.
 図6Bは、第1実施形態に係る通話先のユーザの状態を含む画面例2を示す図である。図6Bに示す例では、本開示の通話機能を実行するアプリケーションのアイコンB1に状態報知情報が表示される。この場合、ユーザがこのアイコンB1をクリックして、このアプリケーションの実行画面に遷移した後に、電話をかけるボタンが表示されてもよい。 FIG. 6B is a diagram showing Screen Example 2 including the state of the called user according to the first embodiment. In the example shown in FIG. 6B, the status notification information is displayed on the icon B1 of the application that executes the call function of the present disclosure. In this case, after the user clicks this icon B1 and transitions to the execution screen of this application, a button to make a call may be displayed.
 図6Cは、第1実施形態に係る通話先のユーザの状態を含む画面例3を示す図である。図6Cに示す例では、本開示の通話機能を実行するアプリケーションの実行画面の所定領域C1に状態報知情報が表示される。また、状態情報が、ユーザがソファに座っている状態を示すため、通話可能であると判定される。また、通話可能であると判定されたことに伴い、「電話する」ボタンのUI部品A2が実行画面内に表示される。 FIG. 6C is a diagram showing a screen example 3 including the state of the called user according to the first embodiment. In the example shown in FIG. 6C, the status notification information is displayed in a predetermined area C1 of the execution screen of the application that executes the call function of the present disclosure. Also, since the state information indicates that the user is sitting on the sofa, it is determined that the call is possible. Further, when it is determined that the call is possible, the UI component A2 of the "call" button is displayed in the execution screen.
 図6Dは、第1実施形態に係る通話先のユーザの状態を含む画面例4を示す図である。図6Dに示す例では、本開示の通話機能を実行するアプリケーションの実行画面に複数の所定領域があり、それぞれの所定領域C1、D1に異なるユーザの現在の状態が表示される。例えば、Bさんは不在にしている状態であり、状態報知情報が表示される所定領域D1には、部屋が暗いことをイメージさせる黒い画面などが表示される。 FIG. 6D is a diagram showing a screen example 4 including the state of the called user according to the first embodiment. In the example shown in FIG. 6D, there are a plurality of predetermined areas on the execution screen of the application that executes the call function of the present disclosure, and different user's current states are displayed in the respective predetermined areas C1 and D1. For example, Mr. B is absent, and in the predetermined area D1 where the status notification information is displayed, a black screen or the like is displayed to give the impression that the room is dark.
 図6Dに示す例では、サーバ10は、所定のユーザに対する通話先の候補として、複数のユーザ(図6Dに示す例では、AさんとBさん)を通話対応情報に設定しておく。サーバ10は、AさんとBさんそれぞれのユーザ端末20から画像データ/音データを取得し、それぞれの状態を判定する。ユーザ端末20は、サーバ10から各ユーザの状態情報を受信すると、図6Dに示すように、複数のユーザの状態報知情報を画面に表示することが可能である。 In the example shown in FIG. 6D, the server 10 sets a plurality of users (Mr. A and Mr. B in the example shown in FIG. 6D) in the call handling information as call destination candidates for a predetermined user. The server 10 acquires the image data/sound data from the user terminals 20 of Mr. A and Mr. B, and determines the state of each. Upon receiving the state information of each user from the server 10, the user terminal 20 can display the state notification information of a plurality of users on the screen as shown in FIG. 6D.
 これにより、通話元となるユーザは、複数の通話先の候補となるユーザの現在の状態を容易に把握することができる。 As a result, the user who is the caller can easily grasp the current status of multiple callee candidates.
 <動作>
 次に、情報処理システム1の各動作について説明する。図7は、第1実施形態に係るサーバ10の動作処理の一例を示すフローチャートである。以下、通話先となる第1ユーザのユーザ端末20Aが画像データ等を送信し、通話元となる第2ユーザのユーザ端末20Bが第1ユーザの状態を表示する例に対する動作を説明する。
<Action>
Next, each operation of the information processing system 1 will be described. FIG. 7 is a flowchart showing an example of operation processing of the server 10 according to the first embodiment. Hereinafter, an operation will be described for an example in which the user terminal 20A of the first user, who is the callee, transmits image data and the like, and the user terminal 20B of the second user, who is the caller, displays the state of the first user.
 ステップS102で、サーバ10の受信部402は、第1ユーザのユーザ端末20Aから送信されるメディアデータに含まれる画像データ/音データを受信する。次に、サーバ10の取得部406は、受信された画像データ/音データを取得する。 At step S102, the receiving unit 402 of the server 10 receives the image data/sound data included in the media data transmitted from the user terminal 20A of the first user. Next, the acquisition unit 406 of the server 10 acquires the received image data/sound data.
 ステップS104で、サーバ10の判定部408は、取得された画像データ/音データに基づき、第1ユーザの現在の状態を判定する。 At step S104, the determination unit 408 of the server 10 determines the current state of the first user based on the acquired image data/sound data.
 ステップS106で、サーバ10の判定部408は、判定された状態が現在の状態から変化したか否かを判定する。状態が変化すれば(ステップS106-YES)、処理はステップS108に進み、状態が変化しなければ(ステップS106-NO)、処理はステップS102に戻る。 In step S106, the determination unit 408 of the server 10 determines whether the determined state has changed from the current state. If the state changes (step S106-YES), the process proceeds to step S108, and if the state does not change (step S106-NO), the process returns to step S102.
 ステップS106で、サーバ10の判定部408は、画像データ/音データを含むメディアデータを送信したユーザ端末20Aの識別情報又は第1ユーザの識別情報に対応付けられ、ユーザ端末20Aと通話可能であるユーザ端末20Bを、通知先として特定する。判定部408は、例えば上述した通話対応情報を参照することで、通知先を特定することが可能である。 In step S106, the determination unit 408 of the server 10 is associated with the identification information of the user terminal 20A that transmitted the media data including the image data/sound data or the identification information of the first user, and can communicate with the user terminal 20A. The user terminal 20B is specified as the notification destination. The determination unit 408 can identify the notification destination by referring to the call handling information described above, for example.
 ステップS108で、サーバ10の送信部410は、判定された第1ユーザの状態に関する情報(状態情報)を、通知先として特定されたユーザ端末20Bに送信する。 In step S108, the transmission unit 410 of the server 10 transmits information (state information) regarding the determined state of the first user to the user terminal 20B specified as the notification destination.
 図8は、第1実施形態に係る状態を表示する側の動作処理の一例を示すフローチャートである。図8に示す例では、ユーザ端末20Bが行う処理を主に説明する。図8に示す例は、状態報知情報は状態表示情報であり、出力制御部318Bは、状態表示情報を表示画面に表示制御する例である。 FIG. 8 is a flowchart showing an example of operation processing on the side that displays the state according to the first embodiment. In the example shown in FIG. 8, the processing performed by the user terminal 20B will be mainly described. The example shown in FIG. 8 is an example in which the state notification information is state display information, and the output control unit 318B controls the display of the state display information on the display screen.
 ステップS202で、ユーザ端末20Bの受信部312Bは、サーバ10から、第1ユーザの状態情報を受信する。また、ユーザ端末20Bのデータ取得部314Bは、受信された状態情報を取得する。 In step S202, the receiving unit 312B of the user terminal 20B receives the state information of the first user from the server 10. Also, the data acquisition unit 314B of the user terminal 20B acquires the received state information.
ステップS204で、ユーザ端末20Bの生成部316Bは、取得された第1ユーザの状態情報に基づいて、状態表示情報を生成する。例えば、生成部316Bは、第1ユーザの状態を示す画像や文字などを含むUI部品などを生成する。 In step S204, the generation unit 316B of the user terminal 20B generates state display information based on the acquired state information of the first user. For example, the generation unit 316B generates a UI component or the like including an image, characters, or the like indicating the state of the first user.
 ステップS206で、ユーザ端末20Bの出力制御部318Bは、生成された状態表示情報を表示画面に表示制御する。 At step S206, the output control unit 318B of the user terminal 20B controls display of the generated status display information on the display screen.
 ステップS208で、ユーザ端末20Bは、第2ユーザから、通話を開始するUI部品の操作が行われたか否かを判定する。通話開始の操作があれば(ステップS208-YES)、送信部306Bは、サーバ10を介してユーザ端末20Aに通話リクエストを送信して処理はステップS210に進み、通話開始の操作がなければ(ステップS2108-NO)、処理はステップS202に戻る。 In step S208, the user terminal 20B determines whether or not the second user has operated a UI component for starting a call. If there is a call start operation (step S208-YES), the transmission unit 306B transmits a call request to the user terminal 20A via the server 10, the process proceeds to step S210, and if there is no call start operation (step S2108-NO), the process returns to step S202.
 ステップS210で、ユーザ端末20Bの受信部312Bは、ユーザ端末20Aからサーバを介して通話レスポンスを受信したか否かを判定する。通話レスポンスを受信すれば(ステップS210-YES)、処理はステップS212に進み、通話レスポンスを受信しなければ(ステップS210-NO)、第1ユーザが応答しなかったということで通話処理を終了し、処理はステップS202に戻る。 At step S210, the receiving unit 312B of the user terminal 20B determines whether or not a call response has been received from the user terminal 20A via the server. If a call response is received (step S210-YES), the process proceeds to step S212, and if a call response is not received (step S210-NO), the call process ends because the first user did not answer. , the process returns to step S202.
 ステップS212で、ユーザ端末20Bは、通話セッションを開始し、ユーザ端末20Aと、通話データを送受信して通話を開始する。 At step S212, the user terminal 20B starts a call session, transmits/receives call data to/from the user terminal 20A, and starts a call.
 以上、第1実施形態における情報処理システムによれば、通話を開始する前に、通話先となるユーザの現在の状態を、通話元となるユーザが把握することができる。 As described above, according to the information processing system in the first embodiment, the user who is the caller can grasp the current state of the user who is the callee before starting the call.
  また、上述した第1実施形態で説明した情報処理は、コンピュータに実行させるためのプログラムとして実現されてもよい。このプログラムをサーバ等からインストールしてコンピュータに実行させることで、前述した情報処理を実現することができる。   Further, the information processing described in the first embodiment may be implemented as a program to be executed by a computer. By installing this program from a server or the like and causing a computer to execute the program, the information processing described above can be realized.
 また、このプログラムを記録媒体に記録し、このプログラムが記録された記録媒体をコンピュータに読み取らせて、前述した情報処理を実現させることも可能である。記録媒体は、「一時的でない有形の媒体」に、プログラムを記憶可能である。 It is also possible to record this program on a recording medium and cause a computer to read the recording medium on which this program is recorded, thereby realizing the information processing described above. The recording medium can store the program in a "non-temporary tangible medium".
 なお、記録媒体は、CD-ROM、フレキシブルディスク、光磁気ディスク等の様に情報を光学的,電気的或いは磁気的に記録する記録媒体、ROM、フラッシュメモリ等の様に情報を電気的に記録する半導体メモリ等、様々なタイプの記録媒体を用いることができる。 Recording media include recording media that record information optically, electrically, or magnetically, such as CD-ROMs, flexible disks, and magneto-optical disks; and recording media that record information electrically, such as ROMs and flash memories. Various types of recording media can be used, such as a semiconductor memory that
 [第2実施形態]
 図9は、本開示の第2実施形態に係るシステム2の概要を説明する図である。図2に示されるシステム2は、画像を管理するためのシステムに第1実施形態で説明したシステムを適用したシステムであり、管理装置10Aと、サーバ10Bと、ユーザ端末20と、画像出力装置30と、カメラ60とを含む。また、画像出力装置30は、表示装置40と接続され、表示装置40は、遠隔制御装置50によって制御される。表示装置40は、カメラ60が取り付けられ、カメラ60により取得された画像は画像出力装置30に入力される。例えば、カメラ60は、Webカメラであり、マイクを内蔵してもよい。サーバ10Bは、第1実施形態で説明したサーバ10の機能を有する。
[Second embodiment]
FIG. 9 is a diagram explaining an overview of the system 2 according to the second embodiment of the present disclosure. A system 2 shown in FIG. 2 is a system in which the system described in the first embodiment is applied to a system for managing images, and includes a management device 10A, a server 10B, a user terminal 20, and an image output device 30. , and a camera 60 . The image output device 30 is also connected to the display device 40 , and the display device 40 is controlled by the remote control device 50 . A camera 60 is attached to the display device 40 , and an image acquired by the camera 60 is input to the image output device 30 . For example, the camera 60 is a web camera and may have a built-in microphone. The server 10B has the functions of the server 10 described in the first embodiment.
 管理装置10A又はサーバ10Bとユーザ端末20とは、通信技術の一例としての無線LAN又は第4又は第5世代移動通信システム(4G又は5G)若しくはLTE(Long Term Evolution)等によるネットワークN1により、互いに通信可能である。また、管理装置10Aと画像出力装置30とは、第3世代移動通信システム(3G)のような、ネットワークN1と比べて通信料金が安価であるが低速な、無線ネットワークN2により、互いに通信可能である。なお、説明のために、ネットワークN1及び無線ネットワークN2を区別して記載したが、これらのネットワークは、インターネットにより、互いに接続され得る。また、ネットワーク構成は、上記例に限られない。 The management device 10A or the server 10B and the user terminal 20 are mutually connected via a network N1 such as a wireless LAN, a fourth or fifth generation mobile communication system (4G or 5G), or LTE (Long Term Evolution) as an example of communication technology. Communication is possible. Also, the management device 10A and the image output device 30 can communicate with each other via a wireless network N2, such as a third generation mobile communication system (3G), which has a lower communication charge than the network N1 but is slower than the network N1. be. Although the network N1 and the wireless network N2 are described separately for the sake of explanation, these networks can be connected to each other via the Internet. Also, the network configuration is not limited to the above example.
 管理装置10Aは、まず、インターネットのWebサイトを通じて、画像の投稿者から、画像の閲覧者又は投稿者の氏名や住所等の情報を取得する。なお、閲覧者又は投稿者の氏名や住所等の情報は必須の情報ではない。このとき、管理装置10Aは、閲覧者により使用される画像出力装置30の識別子(デバイスIDと呼ぶ)を生成する。次いで、画像の投稿者に、例えば電子メールで、デバイスIDを通知する。 The management device 10A first acquires information such as the name and address of the viewer or poster of the image from the poster of the image through the website on the Internet. Information such as the name and address of the viewer or contributor is not essential information. At this time, the management device 10A generates an identifier (referred to as a device ID) of the image output device 30 used by the viewer. Next, the device ID is notified to the contributor of the image, for example, by e-mail.
 管理装置10Aは、生成されたデバイスIDを、画像出力装置30に設定する。その後、管理装置10Aの管理者により、画像出力装置30が、画像の閲覧者の住所へと発送される。画像出力装置30は、例えば3G通信モジュールを内蔵しており、電源が投入されると、設定されたデバイスIDを用いて、すぐに管理装置10Aとの通信を開始するよう構成される。 The management device 10A sets the generated device ID in the image output device 30. Thereafter, the image output device 30 is shipped to the address of the viewer of the image by the administrator of the management device 10A. The image output device 30 incorporates, for example, a 3G communication module, and is configured to immediately start communication with the management device 10A using the set device ID when the power is turned on.
 一方、画像の投稿者は、例えばスマートフォンやタブレットのようなユーザ端末20で動作する、画像共有を行うアプリケーションをダウンロードする。このアプリケーションは、通知されたデバイスIDを用いて、管理装置10Aにアクセスする。管理装置10Aは、アプリケーションから通知されたデバイスIDをキーとして、ユーザ端末20(投稿者)と画像出力装置30(閲覧者)とを関連付けることができる。この情報は、上述した通話対応情報に相当する。 On the other hand, the person who posted the image downloads an application for image sharing that operates on the user terminal 20, such as a smartphone or tablet. This application uses the notified device ID to access the management device 10A. The management device 10A can associate the user terminal 20 (contributor) and the image output device 30 (viewer) using the device ID notified from the application as a key. This information corresponds to the call handling information described above.
 その後、投稿者は、アプリケーションを用いて、様々な被写体を撮影することができる。アプリケーションは、撮影により取得された画像データを、ネットワークN1を介して、管理装置10Aに自動的に送信する。投稿者は、画像データを管理装置10Aに送信するために、特別な操作を要求されない。 After that, the contributor can use the application to shoot various subjects. The application automatically transmits image data obtained by shooting to the management device 10A via the network N1. The contributor is not required to perform any special operation in order to transmit the image data to the management device 10A.
 管理装置10Aは、ユーザ端末20のアプリケーションから送信された画像データを蓄積し、順次、画像出力装置30に配信する。画像出力装置30は、閲覧者からの指示に応じて、画像データを表示装置40に表示する。また、画像出力装置30は、マイクとスピーカとを有してもよく、IP通話が可能である。 The management device 10A accumulates the image data transmitted from the application of the user terminal 20 and sequentially distributes it to the image output device 30. The image output device 30 displays the image data on the display device 40 according to an instruction from the viewer. Also, the image output device 30 may have a microphone and a speaker, and IP calls are possible.
 ここで、表示装置40は、例えば、一般家庭で普及しているテレビであり、遠隔制御装置50は、リモートコントローラである。画像出力装置30は、例えば、HDMI(High-Definition Multimedia Interface)(登録商標)により、表示装置40と接続され、HDMIを介して、遠隔制御装置50から発せられた制御信号を取得することができる。 Here, the display device 40 is, for example, a television widely used in ordinary homes, and the remote control device 50 is a remote controller. The image output device 30 is connected to the display device 40 via, for example, HDMI (High-Definition Multimedia Interface) (registered trademark), and can acquire control signals emitted from the remote control device 50 via HDMI. .
 画像出力装置30は、遠隔制御装置50から発せられた制御信号を取得し、閲覧者の入力した操作の内容を把握することができる。すなわち、閲覧者は、普段から慣れ親しんだ、テレビのリモコンを用いて、配信された画像データを閲覧することが可能となる。 The image output device 30 can acquire the control signal issued from the remote control device 50 and grasp the content of the operation input by the viewer. In other words, the viewer can view the distributed image data using the familiar TV remote control.
 カメラ60は、例えばWebカメラであり、マイクを内蔵してもよい。カメラ60により撮像される画像(静止画又は動画)データ/音データは、画像出力装置30に入力される。画像出力装置30は、入力された画像データ/音データに関するメディアデータをサーバ10に送信する。 The camera 60 is, for example, a web camera, and may have a built-in microphone. Image (still image or moving image) data/sound data captured by the camera 60 is input to the image output device 30 . The image output device 30 transmits media data related to the input image data/sound data to the server 10 .
 上述した画像共有を行うアプリケーションに、第1実施形態で説明した通話を開始することができる機能が実装される。この場合、図1等に示すユーザ端末20Aは、画像出力装置30に相当する。例えば、画像出力装置30は、ユーザ端末20の送信部306、受信部312、データ取得部314、生成部316、出力制御部318、通話処理部320を有する。 The function for starting a call described in the first embodiment is implemented in the application for image sharing described above. In this case, the user terminal 20A shown in FIG. 1 etc. corresponds to the image output device 30. FIG. For example, the image output device 30 has a transmission section 306 , a reception section 312 , a data acquisition section 314 , a generation section 316 , an output control section 318 and a call processing section 320 of the user terminal 20 .
 具体的には、カメラ60により取得された画像データ/音データを含むメディアデータは、画像出力装置30を介してサーバ10Bに送信される。サーバ10Bは、取得したメディアデータに含まれる画像データ/音データに基づいて、画像の閲覧者の状態を判定する。例えば、サーバ10Bは、画像データ/音データを解析することで、表示装置40のある部屋に閲覧者が動いている、椅子に座っている、部屋にいない、賑わっている、静かであるなどの状態を判定する。 Specifically, media data including image data/sound data acquired by the camera 60 is transmitted to the server 10B via the image output device 30. The server 10B determines the state of the viewer of the image based on the image data/sound data included in the acquired media data. For example, the server 10B analyzes the image data/sound data to determine whether the viewer is moving, sitting on a chair, not in the room, busy, or quiet in the room where the display device 40 is located. determine the state.
 サーバ10Bは、管理装置10Aに保持される情報を参照し、画像出力装置30に関連付けられるユーザ端末20を特定し、このユーザ端末20に閲覧者の状態情報を送信する。ユーザ端末20は、上述したように、状態情報に基づき状態報知情報(例えば状態表示情報)を生成し、表示画面に表示するよう制御する(例えば図6)。 The server 10B refers to the information held in the management device 10A, identifies the user terminal 20 associated with the image output device 30, and transmits the status information of the viewer to this user terminal 20. As described above, the user terminal 20 generates state notification information (for example, state display information) based on the state information and controls to display it on the display screen (for example, FIG. 6).
 ユーザ端末20は、投稿者が電話をかけるボタンを操作したことを検知すると、通話リクエストを、サーバ10Bを介して画像出力装置30に送信する。画像出力装置30は、通話リクエストを受信すると、表示装置40に応答するか否かを表示する。例えば、遠隔制御装置50の第1ボタンが応答に割り当てられ、第2ボタンが拒否に割り当てられる。 When the user terminal 20 detects that the contributor has operated the call button, the user terminal 20 transmits a call request to the image output device 30 via the server 10B. When receiving the call request, the image output device 30 displays on the display device 40 whether or not to respond. For example, a first button on remote control 50 is assigned to answer and a second button is assigned to decline.
 閲覧者が、第1ボタンを操作すると、画像出力装置30は、第1ボタンの操作を検知し、通話レスポンスを、サーバ10Bを介してユーザ端末20に送信する。この後、通話が開始される。 When the viewer operates the first button, the image output device 30 detects the operation of the first button and transmits a call response to the user terminal 20 via the server 10B. After this, the call is started.
 以上、第2実施形態によれば、画像を閲覧者と投稿者で共有するシステムに、第1実施形態に開示される通話機能をもたせることができる。また、管理装置10Aとサーバ10Bとは、同じサーバであってもよい。 As described above, according to the second embodiment, a system for sharing images between viewers and contributors can have the call function disclosed in the first embodiment. Also, the management device 10A and the server 10B may be the same server.
 [第3実施形態]
 次に、本開示の第3実施形態について説明する。第3実施形態では、第1実施形態において説明した判定部408の機能がユーザ端末20に設けられる。例えば、ユーザ端末20の判定部は、画像取得部302により所得される画像データ、及び/又は、音取得部304により取得される音データに基づいて、自端末を利用するユーザの状態を判定する。
[Third embodiment]
Next, a third embodiment of the present disclosure will be described. In the third embodiment, the user terminal 20 is provided with the function of the determination unit 408 described in the first embodiment. For example, the determination unit of the user terminal 20 determines the state of the user using the own terminal based on the image data acquired by the image acquisition unit 302 and/or the sound data acquired by the sound acquisition unit 304. .
 ユーザの状態判定の方法は、上述したとおりである。例えば、ユーザ端末20の判定部は、画像データに対して物体認識を行い、及び/又は、音データに対して周波数解析を行い、それぞれの結果を用いてユーザの状態を判定する。 The method for determining the user's status is as described above. For example, the determination unit of the user terminal 20 performs object recognition on image data and/or frequency analysis on sound data, and determines the state of the user using the respective results.
 ユーザ端末20の送信部306は、判定部により判定されたユーザの状態に関する情報(状態情報)を、通話前の他のユーザ端末に向けて送信する。例えば、送信部306は、他のユーザ端末に状態情報を送信するために、まずはサーバ10に状態情報を送信してもよい。サーバ10は、状態情報を取得すると、状態情報を送信したユーザ端末20に対応付けられている通知先(他のユーザ端末)を特定し、特定された通知先に状態情報を送信する。通知先のユーザ端末20は、状態情報を受信すると、第1実施形態と同様の処理を実行する。 The transmission unit 306 of the user terminal 20 transmits information (state information) regarding the state of the user determined by the determination unit to other user terminals before the call. For example, the transmitting unit 306 may first transmit the state information to the server 10 in order to transmit the state information to other user terminals. After acquiring the status information, the server 10 identifies a notification destination (another user terminal) associated with the user terminal 20 that transmitted the status information, and transmits the status information to the identified notification destination. Upon receiving the status information, the user terminal 20 of the notification destination executes the same processing as in the first embodiment.
 以上、第3実施形態によれば、サーバ10の負荷分散を行うことができる。また、ユーザ端末20は、画像データや音データを自端末で処理し、これらの画像データや音データを外部に送信しないので、プライバシーを保護することも可能である。 As described above, according to the third embodiment, the load of the server 10 can be distributed. In addition, since the user terminal 20 processes image data and sound data by itself and does not transmit these image data and sound data to the outside, it is possible to protect privacy.
 [第4実施形態]
 次に、本開示の第4実施形態について説明する。第4実施形態では、第1実施形態において説明した判定部408の一部の機能がユーザ端末20に設けられる。例えば、ユーザ端末20の前処理部は、画像取得部302により所得される画像データに対して物体認識を行い、及び/又は、音取得部304により取得される音データに対して周波数解析を行い、物体の検出や周波数の特徴量の抽出を行う。
[Fourth embodiment]
Next, a fourth embodiment of the present disclosure will be described. In the fourth embodiment, the user terminal 20 is provided with some functions of the determination unit 408 described in the first embodiment. For example, the preprocessing unit of the user terminal 20 performs object recognition on image data acquired by the image acquisition unit 302 and/or performs frequency analysis on sound data acquired by the sound acquisition unit 304. , object detection and frequency feature extraction.
 ユーザ端末20の送信部306は、前処理部により処理された物体認識の結果及び/又は特徴量(周波数解析の結果)を含む処理結果情報を含むメディアデータをサーバ10に送信する。サーバ10は、メディアデータを取得すると、判定部408が、取得したメディアデータに含まれる処理結果情報に基づいてユーザの状態を判定する。例えば、判定部408は、取得可能な各結果情報に対してユーザの状態を対応付けたテーブル等を保持しておくことで、結果情報からユーザの状態を判定することができる。 The transmission unit 306 of the user terminal 20 transmits to the server 10 media data including processing result information including object recognition results and/or feature amounts (frequency analysis results) processed by the preprocessing unit. When the server 10 acquires the media data, the determining unit 408 determines the state of the user based on the processing result information included in the acquired media data. For example, the determination unit 408 can determine the user's state from the result information by holding a table or the like in which the user's state is associated with each obtainable result information.
 サーバ10は、ユーザの状態を判定すると、メディアデータを送信したユーザ端末20に対応付けられている通知先を特定し、特定された通知先に、ユーザの状態に関する情報(状態情報)を送信する。通知先のユーザ端末20は、状態情報を受信すると、第1実施形態と同様の処理を実行する。 After determining the state of the user, the server 10 identifies the notification destination associated with the user terminal 20 that transmitted the media data, and transmits information (state information) regarding the user state to the identified notification destination. . Upon receiving the status information, the user terminal 20 of the notification destination executes the same processing as in the first embodiment.
 以上、第4実施形態によれば、第1実施形態よりも通信量を削減することが可能になり、処理負荷がかかる判定部の計算リソースを分担することができる。 As described above, according to the fourth embodiment, it is possible to reduce the amount of communication compared to the first embodiment, and it is possible to share the computational resources of the determination unit that is subject to the processing load.
 以上、上述した各実施形態は、本発明を説明するための例示であり、本発明をその実施形態のみに限定する趣旨ではなく、本発明は、その要旨を逸脱しない限り、さまざまな変形が可能である。 As described above, each embodiment described above is an example for explaining the present invention, and is not intended to limit the present invention only to the embodiment, and the present invention can be variously modified without departing from the gist thereof. is.
 [変形例]
 例えば、変形例1として、ユーザ端末20の出力制御部318は、生成部316により生成される状態報知情報を出力制御する以外にも、状態情報に応じて機器の操作を制御してもよい。具体例として、出力制御部318は、状態情報に応じて、部屋の照明の明るさ、音楽再生中の機器(自端末又は音楽プレーヤ)の音量、テレビや動画視聴中の機器(自端末又はテレビ等)の一時停止又は音量など、ネットワークで接続された機器を制御する操作制御部としても機能してもよい。
[Modification]
For example, as Modification 1, the output control unit 318 of the user terminal 20 may control the operation of the device according to the state information, in addition to controlling the output of the state notification information generated by the generation unit 316 . As a specific example, the output control unit 318 controls the brightness of the lighting in the room, the volume of the device that is playing music (own terminal or music player), the device that is watching TV or video (own terminal or TV), depending on the state information. etc.) may also function as an operation control unit that controls devices connected via a network, such as pausing or volume.
 例えば、出力制御部318は、通話先の状態が通話可能であれば、部屋の照明を明るくしたり、音楽の音量を小さくしたり、再生中の動画を一時停止したりしてもよい。これにより、通話元のユーザが通話をしやすい環境を作ることができる。また、出力制御部318は、ユーザの各状態に応じた制御コマンドを対応付けて保持しておき、状態情報により特定される制御コマンドを実行するようにしてもよい。 For example, the output control unit 318 may brighten the lighting in the room, reduce the volume of the music, or pause the video being played, if the call destination is in a state where the call can be made. As a result, it is possible to create an environment in which the caller can easily make a call. In addition, the output control unit 318 may store control commands corresponding to each state of the user in association with each other, and execute the control command specified by the state information.
 また、変形例2として、上述した各実施形態では、生成部316は、ユーザ端末20に設けられるが、サーバ10に設けられてもよい。例えば、サーバ10の生成部が、判定部408により判定されたユーザの状態に応じて状態報知情報を生成する。 Also, as a modification 2, in each of the above-described embodiments, the generation unit 316 is provided in the user terminal 20, but may be provided in the server 10. For example, the generation unit of the server 10 generates state notification information according to the state of the user determined by the determination unit 408 .
 この場合、サーバ10の送信部410は、ユーザの状態に関する情報(状態情報)として状態報知情報をユーザ端末20に送信する。すなわち、変形例2の場合、サーバ10から送信されるユーザの状態に関する情報に、状態報知情報が含められる。ユーザ端末20の出力制御部318は、状態報知情報を取得すると、この状態報知情報を画面に表示するよう制御する。これにより、サーバ10側で状態報知情報などの追加や修正が容易にできるようになる。 In this case, the transmitting unit 410 of the server 10 transmits state notification information to the user terminal 20 as information (state information) regarding the state of the user. That is, in the case of Modification 2, the information about the state of the user transmitted from the server 10 includes the state notification information. The output control unit 318 of the user terminal 20 controls to display the status notification information on the screen when the status notification information is acquired. As a result, the server 10 can easily add or modify the status notification information.
 また、変形例3として、判定部408は、メディアデータの代わりに、又は追加して所定のセンサ(例えば、照度センサ、人感センサなどのフォトセンサ、超音波センサ等)から取得される情報を用いて、ユーザの状態を判定してもよい。例えば、判定部408は、所定のセンサの検知信号がONを示すような場合に、通話先のユーザがいると判定可能である。これにより、所定のセンサを導入するコストはかかるかもしれないが、通話前の通話先のユーザの状態を、上述した実施形態と同様に判定することが可能である。また、メディアデータと所定のセンサにより取得される信号とを用いることで、より適切に通話前の通話先のユーザの状態を判定することが可能になる。 Further, as a modification 3, the determination unit 408 uses information acquired from a predetermined sensor (for example, an illuminance sensor, a photo sensor such as a motion sensor, an ultrasonic sensor, etc.) instead of or in addition to the media data. may be used to determine the state of the user. For example, the determination unit 408 can determine that there is a user to be called when the detection signal of a predetermined sensor indicates ON. With this, although it may be costly to introduce a predetermined sensor, it is possible to determine the state of the called party user before the call in the same manner as in the above-described embodiments. In addition, by using media data and a signal acquired by a predetermined sensor, it becomes possible to more appropriately determine the state of the user of the called party before the call.
1…情報処理システム、2…システム、10…サーバ、10A…管理装置、10B…サーバ、20…ユーザ端末、30…画像出力装置、40…表示装置、60…カメラ、102…制御部、106…記憶部、202…制御部、204…RAM、208…記憶部、220…マイク、222…スピーカ、224…撮像装置、302…画像取得部、304…音取得部、306…送信部、312…受信部、314…データ取得部、316…生成部、318…出力制御部、320…通話処理部、402…受信部、404…通話制御部、406…取得部、408…判定部、410…送信部 Reference Signs List 1 information processing system 2 system 10 server 10A management device 10B server 20 user terminal 30 image output device 40 display device 60 camera 102 control unit 106 Memory unit 202 Control unit 204 RAM 208 Storage unit 220 Microphone 222 Speaker 224 Imaging device 302 Image acquisition unit 304 Sound acquisition unit 306 Transmission unit 312 Reception Unit 314 Data acquisition unit 316 Generation unit 318 Output control unit 320 Call processing unit 402 Reception unit 404 Call control unit 406 Acquisition unit 408 Determination unit 410 Transmission unit

Claims (17)

  1.  通信装置が、
     画像データ及び/又は音データを取得すること、
     前記画像データ及び/又は音データに基づいて、前記通信装置を利用するユーザの状態を判定すること、
     前記ユーザの状態に関する情報を、前記通信装置に対応付けられる通話前の他の通信装置に向けて出力すること、
     を実行する情報処理方法。
    the communication device
    obtaining image data and/or sound data;
    Determining the state of the user using the communication device based on the image data and/or sound data;
    outputting information about the user's state toward another communication device associated with the communication device before a call;
    Information processing method that performs
  2.  前記判定することは、
     前記画像データに対して物体認識を行い、少なくとも前記物体認識の結果に基づいて前記ユーザの状態を判定することを含む、請求項1に記載の情報処理方法。
    The determining
    2. The information processing method according to claim 1, comprising performing object recognition on said image data and determining said user's state based on at least a result of said object recognition.
  3.  前記判定することは、
     前記画像データに含まれる前記ユーザの表情認識を行い、少なくとも前記表情認識の結果に基づいて前記ユーザの状態を判定することを含む、請求項1又は2に記載の情報処理方法。
    The determining
    3. The information processing method according to claim 1, comprising recognizing facial expressions of said user included in said image data, and determining a state of said user based on at least a result of said facial expression recognition.
  4.  前記判定することは、
     前記音データに対して周波数解析を行い、少なくとも前記周波数解析の結果に基づいて前記ユーザの状態を判定することを含む、請求項1乃至3いずれか一項に記載の情報処理方法。
    The determining
    4. The information processing method according to any one of claims 1 to 3, comprising performing frequency analysis on said sound data and determining said user's state based on at least a result of said frequency analysis.
  5.  通信装置に、
     画像データ及び/又は音データを取得すること、
     前記画像データ及び/又は音データに基づいて、前記通信装置を利用するユーザの状態を判定すること、
     前記ユーザの状態に関する情報を、前記通信装置に対応付けられる通話前の他の通信装置に向けて出力すること、
     を実行させるプログラム。
    to the communication device,
    obtaining image data and/or sound data;
    Determining the state of the user using the communication device based on the image data and/or sound data;
    outputting information about the user's state toward another communication device associated with the communication device before a call;
    program to run.
  6.  画像データ及び/又は音データを取得する取得部と、
     前記画像データ及び/又は音データに基づいて、前記通信装置を利用するユーザの状態を判定する判定部と、
     前記ユーザの状態に関する情報を、前記通信装置に対応付けられる通話前の他の通信装置に向けて出力する送信部と、
     を備える通信装置。
    an acquisition unit that acquires image data and/or sound data;
    a determination unit that determines a state of a user using the communication device based on the image data and/or the sound data;
    a transmission unit that outputs information about the user's state toward another communication device associated with the communication device before a call;
    A communication device comprising:
  7.  通信装置が、
     他の通信装置により取得される画像データ及び/又は音データに関するメディアデータに基づいて判定される、前記他の通信装置を利用するユーザの状態に関する情報を取得すること、
     前記ユーザの状態に関する情報に基づいて、前記ユーザの状態を報知可能な状態報知情報を生成すること、
     前記状態報知情報を出力制御すること、
     を実行する情報処理方法。
    the communication device
    Acquiring information about the state of the user using the other communication device, determined based on media data relating to image data and/or sound data acquired by the other communication device;
    generating state notification information capable of notifying the state of the user based on the information about the state of the user;
    controlling the output of the state notification information;
    Information processing method that performs
  8.  前記出力制御することは、
     表示画面の所定領域内に前記状態報知情報を表示制御すること、
     前記状態報知情報が更新されると、更新後の状態報知情報を前記所定領域内に表示制御すること、を含む請求項7に記載の情報処理方法。
    The output control includes
    controlling display of the state notification information within a predetermined area of a display screen;
    8. The information processing method according to claim 7, further comprising, when the state notification information is updated, displaying the updated state notification information within the predetermined area.
  9.  前記生成することは、
     情報処理装置を介して通話を実行するアプリケーションのアイコンが前記表示画面に存在する場合、前記状態報知情報として前記アイコンの画像を生成することを含む、請求項8に記載の情報処理方法。
    The generating includes:
    9. The information processing method according to claim 8, comprising generating an image of said icon as said status notification information when an icon of an application for executing a call via said information processing device is present on said display screen.
  10.  前記生成することは、
     前記ユーザの状態を示す動画像又はアニメーションを含む状態報知情報を生成することを含む、請求項7乃至9いずれか一項に記載の情報処理方法。
    The generating includes:
    10. The information processing method according to any one of claims 7 to 9, comprising generating state notification information including a moving image or animation indicating the user's state.
  11.  前記生成することは、
     前記ユーザの状態に関する情報が通話可能であることを示す場合、前記他の通信装置に対して通話処理を開始するUI部品を特定することを含み、
     前記出力制御することは、
     前記状態報知情報の出力とともに前記UI部品を表示画面に表示制御することを含む、請求項7乃至9いずれか一項に記載の情報処理方法。
    The generating includes:
    identifying a UI component that initiates call processing for the other communication device when the information about the user's status indicates that a call is possible;
    The output control includes
    10. The information processing method according to any one of claims 7 to 9, comprising controlling display of said UI component on a display screen together with outputting said status notification information.
  12.  通信装置に、
     他の通信装置により取得される画像データ及び/又は音データに関するメディアデータに基づいて判定される、前記他の通信装置を利用するユーザの状態に関する情報を取得すること、
     前記ユーザの状態に関する情報に基づいて、前記ユーザの状態を報知可能な状態報知情報を生成すること、
     前記状態報知情報を出力制御すること、
     を実行させるプログラム。
    to the communication device,
    Acquiring information about the state of the user using the other communication device, determined based on media data relating to image data and/or sound data acquired by the other communication device;
    generating state notification information capable of notifying the state of the user based on the information about the state of the user;
    controlling the output of the state notification information;
    program to run.
  13.  他の通信装置により取得される画像データ及び/又は音データに関するメディアデータに基づいて判定される、前記他の通信装置を利用するユーザの状態に関する情報を取得する取得部と、
     前記ユーザの状態に関する情報に基づいて、前記ユーザの状態を報知可能な状態報知情報を生成する生成部と、
     前記状態報知情報を出力制御する出力制御部と、
     を備える通信装置。
    an acquisition unit that acquires information about the state of a user using the other communication device, which is determined based on media data related to image data and/or sound data acquired by the other communication device;
    a generating unit that generates state notification information that can notify the state of the user based on the information about the state of the user;
    an output control unit that controls output of the state notification information;
    A communication device comprising:
  14.  情報処理装置が、
     第1通信装置から画像データ及び/又は音データに関するメディアデータを取得すること、
     前記メディアデータに基づいて、前記第1通信装置のユーザの状態を判定すること、
     前記第1通信装置の識別情報に関連付けられ、前記第1通信装置と通話可能な第2通信装置に、前記ユーザの状態に関する情報を送信すること、を実行する情報処理方法。
    The information processing device
    obtaining media data related to image data and/or sound data from the first communication device;
    determining a state of a user of the first communication device based on the media data;
    An information processing method, comprising transmitting information about the state of the user to a second communication device that is associated with the identification information of the first communication device and that can communicate with the first communication device.
  15.  前記判定することは、
     前記メディアデータに含まれる処理結果情報であって、前記画像データの物体認識結果、及び/又は、前記音データの周波数解析の結果を含む前記処理結果情報に少なくとも基づいて、前記ユーザの状態を判定することを含む、請求項14に記載の情報処理方法。
    The determining
    Determining the state of the user based on at least the processing result information contained in the media data, which includes object recognition results of the image data and/or frequency analysis results of the sound data. 15. The information processing method of claim 14, comprising:
  16.  情報処理装置に、
     第1通信装置から送信される画像データ及び/又は音データに関するメディアデータを取得すること、
     前記メディアデータに基づいて、前記第1通信装置のユーザの状態を判定すること、
     前記第1通信装置の識別情報に関連付けられ、前記第1通信装置と通話可能な第2通信装置に、前記ユーザの状態に関する情報を送信すること、を実行させるプログラム。
    information processing equipment,
    Acquiring media data related to image data and/or sound data transmitted from the first communication device;
    determining a state of a user of the first communication device based on the media data;
    Sending information about the state of the user to a second communication device that is associated with the identification information of the first communication device and that can communicate with the first communication device.
  17.  第1通信装置から画像データ及び/又は音データに関するメディアデータを取得する取得部と、
     前記メディアデータに基づいて、前記第1通信装置のユーザの状態を判定する判定部と、
     前記第1通信装置の識別情報に関連付けられ、前記第1通信装置と通話可能な第2通信装置に、前記ユーザの状態に関する情報を送信する送信部と
     を備える情報処理装置。
    an acquisition unit that acquires media data related to image data and/or sound data from the first communication device;
    a determination unit that determines a state of the user of the first communication device based on the media data;
    and a transmitting unit that transmits information about the state of the user to a second communication device that is associated with the identification information of the first communication device and that can communicate with the first communication device.
PCT/JP2022/012347 2021-03-19 2022-03-17 Information processing method, program and information processing device WO2022196769A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-045779 2021-03-19
JP2021045779A JP2022144667A (en) 2021-03-19 2021-03-19 Program, information processing method, and information processing device

Publications (1)

Publication Number Publication Date
WO2022196769A1 true WO2022196769A1 (en) 2022-09-22

Family

ID=83321082

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/012347 WO2022196769A1 (en) 2021-03-19 2022-03-17 Information processing method, program and information processing device

Country Status (2)

Country Link
JP (1) JP2022144667A (en)
WO (1) WO2022196769A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0758823A (en) * 1993-08-12 1995-03-03 Nippon Telegr & Teleph Corp <Ntt> Telephone dial system
JPH09307868A (en) * 1996-03-15 1997-11-28 Toshiba Corp Communication equipment and communication method
JP2003023474A (en) * 2001-07-06 2003-01-24 Nec Corp Mobile object terminal and ringing method for incoming call
JP2004214934A (en) * 2002-12-27 2004-07-29 Matsushita Electric Ind Co Ltd Terminal and program for presence information processing, and presence service providing server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0758823A (en) * 1993-08-12 1995-03-03 Nippon Telegr & Teleph Corp <Ntt> Telephone dial system
JPH09307868A (en) * 1996-03-15 1997-11-28 Toshiba Corp Communication equipment and communication method
JP2003023474A (en) * 2001-07-06 2003-01-24 Nec Corp Mobile object terminal and ringing method for incoming call
JP2004214934A (en) * 2002-12-27 2004-07-29 Matsushita Electric Ind Co Ltd Terminal and program for presence information processing, and presence service providing server

Also Published As

Publication number Publication date
JP2022144667A (en) 2022-10-03

Similar Documents

Publication Publication Date Title
US8325213B2 (en) Video communication system and method
TWI446780B (en) Communication apparatus and method
US8350888B2 (en) Apparatus and method for providing emotion expression service in mobile communication terminal
US20100274847A1 (en) System and method for remotely indicating a status of a user
US20130086615A1 (en) Concurrent real-time communication with media contextualized activity sharing
US9041763B2 (en) Method for establishing video conference
JPWO2007055206A1 (en) COMMUNICATION DEVICE, COMMUNICATION METHOD, COMMUNICATION SYSTEM, PROGRAM, AND COMPUTER-READABLE RECORDING MEDIUM
JP5228551B2 (en) Cooperation method between door intercom device and IP mobile phone
WO2017113695A1 (en) Video communication method for smart television, terminal equipment and smart television
US20070072648A1 (en) Method and apparatus for identifying a calling party
WO2022196769A1 (en) Information processing method, program and information processing device
US8848885B2 (en) Device information communication method, video display device, and video display system
JP4636903B2 (en) Video phone equipment
KR100631624B1 (en) How to handle privacy information on your mobile phone
JP2006304151A (en) Electronic conference setting program, electronic conference terminal, and electronic conference setting method
JP4791213B2 (en) COMMUNICATION DEVICE, COMMUNICATION METHOD, AND PROGRAM
JP2008263385A (en) Display system
TW200843470A (en) Communication Framework and Method and System for Providing a Real-Time Audio and/or Video Signal
KR100457326B1 (en) System and method for selecting the camera with phone number
JP2007096993A (en) Messaging system, client apparatus, and program
JP7340835B2 (en) Program, information processing method, and information processing device
JP6230875B2 (en) Intercom system
US20230199109A1 (en) Methods and apparatuses of call implementation
CN109413357B (en) Audio and video call method and device, equipment and storage medium thereof
US20130007351A1 (en) Information processor, information processing method, and computer program product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22771514

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22771514

Country of ref document: EP

Kind code of ref document: A1