WO2015184701A1 - 单呼即通实况通信终端、方法及工具 - Google Patents
单呼即通实况通信终端、方法及工具 Download PDFInfo
- Publication number
- WO2015184701A1 WO2015184701A1 PCT/CN2014/086574 CN2014086574W WO2015184701A1 WO 2015184701 A1 WO2015184701 A1 WO 2015184701A1 CN 2014086574 W CN2014086574 W CN 2014086574W WO 2015184701 A1 WO2015184701 A1 WO 2015184701A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- communication terminal
- trusted user
- call
- video
- Prior art date
Links
- 238000004891 communication Methods 0.000 title claims abstract description 284
- 238000000034 method Methods 0.000 title claims abstract description 21
- 230000004044 response Effects 0.000 claims abstract description 55
- 230000009471 action Effects 0.000 claims description 43
- 230000008859 change Effects 0.000 claims description 15
- 230000002159 abnormal effect Effects 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 5
- 230000000007 visual effect Effects 0.000 claims description 4
- 230000005856 abnormality Effects 0.000 claims 1
- 238000012544 monitoring process Methods 0.000 abstract description 26
- 230000002708 enhancing effect Effects 0.000 abstract description 3
- 230000003993 interaction Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 12
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/70—Circuitry for compensating brightness variation in the scene
- H04N23/71—Circuitry for evaluating the brightness variation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/142—Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/147—Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/183—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B13/00—Burglar, theft or intruder alarms
- G08B13/16—Actuation by interference with mechanical vibrations in air or other fluid
- G08B13/1654—Actuation by interference with mechanical vibrations in air or other fluid using passive vibration detection systems
- G08B13/1672—Actuation by interference with mechanical vibrations in air or other fluid using passive vibration detection systems using sonic detecting means, e.g. a microphone operating in the audio frequency range
Definitions
- the present invention relates to communication technologies, and in particular, to a single-call, live communication terminal, method and tool.
- a camera is installed in the home, and the collected video signal is sent to a remote monitoring terminal (such as a mobile phone user).
- Remote monitoring is performed by displaying the captured video on the monitor's screen.
- remote video surveillance is not a two-way communication. Although the user on the monitoring side can see the situation inside the home, the people in the home can't hear the voice of the monitoring user, can't interact in two directions, and the user experience is poor.
- One of the technical problems solved by the present invention is to enhance real-time interaction between a person who is in need of being taken care of and in a fixed place, and a user who is in other unfixed places and a mobile user, thereby improving the communication experience. It corresponds to a ubiquitous communication model in real life, that is, there are specific social relationships between the visiting user and the visited place, and the person being visited, such as the elderly and children, parents and children, without having to engage in a conversation similar to strangers. Such an identity confirmation step.
- a single-call or live communication terminal including a camera, an audio collection unit, a speaker, and a transceiver, and the video and audio respectively collected by the camera and the audio collection unit are transmitted and received.
- the signal is sent by the transceiver, and the audio received by the transceiver is output through the speaker, wherein the transceiver automatically sends a response to the connection request in response to receiving the connection request from the trusted user, thereby automatically establishing an IP with the trusted user.
- Communication is a two-way communication that can be automatically performed after a one-way call.
- the transceiver after automatically establishing an IP communication with a trusted user, transmits only the video and audio collected by the camera and the audio collection unit to the trusted user in response to the two-way from the trusted user.
- the communication request outputs the audio from the trusted user through the speaker while transmitting the video and audio collected by the camera and the audio collection unit to the trusted user.
- the transceiver after automatically establishing an IP communication with a trusted user, transmits audio from the trusted user while transmitting the video and audio collected by the camera and the audio collection unit to the trusted user. Output through the speaker.
- the single-call or live communication terminal further includes a display, if the transceiver transmits the IP communication with the trusted user, if the transceiver receives the video, the video is displayed, if the transceiver If the message is not received, the identity of the trusted user is displayed.
- the transceiver sends a response via the server IP communication to the other trusted user in response to receiving a connection request from another trusted user after establishing IP communication with the trusted user And issuing a request to the trusted user to change the server for IP communication.
- the display simultaneously displays video or identification of a plurality of trusted users.
- the transceiver in response to one or more of the videos or identities of the plurality of trusted users being selected, disconnects the one or more videos or identities The IP communication of the trusted user, or the speaker does not output the voice of a trusted user corresponding to the one or more videos or identities.
- the video or logo of the selected trusted user in response to one of the videos or logos of the plurality of trusted users being selected, the video or logo of the selected trusted user becomes the enlarged home screen.
- the transceiver transmits reminder information to the trusted user in response to identifying a person or a specific person from the video and audio respectively collected by the camera and the audio collection unit.
- the person or the specific person is identified based on one or more of face recognition, height recognition, voice recognition, and identity indicated by the wireless signal transmitted by the mobile phone. of.
- the transceiver transmits the reminder information to the trusted user in response to the specific action being recognized from the video and audio respectively collected by the camera and the audio collection unit.
- the specific action is determined by establishing a model for a predetermined action in advance, and matching the search and the established model from the video and audio collected by the camera and the audio collection unit.
- the model is generated by self-learning.
- the single-call or live communication terminal further includes a depth sensor, and the specific action is based on depth recognition of the video, audio, and depth sensor sensed by the camera and the audio collection unit respectively.
- the transceiver transmits the reminder information to the trusted user in response to the abnormal condition being recognized from the video and audio respectively collected by the camera and the audio collection unit.
- the abnormal condition is identified by recognizing one or more of the following: a dramatic change in the video captured by the camera; an audio collected by the audio collection unit above a certain threshold; audio The dramatic change of the audio collected by the acquisition unit; based on the predetermined events recognized by the camera and the audio collection unit respectively, wherein the model of the predetermined event has been established in advance, and is separately collected from the camera-based and audio-collecting units.
- the incoming video and audio are searched for events that match the established model to identify the predetermined event.
- the single-call or live communication terminal further includes: a rotating device that rotates the camera.
- the rotating device in response to recognizing one of the following elements from the video and audio respectively collected by the camera and the audio collection unit, rotates the camera toward the direction of the recognized element: person or specific Person; specific action; abnormal condition.
- the single-call or live communication terminal further includes: a light sensor for sensing a change of ambient light around the single-call or live communication terminal, wherein the display brightness of the display is according to the change of the light Adjusted.
- a tool installed in a mobile terminal comprising: a transmitting unit configured to transmit a connection request for a specific communication terminal in response to a trigger; a receiving unit configured To receive an automatic response from the particular communication terminal, an IP communication with the particular mobile terminal is automatically established.
- the receiving unit receives the video and audio from the specific communication terminal, and the sending unit does not send the user's audio and video, in response to the second
- the triggering unit transmits the audio and video to the specific communication terminal while the receiving unit receives the video and audio from the specific communication terminal.
- the transmitting unit after automatically establishing IP communication with the specific mobile terminal, while the receiving unit receives the video and audio from the specific communication terminal, the transmitting unit sends a tone to the specific communication terminal, video.
- the first trigger includes any one of: booting of the mobile terminal; activation of the tool when the mobile terminal is powered on; and user interface on the mobile terminal when the mobile terminal is powered on a specific action; the specific voice received by the mobile terminal in a power-on state; the light sensed by the mobile terminal in the power-on state becomes strong.
- the second trigger comprises any one of: a specific action on the user interface in an activated state of the tool; a specific voice received in an activated state of the tool.
- the transmitting unit is configured to transmit a connection request for the specific communication terminal selected by the user in response to the selection of the user input in the case where the mobile terminal stores the connection for the plurality of communication terminals.
- a single-call or live-to-live communication method comprising: receiving a connection request from a trusted user; automatically transmitting in response to receiving a connection request from a trusted user The response to the connection request automatically establishes IP communication with the trusted user; in IP communication with the trusted user, the collected video and audio are sent to the trusted user, and at least the audio from the trusted user is received.
- the single-call or live communication method further includes: sending a reminder to the user in response to identifying one of the following elements from the collected video and audio Information: person or specific person; specific action; abnormal condition.
- the single-call or live-to-live communication method further includes: in response to receiving a connection request from another trusted user after establishing an IP communication with the trusted user, to the another The trusted user issues a response via the server IP communication and sends a request to the trusted user to redirect the server for IP communication.
- the single-call or live-to-live communication terminal provided by an embodiment of the present invention automatically responds to a connection request from a trusted user by a transceiver, and automatically sends a response to the connection request, thereby automatically establishing a trusted user.
- IP communication not only the monitoring end user can view the situation at the live communication terminal at any time, but also the person at the live communication terminal can interact with the monitoring end user in real time, thereby improving the user experience.
- the method of establishing IP communication by manually confirming the connection request by the user of the live communication terminal avoids the influence that the live communication terminal is unmanned or someone but cannot be answered normally, and the live monitoring cannot be performed.
- the configuration of one embodiment of the present invention provides for the possibility of two-way interaction between the monitoring end user and the person at the live communication terminal, sometimes the monitoring end user also has a desire to not know who is at the live communication terminal who is monitoring. Therefore, after automatically establishing the IP communication with the trusted user, the transceiver can only send the video and audio collected by the camera and the audio collection unit to the trusted user, and respond to the two-way communication request from the trusted user.
- the audio from the trusted user is output through the speaker while the video and audio collected by the camera and the audio collecting unit are sent to the trusted user. In this way, the monitoring end user can flexibly choose whether to let the person at the live communication terminal know that he is monitoring, and improve the flexibility of the monitoring user side.
- the single-call or live communication terminal provided by one embodiment of the present invention displays different information based on whether it receives the video, so that the manner of displaying the information and the format of the data transmission are more flexible.
- the single-call or live communication terminal provided by one embodiment of the present invention adopts end-to-end direct communication when communicating with a single trusted user, and performs IP communication through the server when communicating with multiple trusted users.
- the flexible communication mode enables the single-call or live communication terminal to effectively avoid waste of server resources when communicating with a single trusted user, and enables the single-call or live communication terminal to communicate with multiple trusted users. Pass Pass the server to forward data and transfer large amounts of data faster and more accurately.
- the single-call or live communication terminal provided by one embodiment of the present invention can display the video or identifier of multiple trusted users simultaneously by the display in the case of communicating with multiple trusted users IP, thereby improving the visual experience of the user. .
- the single-call or live communication terminal provided by one embodiment of the present invention can disconnect the IP communication with one or more trusted users by the transceiver in the case of communicating with a plurality of trusted users IP.
- the trusted user of the single-call live communication terminal can freely select the communication object; and the speaker of the single-call or live communication terminal can output or not output sound to one or more trusted users, thereby further enhancing the trusted user. Flexibility in video communication/voice communication/screen-only communication.
- the single-call or live communication terminal provided by one embodiment of the present invention may be selected in response to one of the videos or identifiers of the plurality of trusted users, and the video or the identifier of the selected trusted user becomes enlarged.
- the main picture thereby highlighting the communication of the single-call instant communication terminal and the trusted user corresponding to the main picture, further enhancing the user's visual experience.
- the single-call or live communication terminal provided by an embodiment of the present invention can send a reminder message to a trusted user based on the video and audio recognized by the camera and the audio collection unit, respectively, to satisfy the trusted user. It is only necessary to monitor when someone or a specific person appears in a specific environment to avoid continuous monitoring.
- the single-call or live communication terminal provided by one embodiment of the present invention can be identified based on one or more of face recognition, height recognition, voice recognition, and identity indicated by a wireless signal transmitted by the mobile phone, and can effectively improve the single The sensitivity of the live communication terminal to the surrounding situation recognition.
- the single-call or live communication terminal provided by one embodiment of the present invention can identify a specific action or abnormal condition based on the video and audio respectively collected by the camera and the audio collection unit, and send the reminder information to the trusted user, thereby satisfying A trusted user may only need to monitor for certain situations in a single-call or live communication terminal to avoid continuous monitoring.
- the single-call or live communication terminal provided by one embodiment of the present invention can generate a model in advance for a predetermined action, or can generate a model by self-learning, and separately acquire video and audio from a camera and an audio collection unit.
- the search matches the established model The action makes it more flexible, smarter, and more accurate to identify specific actions and better monitor the surrounding situation.
- the single-call or live communication terminal provided by one embodiment of the present invention performs the depth recognition of the surrounding situation by using the depth sensor, and has higher accuracy in recognizing the three-dimensional object and the person, the specific person, the action, and the like.
- the camera of the single-call or live communication terminal provided by one embodiment of the present invention can be rotated, and can further rotate toward the identified elements to collect events more intelligently and flexibly.
- the display brightness of the display can be adjusted according to the sensed single call or the change of the ambient light around the live communication terminal, thereby improving the comfort of viewing the display.
- the tool installed in the mobile terminal provided by one embodiment of the present invention transmits a connection request for a specific communication terminal and is configured to receive an automatic response from the specific communication terminal, thereby automatically establishing an IP communication with the specific mobile terminal,
- the manner in which the user who does not need the live communication terminal manually confirms the connection request can establish an IP communication manner, thereby avoiding the influence that the live communication terminal is unmanned and unable to perform live monitoring.
- the receiving unit receives the video and audio from the specific communication terminal, and the sending unit does not send the user's audio, in response to the second trigger.
- the transmitting unit that receives the video and audio from the specific communication terminal receives the audio from the specific communication terminal, so that the monitoring user does not want the person at the live communication terminal to know that he is monitoring,
- the second trigger is not performed, so that the monitoring end user can flexibly choose whether to let the person at the live communication terminal know that he is monitoring, and improve the flexibility of the user side of the monitoring end.
- the trigger may be the booting of the mobile terminal, the activation of the tool in the powered-on state of the mobile terminal, the specific action on the user interface in the powered-on state of the mobile terminal, the Any one of the specific voice received in the power-on state of the mobile terminal and the light intensity sensed in the power-on state of the mobile terminal improves the flexibility of the mobile terminal being triggered.
- the mobile terminal may store multiple channels for multiple channels.
- the connection of the communication terminal allows the user to select one of the communication terminals to communicate, so that one mobile terminal can simultaneously bind multiple single-call and live communication terminals, thereby improving user convenience.
- FIG. 1 shows a schematic block diagram of a single-call or live communication terminal in accordance with one embodiment of the present invention
- FIG. 2(a) is a diagram showing a single-call or live communication terminal and a single user performing IP communication according to an embodiment of the present invention
- FIG. 2(b) is a schematic diagram showing a single-call or live communication terminal and a plurality of users performing IP communication according to another embodiment of the present invention
- FIG. 3 shows an external left side view of a single-call or live communication terminal in accordance with one embodiment of the present invention
- Figure 4 shows a block diagram of a mobile terminal in accordance with one embodiment of the present invention
- FIG. 5 shows a flow chart of a single-call or live communication method in accordance with yet another embodiment of the present invention.
- the single-call live communication terminal 1 includes a camera 101, an audio collection unit 102, a speaker 104, and a transceiver 105.
- the video and audio respectively collected by the camera 101 and the audio collection unit 102 are transmitted through the transceiver 105.
- the audio received by the transceiver 105 is output through the speaker 104.
- Transceiver 101 is responsive to receiving The connection request from the user automatically issues a response to the connection request, thereby automatically establishing IP communication with the user.
- Single call is a two-way communication that can be automatically performed after a one-way call.
- the two-way interworking of the trusted user with the person at the single call or the live communication terminal 1 can be automatically established. That is, the audio from the trusted user is output through the speaker 104 while the video and audio collected by the camera 101 and the audio collection unit 102 are transmitted to the trusted user. It is also possible to first notify the trusted user only the situation at the single call or the live communication terminal 1, without transmitting the audio or the like of the trusted user to the single call or the live communication terminal 1 side. That is, only the video and audio collected by the camera 101 and the audio collection unit 102 are sent to the trusted user.
- the audio and the like of the trusted user are transmitted to the single-call or live communication terminal 1 side, that is, the video and audio collected by the camera 101 and the audio collection unit 102 are sent to the trusted device.
- the user's audio from the trusted user is simultaneously output through the speaker 104.
- the camera 101 is a camera at the upper end of the live communication terminal 1, but it will be understood by those skilled in the art that it may also be other camera devices located at other positions of the live communication terminal 1.
- the audio collection unit 102 is, for example, a microphone on the outer surface of the live communication terminal 1, but may be other audio collection devices.
- the speaker 104 is, for example, a sound player on the outer surface of the live communication terminal 1, but may be another audio output device.
- the transceiver 105 such as an antenna, may be other transceiver devices, such as a built-in wireless transceiver module.
- the single-call instant communication terminal includes, but is not limited to, any electronic product that can interact with a user through a touch pad, a voice control device, a remote control device, or a keyboard, such as a computer or a tablet (PAD). , Internet TV (IPTV), etc., those skilled in the art should understand that other user equipments, as applicable to the present invention, are also included in the scope of the present invention.
- the single-call live communication terminal 1 may further include a display 103. If the transceiver 101 establishes IP communication with a trusted user, if the transceiver 105 receives The video displays the video and displays the identity of the trusted user if the transceiver 105 does not receive the video. Of course, the transceiver 103 can display only the identity of the trusted user even if the video can be received.
- the identifier of the trusted user may be a video screenshot of a trusted user, an avatar, or He logo.
- the single-call live communication terminal 1 may not include the display 103. Thus, when the live communication terminal 1 communicates with the trusted user IP, the image of the trusted user cannot be seen, and only the voice of the trusted user can be heard. .
- 2(a) is a diagram showing a single-call or live communication terminal 1 and a single trusted user performing IP communication according to an embodiment of the present invention.
- IP communication based on the Point-to-Point Protocol is preferably performed to save resources of the server.
- 2(b) is a diagram showing a single-call or live communication terminal 1 and a plurality of trusted users performing IP communication according to another embodiment of the present invention.
- the information is transmitted and received through the server 5 via the IP network 4.
- the single-call or live communication terminal 1 when the single-call or live communication terminal 1 performs IP communication only with the trusted user A, the IP communication is directly performed based on the point-to-point protocol;
- the single-call instant communication terminal 1 issues a server IP communication to the trusted user B.
- the server may include a network host, a single network server, a plurality of network server collections, or a cloud computing based computer collection.
- the display 103 of the single-call instant communication terminal 1 can simultaneously display multiple trusted User's video or logo.
- the single-call live communication terminal 1 responds to one or more videos or identifiers in the video or identification of the plurality of trusted users. Selected, the transceiver 105 disconnects the IP communication of the trusted user corresponding to the one or more videos or identities. Or the transceiver 105 is still in IP communication with one or more trusted users.
- the speaker 104 does not output the selected one or more videos or the voice of the trusted user corresponding to the identifier, and only the selected one or more videos or the video images of the trusted user corresponding to the identifier are displayed by the display 103.
- the voices of multiple trusted users heard by the person on the side of the live communication terminal 1 are prevented from interfering with each other.
- the single call instant communication terminal 1 is responsive to the video or logo of the plurality of trusted users. Upon selection, the video or logo of the selected trusted user is upgraded from the original screen to the enlarged home screen.
- the single-call or live communication terminal 1 may separately collect in response to the slave camera 101 and the audio collection unit 102.
- a person or a specific person is recognized in the view and audio, and the reminder information is sent by the transceiver 105 to the trusted user.
- the transceiver 105 actively initiates The trusted user at the other end sends a reminder message to inform the other end of the trusted user that someone is present in the current environment.
- the single-call or live communication terminal 1 can also send a reminder information to the trusted user by the transceiver 105 for a specific person recognized by the camera 101 and the audio collection unit 102, for example, in a real-life scenario.
- the babysitter has been at home for a long time. At this time, the child comes back from school.
- the single-call live communication terminal 1 placed in the home recognizes the child through the camera 101 and the audio collection unit 102, and the transceiver 105 is remotely or in real time.
- a user (such as a father in the office) sends a reminder message.
- the single-call or live communication terminal 1 can be used by the camera 101, the audio collection unit 102, and other devices or units, based on the identity indicated by the face recognition, the height recognition, the voice recognition, and the wireless signal sent by the mobile phone. One or more of them to identify a person or a specific person.
- the height of most people is also within a certain range, and the frequency of the human voice is also within a specific range, and thus, for example, when one of the images is taken
- the area is similar to the mode of the stored face, and/or the distance between the face sensed by the position sensor and/or the depth sensor and the single call or the live communication terminal 1 is determined.
- the height is within a certain range, and/or the audio collected by the audio collection unit 102 is also within a certain range, and the presence of a person can be recognized.
- the mode and/or height and/or sound frequency of the face of the specific person may be stored in advance in the memory.
- a certain area in the captured image matches the stored pattern of the specific face, and/or the distance between the specific face sensed by the position sensor and/or the depth sensor and the single call or the live communication terminal 1 It is determined that the height matches the stored height, and/or the matching of the audio collected by the audio collecting unit 102 matches the stored frequency of the modified person's voice, the presence of the specific person can be identified.
- Self-learning methods can also be used to identify the presence of a person or a specific person. For example, if a certain mode in the captured image always coincides with a certain frequency of the collected sound, a prompt may be displayed on the display, that is, the person is recognized, and the person next to the live communication terminal 1 judges and names. If the person next to the live communication terminal 1 finds an identification error, it feeds back on the interface of the display. Upon receiving such feedback, a person or a specific person is not considered to be present when such a pattern in the next captured image coincides with such a frequency of the collected sound. In the self-learning mode, the mode and/or height and/or sound frequency of the face of the specific person may not be stored in the memory in advance.
- the single-call live communication terminal 1 is a Bluetooth device, and the user's mobile phone also has a Bluetooth wireless unit.
- the single-call instant communication terminal 1 recognizes that the Bluetooth wireless unit of the specific identity appears within a certain distance, it is considered that the specific person is identified.
- the manner in which the single-call or live communication terminal 1 recognizes a person or a specific person is not limited, and any device or unit having an identification person or a specific person, as applicable to the present invention, should be included in the scope of protection of the present invention. And is hereby incorporated by reference.
- the single-call or live communication terminal 1 can also identify a specific action based on the collected video and audio through the camera 101 and the audio collection unit 102, for example, identifying an action of falling of the old man and an action of dancing by the child. And so on, and the transceiver 105 actively sends reminder information to the trusted user at the other end.
- the model can be manually set in advance and based on the set action.
- a module is searched and stored from the video and audio collected by the camera 101 and the audio collection unit 102
- the transceiver 105 actively sends a reminder message to the trusted user at the other end. For example, for an action such as watching TV, create a model: identify a person sitting on the sofa; follow the person's gaze direction, have an object; recognize that the object is a TV; the person stays on the TV At least 10 seconds.
- the recognition of the sofa is similar to face recognition, it can also be performed by pattern matching, and the image of the person sitting on the sofa as a whole can also be regarded as one
- the object performs pattern matching recognition, and then detects the person's gaze direction, and then detects whether the object in the direction of the person's gaze is a television (for example, pattern matching the television as an object), and if so, counts for 10 seconds.
- the single-call or live communication terminal 1 can also automatically establish an action model by means of self-learning such as machine learning.
- the single-call live communication terminal 1 extracts motion features from the video and audio collected by the camera 101 and the audio collection unit 102, and establishes an action model based on the extracted features. For example, from the video and audio collected by the camera 101 and the audio collection unit 102, a person is found sitting on the sofa, and there is a TV in the direction of the person's gaze, and the event that the person stays on the TV exceeds 10 If the frequency of seconds exceeds the threshold, then this is considered a model of a particular action.
- the action model may not be stored in the database in advance, but the model of the action is extracted in a learning manner based on the view and audio collected from the camera 101 and the audio collection unit 102.
- the single-call live communication terminal 1 further includes a depth sensor (197), which is jointly identified by the camera 101, the audio collection unit 102, and the depth sensor through the acquired video and audio and the sensed depth.
- the depth sensor senses a person or object and the single call instant communication terminal 1.
- the depth sensor 197 is located to the left of the center of the upper frame of the display in FIG. 2(a), it may be disposed at other reasonable physical locations.
- the single-call instant communication terminal 1 identifies an abnormal condition based on the video and audio collected by the camera 101 and the audio collection unit 102, and the transceiver 105 actively sends another A trusted user at one end sends a reminder message.
- abnormal conditions such as strangers visiting, fire, crying, noise, electrical accidents, and so on.
- the abnormal condition is identified by identifying one or more of the following: a dramatic change in the video captured by the camera; an audio collected by the audio collection unit above a certain threshold; and the audio acquisition unit collects The dramatic change of the audio; based on the predetermined events recognized by the video and audio collected by the camera (101) and the audio collection unit (102).
- the scheduled event is a predetermined event such as a fire or an electrical accident.
- the single-call live communication terminal 1 recognizes a predetermined event based on the camera 101 and the audio collection unit 102, wherein the model of the predetermined event has been established in advance, and passes from the camera 101 based on the audio collection unit.
- the visual and audio collected in 102 respectively search for events matching the established model, thereby identifying the predetermined event.
- the single-call or live communication terminal 1 can automatically establish a model of a predetermined event by means of self-learning such as machine learning.
- the single-call live communication terminal 1 extracts event features from the video and audio collected by the camera 101 and the audio collection unit 102, and establishes a model of the predetermined event based on the extracted event features.
- the single-call live communication terminal 1 further includes a rotating device 199 for rotating the camera 101.
- the rotating device 199 rotates the camera 101 in a direction facing the identified element: a person or a specific person ; specific action; abnormal condition.
- the camera 101 shown in FIG. 3 can be rotated left and right toward the identified elements. In another embodiment, the camera 101 shown in FIG. 3 can be rotated up, down, left, and right toward the identified elements.
- the single-call or live communication terminal 1 may further include: a light sensor 198 for sensing a change in ambient light around the live communication terminal 1 in a single call, wherein the display brightness of the display 103 is Adjusted according to the change of the light. If the ambient light is strong, you can increase the display brightness of the display. If the surrounding light is weak, you can display The display brightness of the device is reduced. In this way, the discomfort of the eyes to view the display can be reduced.
- FIG. 1 block diagrams shown in FIG. 1 are for illustrative purposes only and are not intended to limit the scope of the invention. In some cases, certain units or devices may be added or removed as appropriate.
- the above-mentioned single-call instant communication terminal 1 sends reminder information to the trusted user based on the transceiver 105 mainly by sending a short message, a Fetion or WeChat or a customized message under the private protocol to the trusted user.
- the trusted user at the other end is mainly in IP communication with the single-call or live communication terminal 1 in the wifi network environment.
- the trusted user at the other end can also pass through a network such as 3G.
- the communication method such as 2G network or 4G communicates with the single call or live communication terminal 1 .
- a tool 31 mounted to the mobile terminal 3 including a transmitting unit 301 and a receiving unit 302.
- the transmitting unit 301 is configured to transmit a connection request for a specific communication terminal (corresponding to the aforementioned one-call or live communication terminal) in response to the first trigger.
- the receiving unit 302 is configured to receive an automatic response from the particular communication terminal to automatically establish IP communication with the particular mobile terminal.
- the mobile terminal includes an electronic device such as a smart phone, a tablet computer, etc., and the tool may be installed on the mobile terminal in an application (app) manner and displayed in the form of an application icon, and the tool may also be a plug-in. The form is built into the mobile terminal.
- the IP communication is performed with the single-call or live communication terminal; when the mobile terminal is in a network environment such as 2G, the single-call or live communication terminal can send the mobile terminal to the mobile terminal.
- Reminder information When the mobile terminal is in a network environment such as wifi or 3G or 4G, the IP communication is performed with the single-call or live communication terminal; when the mobile terminal is in a network environment such as 2G, the single-call or live communication terminal can send the mobile terminal to the mobile terminal. Reminder information.
- the transmitting unit 301 may transmit audio to the specific communication terminal while the receiving unit 302 receives the video and audio from the specific communication terminal.
- the receiving unit 302 receives the video and audio from the specific communication terminal, and the sending unit 301 does not transmit the audio of the user, but is in the receiving unit in response to the second trigger.
- 302 receives from the stated The simultaneous transmission and reception unit 301 of the specific communication terminal transmits audio to the specific communication terminal.
- the second trigger may not be performed, so that only the video and audio from the specific communication terminal are transmitted to the mobile terminal 3, and Information such as audio of the user of the mobile terminal 3 is not transmitted to the specific communication terminal.
- the first trigger includes any one of the following: a booting of the mobile terminal; activation of the tool in a boot state of the mobile terminal; a specific action on the user interface in a boot state of the mobile terminal; The specific voice received in the state; the light sensed by the mobile terminal in the power-on state becomes strong.
- the communication connection with the single-call or live communication terminal 1 is automatically performed as the mobile terminal is powered on. This can enable the mobile phone to automatically enter the monitoring state of the environment in which the single call is connected to the live communication terminal 1 after the power is turned on, thereby improving user efficiency.
- the first trigger is the activation of the tool in the power-on state of the mobile terminal
- the specific action on the user interface in the power-on state of the mobile terminal, or the specific voice received in the power-on state of the mobile terminal According to the needs of the user, it is decided whether to enter the monitoring state of the environment in which the single call is connected to the live communication terminal 1, thereby increasing user flexibility.
- Specific actions such as swiping, clicking, double clicking, etc. on an icon, or entering specific content at a particular location on the touch screen.
- the first trigger is that the light sensed by the mobile terminal is in a state of being turned on
- the light is sensed to be strong, thereby automatically performing the live call with the single call.
- the waste of resources caused by the connection resources of the live communication terminal 1 with the single call is still avoided in the environment where the user does not wish to monitor the single call or the live communication terminal 1 and put the mobile terminal in the pocket.
- a light sensor is provided in the mobile terminal or tool for sensing changes in light on the surface of the mobile terminal.
- the second trigger can include any of the following: a particular action on the user interface in an activated state of the tool; a particular voice received in an activated state of the tool.
- a particular action can be an action at a location on the user interface (such as swiping, clicking, double-clicking, etc.).
- the first trigger may be an action for the first icon on the user interface
- the second touch The hair is an action directed to a second icon on the user interface that is different from the first icon, and so on.
- the transmitting unit 301 is configured to transmit a connection request for the specific communication terminal selected by the user in response to the selection of the user input in the case where the mobile terminal stores the connection for the plurality of communication terminals. For example, a list of multiple communication terminals can be displayed to the user for the user to select one of them. In response to this selection, a connection request is sent to the selected particular communication terminal.
- FIG. 5 shows a flow chart of a single call instant communication method 2 in accordance with yet another embodiment of the present invention.
- the single-call live communication method 2 includes:
- Step S1 the single call instant communication terminal receives the connection request from the trusted user
- Step S2 in response to receiving a connection request from the trusted user, automatically issuing a response to the connection request, thereby automatically establishing IP communication with the trusted user;
- step S3 in the IP communication with the trusted user, the collected video and audio are sent to the trusted user, and at least the audio from the trusted user is received.
- the single-call or live communication method further includes: in response to identifying one of the following elements from the collected video and audio, sending a reminder message to the trusted user: a person or a specific person; a specific action; situation.
- the single-call live communication method further includes: sending a server to the other trusted user in response to receiving a connection request from another trusted user after establishing IP communication with the trusted user
- the IP communication responds and sends a request to the trusted user to change the server for IP communication.
- the present invention can be implemented as a device, apparatus, method, or computer program product. Therefore, the present disclosure may be embodied in the following forms, that is, it may be complete hardware, full software, or a combination of hardware and software.
- each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the Executable instructions.
- each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the Executable instructions.
- the functions of the annotations may also occur in an order different from that noted in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Telephonic Communication Services (AREA)
Abstract
本发明公开了一种单呼即通实况通信终端、方法及安装于移动终端的工具,其中,单呼即通实况通信终端接收来自可信任用户的连接请求;响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的 IP 通信;在与可信任用户的 IP 通信中,向可信任用户发送采集到的视、音频,并至少接收来自可信任用户的音频。与现有技术相比,本发明通过单呼即通实况通信终端自动响应可信任用户的连接请求,加强了监视端可信任用户与被监视端的互动从而提升用户的通信体验。
Description
本申请要求了2014年6月5日提交的、申请号为201410247191.1、发明名称为“单呼即通实况通信终端、方法及工具”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及通信技术,尤其涉及一种单呼即通实况通信终端、方法及工具。
现有技术中,存在着一种家庭摄像监控系统。在家庭中安装有摄像头,将采集到的视频信号发送给远程的监视端(如手机用户)。在监视端的屏幕上显示采集到的视频,从而实现远程监控。但是,远程视频监控不是一种双向通信。虽然监视端的用户能够看到家里面的情况,但家里面的人听不到监视端用户的声音,不能双向互动,用户体验差。
发明内容
本发明解决的技术问题之一是加强处于固定场所需要被关照、被光顾的人,与处于其他不固定场所、可移动性用户之间的实时互动,从而提升通信体验。它对应于实际生活中一种普遍存在的沟通模型,即访问用户与被访问场所、被访问的人之间存在特定的社会关系,比如老人与子女,父母与孩子,无需进行类似于陌生人对话那样的身份确认步骤。
根据本发明的一个方面的一个实施例,提供了一种单呼即通实况通信终端,包括摄像机、音频采集单元、扬声器以及收发信机,摄像机、音频采集单元分别采集到的视、音频通过收发信机发送,通过收发信机接收到的音频通过扬声器输出,其中收发信机响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信。单呼即通是指单向呼叫后即可自动进行双向通信。
根据本发明的一个实施例,收发信机在自动建立与可信任用户的IP通信之后,仅将摄像机、音频采集单元采集到的视、音频发送给可信任用户,响应于来自可信任用户的双向通信请求,才在将摄像机、音频采集单元采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器输出。
根据本发明的一个实施例,收发信机在自动建立与可信任用户的IP通信之后,在将摄像机、音频采集单元采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器输出。
根据本发明的一个实施例,所述单呼即通实况通信终端还包括显示器,在收发信息建立了与可信任用户的IP通信的情况下,如果收发信机接收到视频则显示视频,如果收发信息未接收到视频则显示可信任用户的标识。
根据本发明的一个实施例,收发信机响应于在建立与可信任用户的IP通信后接收到来自另一可信任用户的连接请求,向所述另一可信任用户发出经服务器IP通信的应答,并向所述可信任用户发出改经服务器进行IP通信的请求。
根据本发明的一个实施例,在收发信机同时建立了与多个可信任用户的IP通信的情况下,显示器同时显示多个可信任用户的视频或标识。
根据本发明的一个实施例,响应于所述多个可信任用户的视频或标识中的一个或多个视频或标识被选择,收发信机断开与所述一个或多个视频或标识对应的可信任用户的IP通信,或者扬声器不输出与所述一个或多个视频或标识对应的可信任用户的声音。
根据本发明的一个实施例,响应于所述多个可信任用户的视频或标识中之一被选择,被选择的可信任用户的视频或标识变为放大的主画面。
根据本发明的一个实施例,响应于从摄像机、音频采集单元分别采集到的视、音频中识别出人或特定人,收发信机向可信任用户发送提醒信息。
根据本发明的一个实施例,人或特定人是基于人脸识别、身高识别、声音识别、携带手机发出的无线信号表明的身份中的一个或多个来识别
的。
根据本发明的一个实施例,响应于从摄像机、音频采集单元分别采集到的视、音频中识别出特定动作,收发信机向可信任用户发送提醒信息。
根据本发明的一个实施例,特定动作是通过事先为预定的动作建立模型,并从摄像机、音频采集单元分别采集到的视、音频中搜索与建立的模型的匹配识别的。
根据本发明的一个实施例,所述模型是通过自学习的方式产生的。
根据本发明的一个实施例,所述单呼即通实况通信终端还包括深度传感器,所述特定动作是基于摄像机、音频采集单元分别采集到的视、音频以及深度传感器感测的深度识别的。
根据本发明的一个实施例,响应于从摄像机、音频采集单元分别采集到的视、音频中识别出异常状况,收发信机向可信任用户发送提醒信息。
根据本发明的一个实施例,所述异常状况是通过识别出以下中的一种或多种识别的:摄像机采集到的视频的剧烈变化;音频采集单元采集到的高于特定阈值的音频;音频采集单元采集到的音频的剧烈变化;基于摄像机、音频采集单元分别采集到的视、音频识别出的预定事件,其中该预定事件的模型事先已建立,并通过从基于摄像机、音频采集单元分别采集到的视、音频中搜索与建立的模型相匹配的事件,从而识别预定事件。
根据本发明的一个实施例,单呼即通实况通信终端还包括:使摄像机转动的转动装置。
根据本发明的一个实施例,响应于从摄像机、音频采集单元分别采集到的视、音频中识别出以下要素中的一个,转动装置使摄像机向着面对识别出的要素的方向转动:人或特定人;特定动作;异常状况。
根据本发明的一个实施例,单呼即通实况通信终端还包括:光线传感器,用于感测单呼即通实况通信终端周围环境光线的变化,其中显示器的显示亮度是根据所述光线的变化调整的。
根据本发明的另一方面的一个实施例,还提供了一种安装于移动终端的工具,包括:发送单元,被配置为响应于触发,发送针对特定通信终端的连接请求;接收单元,被配置为接收来自所述特定通信终端的自动应答,从而自动建立与所述特定移动终端的IP通信。
根据本发明的一个实施例,在自动建立与所述特定移动终端的IP通信后,接收单元接收来自所述特定通信终端的视、音频,发送单元不发送用户的音、视频,响应于第二触发,才在接收单元接收来自所述特定通信终端的视、音频的同时发送单元向所述特定通信终端发送音、视频。
根据本发明的一个实施例,在自动建立与所述特定移动终端的IP通信后,在接收单元接收来自所述特定通信终端的视、音频的同时,发送单元向所述特定通信终端发送音、视频。
根据本发明的一个实施例,第一触发包括以下中的任一种:所述移动终端的开机;所述移动终端开机状态下所述工具的激活;所述移动终端开机状态下用户界面上的特定动作;所述移动终端开机状态下接收到的特定语音;所述移动终端开机状态下感测到的光线变强。
根据本发明的一个实施例,所述第二触发包括以下中的任一种:在所述工具的激活状态下用户界面上的特定动作;在所述工具的激活状态下接收到的特定语音。
根据本发明的一个实施例,发送单元被配置为在移动终端存储有针对多个通信终端的连接的情况下,响应于用户输入的选择,发送针对用户所选择的特定通信终端的连接请求。
根据本发明的又一个方面的一个实施例,还提供了一种单呼即通实况通信方法,包括:接收来自可信任用户的连接请求;响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信;在与可信任用户的IP通信中,向可信任用户发送采集到的视、音频,并至少接收来自可信任用户的音频。
根据本发明的一个实施例,所述单呼即通实况通信方法还包括:响应于从采集到的视、音频中识别出以下要素中的一个,向用户发送提醒
信息:人或特定人;特定动作;异常状况。
根据本发明的一个实施例,所述单呼即通实况通信方法还包括:响应于在建立与可信任用户的IP通信后接收到来自另一可信任用户的连接请求,向所述另一可信任用户发出经服务器IP通信的应答,并向所述可信任用户发出改经服务器进行IP通信的请求。
与现有技术相比,本发明一个实施例提供的单呼即通实况通信终端通过收发信机响应来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信。相比于现有技术的方案,为不仅监视端用户可以随时查看实况通信终端处的状况,实况通信终端处的人也能实时与监视端用户互动提供了可能,提升了用户体验。无需实况通信终端的用户对连接请求进行人工确认即可建立IP通信的方式避免了实况通信终端处无人或有人但无法正常接听而造成无法进行实况监视的影响。
虽然本发明一个实施例的配置为监视端用户和实况通信终端处的人的双向互动提供了可能,但有时监视端用户也有不希望实况通信终端处的人知道谁在监视的需要。因此,收发信机可以在自动建立与可信任用户的IP通信之后,仅将摄像机、音频采集单元采集到的视、音频发送给可信任用户,响应于来自可信任用户的双向通信请求,才在将摄像机、音频采集单元采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器输出。这样,使得监视端用户可以灵活选择是否让实况通信终端处的人知道自己在监视,提高监视端用户侧的灵活性。
并且,本发明一个实施例提供的单呼即通实况通信终端基于其是否接收到视频,显示不同信息,使得信息显示的方式和数据传送的格式更灵活。
并且,本发明一个实施例提供的单呼即通实况通信终端在与单个可信任用户进行通信时采用端到端的直接通信,在与多个可信任用户进行通信时改经服务器进行IP通信,这种灵活的通信方式使得单呼即通实况通信终端在与单个可信任用户进行通信时可以有效避免对服务器资源的浪费,并使得单呼即通实况通信终端在与多个可信任用户进行通信时通
过服务器来转发数据,更快更准确地传送大量数据。
并且,本发明一个实施例提供的单呼即通实况通信终端可以在与多个可信任用户IP通信的情况下,由显示器同时显示多个可信任用户的视频或标识,从而提升用户的视觉体验。
并且,本发明一个实施例提供的单呼即通实况通信终端可以在与多个可信任用户IP通信的情况下,由收发信机断开与其中一个或多个可信任用户的IP通信,使得单呼即通实况通信终端的可信任用户可以自由地选择通信对象;并且,单呼即通实况通信终端的扬声器可以向一个或多个可信任用户输出或不输出声音,从而进一步提高可信任用户进行视频通信/语音通信/仅画面通信的灵活度。
并且,本发明一个实施例提供的单呼即通实况通信终端可以响应于所述多个可信任用户的视频或标识中之一被选择,被选择的可信任用户的视频或标识变为放大的主画面,从而突出单呼即通实况通信终端与主画面对应的可信任用户的通信,进一步提升用户的视觉体验。
并且,本发明一个实施例提供的单呼即通实况通信终端可以基于摄像机、音频采集单元分别采集到的视、音频识别出人或特定人,向可信任用户发送提醒信息,从而满足可信任用户仅需在有人或特定人出现在特定环境中才进行监视的需要,避免持续监视。
并且,本发明一个实施例提供的单呼即通实况通信终端可以基于人脸识别、身高识别、声音识别、携带手机发出的无线信号表明的身份中的一个或多个来识别,可以有效提升单呼即通实况通信终端对周围情况识别的灵敏度。
并且,本发明一个实施例提供的单呼即通实况通信终端可以基于摄像机、音频采集单元分别采集到的视、音频中识别出特定动作或异常状况,并向可信任用户发送提醒信息,从而满足可信任用户可能仅需要在单呼即通实况通信终端出现某些情况进行监视的需要,避免持续监视。
并且,本发明一个实施例提供的单呼即通实况通信终端可以通过为预定的动作事先建立模型,也可以通过自学习的方式产生模型,并从摄像机、音频采集单元分别采集到的视、音频中搜索与建立的模型相匹配
的动作,从而更灵活、更智能、更准确地识别特定动作,更好地监控周围情况。
并且,本发明一个实施例提供的单呼即通实况通信终端通过采用深度传感器进行周围情况的深度识别,在识别三维物体和人、特定人、动作等方面,准确度更高。
并且,本发明一个实施例提供的单呼即通实况通信终端的摄像机可转动,进一步还可以向着识别出的要素转动,更智能、更灵活地采集事件。
并且,由于在本发明的一个实施例中,能根据感测到的单呼即通实况通信终端周围环境光线的变化调整显示器的显示亮度,提高了观看显示器的舒适度。
由于本发明一个实施例提供的安装于移动终端的工具发送针对特定通信终端的连接请求,并配置为接收来自所述特定通信终端的自动应答,从而自动建立与所述特定移动终端的IP通信,无需实况通信终端的用户对连接请求进行人工确认即可建立IP通信的方式避免了实况通信终端处无人造成无法进行实况监视的影响。
由于在本发明一个实施例中,在自动建立与所述特定移动终端的IP通信后,接收单元接收来自所述特定通信终端的视、音频,发送单元不发送用户的音频,响应于第二触发,才在接收单元接收来自所述特定通信终端的视、音频的同时发送单元向所述特定通信终端发送音频,这样,监视端用户如果不希望实况通信终端处的人知道自己在监视,就可以不进行第二触发,从而监视端用户可以灵活选择是否让实况通信终端处的人知道自己在监视,提高监视端用户侧的灵活性。
在本发明的一个实施例中,所述触发可以是所述移动终端的开机、所述移动终端开机状态下所述工具的激活、所述移动终端开机状态下用户界面上的特定动作、所述移动终端开机状态下接收到的特定语音、所述移动终端开机状态下感测到的光线变强中的任一个,提升了该移动终端被触发的灵活性。
另外,在本发明的一个实施例中,移动终端可以存储有针对多个通
信终端的连接,可以让用户选择其中一个通信终端进行通信,使得一个移动终端可以同时绑定多个单呼即通实况通信终端,提升用户便利性。
本领域普通技术人员将了解,虽然下面的详细说明将参考图示实施例、附图进行,但本发明并不仅限于这些实施例。而是,本发明的范围是广泛的,且意在仅通过后附的权利要求限定本发明的范围。
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:
图1示出根据本发明一个实施例的单呼即通实况通信终端的示意性框图;
图2(a)示出了根据本发明一个实施例的单呼即通实况通信终端和单个用户进行IP通信的示意图;
图2(b)示出了根据本发明另一个实施例的单呼即通实况通信终端和多个用户进行IP通信的示意图;
图3示出了根据本发明一个实施例的单呼即通实况通信终端的外部左视图;
图4示出了根据本发明的一个实施例的移动终端的框图;
图5示出了根据本发明又一个实施例的单呼即通实况通信方法的流程图。
附图中相同或相似的附图标记代表相同或相似的部件。
下面结合附图对本发明作进一步详细描述。
图1示出了根据本发明一个实施例的单呼即通实况通信终端1的示意图。根据本发明一个实施例的单呼即通实况通信终端1包括摄像机101、音频采集单元102、扬声器104以及收发信机105。摄像机101、音频采集单元102分别采集到的视、音频通过收发信机105发送。通过收发信机105接收到的音频通过扬声器104输出。收发信机101响应于接收到
来自用户的连接请求,自动发出对该连接请求的应答,从而自动建立与用户的IP通信。单呼即通是指单向呼叫后即可自动进行双向通信。
在收发信机101在自动建立与用户的IP通信之后,可以自动建立可信任用户与单呼即通实况通信终端1处的人的双向互通。即,在将摄像机101、音频采集单元102采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器104输出。也可以先仅将单呼即通实况通信终端1处的情况通知给可信任用户,而不将可信任用户的音频等传送到单呼即通实况通信终端1侧。即,仅将摄像机101、音频采集单元102采集到的视、音频发送给可信任用户。当可信任用户发出双向通信请求后,才将可信任用户的音频等传送到单呼即通实况通信终端1侧,即在将摄像机101、音频采集单元102采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器104输出。
在图2中,摄像机101是实况通信终端1的上端的摄像头,但本领域技术人员应当理解,其也可以是位于实况通信终端1的其他位置的其他摄像装置。音频采集单元102例如是实况通信终端1外表面的麦克风,但也可以是其他音频采集装置。扬声器104例如是实况通信终端1外表面的放音器,但也可以是其他音频输出设备。收发信机105例如天线,也可以是其他收发设备,例如内置的无线收发模块。
在此,所述单呼即通实况通信终端包括但不限于任何一种可与用户通过触摸板、声控设备、遥控设备或键盘等进行人机交互的电子产品,例如计算机、平板电脑(PAD)、网络电视(IPTV)等,本领域技术人员应能理解,其他的用户设备如可适用于本发明,也应包含在本发明保护范围以内。
进一步地,请继续参考图1,所述单呼即通实况通信终端1可以还包括显示器103,在收发信机101建立了与可信任用户的IP通信的情况下,如果收发信机105接收到视频则显示视频,如果收发信机105未接收到视频则显示可信任用户的标识。当然,所述收发信机105即使在可以接收到视频的情况下,显示器103也可以仅显示可信任用户的标识。其中,所述可信任用户的标识可以为可信任用户的视频截图、头像或其
他标识。当然,单呼即通实况通信终端1也可以不包括显示器103,这样,实况通信终端1在与可信任用户IP通信时,不能看到可信任用户的图像,只能听到可信任用户的声音。
图2(a)示出了根据本发明一个实施例的单呼即通实况通信终端1和单个可信任用户进行IP通信的示意图。根据图2(a),单呼即通实况通信终端1和单个可信任用户进行IP通信时,优选基于点对点协议(Point-to-Point Protocol)进行IP通信,以节省服务器的资源。图2(b)示出了根据本发明另一个实施例的单呼即通实况通信终端1和多个可信任用户进行IP通信的示意图。根据图2(b),单呼即通实况通信终端1和多个可信任用户进行IP通信时,经由IP网络4通过服务器5来收发信息。
具体地,以可信任用户A和可信任用户B为例,当单呼即通实况通信终端1仅跟可信任用户A进行IP通信时,直接基于点对点协议进行IP通信;当单呼即通实况通信终端1在跟可信任用户A已经建立了IP通信的情况下,此时接收到可信任用户B的连接请求,则单呼即通实况通信终端1向可信任用户B发出经服务器IP通信的应答,并向可信任用户A发出改经服务器进行IP通信的请求,之后可信任用户A和可信任用户B都通过服务器与单呼即通实况通信终端1进行通信,也即,此时的可信任用户A与单呼即通实况通信终端1的IP通信从点对点的IP通信方式切换到改经服务器进行IP通信的方式。在此,所述服务器可以包括网络主机、单个网络服务器、多个网络服务器集合或基于云计算的计算机集合。
可选地,单呼即通实况通信终端1的收发信机105在同时与多个可信任用户进行IP通信的情况下,单呼即通实况通信终端1的显示器103可以同时显示多个可信任用户的视频或标识。优选地,为了使单呼即通实况通信终端1更自由地选择通信对象,单呼即通实况通信终端1响应于所述多个可信任用户的视频或标识中的一个或多个视频或标识被选择,收发信机105断开与所述一个或多个视频或标识对应的可信任用户的IP通信。或者收发信机105在仍与一个或多个可信任用户进行IP通信的情
况下,所述扬声器104不输出被选择的一个或多个视频或标识对应的可信任用户的声音,仅由显示器103显示被选择的一个或多个视频或标识对应的可信任用户的视频画面,避免单呼即通实况通信终端1端的人听到的多个可信任用户的声音互相干扰。
可选地,为了更好地突出单呼即通实况通信终端1的显示器103中的主画面,所述单呼即通实况通信终端1响应于所述多个可信任用户的视频或标识中之一被选择,被选择的可信任用户的视频或标识从原画面升级为放大的主画面。
根据本发明的一个实施例,为了更智能地提醒可信任用户知晓单呼即通实况通信终端的情况,所述单呼即通实况通信终端1可以响应于从摄像机101、音频采集单元102分别采集到的视、音频中识别出人或特定人,并由收发信机105向可信任用户发送提醒信息。典型地,当所述单呼即通实况通信终端1从无人环境切换到有人环境时,即通过摄像机101、音频采集单元102检测到当前场所中出现人时,则由收发信机105主动向另一端的可信任用户发送提醒信息,告知另一端的可信任用户当前环境中有人出现。典型地,所述单呼即通实况通信终端1也可以针对摄像机101、音频采集单元102所识别出的特定人,由收发信机105主动向可信任用户发送提醒信息,例如,在现实场景中,保姆一直在家里待着,此时孩子放学回来了,置于家中的单呼即通实况通信终端1通过摄像机101、音频采集单元102识别出孩子,则由收发信机105及时或实时向远程用户(例如办公室中的父亲)发送提醒信息。
可选地,所述单呼即通实况通信终端1可以通过摄像机101、音频采集单元102以及其他装置或单元,基于人脸识别、身高识别、声音识别、携带手机发出的无线信号表明的身份中的一个或多个来识别人或特定人。
在识别人的情况下,由于人脸的模式是很像的,绝大多数人的身高也是在特定范围内、人的声音频率也是在特定范围内,因此,例如当拍摄的图像中的某一区域与存储的人脸的模式类似,且/或结合位置传感器和/或深度传感器感测到的人脸与单呼即通实况通信终端1的距离判断出
其身高在特定范围内,且/或音频采集单元102采集到的音频也在特定范围内,可识别出人的存在。
在识别特定人的情况下,可以预先将特定人的人脸的模式和/或身高和/或声音频率存储在存储器中。当拍摄的图像中的某一区域与存储的该特定人脸的模式匹配,且/或结合位置传感器和/或深度传感器感测到的该特定人脸与单呼即通实况通信终端1的距离判断出其身高与存储的身高匹配,且/或音频采集单元102采集到的音频的匹配与存储的改特定人的声音的频率匹配时,可识别出特定人的存在。
识别人或特定人的存在也可以采用自学习的方法。例如,如果拍摄的图像中的某个模式与采集到的声音的某个频率总是同时出现,可以在显示器上显示提示,即识别到了人,请实况通信终端1旁的人判断并命名。如果实况通信终端1旁的人发现识别错误,则在显示器的界面上反馈。接收到这种反馈后,在下一次拍摄的图像中的这种模式与采集到的声音的这种频率同时出现时就不认为出现了人或特定人。在自学习的方式下,也可以预先不将特定人的人脸的模式和/或身高和/或声音频率存储在存储器中。
另外,也可以基于携带手机发出的无线信号表明的身份识别人或特定人。例如单呼即通实况通信终端1是蓝牙设备,用户的手机中也具有蓝牙无线单元。当单呼即通实况通信终端1识别出特定身份的蓝牙无线单元出现在一定距离内时,则认为识别出了特定人。
在此,对于单呼即通实况通信终端1识别人或特定人的方式不予限定,任何具有识别人或特定人的装置或单元如可适用本发明,都应包含在本发明的保护范围以内,并在此以引用方式包含于此。
可选地,所述单呼即通实况通信终端1也可以通过摄像机101、音频采集单元102基于所采集到的视、音频识别特定动作,例如识别出老人摔倒的动作、小孩子跳舞的动作等等,并由收发信机105主动向另一端的可信任用户发送提醒信息。
可选地,可以人为地事先设定并根据设定的动作建立模型。当从摄像机101、音频采集单元102所采集的视、音频中搜索到与存储的一个模
型相匹配的特定动作时,则由收发信机105主动向另一端的可信任用户发送提醒信息。例如,对于看电视这样一个动作,建立一个模型:识别出一个人坐在沙发上;顺着该人的目光方向看去,有一个物体;识别出该物体是电视;该人目光停留在电视上至少10秒。如果从摄像机101拍摄的图像中检测到人,然后检测到此人坐在沙发上(沙发的识别类似人脸识别,也可以通过模式匹配进行,也可以将人坐在沙发上的图像整体作为一个对象进行模式匹配识别),然后检测此人的目光方向,然后检测此人目光方向上的物体是否是电视(例如将电视作为一个对象进行模式匹配),如果是则计数10秒。
当然,所述单呼即通实况通信终端1也可以通过机器学习等自学习的方式自动建立动作模型。例如,单呼即通实况通信终端1从摄像机101、音频采集单元102所采集的视、音频中提取动作特征,并基于提取的特征建立动作模型。例如,从摄像机101、音频采集单元102所采集的视、音频中发现有一个人坐在沙发上、顺着此人目光看去的方向有一个电视、在此人目光停留在电视上的事件超过10秒的频率超过阈值,则认为这是一个特定动作的模型。在这种情况下,动作模型可以不预先存储在数据库中,而是根据从摄像机101、音频采集单元102所采集的视、音频以学习的方式提取动作的模型。
为了更准确地识别出特定动作,所述单呼即通实况通信终端1还包括深度传感器(197),由摄像机101、音频采集单元102以及深度传感器通过采集的视音频以及感测的深度共同识别出特定动作。深度传感器感测人或物体与单呼即通实况通信终端1。虽然在图2(a)中深度传感器197位于显示器上部边框中心偏左的位置,其也可以设置在其他合理的物理位置。当人或物体发生一个动作的时候,同样的动作幅度由于与单呼即通实况通信终端1的距离不同在拍摄到的图像中产生的变化幅度会是不同的。因此,结合深度传感器,对动作能够进行更准确的识别,从而提高识别精度。
可选地,所述单呼即通实况通信终端1基于摄像机101、音频采集单元102所采集的视音频中识别出异常状况,由收发信机105主动向另
一端的可信任用户发送提醒信息。其中,异常状况诸如陌生人到访、失火、哭声、吵闹声、电器事故等等。典型地,所述异常状况是通过识别出以下中的一种或多种识别的:摄像机采集到的视频的剧烈变化;音频采集单元采集到的高于特定阈值的音频;音频采集单元采集到的音频的剧烈变化;基于摄像机(101)、音频采集单元(102)分别采集到的视、音频识别出的预定事件。预定事件是事先规定好的诸如失火、电器事故等事件。
对于预定事件,具体地,所述单呼即通实况通信终端1基于摄像机101、音频采集单元102识别出预定事件,其中该预定事件的模型事先已建立,并通过从基于摄像机101、音频采集单元102分别采集到的视、音频中搜索与建立的模型相匹配的事件,从而识别预定事件。在此,所述单呼即通实况通信终端1可以通过机器学习等自学习的方式自动建立预定事件的模型。典型地,所述单呼即通实况通信终端1从摄像机101、音频采集单元102所采集的视、音频中提取事件特征,并基于提取的事件特征建立预定事件的模型。当然,也可以不采用自学习的方法建立预定事件的模型,而是直接规定若干预定事件的模型。
图3示出了根据本发明一个实施例的单呼即通实况通信终端的外部左视图。根据本发明的一个实施例,为了更好地采集信息,单呼即通实况通信终端1还包括转动装置199,用于使摄像机101转动。优选地,响应于从摄像机101、音频采集单元102分别采集到的视、音频中识别出以下要素中的一个,转动装置199使摄像机101向着面对识别出的要素的方向转动:人或特定人;特定动作;异常状况。
在一个实施例中,图3所示的摄像机101可以向着识别出的要素左右转动。在另一个实施例,图3所示的摄像机101可以向着识别出的要素上下左右转动。
如图2(a)所示,单呼即通实况通信终端1还可包括:光线传感器198,用于感测单呼即通实况通信终端1周围环境光线的变化,其中显示器103的显示亮度是根据所述光线的变化调整的。如果周围光线比较强,可以将显示器的显示亮度增加。如果周围光线比较弱,可以将显示
器的显示亮度减少。这样,可以减少眼睛观看显示器的不舒适感。
虽然图2(a)中的光线传感器位于显示器上边框的中心偏右的位置处,但其也可以设置在任何其他合理的物理位置处。
应当理解,图1所示的框图仅仅是为了示例的目的,而不是对本发明范围的限制。在某些情况下,可以根据具体情况增加或减少某些单元或装置。
需要说明的是,上述单呼即通实况通信终端1基于收发信机105向可信任用户发送提醒信息主要通过向可信任用户发送短信、飞信或微信或私有协议下的定制化消息等方式进行。
在此,上述另一端的可信任用户主要在wifi网络环境下与所述单呼即通实况通信终端1进行IP通信,当然,在此,所述另一端的可信任用户也可以通过诸如3G网络、2G网络、4G等通信方式与所述单呼即通实况通信终端1进行通信。
根据本发明另一个实施例,如图4所示,提供了一种安装于移动终端3的工具31,包括发送单元301和接收单元302。发送单元301被配置为响应于第一触发,发送针对特定通信终端(相应于前述的单呼即通实况通信终端)的连接请求。接收单元302被配置为接收来自所述特定通信终端的自动应答,从而自动建立与所述特定移动终端的IP通信。所述移动终端包括诸如、智能手机、平板电脑等电子设备,所述工具可以以应用程序(app)的方式安装在移动终端上,并以应用图标的形式予以展示,所述工具也可以以插件的形式内置于移动终端中。移动终端处于wifi或3G、4G等网络环境时,与单呼即通实况通信终端进行IP通信;所述移动终端处于2G等网络环境时,所述单呼即通实况通信终端可以向移动终端发送提醒信息。
在自动建立与所述特定移动终端的IP通信后,可以在接收单元302接收来自所述特定通信终端的视、音频的同时,发送单元301向所述特定通信终端发送音频。也可以在自动建立与所述特定移动终端的IP通信后,接收单元302接收来自所述特定通信终端的视、音频,发送单元301不发送用户的音频,响应于第二触发,才在接收单元302接收来自所述
特定通信终端的视、音频的同时发送单元301向所述特定通信终端发送音频。这样,如果移动终端3的用户不希望特定通信终端出的人知道自己正在监视特定通信终端,可以不进行第二触发,这样,仅将来自特定通信终端的视频、音频传送给移动终端3,而移动终端3的用户的音频等信息不传递到特定通信终端处。
第一触发包括以下中的任一种:所述移动终端的开机;所述移动终端开机状态下所述工具的激活;所述移动终端开机状态下用户界面上的特定动作;所述移动终端开机状态下接收到的特定语音;所述移动终端开机状态下感测到的光线变强。
在第一触发是所述移动终端的开机的情况下,随着移动终端的开机,自动进行与单呼即通实况通信终端1的通信连接。这可以使手机在开机后自动进入对单呼即通实况通信终端1所处环境的监控状态,提高用户效率。
在第一触发是所述移动终端开机状态下所述工具的激活、所述移动终端开机状态下用户界面上的特定动作、或所述移动终端开机状态下接收到的特定语音的情况下,可以根据用户需要决定是否进入对单呼即通实况通信终端1所处环境的监控状态,增加用户灵活性。特定动作例如对图标滑动、单击、双击等,或者在触摸屏的特定位置输入特定内容。
在第一触发是所述移动终端开机状态下感测到的光线变强的情况下,当用户从口袋中拿出移动终端时,感测到光线变强,从而自动进行与单呼即通实况通信终端1的通信连接。此时,避免了在用户不希望监视单呼即通实况通信终端1所处的环境而将移动终端放在口袋中仍然占用与单呼即通实况通信终端1的连接资源导致的资源浪费。在这种方式下,在移动终端或工具中配有光线传感器,用于感测移动终端表面上光线的变化。
第二触发可以包括以下中的任一种:在所述工具的激活状态下用户界面上的特定动作;在所述工具的激活状态下接收到的特定语音。特定动作可以是在用户界面上某个位置的动作(如滑动、单击、双击等)等等。例如,第一触发可以是针对用户界面上第一图标的动作,而第二触
发是针对用户界面上与第一图标不同的第二图标的动作,等等。
可选地,发送单元301被配置为在移动终端存储有针对多个通信终端的连接的情况下,响应于用户输入的选择,发送针对用户所选择的特定通信终端的连接请求。例如,可以向用户显示多个通信终端的列表,供用户选择其中一个。响应于这种选择,向选择的特定通信终端发送连接请求。
图5示出了根据本发明又一个实施例的单呼即通实况通信方法2的流程图。根据图5,所述单呼即通实况通信方法2包括:
步骤S1,单呼即通实况通信终端接收来自可信任用户的连接请求;
步骤S2,响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信;
步骤S3,在与可信任用户的IP通信中,向可信任用户发送采集到的视、音频,并至少接收来自可信任用户的音频。
进一步地,所述单呼即通实况通信方法还包括:响应于从采集到的视、音频中识别出以下要素中的一个,向可信任用户发送提醒信息:人或特定人;特定动作;异常状况。
进一步地,所述单呼即通实况通信方法还包括:响应于在建立与可信任用户的IP通信后接收到来自另一可信任用户的连接请求,向所述另一可信任用户发出经服务器IP通信的应答,并向所述可信任用户发出改经服务器进行IP通信的请求。
所属技术领域的技术人员知道,本发明可以实现为设备、装置、方法或计算机程序产品。因此,本公开可以具体实现为以下形式,即:可以是完全的硬件,也可以是完全的软件,还可以是硬件和软件结合的形式。
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所
标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化囊括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。
Claims (27)
- 一种单呼即通实况通信终端(1),包括摄像机(101)、音频采集单元(102)、扬声器(104)以及收发信机(105),摄像机(101)、音频采集单元(102)分别采集到的视、音频通过收发信机(105)发送,通过收发信机(105)接收到的音频通过扬声器(104)输出,其中收发信机(101)响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信。
- 根据权利要求1的单呼即通实况通信终端(1),其中收发信机(101)在自动建立与可信任用户的IP通信之后,仅将摄像机(101)、音频采集单元(102)采集到的视、音频发送给可信任用户,响应于来自可信任用户的双向通信请求,才在将摄像机(101)、音频采集单元(102)采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器(104)输出。
- 根据权利要求1的单呼即通实况通信终端(1),其中收发信机(101)在自动建立与可信任用户的IP通信之后,在将摄像机(101)、音频采集单元(102)采集到的视、音频发送给可信任用户的同时将来自可信任用户的音频通过扬声器(104)输出。
- 根据权利要求1-3中任一个的单呼即通实况通信终端(1),还包括显示器(103),在收发信机(101)建立了与可信任用户的IP通信的情况下,如果收发信机(105)接收到视频则显示视频,如果收发信机(105)未接收到视频则显示可信任用户的标识。
- 根据权利要求4的单呼即通实况通信终端(1),其中收发信机(105)响应于在建立与可信任用户的IP通信后接收到来自另一可信任用户的连接请求,向所述另一可信任用户发出经服务器IP通信的应答,并向所述可信任用户发出改经服务器进行IP通信的请求。
- 根据权利要求5的单呼即通实况通信终端(1),其中在收发信机(105)同时建立了与多个可信任用户的IP通信的情况下,显示器(103)同时显示多个可信任用户的视频或标识。
- 根据权利要求5的单呼即通实况通信终端(1),其中响应于所述多个可信任用户的视频或标识中的一个或多个视频或标识被选择,收发信机(101)断开与所述一个或多个视频或标识对应的可信任用户的IP通信,或者扬声器(104)不输出与所述一个或多个视频或标识对应的可信任用户的声音。
- 根据权利要求5的单呼即通实况通信终端(1),其中响应于所述多个可信任用户的视频或标识中之一被选择,被选择的可信任用户的视频或标识变为放大的主画面。
- 根据权利要求1的单呼即通实况通信终端(1),其中响应于从摄像机(101)、音频采集单元(102)分别采集到的视、音频中识别出人或特定人,收发信机(101)向可信任用户发送提醒信息。
- 根据权利要求9的单呼即通实况通信终端(1),其中人或特定人是基于人脸识别、身高识别、声音识别中的一个或多个来识别的。
- 根据权利要求9的单呼即通实况通信终端(1),其中收发信机(105)还接收携带手机发出的无线信号,基于该无线信号中标明的携带手机的身份,来识别人或特定人。
- 根据权利要求1的单呼即通实况通信终端(1),其中响应于从摄像机(101)、音频采集单元(102)分别采集到的视、音频中识别出特定动作,收发信机(101)向可信任用户发送提醒信息。
- 根据权利要求12的单呼即通实况通信终端(1),还包括深度传感器,所述特定动作是基于摄像机(101)、音频采集单元(102)分别采集到的视、音频以及深度传感器(197)感测的深度识别的。
- 根据权利要求1的单呼即通实况通信终端(1),其中响应于从摄像机(101)、音频采集单元(102)分别采集到的视、音频中识别出异常状况,收发信机(101)向可信任用户发送提醒信息。
- 根据权利要求14的单呼即通实况通信终端(1),其中所述异常状况是通过识别出以下中的一种或多种识别的:摄像机(101)采集到的视频的剧烈变化;音频采集单元(102)采集到的高于特定阈值的音频;音频采集单元(102)采集到的音频的剧烈变化;基于摄像机(101)、音频采集单元(102)分别采集到的视、音频识别出的预定事件,其中该预定事件的模型事先已建立,并通过从基于摄像机(101)、音频采集单元(102)分别采集到的视、音频中搜索与建立的模型相匹配的事件,从而识别预定事件。
- 根据权利要求1的单呼即通实况通信终端(1),还包括:使摄像机(101)转动的转动装置(199)。
- 根据权利要求16的单呼即通实况通信终端(1),其中响应于从摄像机(101)、音频采集单元(102)分别采集到的视、音频中识别出以下要素中的一个,转动装置(199)使摄像机(101)向着面对识别出的要素的方向转动:人或特定人;特定动作;异常状况。
- 根据权利要求4的单呼即通实况通信终端(1),还包括光线传感器(198),用于感测单呼即通实况通信终端(1)周围环境光线变化,其中显示器(103)的显示亮度是根据感测到的所述光线的变化调整的。
- 一种安装于移动终端(3)的工具(31),包括:发送单元(301),被配置为响应于第一触发,发送针对特定通信终端的连接请求;接收单元(302),被配置为接收来自所述特定通信终端的自动应答,从而自动建立与所述特定移动终端的IP通信。
- 根据权利要求19的工具(31),其中在自动建立与所述特定移动终端的IP通信后,接收单元(302)接收来自所述特定通信终端的视、音频,发送单元(301)不发送用户的音频,响应于第二触发,才在接收单元(302)接收来自所述特定通信终端的视、音频的同时发送单元(301)向所述特定通信终端发送音频。
- 根据权利要求19的工具(31),其中在自动建立与所述特定移动终端的IP通信后,在接收单元(302)接收来自所述特定通信终端的视、 音频的同时,发送单元(301)向所述特定通信终端发送音频。
- 根据权利要求19所述的工具(31),其中所述第一触发包括以下中的任一种:所述移动终端的开机;所述移动终端开机状态下所述工具的激活;所述移动终端开机状态下用户界面上的特定动作;所述移动终端开机状态下接收到的特定语音;所述移动终端开机状态下感测到的光线变强。
- 根据权利要求20所述的工具(31),其中所述第二触发包括以下中的任一种:在所述工具的激活状态下用户界面上的特定动作;在所述工具的激活状态下接收到的特定语音。
- 根据权利要求19所述的工具(31),其中发送单元(301)被配置为在移动终端存储有针对多个通信终端的连接的情况下,响应于用户输入的选择,发送针对用户所选择的特定通信终端的连接请求。
- 一种单呼即通实况通信方法(2),包括:接收来自可信任用户的连接请求(S1);响应于接收到来自可信任用户的连接请求,自动发出对该连接请求的应答,从而自动建立与可信任用户的IP通信(S2);在与可信任用户的IP通信中,向可信任用户发送采集到的视、音频,并至少接收来自可信任用户的音频(S3)。
- 根据权利要求25的单呼即通实况通信方法(2),还包括:响应于从采集到的视、音频中识别出以下要素中的一个,向可信任用户发送提醒信息:人或特定人;特定动作;异常状况。
- 根据权利要求25或26的单呼即通实况通信方法(2),还包括:响应于在建立与可信任用户的IP通信后接收到来自另一可信任用户的连 接请求,向所述另一可信任用户发出经服务器IP通信的应答,并向所述可信任用户发出改经服务器进行IP通信的请求。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/316,449 US20180039836A1 (en) | 2014-06-05 | 2014-09-15 | Single call-to-connect live communication terminal, method and tool |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410247191.1A CN104023207A (zh) | 2014-06-05 | 2014-06-05 | 单呼即通实况通信终端、方法及工具 |
CN201410247191.1 | 2014-06-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015184701A1 true WO2015184701A1 (zh) | 2015-12-10 |
Family
ID=51439751
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/086574 WO2015184701A1 (zh) | 2014-06-05 | 2014-09-15 | 单呼即通实况通信终端、方法及工具 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20180039836A1 (zh) |
CN (1) | CN104023207A (zh) |
WO (1) | WO2015184701A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111064928A (zh) * | 2019-12-10 | 2020-04-24 | 湖北牡丹科技发展有限公司 | 一种具有人脸识别功能的视频监控系统 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104023207A (zh) * | 2014-06-05 | 2014-09-03 | 北京小鱼儿科技有限公司 | 单呼即通实况通信终端、方法及工具 |
CN105848002A (zh) * | 2016-04-01 | 2016-08-10 | 太仓日森信息技术有限公司 | 一种网络视频请求接入时的图片提示方法 |
US11321655B2 (en) * | 2019-11-26 | 2022-05-03 | Ncr Corporation | Frictionless and autonomous control processing |
US11818086B1 (en) * | 2022-07-29 | 2023-11-14 | Sony Group Corporation | Group voice chat using a Bluetooth broadcast |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101656874A (zh) * | 2009-09-17 | 2010-02-24 | 杭州智傲科技有限公司 | 一种远程视频监控方法 |
CN101821765A (zh) * | 2007-10-16 | 2010-09-01 | 朴相来 | 使用无线通信网络来保护并管理孩子的系统和方法 |
CN102333202A (zh) * | 2010-07-14 | 2012-01-25 | 山东省普来特能源与电器研究院 | 一种网络视频监控装置 |
CN102572388A (zh) * | 2011-10-31 | 2012-07-11 | 东莞市中控电子技术有限公司 | 一种基于人脸识别的网络视频监控装置与监控识别方法 |
CN104023207A (zh) * | 2014-06-05 | 2014-09-03 | 北京小鱼儿科技有限公司 | 单呼即通实况通信终端、方法及工具 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101790003A (zh) * | 2009-01-23 | 2010-07-28 | 英华达(上海)电子有限公司 | 一种用于具有音频或视频功能的移动装置的监视或监听方法 |
US8917306B2 (en) * | 2011-07-29 | 2014-12-23 | Cisco Technology, Inc. | Previewing video data in a video communication environment |
-
2014
- 2014-06-05 CN CN201410247191.1A patent/CN104023207A/zh active Pending
- 2014-09-15 US US15/316,449 patent/US20180039836A1/en not_active Abandoned
- 2014-09-15 WO PCT/CN2014/086574 patent/WO2015184701A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101821765A (zh) * | 2007-10-16 | 2010-09-01 | 朴相来 | 使用无线通信网络来保护并管理孩子的系统和方法 |
CN101656874A (zh) * | 2009-09-17 | 2010-02-24 | 杭州智傲科技有限公司 | 一种远程视频监控方法 |
CN102333202A (zh) * | 2010-07-14 | 2012-01-25 | 山东省普来特能源与电器研究院 | 一种网络视频监控装置 |
CN102572388A (zh) * | 2011-10-31 | 2012-07-11 | 东莞市中控电子技术有限公司 | 一种基于人脸识别的网络视频监控装置与监控识别方法 |
CN104023207A (zh) * | 2014-06-05 | 2014-09-03 | 北京小鱼儿科技有限公司 | 单呼即通实况通信终端、方法及工具 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111064928A (zh) * | 2019-12-10 | 2020-04-24 | 湖北牡丹科技发展有限公司 | 一种具有人脸识别功能的视频监控系统 |
Also Published As
Publication number | Publication date |
---|---|
US20180039836A1 (en) | 2018-02-08 |
CN104023207A (zh) | 2014-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2016008210A1 (zh) | 一种通信终端及安装于移动终端的工具 | |
US11115227B2 (en) | Terminal and method for bidirectional live sharing and smart monitoring | |
JP6445173B2 (ja) | デバイスの制御方法及び装置 | |
US9626496B2 (en) | Method and apparatus for processing sensor data of detected objects | |
US11570354B2 (en) | Display assistant device having a monitoring mode and an assistant mode | |
US10063760B2 (en) | Photographing control methods and devices | |
US10055094B2 (en) | Method and apparatus for dynamically displaying device list | |
EP3144910A1 (en) | Method and device for locating a wearable device | |
US9742582B2 (en) | House monitoring system | |
WO2015184701A1 (zh) | 单呼即通实况通信终端、方法及工具 | |
US11700071B2 (en) | Method, device, system, and storage medium for live broadcast detection and data processing | |
CN105872952A (zh) | 基于可穿戴设备的信息发送方法及装置 | |
CN109061903B (zh) | 数据显示方法、装置、智能眼镜及存储介质 | |
CN104352228A (zh) | 应用程序处理方法及装置 | |
CN104332037A (zh) | 告警检测的方法及装置 | |
CN104903844A (zh) | 用于呈现网络和相关联的移动设备中的数据的方法 | |
CN110603813A (zh) | 一种通过智能电视实现视频通话和安防监控的系统和方法 | |
KR102291482B1 (ko) | 독거노인 케어 시스템 및 이의 동작방법 | |
CN108882212A (zh) | 健康数据传输方法及装置 | |
JP6145905B1 (ja) | 照明制御システム及び照明制御方法 | |
CN105100749A (zh) | 摄像方法、装置及终端 | |
JP2019208172A (ja) | コミュニケーションシステム | |
WO2017101404A1 (zh) | 用于移动设备和电视的设备和方法及数据传送系统 | |
JP2014230095A (ja) | 画像送出装置 | |
CN116709207A (zh) | 智能手环的控制方法、智能手环及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14894075 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 27-03-2017) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14894075 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15316449 Country of ref document: US |